Data Mining & Machine Learning

Base Knowledge

There are no significant specific knowledge base recommended for this curricular unit.

Teaching Methodologies

The classes will be taught in a theoretical-practical regime and the teaching methodology will include different pedagogical methods, respectively the expository, demonstrative and project-based learning methods.

The expository method will be used to present the concepts and main contents of the curricular unit. The teacher organizes and orally presents the contents, structuring the reasoning and the result to be obtained. This exhibition will be supported by slides, which will later be made available to students. This exhibition will be complemented with some references made available.

The demonstrative method will be used to exemplify some applications of concepts, namely the application of the different techniques approached for each task of data mining and machine learning. Based on several practical sheets made available, the teacher shares his know-how and demonstrates and helps students in their execution, so that they successfully carry out what is requested there, sometimes on paper, sometimes on a computer, through a tool specific for this purpose.

The project-based learning (PBL) method will be used to build knowledge through a long and continuous work of study, whose purpose is to meet a challenge/problem whose objective is the development of a data mining project and machine learning, using data from an organization (private or public), open data or creating data through inquiry.

Learning Results

Many organizations from various sectors of activity implement data mining projects, which allow them to obtain new knowledge, such as standard behaviors and future trends, and thus, to decide more proactively. This Data Mining (DM) & Machine Learning (ML) curricular unit intends to teach how to create more robust models, which allow analyzing more and more complex data, with faster and more accurate results and more easily identifying opportunities or risks.

The main objectives to be achieved are:

O1 – Introduce the concepts of data mining and machine learning, their differences and complementarities

O2 – Present the motivations and application domains of data mining and machine learning

O3 – Provide students with the main knowledge for each type of activity

O4 – Know and understand the principles of some of the most common techniques

O5 – Being able to apply the techniques covered in practice

O6 – Understand the steps of an adequate methodology to develop a DM/ML project

The main competences to be developed are:

C1 – Ability to elaborate questions that can be answered by a DM/ML project

C2 – Ability to frame a DM/ML project and analyze its feasibility

C3 – Be able to plan and implement a DM/ML project in its various stages

C4 – Ability to propose, create and interpret models suitable for real problems and challenges

C5 – Ability to criticize current models and propose alternative models

Program

1. Introduction to Data Mining and Machine Learning

2. Machine learning categories

  • Supervised learning
  • Unsupervised learning
  • Reinforcement learning

3. Predictive activities

  • Classification,
  • Forecasting,
  • Trend analysis (time series)

4. Descriptive activities

  • Grouping,
  • Summary (and visualization),
  • Association

5. CRISP-DM Methodology

  • Business understanding
  • Data understanding
  • data preparation
  • Modeling
  • Assessment
  • Development

6. Main Techniques:

  • Decision trees
  • Association rules
  • Linear regression
  • Artificial neural networks
  • Fuzzy sets and fuzzy logic
  • Bayes networks

Curricular Unit Teachers

Internship(s)

NAO

Bibliography

Main Bibliography:

Tan, P. N., Steinbach, M., Karpatne, V. & Kumar, V. (2019). Introduction to Data Mining. Global Edition. Ney York: Pearson Education.

Yang, X. S. (2019). Introduction to algorithms for data mining and machine learning. Academic press.

Complementary Bibliography:

Camilo, C., & Silva, J. (2009). Mineração de dados. Goiânia: Universidade Federal de Goiás.

Chakrabarti, S., Cox, E., Frank, E., Güting, R., Han, J., Jiang, X., Neapolitan, R. (2008). Data Mining: Know It All.Burlinghton: Morgan

Chapman, P., Clinton, J., Kerber, R., Khabaza, T., Reinartz, T., Shearer, C., & Wirth, R. (2000). CRISP-DM 1.0: Step-By-Step Data Mining

Delen, D.. (2014). Real-World Data Mining: FT Press.

Dogan, A., & Birant, D. (2021). Machine learning and data mining in manufacturing. Expert Systems with Applications, 166, 114060.

Han, J., Pei, J., & Tong, H. (2023). Data mining: concepts and techniques. Morgan Kaufmann.

Larose, D. (2005). Discovering Knowledge in Data. Hoboken: John Wiley & Sons, Inc.

Mohri, M., Rostamizadeh, A., & Talwalkar, A. (2018). Foundations of machine learning. MIT press.

North, M. (2012). Data mining for the masses: A Global Text Project Book.

Santos, M., & Azevedo, C. (2005). Data Mining: Descoberta de Conhecimento em Bases de Dados. Lisboa: FCA.

Witten, I., & Frank, E. (2005). Data Mining: Practical machine learning tools and techniques. San Francisco: MorganKaufmann

Wang, W., & Siau, K. (2019). Artificial intelligence, machine learning, automation, robotics, future of work and future of humanity A review and research agenda. Journal of Database Management (JDM), 30(1), 61-79.