Data Mining

Learn about of data mining tools and the importance of choosing the right one.
this course is part of
Transforming Logistics with Analytics
Raymond Hoogendoorn & Marianne Peeters

About the course

The module on data mining begins by outlining the learning objectives, emphasizing understanding the data mining process, and gaining familiarity with tasks, methods, and tools. The course aims to equip participants with the ability to structure and select data mining projects to extract valuable insights from data for solving real-world problems.

The data mining process, based on the widely adopted CRISP-DM approach, consists of six phases. It starts with business understanding, where a business problem or opportunity is translated into a data mining project. The subsequent phases involve data understanding, data preparation, modeling, evaluation, and deployment. Iteration is highlighted as a common practice throughout the process. Each phase requires specific skills, including a deep understanding of the business, knowledge of databases, and proficiency in data mining methods. The module also categorizes data mining methods into supervised (classification, regression, time series analysis) and unsupervised (clustering, association analysis) types. Various techniques and algorithms are mentioned, with a focus on the goals each method aims to achieve.

Finally, the module emphasizes the importance of choosing the right data mining tools, such as commercial options like IBM SPSS Modeler and SAS Enterprise Miner, as well as free/open-source tools like RapidMiner, KNIME, WEKA, and Orange. Programming languages like Python and R are also noted for data analysis and modeling, with KNIME and R specified for use in the course. The overall goal of this module is to provide students with the skills needed for successful data mining projects, aligning with business objectives, and leveraging effective tools and methodologies.

Course subjects

More about the authors

Raymond Hoogendoorn
Dr. Raymond Hoogendoorn is Professor in Artificial Intelligence for Logistics at the Center of Expertise HRTech of Rotterdam University of Applied Sciences and Senior Scientist at TNO. He focuses on developing and implementing Artificial Intelligence methods and techniques in order to make strategic, tactical and operational processes in logistics more efficient and sustainable. Raymond has an MSc in Psychology and a PhD in Civil Engineering. He has held various roles, such as Ass. Professor at TU Delft, Head of Data Science at Transavia and Director Consumer Insights at Lowell Financial.
Marianne Peeters
Drs. Marianne Peeters is a teacher and internship-graduation coordinator within the logistics courses of Fontys School of Technology and Logistics. Field of expertise is supply chain management, operations research and data sciences.

This publication is part of the project ‘small projects 2022 route transport and logistics' with project number NWA.1418.22.023 which is financed by the Dutch Research Council (NWO).