Data mining
General data
Course ID: | 1000-2M03DM |
Erasmus code / ISCED: |
11.303
|
Course title: | Data mining |
Name in Polish: | Data mining |
Organizational unit: | Faculty of Mathematics, Informatics, and Mechanics |
Course groups: |
(in Polish) Przedmioty obieralne na studiach drugiego stopnia na kierunku bioinformatyka Elective courses for Computer Science Elective courses for Machine Learning |
ECTS credit allocation (and other scores): |
6.00
|
Language: | English |
Main fields of studies for MISMaP: | computer science |
Type of course: | elective monographs |
Prerequisites: | Machine learning 1000-2N09SUS |
Prerequisites (description): | It is recommended that a person registering for the course should have basic knowledge of machine learning methods and data processing. |
Mode: | Classroom |
Short description: |
Presentation of the main issues in the field of data mining and the methods to resolve them. Discussion about the efficient implementation on large collections of data for basic problems, such as associative rules, data preparation, discretization of real value attributes, decision tree. Presentation of modern computation techniques such as parallel processing, evolutionary computation, using standard heuristic databases or specially constructed data structures. |
Full description: |
1. Introduction to KDD and data mining; templates and patterns 2. Transaction data analysis and association rules; main algorithms for association rule generation: Apriori, AprioriTid, FP-tree. 3. Classification problem and classifier evaluation methods; case based methods, naive Bayes classifiers, Bayesian networks. Improving nearest neighbors classifiers 4. Entropy measure and decision tree methods. 5. Clustering problem and clustering algorithms 6. Computational learning theorem 7. Rule-based classifiers; 8. Data cleaning and data preprocessing techniques; 9. Hidden Markov Model and its application 10. Searching for sequence patterns from time series data 11. OLAP and data mining 12. Web mining and text mining; |
Bibliography: |
1. "Data Mining: Concepts and Techniques". J. Han and M. Kamber. Morgan Kaufmann Publishers. 2001 2. "Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations". I. Witten and E. Frank. Morgan Kaufmann Publishers. 2000. 3. "Advances in Knowledge Discovery and Data Mining". Eds.: Fayyad, Piatetsky-Shapiro, Smyth, and Uthurusamy. The MIT Press, 1995. 4. Jure Leskovec, Anand Rajaraman, and Jeffrey David Ullman. 2014. Mining of Massive Datasets (2nd. ed.). Cambridge University Press, USA. |
Learning outcomes: |
Knowledge and skills: 1. Knows the basic classes of problems related to data mining and knowledge discovery. 2. Knows and is able to use in practice the methods of market basket analysis, understands and is able to apply in practice the algorithms for searching for frequent itemsets. 3. Knows and is able to apply basic ML algorithms. 4. Can evaluate the effectiveness of ML models in classification, regression, and clustering problems. 5. Knows the basic techniques of text processing for the construction of ML models and is able to apply them in practice. 6. Can construct simple recommendation systems and understand their operation. 7. Knows the basic methods of constructing predictive models for time series. Can apply them to real-world data sets and assess their actual effectiveness. 8. Knows current major trends in fields of science related to machine learning and knowledge discovery from databases. Social competence: 1. Is able to prepare a report on exploratory data analysis presenting the most important information using data visualization techniques. 2. Can present the results of the conducted analyzes. |
Assessment methods and assessment criteria: |
The final grades are based on the sum of points from the laboratory and the exam. Additionally, doctoral students may pass this course through the preparation of a special project involving participation in an international data mining competition. |
Classes in period "Summer semester 2023/24" (in progress)
Time span: | 2024-02-19 - 2024-06-16 |
Navigate to timetable
MO LAB
TU W TH FR LAB
WYK
|
Type of class: |
Lab, 30 hours
Lecture, 30 hours
|
|
Coordinators: | Hung Son Nguyen | |
Group instructors: | Hung Son Nguyen | |
Students list: | (inaccessible to you) | |
Examination: | Examination |
Classes in period "Summer semester 2024/25" (future)
Time span: | 2025-02-17 - 2025-06-08 |
Navigate to timetable
MO TU W TH FR |
Type of class: |
Lab, 30 hours
Lecture, 30 hours
|
|
Coordinators: | Hung Son Nguyen | |
Group instructors: | Hung Son Nguyen, Marcin Szczuka | |
Students list: | (inaccessible to you) | |
Examination: | Examination |
Copyright by University of Warsaw.