Unsupervised Learning
General data
Course ID: | 2400-DS1UL |
Erasmus code / ISCED: |
14.3
|
Course title: | Unsupervised Learning |
Name in Polish: | Unsupervised Learning |
Organizational unit: | Faculty of Economic Sciences |
Course groups: |
(in Polish) Przedmioty kierunkowe do wyboru - studia II stopnia IE - grupa 1 (6*30h) English-language course offering of the Faculty of Economics Mandatory courses for 1st year students of Data Science and Business Analytics |
ECTS credit allocation (and other scores): |
3.00
|
Language: | English |
Type of course: | obligatory courses |
Short description: |
Unsupervised learning course is focused on exploring data structure and extracting crucial information from unlabeled data without any particular variable of interest. The course is based on three subfields of unsupervised learning: clustering, dimension reduction and association rule learning. Both theoretical and practical aspects are going to be discussed. The course is realized in a computer lab. Passing requirements: group projects. The course is dedicated for graduate students (Econometrics and Informatics, Data Science). |
Full description: |
The main aim of the course is to make students familiar with research opportunities associated with data mining algorithms (Knowledge Discovery in Databases, KDD) and their usage in business applications. Three blocks of subjects are going to be fulfilled: clustering, dimension reduction and association rule learning. Each of the blocks will be divided into four stages: i) introduction and construction of basic algorithms, ii) familiarization with accessible commands in R, their comparison and evaluation, iii) work with newest literature sources, iv) group project. BLOCK 1: Clustering Data research will be done by clustering. Several methods are going to be introduced: distance-based, k-means, Partitioning Around Medoids (PAM), Clustering Large Applications (CLARA), Clustering Large Applications based on RANdomized Search (CLARANS) or nonparametric clustering, hierarchical clustering, dictionary learning, linkage methods and probabilistic ones. Different ways of identifying the optimal number of clusters (CH index, Silhouette index) and agreement indices will be presented. BLOCK 2: Dimension reduction Analysis of main components of the principal component analysis (PCA), multidimensional scaling (classic and metric), as well as actual non-linear dimension reduction methods. BLOCK 3: Association rule learning Main algorithms of association rules are going to be introduced (Apriori, Eclat, FP-growth, OPUS). They are applied i.a. in market based analysis and common patterns between purchased goods. Crucial measures of rules and transactions (support, confidence, lift, difference of consifence) will be described. Models based on real data will be prepared (with cleaning and transforming the input data). Visualization methods of data regarding transactions, rules and clusters (also interactive), as well as simplifying tools for big data by sampling will be done. Examples of used packages: arules, arulesViz, stats, cluster, pdfCluster, clues i inne (see R TaskViews „Cluster” - Cluster Analysis & Finite Mixture Models). |
Bibliography: |
Papers provided by lecturers as well as: Bousquet, O.; von Luxburg, U.; Raetsch, G., eds. (2004). Advanced Lectures on Machine Learning. Springer-Verlag. Duda, Richard O.; Hart, Peter E.; Stork, David G. (2001). "Unsupervised Learning and Clustering". Pattern classification (2nd ed.). Wiley. Hastie, Trevor; Tibshirani, Robert (2009). The Elements of Statistical Learning: Data mining,Inference,and Prediction. New York: Springer. |
Learning outcomes: |
- Student has knowledge about unsupervised learning principles - Student is familiar with unsupervised learning methodology - Student is able to analyze data by using unsupervised learning approach - Student is able to use knowledge about unsupervised learning to conduct his/her own research - Student gains, processes and analyzes data independently - Student is capable of working in groups and co-operating with others - Student is able to formulate his/her point of view and express it in the discussion - Student expresses his/her research curiosity and openness towards economic phenomenon K_W01, K_U01, K_U02, K_U03, K_U04, K_U05, KS_01, |
Assessment methods and assessment criteria: |
Evaluation of group projects |
Classes in period "Winter semester 2023/24" (past)
Time span: | 2023-10-01 - 2024-01-28 |
Go to timetable
MO TU LAB
LAB
LAB
LAB
W TH FR |
Type of class: |
Lab, 30 hours
|
|
Coordinators: | Katarzyna Kopczewska, Jacek Lewkowicz | |
Group instructors: | Katarzyna Kopczewska, Jacek Lewkowicz | |
Students list: | (inaccessible to you) | |
Credit: |
Course -
Grading
Lab - Grading |
Classes in period "Winter semester 2024/25" (past)
Time span: | 2024-10-01 - 2025-01-26 |
Go to timetable
MO TU KON
KON
KON
KON
W TH FR |
Type of class: |
Seminar, 30 hours
|
|
Coordinators: | Katarzyna Kopczewska, Jacek Lewkowicz | |
Group instructors: | Katarzyna Kopczewska, Jacek Lewkowicz | |
Students list: | (inaccessible to you) | |
Credit: |
Course -
Grading
Seminar - Grading |
Copyright by University of Warsaw.