University of Warsaw - Central Authentication System
Strona główna

Machine Learning 1: classification methods

General data

Course ID: 2400-DS1ML1
Erasmus code / ISCED: 14.3 Kod klasyfikacyjny przedmiotu składa się z trzech do pięciu cyfr, przy czym trzy pierwsze oznaczają klasyfikację dziedziny wg. Listy kodów dziedzin obowiązującej w programie Socrates/Erasmus, czwarta (dotąd na ogół 0) – ewentualne uszczegółowienie informacji o dyscyplinie, piąta – stopień zaawansowania przedmiotu ustalony na podstawie roku studiów, dla którego przedmiot jest przeznaczony. / (0311) Economics The ISCED (International Standard Classification of Education) code has been designed by UNESCO.
Course title: Machine Learning 1: classification methods
Name in Polish: Machine Learning 1: classification methods
Organizational unit: Faculty of Economic Sciences
Course groups: (in Polish) Przedmioty 4EU+ (z oferty jednostek dydaktycznych)
(in Polish) Przedmioty kierunkowe do wyboru - studia II stopnia IE - grupa 2 (2*30h)
English-language course offering of the Faculty of Economics
Mandatory courses for 1st year students of Data Science and Business Analytics
ECTS credit allocation (and other scores): 4.00 Basic information on ECTS credits allocation principles:
  • the annual hourly workload of the student’s work required to achieve the expected learning outcomes for a given stage is 1500-1800h, corresponding to 60 ECTS;
  • the student’s weekly hourly workload is 45 h;
  • 1 ECTS point corresponds to 25-30 hours of student work needed to achieve the assumed learning outcomes;
  • weekly student workload necessary to achieve the assumed learning outcomes allows to obtain 1.5 ECTS;
  • work required to pass the course, which has been assigned 3 ECTS, constitutes 10% of the semester student load.

view allocation of credits
Language: English
Type of course:

obligatory courses

Short description:

This course provides a broad perspective on application of Machine Learning methods in supervised learning for regression and classification problems. It includes both the description of theoretical background and practical examples and illustrations. The course covers the basis of machine learning including measuring performance, model testing, details of validation methods, feature engineering and selection, simple linear and logistic regression, discriminant analysis as well as K-nearest neighbors, Support Vector Machines, ridge and Lasso regression modelling methods.

Full description:

1. Introduction to Machine Learning

a. What is and what is not machine learning

b. Differences between classification, regression and clustering

c. Introducing a cost function

d. Sample parametric methods - linear regression and logistic regression

2. Measuring performance, machine learning diagnostics

a. Performance measures of supervised learning algorithms (model performance, error, confusion matrix and ratios, ROC curve, AUC, RMSE)

b. Learning curves

c. Training set and test set

3. Testing the model

a. Extending model complexity to increase fit

b. The concept of bias and variance and their trade-off

c. Cross-validation, selection of number of folds

4. Feature engineering

a. Feature transformation

b. Discretization of continuous features

c. Feature standardization/normalization

5. k-NN

a. Classification with k-nearest neighbours

b. Regression with k-nearest neighbours

6. Support Vector Machines

a. Optimization objective

b. Separating the data with a maximum margin

c. Kernel selection for more complex data

d. Modification of SVM algorithm for regression problems

7. Feature selection methods

a. Wrapper methods including automated selection (forward, backward and stepwise)

b. Filter methods – applying scoring to features (e.g. Chi squared test, information gain and correlation coefficient scores)

8. Regularization methods

a. introducing penalty for complexity

b. L1 regularization for additional sparsity in coefficients

c. L2 regularization for penalization of large coefficients

d. Regularized linear regression

e. Regularized logistic regression

9. Lasso regression

10. Workshops on real data

11. Project presentations

Bibliography:

Harrington, Peter. Machine learning in action. Vol. 5. Greenwich, CT: Manning, 2012.

Zumel, Nina, John Mount, and Jim Porzak. Practical data science with R. Manning, 2014.

Lantz, Brett. Machine learning with R. Packt Publishing Ltd, 2013.

Friedman, Jerome, Trevor Hastie, and Robert Tibshirani. "The Elements of Statistical Learning: Data Mining, Inference, and Prediction." Springer Series in Statistics ( (2009).

Learning outcomes:

After completing the course, the average student will have reliable, structured knowledge on a wide range of unsupervised learning algorithms for regression and classification problems, such as linear and logistic regression, linear discriminant analysis, kNN, ridge regression, LASSO, Support Vector Machine. They will know the theoretical foundations of these algorithms, as well as have programming skills allowing their application in practice. They will be able to select predictive modeling algorithms that are best suited to the specific research problem, perform reliable validation of models, select and transform variables, and perform an independent research project using the methods learned.

K_U02, K_U05

Assessment methods and assessment criteria:

Harrington, Peter. Machine learning in action. Vol. 5. Greenwich, CT: Manning, 2012.

Zumel, Nina, John Mount, and Jim Porzak. Practical data science with R. Manning, 2014.

Lantz, Brett. Machine learning with R. Packt Publishing Ltd, 2013.

Friedman, Jerome, Trevor Hastie, and Robert Tibshirani. "The Elements of Statistical Learning: Data Mining, Inference, and Prediction." Springer Series in Statistics ( (2009).

Classes in period "Summer semester 2023/24" (in progress)

Time span: 2024-02-19 - 2024-06-16
Selected timetable range:
Navigate to timetable
Type of class:
Seminar, 30 hours more information
Coordinators: Piotr Wójcik
Group instructors: Szymon Lis, Michał Woźniak, Piotr Wójcik
Students list: (inaccessible to you)
Examination: Course - Grading
Seminar - Grading
Course descriptions are protected by copyright.
Copyright by University of Warsaw.
Krakowskie Przedmieście 26/28
00-927 Warszawa
tel: +48 22 55 20 000 https://uw.edu.pl/
contact accessibility statement USOSweb 7.0.3.0 (2024-03-22)