University of Warsaw - Central Authentication System
Strona główna

Statistical machine learning

General data

Course ID: 1000-317bSML
Erasmus code / ISCED: (unknown) / (0612) Database and network design and administration The ISCED (International Standard Classification of Education) code has been designed by UNESCO.
Course title: Statistical machine learning
Name in Polish: Uczenie statystyczne
Organizational unit: Faculty of Mathematics, Informatics, and Mechanics
Course groups: Obligatory courses for 1st year Machine Learning
ECTS credit allocation (and other scores): 6.00 Basic information on ECTS credits allocation principles:
  • the annual hourly workload of the student’s work required to achieve the expected learning outcomes for a given stage is 1500-1800h, corresponding to 60 ECTS;
  • the student’s weekly hourly workload is 45 h;
  • 1 ECTS point corresponds to 25-30 hours of student work needed to achieve the assumed learning outcomes;
  • weekly student workload necessary to achieve the assumed learning outcomes allows to obtain 1.5 ECTS;
  • work required to pass the course, which has been assigned 3 ECTS, constitutes 10% of the semester student load.
Language: English
Type of course:

elective monographs

Short description:

The goal of the course is to introduce fundamental notions and statistical tools used in machine learning such as linear, logistic and multivariate regression, classifiers, dimension reduction methods, bayesian methods.

Full description:

The detailed program

  1. Explanatory Statistics (2-3 lessons)
    1. Basic summary statistic (sample mean, median, sample variance, etc.)
    2. Visualization of data (histogram, box-plots, kernel density estimation)
    3. Principal component analysis
    4. Clusterization, hierarchical clustering, k-means, k-medoids 
  2. Statistical theory (4-5 lessons)
    1. Basic definitions (statistical models, statistics, likelihood, etc.)
    2. Estimation theory (maximum likelihood estimators, efficiency, mean square error, bias vs variance trade-off, confidence intervals)
    3. Statistical hypothesis testing (type I and type II errors, power of a test, significance, p-value)
    4. Problems with p-values (effect size, multiple hypotheses testing)
    5. Bayesian inference (prior distribution, posterior distribution, Bayesian risk and Bayesian estimator, credible intervals)
    6. Distance between probability measures (Kullback-Leibler divergence, total variation distance, etc.)
  3. Simple regression and classification models (3-4 lessons)
    1. Linear regression
    2. Classification. Logistic regression, LDA, QDA
    3. Cross-validation and bootstrap
    4. Model selection and regularization. Lasso, ridge regression, forward-backwards procedure
  4. Advanced models (3 lessons)
    1. Tree-like models, bagging, random forests, boosting
    2. Support vector machines
    3. Non-linear models: splines, generalized additive models
Bibliography:

1. Trevor Hastie, Robert Tibshirani, Jerome H., The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Springer, Berlin

2. Andrew Ng, Machine Learning Yearning, https://github.com/ajaymache/machine-learning-yearning

Learning outcomes:

Knowledge: the student

* has in-depth understanding of the branches of mathematics necessary to study machine learning (probability theory, statistics, multivariable calculus, and linear algebra) [K_W05];

* has based in theory and well organized knowledge of fundamental techniques of statistics used in modeling and data analysis [K_W07].

Abilities: the student is able to

* construct mathematical reasoning [K_U06];

* express problems in the language of mathematics [K_U07];

* apply techniques of modern statistical data analysis [K_U10].

Social competences: the student is ready to

* critically evaluate acquired knowledge and information [K_K01];

* recognize the significance of knowledge in solving cognitive and practical problems and the importance of consulting experts when difficulties arise in finding a self-devised solution [K_K02];

* think and act in an entrepreneurial way [K_K03].

Assessment methods and assessment criteria:

Impact on the final grade: the final test 50%, two programming assignments 50%, in lab activity 10%.

Classes in period "Winter semester 2023/24" (past)

Time span: 2023-10-01 - 2024-01-28
Selected timetable range:
Navigate to timetable
Type of class:
Lab, 30 hours more information
Lecture, 30 hours more information
Coordinators: Dorota Celińska-Kopczyńska
Group instructors: Dorota Celińska-Kopczyńska, Jakub Krajewski, Andrzej Mizera, Grzegorz Preibisch
Students list: (inaccessible to you)
Examination: Grading

Classes in period "Winter semester 2024/25" (future)

Time span: 2024-10-01 - 2025-01-26
Selected timetable range:
Navigate to timetable
Type of class:
Lab, 30 hours more information
Lecture, 30 hours more information
Coordinators: Dorota Celińska-Kopczyńska
Group instructors: Maria Bochenek, Dorota Celińska-Kopczyńska, Jakub Krajewski, Andrzej Mizera
Students list: (inaccessible to you)
Examination: Grading
Course descriptions are protected by copyright.
Copyright by University of Warsaw.
Krakowskie Przedmieście 26/28
00-927 Warszawa
tel: +48 22 55 20 000 https://uw.edu.pl/
contact accessibility statement USOSweb 7.0.3.0 (2024-03-22)