Statistical data analysis
General data
Course ID: | 1000-714SAD |
Erasmus code / ISCED: |
11.303
|
Course title: | Statistical data analysis |
Name in Polish: | Statystyczna analiza danych |
Organizational unit: | Faculty of Mathematics, Informatics, and Mechanics |
Course groups: |
Obligatory courses for 2nd year Bioinformatics Obligatory courses for 3rd grade Mathematics |
ECTS credit allocation (and other scores): |
6.00
|
Language: | Polish |
Type of course: | obligatory courses |
Prerequisites (description): | (in Polish) Oczekuje się dobrej znajomości zagadnień ujętych w sylabusach przedmiotów Analiza matematyczna II.1 oraz Rachunek prawdopodobieństwa I. |
Short description: |
Introduction to basic statistical notions and tools, such as parameter estimation and hypothesis testing. Introduction to data science, covering classification and clustering methods. The Mathematics students can alternatively take the course which has a different character. |
Full description: |
1. Basic notions of probability calculus and statistics: random variables, their distributions, expected value and variance, probability space. 2. Basic notions of statistics: statistical space, random experiment, statistic, statistical model, model evaluation methods. 3. Parameter estimation. Bias and efficiency, maximum likelihood estimatoes, confidence intervals. 4. Summary and visualisation of data. Quantile-quantile plots. Histograms, kernel density estimation, boxplot. 5. Hypothesis testing. The notion of a statistical hypothesis, the procedure of hypothesis testing, type I and type II errors, power of a test, Neyman-Pearson lemma, parametric statistical significance tests, significance tests for a mean, significance tests for a variance. 6. The notion of p-value and potential misunderstandings and misusage, effect size, multiple hypothesis testing. 7. Useful statistical tests. Statistical significance test for two means, non-parametric tests for two medians, Pearson's chi-squared test, analysis of variance. 8. Linear regression, simple, multiple, with extensions: assumptions, parameter estimation, evaluation of goodness of fit. 9. Classification. Logistic regression, LDA, QDA, KNN. 10. Resampling methods. Cross-validation, bootstrap. 11. Model selection and regularisation. Feature selection, usage of a validation set, usage of cross-validation, analysis of high-dimensional data, lasso and ridge regression, partial least squares. 12. Tree-based models: decision trees, bagging, random forests, boosting 13. Support vector machines. Separating hyperplanes, maximum margin classifyier, support vector machines. 14. Dimensionality reduction. PCA. 15. Unsupervised learning. The notion of clustering, methods of hierarchical clustering and k-means. 16. Nonlinear models. Polynomial regression, splines, generalized additive models. |
Bibliography: |
Lesław Gajek, Marek Kałuszka, Wnioskowanie statystyczne, modele i metody. John A. Rice, Mathematical Statistics and Data Analysis. Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani. Introduction to Statistical Learning in R. |
Learning outcomes: |
Knowledge: 1. general knowledge of the problems of statistical data analysis. 2. basic knowledge of the statistical tools used in the modeling and analysis of data. 3. basic notions and methods of probability calculus and statistics, including parameter estimation and hypothesis testing methods. Skills: 1. performing simple statistical analysis and statistical testing. 2. using modern statistical analysis tools. Social skills: 1. Ability to explain statistical inference in plain words. |
Assessment methods and assessment criteria: |
Impact on the final grade: the exam grade 40%, mid-term test 20%, assignment 10%, in class activity 10%, in lab activity 10%. |
Classes in period "Summer semester 2023/24" (in progress)
Time span: | 2024-02-19 - 2024-06-16 |
Navigate to timetable
MO TU CW
LAB
CW
LAB
W LAB
LAB
CW
CW
TH CW
WYK
LAB
FR CW
LAB
|
Type of class: |
Classes, 15 hours
Lab, 30 hours
Lecture, 30 hours
|
|
Coordinators: | Błażej Miasojedow | |
Group instructors: | Barbara Domżał, Błażej Miasojedow, Szymon Nowakowski, Piotr Pokarowski, Łukasz Rajkowski | |
Students list: | (inaccessible to you) | |
Examination: | Examination |
Classes in period "Summer semester 2024/25" (future)
Time span: | 2025-02-17 - 2025-06-08 |
Navigate to timetable
MO TU W TH FR |
Type of class: |
Classes, 15 hours
Lab, 30 hours
Lecture, 30 hours
|
|
Coordinators: | Błażej Miasojedow | |
Group instructors: | Błażej Miasojedow, Szymon Nowakowski, Piotr Pokarowski, Łukasz Rajkowski | |
Students list: | (inaccessible to you) | |
Examination: | Examination |
Copyright by University of Warsaw.