Uniwersytet Warszawski - Centralny System UwierzytelnianiaNie jesteś zalogowany | zaloguj się
katalog przedmiotów - pomoc

Data Analysis in R

Informacje ogólne

Kod przedmiotu: 2500-EN-F-233 Kod Erasmus / ISCED: 14.4 / (0313) Psychologia
Nazwa przedmiotu: Data Analysis in R
Jednostka: Wydział Psychologii
Grupy: Academic basket
Elective courses
electives for 3,4 and 5 year
Methodology, Statistics and Psychometrics basket
Punkty ECTS i inne: (brak)
zobacz reguły punktacji
Język prowadzenia: angielski
Skrócony opis: (tylko po angielsku)

The course will give an introductory though very sound knowledge of the

R environment for statistical programming. R is the most used, flexible,

powerful and complete among statistical software. Moreover, it is free

and hence easily accessible even to students at the beginning of their

research work. It is customizable and if new statistical procedures

emerge, good chances are that they are implemented in an R package.

While some of the most widely used standard statistical analyses and data

visualizations will be shortly presented, the focus of the course will be on

data management and programming in R.

Pełny opis: (tylko po angielsku)

In psychology as in other fields, we see how technological advancements

provide researchers with a growing quantity of collected data.

Researchers are faced with the necessity of adding to the theoretical

knowledge about their subject of study, the knowledge on how to deal

and extract information from this increasingly available amount of data.

Oftentimes though, we witness spectacular scientific advances, which are

made possible by creatively connecting data with the scientific questions

we are after. Simple, standard statistical knowledge and methods are

often not enough anymore and the need for a new competence for the

empirically minded researcher is emerging. This competence is taking the

form of a new discipline in the quantitative sciences. called Data Science.

Data science involves theoretically informed data management decisions

and requires robust and customizable tools to perform these activities.

In this context, R is emerging as THE standard statistical software for the

next generations of analytically minded researchers. It is a flexible and

powerful programming language and environment focused on data

analysis. Although the core set of functions has more functionalities than

you will ever want to use, R is an open source and freely available

platform, which means everyone can contribute to it writing specialized

packages that are made public to everyone. For that reason, it is also the

most comprehensive statistical software and many innovative statistical

methods are already available for your specialized needs. But not only

that. R has very advanced and impressive graphical capacities which can

produce publication quality data visualizations with just a few lines of


The course will have an applied, hands-on approach and we will lead

students from implementing their first simple operations on the data, to

creating their own set of scripts and functions using the R language. The

course is meant to be the first in a series of lectures specifically centered

on R. Its focus will be on data management and programming, the basis of

data science. For that reason both statistical modeling and data

visualization won’t have much space and only a few basic statistical

analyses and plotting functions will be covered. Advanced, specialized

courses on these aspects of data science will be offered as standalone


Literatura: (tylko po angielsku)


Introduction to R


Lander, J., (2013) “R for everyone”. Addison-Wesley

Peng, R.D., (2015) “R programming for Data Science”. Leanpub

Efekty uczenia się: (tylko po angielsku)

 Students will be able to perform many basic but important operations

over data within the R statistical environment: importing the data,

understanding the basic data types, visualizing and summarizing the


 Students will be trained to go through all the steps needed to

organize, restructure and clean data for successive statistical analysis.

 Students will learn how to write simple programs (scripts) in R in

order to automatize recursive problems in data cleaning.

Metody i kryteria oceniania: (tylko po angielsku)

Most of the classes will start with a short (3-4 questions) quiz concerning

the material presented in the previous class, and short polls to gauge

students’ confidence and understanding of the current material will be

administered and used to additionally tune the presentation of materials.

Anyway, these quizzes won’t contribute to the final grade.

Home assignments will contribute to the evaluation and progress made.

There will be 5 home assignments during the course (approximately one

every two/three classes; 30 points total). A final exam is envisioned

during which students will solve one or two practical problems using most

of the concepts treated in class in R (70 points total). For these reasons

attendance is deemed essential – students are expected to attend ALL

classes, be on time and prepared for discussion and activities.

In general Home assignments will contribute to 30% of the final grade,

and Final Exam for the remaining 70%.

Grades will be assigned according

to the following scale:

 5 – 90-100% – outstanding performance

 4+ – 79-89

 4 – 73-78% – good performance

 3+ – 67-72

 3 – 60-66% – minimum passing performance

 2 – 59% or less – performance not suitable for passing

Attendance rules

Attendance is a very important factor in order to pass the class. Up to two

unexcused missed classes are allowed. Additional absences should be

documented (e.g. sick leave). In case of exceptional and motivated

situations I urge to contact me personally to evaluate if additional

assignments can amend for the missed periods

Przedmiot nie jest oferowany w żadnym z aktualnych cykli dydaktycznych.
Opisy przedmiotów w USOS i USOSweb są chronione prawem autorskim.
Właścicielem praw autorskich jest Uniwersytet Warszawski.