University of Warsaw - Central Authentication System
Strona główna

Big Data Analytics

General data

Course ID: 2400-DS2BDA
Erasmus code / ISCED: 14.3 Kod klasyfikacyjny przedmiotu składa się z trzech do pięciu cyfr, przy czym trzy pierwsze oznaczają klasyfikację dziedziny wg. Listy kodów dziedzin obowiązującej w programie Socrates/Erasmus, czwarta (dotąd na ogół 0) – ewentualne uszczegółowienie informacji o dyscyplinie, piąta – stopień zaawansowania przedmiotu ustalony na podstawie roku studiów, dla którego przedmiot jest przeznaczony. / (0311) Economics The ISCED (International Standard Classification of Education) code has been designed by UNESCO.
Course title: Big Data Analytics
Name in Polish: Big Data Analytics
Organizational unit: Faculty of Economic Sciences
Course groups: (in Polish) Przedmioty kierunkowe do wyboru - studia II stopnia IE - grupa 1 (6*30h)
English-language course offering of the Faculty of Economics
Mandatory courses for 2nd year students of Data Science and Business Analytics
ECTS credit allocation (and other scores): 2.00 Basic information on ECTS credits allocation principles:
  • the annual hourly workload of the student’s work required to achieve the expected learning outcomes for a given stage is 1500-1800h, corresponding to 60 ECTS;
  • the student’s weekly hourly workload is 45 h;
  • 1 ECTS point corresponds to 25-30 hours of student work needed to achieve the assumed learning outcomes;
  • weekly student workload necessary to achieve the assumed learning outcomes allows to obtain 1.5 ECTS;
  • work required to pass the course, which has been assigned 3 ECTS, constitutes 10% of the semester student load.
Language: English
Type of course:

obligatory courses

Short description:

The laboratory gives practical experience in using Hadoop ecosystem technologies for Big Data Analytics. Participants will learn how to apply data analysis and machine learning techniques they have learned in previous courses to Big Data datasets. Student will not learn new techniques. The course will be focused on practical experience and understanding concepts behind used technologies.

Full description:

1. Introduction to Linux environment

2. Introduction to Big Data concepts

• Hadoop ecosystem

• MapReduce paradigm

3. Data preparation and exploration with use of Apache Hive and Apache Spark

• Differences vs. RDBMs

• Optimization

• Traps

4. Introduction to ML with Apache Spark

• Transfer models build in R or Python to big data world (possibilities and limitations)

5. Interactive analytics

6. Visualization in Big Data

7. Scheduling tools (Apache Airflow)

Bibliography:

Readings and up-to-date online resources provided during the laboratory as a preparation for the next one.

Learning outcomes:

Student will learn how to use Hadoop Ecosystem technologies for data preparation, analysis and how to apply basic machine learning algorithms to big data datasets with use of Apache Hive and Apache Spark.

K_U02, K_U05

Assessment methods and assessment criteria:

All students will be obliged to:

• be present at the classes (according to common University of Warsaw rules),

• presentation about examples of usage of methods presented at the course (based on academic articles)

• Big Data project

Classes in period "Winter semester 2023/24" (past)

Time span: 2023-10-01 - 2024-01-28
Selected timetable range:
Navigate to timetable
Type of class:
Lab, 15 hours more information
Coordinators: Michał Bryś
Group instructors: Michał Bryś
Students list: (inaccessible to you)
Examination: Course - Grading
Lab - Grading

Classes in period "Winter semester 2024/25" (future)

Time span: 2024-10-01 - 2025-01-26
Selected timetable range:
Navigate to timetable
Type of class:
Lab, 15 hours more information
Coordinators: (unknown)
Group instructors: (unknown)
Students list: (inaccessible to you)
Examination: Course - Grading
Lab - Grading
Course descriptions are protected by copyright.
Copyright by University of Warsaw.
Krakowskie Przedmieście 26/28
00-927 Warszawa
tel: +48 22 55 20 000 https://uw.edu.pl/
contact accessibility statement USOSweb 7.0.3.0 (2024-03-22)