The course is included in these curricula and study modules
The course takes place in period
- 3 (2022-01-01 to 2022-03-13)
- 4 (2022-03-14 to 2022-07-31)
Level/category
Teaching language
English
Type of course
Compulsory
Cycle/level of course
Second
Recommended year of study
1
Total number of ECTS
5 cr
Competency aims
The aim of the course is to provide the student with the necessary tools for handling big data sources for machine learning modeling.
Learning outcomes
At the end of the course, the student is expected to understand when it is needed to use supercomputer facilities for solving analytical problems. The student will be able to run machine learning algorithms in supercomputer facilities. Moreover, the student will be able to run machine learning models using spark and dask frameworks.
Course contents
The students get an overview of machine learning to model using super computing facilities, and how to utilize big data. The areas of descriptive and predictive modeling are introduced for small data, and the students are then given an explanation for how similar models can be modified to work with big data. The students are introduced to the analytical process; data-related requirement handling, domain knowledge, modeling, and verification of results.
Prerequisites and co-requisites
Basic python programming skills are required. Previous courses in Machine Learning for Predictive and Descriptive problems are recommended.
Recommended or required reading
Hamstra, M., & Zaharia, M. (2013). Learning Spark: lightning-fast big data analytics. O'Reilly & Associates.
Daniel, J. (2019). Data Science with Python and Dask. Simon and Schuster.
https://docs.csc.fi/support/tutorials/ml-guide/ External link
Study activities
- Lectures - 30 hours
- Small-group work - 70 hours
- Individual studies - 35 hours
Workload
- Total workload of the course: 135 hours
- Of which autonomous studies: 135 hours
- Of which scheduled studies: 0 hours
Mode of Delivery
Multiform education
Assessment requirements
To pass this course, the student should present a final project in group or individually where they use big data facilities for machine learning modeling.
Teacher
- Björk Kaj-Mikael
- Espinosa Leal Leonardo
- Majd Amin
- Scherbakov-Parland Andrej
Examiner
Espinosa Leal Leonardo
Home page of the course
Group size
No limit (46 students enrolled)
Assignments valid until
12 months after course has ended
Course enrolment period
2021-12-24 to 2022-01-20
Date | Time | Room | Title | Description | Organizer |
---|---|---|---|---|---|
2022-02-24 | 13:00 - 18:00 | Big Data Analytics | Espinosa Leal Leonardo Majd Amin Scherbakov-Parland Andrej |
||
2022-02-25 | 13:00 - 18:00 | Big Data Analytics | Espinosa Leal Leonardo Majd Amin Scherbakov-Parland Andrej |
||
2022-03-10 | 13:00 - 18:00 | Big Data Analytics | Espinosa Leal Leonardo Majd Amin Scherbakov-Parland Andrej |
||
2022-03-11 | 13:00 - 18:00 | Big Data Analytics | Espinosa Leal Leonardo Majd Amin Scherbakov-Parland Andrej |
||
2022-03-24 | 13:00 - 18:00 | Big Data Analytics | Espinosa Leal Leonardo Majd Amin Scherbakov-Parland Andrej |
||
2022-03-25 | 13:00 - 18:00 | Big Data Analytics | Espinosa Leal Leonardo Majd Amin Scherbakov-Parland Andrej |