EBA 3501 Foundations of Data Science
EBA 3501 Foundations of Data Science
In this course, we will embark on an exciting journey into the realm of machine learning and data science. Machine learning, a subset of AI, empowers computers to learn from data and make predictions or decisions without explicit programming.
Throughout this course, we will leverage the power of Python, the most popular language for machine learning. Libraries such as scikit-learn provides a rich set of tools for data preprocessing, model building, evaluation, and much more.
This course will equip you with the foundational knowledge and practical skills needed to tackle real-world problems involving data analysis, classification, and regression.
Upon completion of this course, students will be able to:
- Understand the basic concepts of machine learning and exploratory data analysis.
- Know the most popular machine learning methods' definition, interpretation, and properties.
- Appreciate the reasoning behind basic workflows in data science.
Upon completion of this course, students will be able to:
- Apply Python libraries for loading, cleaning, and exploration of data.
- Fit a variety of important machine learning models in Python and interpret the results.
- Present data analyses professionally using Quarto and Jupyter notebooks.
Upon completion of the course, students will have stronger competence in:
- Work on difficult problems, independently and in teams.
- Read and understand technical documentation.
- Present analyses professionally.
The course covers the following topics:
- Loading, cleaning, and exploration of data.
- Quarto and Jupyter notebooks for the presentation of exploratory data analyses and applications of machine learning methods.
- Fitting of basic classifiers.
- Run and make informed choices among linear regression.
- Interpret the coefficients of regression models along with their p-values.
- Splitting data into test and training sets.
- Construction estimator pipelines, including data loading, preprocessing, fitting, and model evaluation.
- Perform regularized regression, such as ridge and lasso.
- Usage of non-linear features such as polynomials and splines.
- Feature transformations such as one-hot encoding.
- Make informed choices between different models using cross-validation.
The course uses the following methods for teaching and learning:
The course will be a combination of lectures and tutorials. Please note that while attendance is not compulsory in all courses, it is the student’s own responsibility to obtain any information provided in class that is not included on Itslearning or in the text book.
Higher Education Entrance Qualification
Disclaimer
Deviations in teaching and exams may occur if external conditions or unforeseen events call for this.
EBA3400 Programming, data extraction and visualisation, EBA 1180 Mathematics for Data Science or equivalent courses.
Assessments |
---|
Exam category: Submission Form of assessment: Submission other than PDF Exam/hand-in semester: First Semester Weight: 100 Grouping: Group/Individual (1 - 3) Duration: 1 Week(s) Comment: Home exam. Exam code: EBA 35011 Grading scale: ECTS Resit: Examination every semester |
Activity | Duration | Comment |
---|---|---|
Prepare for teaching | 45 Hour(s) | |
Teaching | 45 Hour(s) | |
Examination | 25 Hour(s) | |
Student's own work with learning resources | 85 Hour(s) |
A course of 1 ECTS credit corresponds to a workload of 26-30 hours. Therefore a course of 7,5 ECTS credit corresponds to a workload of at least 200 hours.