EBA 3500 Data Analysis with Programming

EBA 3500 Data Analysis with Programming

Course code: 
EBA 3500
Department: 
Data Science and Analytics
Credits: 
7.5
Course coordinator: 
Jonas Moss
Course name in Norwegian: 
Data Analysis with Programming
Product category: 
Bachelor
Portfolio: 
Bachelor of Data Science for Business - Programme Courses
Semester: 
2023 Autumn
Active status: 
Active
Level of study: 
Bachelor
Teaching language: 
English
Course type: 
One semester
Introduction

This course introduces the basics of statistics and machine learning in the context of Python. It covers:

  • Inferential statistics, such as the bootstrap, p-values, and confidence intervals.
  • Methods for constructing and evaluating statistical estimators.
  • The fundamentals of the two most important regression models: Linear regression and logistic regression.

Additionally, the course introduces students to the Python packages NumPy, SciPy, and statsmodels

Learning outcomes - Knowledge

Upon completion of this course, students will be able to:

  • Understand, explain, and use fundamental statistical concepts such as:

    • Population values and estimators.
    • Construction of estimators using maximum likelihood.
    • Unbiased estimators.
    • Mean squared error.
    • Statistical tests.
    • Type I and type II errors.
    • p-values.
    • Confidence intervals.
    • Basic bootstrap.
    • Rudimentary exploratory data analysis.
    • Data summaries such as the mean, standard deviation, and kurtosis.
  • Understand the key differences between statistics and machine learning.

  • Understand how to evaluate the performance of machine learning models.

  • Understand and explain the fundamentals of linear regression modeling and logistic regression.

  • Understand and explain the role of ANOVA and omnibus tests in science.

Learning outcomes - Skills

Upon completion of this course, students will be able to:

  • Perform vectorized computations in NumPy.
  • Run statistical simulations using NumPy and SciPy.
  • Perform reasonable exploratory data analysis in Python, including constructing graphs.
  • Create novel estimators for population values.
  • Choose between estimators in a principled way.
  • Calculate confidence intervals, p-values, and test statistics in both traditional and new settings.
  • Program bootstrap procedures for statistical inference.
  • Implement complex linear regressions and logistic regressions in Python and interpret their output.
  • Conduct simple ANOVA experiments.
General Competence

Upon completion of the course, students will have stronger competence in:

  • Making sound statistical judgments.
  • Using online resources to solve problems.
  • Working independently on difficult problems.
  • Reading and understanding technical documentation.
Course content

The course covers the following topics:

  • NumPy and SciPy.
  • Statistical simulation in Python.
  • Exploratory data analysis.
  • Statistical models and the bootstrap.
  • Unbiased estimators and the efficiency of estimators.
  • Construction of estimators.
  • Confidence intervals.
  • Hypothesis tests and p-values.
  • The t-test.
  • Foundations of machine learning.
  • Linear regression.
  • Inference for linear regression.
  • Linear regression using categorical covariates (ANOVA).
  • Binary regression, such as logistic regression.
Teaching and learning activities

The course uses the following methods for teaching and learning:

  • Video lectures.
  • Lab sessions where the teacher and TAs help students with problems.
  • Homework exercises.
Software tools
Software defined under the section "Teaching and learning activities".
Additional information

Starting in the autumn of 2023, the form of evaluation in this course has changed from two exam codes (EBA 35001 and EBA 35002) to one exam code (EBA 35003).

A re-sit examination is offered in the former exam codes EBA 35001 and EBA 35002 in autumn 2023 and last time in spring 2024.

Qualifications

Higher Education Entrance Qualification

Disclaimer

Deviations in teaching and exams may occur if external conditions or unforeseen events call for this.

Required prerequisite knowledge

EBA3400 Programming, data extraction and visualisation, EBA 1180 Mathematics for Data Science or equivalent courses.

Assessments
Assessments
Exam category: 
Submission
Form of assessment: 
Handin - all file types
Weight: 
100
Grouping: 
Group/Individual (1 - 3)
Duration: 
1 Week(s)
Comment: 
Home exam.
Exam code: 
EBA 35003
Grading scale: 
ECTS
Resit: 
Examination every semester
Type of Assessment: 
Ordinary examination
Total weight: 
100
Student workload
ActivityDurationComment
Digital resources
  • Interactive video
21 Hour(s)
Seminar groups
36 Hour(s)
Examination
25 Hour(s)
Student's own work with learning resources
118 Hour(s)
Sum workload: 
200

A course of 1 ECTS credit corresponds to a workload of 26-30 hours. Therefore a course of 7,5 ECTS credit corresponds to a workload of at least 200 hours.