GRA 4142 Data Management and Python Programming
GRA 4142 Data Management and Python Programming
According to Statista, the annual amount of data created, captured, copied, and consumed worldwide will reach 120 zettabytes in 2023. Using available data to gain insights and make correct decisions is becoming essential for almost any business in today’s world.
This course introduces two of the most popular and indispensable programming languages for data analysts:
- Python (with focus on data cleaning, processing, analysis and visualization)
- SQL
In addition, the course also covers the basics of data management with focus on relational databases.
Upon completion of the course the student shall be able to:
- understand, explain and use fundamental programming concepts, including:
- syntax and semantics,
- variables,
- types,
- basic data structures,
- expressions and statements,
- control flow (conditionals and loops),
- functions and libraries,
- input/output operations,
- exceptions
with focus on the Python programming language,
- understand and explain principles of data modeling and relational databases,
- understand, explain and use SQL statements and queries.
Upon completion of the course the student shall be able to:
- use integrated development environments to create computer programs,
- design, implement, run, test and debug programs in Python based on a given textual description of a problem,
- process, analyze, summarize and visualize datasets using Python, NumPy, Matplotlib and Seaborn and other libraries,
- read and understand Python source code implemented by others,
- create a data model based on a given textual description of a problem,
- implement this data model in a relational database using the SQL language,
- query and modify relational databases using the SQL language,
- create computer programs in Python that store, modify and query data stored in relation databases,
- set up indexes to improve the performance of databases.
Upon completion of the course the student shall have stronger competence in:
- processing and analyzing data with help of computers,
- using online resources as aids to solve problems,
- reading and understanding technical documentation,
- working in groups.
- Introduction, installation of Python, Jupyter lab, IDEs.
- Executing Python code.
- Variables, basic types, user input and output.
- Control flow (conditional execution, loops).
- Organizing code (functions and libraries).
- Data structures.
- Strings, reading, writing and processing text files.
- Vectors and matrices (NumPy), random numbers and the Monte Carlo method.
- Processing and analyzing tabular data with Pandas (reading, cleaning, manipulating, grouping and aggregating data).
- Plotting and visualization (Matplotlib, Seaborn).
- Introduction to relational databases.
- Structured Query Language (SQL).
- Relational model.
- Programming with databases.
- Indexes.
- Transactions.
- Organized (synchronous) classes combining classical lectures with discussing and solving practical problems. (Students are expected to prepare for these sessions by going through given Jupyter notebooks and other reading material and/or watching selected videos online.)
- Homework exercises (ungraded, solved individually or in groups of 2-3 students).
Software tools: open-source software (more information will be given at the beginning of the course).
Please note that while attendance is not compulsory, it is the student’s own responsibility to obtain any information provided in class.
All courses in the Masters programme will assume that students have fulfilled the admission requirements for the programme. In addition, courses in second, third and/or fourth semester can have specific prerequisites and will assume that students have followed normal study progression. For double degree and exchange students, please note that equivalent courses are accepted.
Disclaimer
Deviations in teaching and exams may occur if external conditions or unforeseen events call for this.
Assessments |
---|
Exam category: Submission Form of assessment: Handin - all file types Weight: 100 Grouping: Individual Duration: 30 Hour(s) Exam code: GRA 41423 Grading scale: ECTS Resit: Examination when next scheduled course |
Activity | Duration | Comment |
---|---|---|
Teaching | 48 Hour(s) | |
Group work / Assignments | 84 Hour(s) | |
Prepare for teaching | 12 Hour(s) | |
Examination | 16 Hour(s) |
A course of 1 ECTS credit corresponds to a workload of 26-30 hours. Therefore a course of 6 ECTS credits corresponds to a workload of at least 160 hours.