Introduction to Applied Data Science

Course Description

The course will provide a practical introduction to the tools and techniques of modern data science. This course aims to familiarize you with the basic aspects of data science and the process of data acquisition (APIs and Web Scraping), which will allow students to independently collect and acquire data from online sources. Afterwards, we introduce students to the toolkit to process and analyze text data. Special attention is paid to LLM-related workflows. Finally, we will focus on the analysis of spatial data.

Most of the applications and assignments in this course ask you to answer concrete economic questions. The philosophy behind these assignments is that you answer questions from the ground up, just like researchers do, and just like you will have to do at a later stage of your study. As such, this course is also an introduction into what economists do when they conduct empirical research. We develop these skills using the R programming language.

Format

This course features one weekly lecture (2 contact hours), and 1 tutorial (2 contact hours). In addition, you can also ask questions to the course coordinator by email.

Lecture Schedule

Event Date Subject
Lecture 1 21-04 Introduction to Data and Data Science
Lecture 2 28-04 Getting Data: API's and Databases
Lecture 3 07-05 Getting Data: Web Scraping
Lecture 4 26-05 Text as Data
Lecture 5 27-05 Introduction to LLMs
Lecture 6 09-06 Prompt Engineering and Structured Data
Lecture 7 16-06 Spatial Data and Geocomputation

Tutorial Schedule

Event Date Subject
Tutorial 1 23-04 Introduction to Data and Data Science
Tutorial 2 30-04 Getting Data: API's and Databases
Tutorial 3 12-05 Getting Data: Web Scraping
Tutorial 4 21-05 Discussion of Midterm
Tutorial 5 28-05 Text as Data
Tutorial 6 04-06 Introduction to LLMs
Tutorial 7 11-06 Prompt Engineering and Structured Data
Tutorial 8 18-06 Spatial Data & Mock Exam

Course Materials

You don’t need to buy any books for this course, and the slides are more or less self-contained. We do use a couple of resources that you should read as a preparation for lectures/assignments. These are reference materials that are regularly updated following the newest changes in the R community. The most important study book is R for Data Science. The mentioned chapters in this book serve as a good complement to the first lectures. In the next lectures, we’ll use ideas from the books Text Mining with R, Speech and Language Processing, and Spatial Analysis with R. The rest of the material is purely supplementary.

More advanced supplementary material:

Supplementary material in Python:

Asssessment

This course has a mid-term exam and a final exam. The mid-term will count for 40% of the final grade, the final exam for the remaining 60%. Both should be completed as part of the effort requirement. The answers to the assessments will be posted on Blackboard and perusal sessions will be organized. Both will be pen and paper exams and will feature multiple choice and open questions. If the final grade is below \(< 5.50\) but \(\geq 4\), there is a possibility of a resit, but only if the effort requirement is satisfied. No resit opportunity is possible for people obtained grades higher than 5.50.

Assessment Date
Mid-term 19-05
Final Exam 23-06
Retake Exam 07-07

Effort Requirement

In order to meet the effort requirement for this course, students must attend at least 6 out of 8 tutorials.

Learning Objectives

On effective completion of the course, students should:

  • Understand the basics of R programming in a data science context
  • Be able to independently acquire data from a variety of sources
  • Be able to understand common data formats such as HTML, json and XML.
  • Understand and be able to analyze non-standard formats of data such as text and spatial data
  • Be able to use R in the contexts mentioned in the points above.

Overview

Code ECB1ID
Period 4
Timeslot B (Tuesday Morning, Thursday Afternoon)
Level 1
ECTS 7.5
Course Type Optional Minor Course
Programme BSc Economics & Business Economics
Department U.S.E., Applied Economics
Coordinator/Lecturer Bas Machielsen
Tutorial Teachers Tina Dulam
Jozef Patrnciak
Bas Machielsen
Language English