Parallel and Distributed Data Management
(Non-Standard Database Systems)

News

About this Course

This course comes in two variants:

Data Science students are encouraged to enroll into Variant A (5 ECTS UV rather than VO+PS) to take advantage of the midterms for the lecture part of the course. Recognition of Variant A as equivalent to Variant B (as required for Data Science students) is ensured.

Lecture

Questions and discussions

For questions and discussions (also among students) regarding course specific topics, please use the Slack channel #pddm-uv (Workspace dbteaching.slack.com).

Slack registration: Students register with their university email here: https://dbteaching.slack.com/signup

Schedule

Schedule of the course according to PlusOnline. Deviations will be communicated explicitly in the Slack channel #pddm-uv and/or the course website.

Slides

Each set of slides treats a specific topic area and will be discussed in one or more lecture units. Slides that have not yet been discussed during the lecture may be subject to change. Once a slide set has been discussed in class, only bug fixes will be applied. Slide sets have a version (date) on the title page.

The slides and their discussion during the lecture are essential for the exam preparation.

Note: The slide version of last year is already online to give you an overview, but this version may be subject to change.

Topics Slides Handouts Literature
Database System Architectures [1x1] [2x2] DSC6 17
Parallel Databases [1x1] [2x2] DSC6 18
Distributed Databases [1x1] [2x2] 2-Phase-Commit
Persistent Messaging
Distributed Locking
DSC6 19

Previous Knowledge Expected

Basics of transactions:
DSC6 14.1–14.2, 14.4–14.6
Concurrency Control:
2-Phase Locking (2PL): DSC6 15.1.3
Timestamp-Based Protocols: DSC6 15.4
Deadlocks: DSC6 15.2

Literature

DSC6 — Database System Concepts
Silberschatz, Korth, Sudarshan. Database System Concepts.. McGraw-Hill, 2011, 6th edition.
Multiple copies of the book are available in the textbook collection of the department library (Itzling).
DSC7 — Database System Concepts
Silberschatz, Korth, Sudarshan. Database System Concepts.. McGraw-Hill, 2019, 7th edition.
The book is available online from our university library

Grading

The grading of the course is based on:

  1. Two midterms: you will write two midterm exams with 15 points each (30 points in total).
  2. Hands-on project: consisting of 3 assignments (see below) with 10 points each (30 points in total).
The overall score is the sum of the midterm score and the project score. The maximum overall score is 60. You need to achieve a midterm score of at least 12 points and a project score of at least 12 points to pass the course. The final grade is computed from the overall score as follows:

Score Grade
≥ 52.5 1
[45, 52.5) 2
[37.5, 45) 3
[30, 37.5) 4
< 30 5

If you have already taken the lecture (e.g., last year) and passed the lecture exam, the grade of the exam can be accredited for the midterms. You will receive 15 points for grade 4, 18.75 points for grade 3, 22.5 points for grade 2, and 26.25 points for grade 1. Hence, you don’t have to write the midterms if you have already passed the exam.

Midterm Exams

The midterm exams are planned for:

  1. Tue April 28, 15:00 (T03)
  2. Tue June 30, 15:00 (T03)

The midterm exam lasts for 60 minutes and you can get a total of 30 points for the two midterms.

Cheat sheet: You may use one A4 sheet with your personal notes (single-sided, handwritten or printed).

Previous exams: 20230627, 20230712, 20230919
Please note that these exams were a part of the lecture (VO) variant of this course and differ in their format from the midterm exams.

Project (Lab)

The goal of the project is to gain hands-on experience by working on three major programming assignments throughout the semester, in which we will implement parallel join algorithms using the Apache Spark framework.

For questions and discussions (also among students), please use the Slack channel #pddm-lab (Workspace dbteaching.slack.com).

Assignments

You will work on the project in groups of two people each. The assignment sheets will be published on Blackboard and we will provide skeleton files in which you are expected to implement your solution. We will ship the skeleton files in a single respository for all assignments using GitHub Classroom. More details on the setup will be given in the kick-off meeting and in the first assignment.

Assignment Total Points Release Date Due Date
A1: Setup & Warmup 5+5 points 17.03.2026 12.04.2026 @ 23:59
A2: Parallel Set Similarity Join 5+5 points 14.04.2026 10.05.2026 @ 23:59
A3: Fragment-and-Replicate Join 5+5 points 12.05.2026 07.06.2026 @ 23:59

Submission

Commit and push your team's solution to the repository provided by GitHub Classroom. You can push as many changes as you want, only the most recent commit on the main branch will be graded. The last commit before the deadline counts. Do not post your project on a public GitHub repository and do not copy solutions from anyone else. In both cases, you will fail the respective assignment (i.e., receive 0 points).

Grading

Every assignment consists of two parts (implementation and questionnaire) and each part is labeled with the amount of points that can be achieved. Due to the high exceptionally high number of enrollments to this course, students will complete three mini-tests as concluding part of an assignment. A mini-test is a short questionnaire on practical aspects of the previous assignment and accounts for 50% of the max. points on an assignment, i.e., you receive up to 5 points for your assignment submission and up to 5 points on the corresponding mini-test. Mini-test are taken individually (i.e., no groups) and are not to be confused with the midterms. In total, assignment submissions (3 x 5 = 15 points) and mini-tests (3 x 5 = 15 points) sum up to 30 points.

For students that take only the lab (PS), the grade of the lab is computed as follows:

Score Grade
≥ 26.25 1
[22.50, 26.25) 2
[18.75, 22.50) 3
[15, 18.75) 4
< 15 5

Meetings

We will meet at 02:00 p.m. (14:00) or 03:00 p.m. (15:00) according to the schedule below. One or two team(s) will present their solutions and we will discuss preliminiaries for the upcoming assignment (attendance is obligatory). Between a compulsory meeting and the deadline of the next assignment, we offer optional Q&A units (attendance is optional; it is your responsibility to resolve unclear aspects, issues, and the likes early).

Date Unit Attendance
03.03.2026 Kick-off meeting compulsory
10.03.2026 No class
17.03.2026 Unit 1: MapReduce and Apache Spark compulsory
24.03.2026 Q&A optional
31.03.2026 No class
07.04.2026 No class
14.04.2026 Unit 2: Set Similarity Joins + Mini-Test 1 compulsory
21.04.2026 Q&A optional
28.04.2026 Midterm 1 compulsory (UV only)
05.05.2026 Q&A optional
12.05.2026 Unit 3: Fragment-and-Replicate Joins + Mini-Test 2 compulsory
19.05.2026 Q&A optional
26.05.2026 No class
02.06.2026 Q&A optional
09.06.2026 Unit 4: Conclusion and Outlook + Mini-Test 3 compulsory
16.06.2026 Backup
23.06.2026 Backup
30.06.2026 Midterm 2 compulsory (UV only)

Course Unenrollment & Absences

Unenrollments are possible only until before the 3rd lab unit, i.e., all students that are still enrolled at the time of the 3rd lab unit will be graded.

You have at most one absence. If you have an objective reason (e.g., illness with medical certificate), we excuse at most one additional absence. With more than 2 absences in total (even if excused) you fail the course.