Pearls of Database Literature
During this seminar, you will read scientific articles on database topics that are particularly relevant and worth reading. Each article will be discussed in class in a separate meeting.
The discussion follows a system that was developed by Curtis E. Dyreson from Utah State University. In a nutshell, the students will be divided into three groups for each paper. Each group assumes one of the following roles:
- Your role is to send three questions on the paper to the lecturer before the discussion. A question can be about a specific topic, a general question about future trends, or a criticism/comment on the paper.
- Your role is to answer questions in class.
- Note Takers
Your role is to take notes on the discussion, write-up the notes, and
send them to the lecturer.
The kick-off meeting takes place on 2015-03-11 at 9:00am. The participation in the kick-off meeting is compulsory.
When to submit questions and notes?
- The discussions will take place on Wednesdays, 9:15-11:00am, about every second week. The dates are listed in the table below. Attendance is compulsory!
- If you are in the questioner group for a paper, send the questions to the lecturer within Monday before the paper is discussed. The questions will be made available online the morning before the discussion.
- If you are in the note taker group for a paper, send your notes to the lecturer within Tuesday after the discussion date of the paper. The notes will be made available online.
How to submit questions and notes?
Please submit your questions/notes to email@example.com. You help the lecturer if you use a subject of the form
<your name>: [questions|notes] <paper-id>
for example, "Peter Pan: questions Codd 1970".
- Codd 1970
E. F. Codd. A relational model of data for large shared data banks. Communications of the ACM. 1970.
In this foundational paper E. F. "Ted" Codd introduces the relational model, which is "widely recognized as one of the great technical achievements of the 20th century" [ ACM Turing Award for Codd in 1981]
- Selinger 1979
P. Griffiths Selinger, M. M. Astrahan, D. D. Chamberlin, R. A. Lorie, and T. G. Price. Access path selection in a relational database management system. ACM SIGMOD. 1979.
System R pionieered the implementation of the relational data model. Patricia "Pat" Selinger gave a distinguished profiles in databases interview.
- Goldstein 2001
J. Goldstein and P. Larson Optimizing queries using materialized
views: a practical, scalable solution ACM SIGMOD. 2001.
This paper from the SIGMOD 2001 Conference describes a new technique for rewriting queries using materialized views. The paper introduces an efficient view matching algorithm and a new index structure, the filter tree, on view definitions that is used to determine which parts of a query can be computed from materialized views. The technique incurs very little penalty in the time taken to generate the query plan. The paper has been highly influential in subsequent research on the use of materialized views for query processing and in other contexts such as federated systems, probabilistic databases and XML databases; and it has influenced the design of query optimizers in at least two commercial database systems. [2011 SIGMOD Test of Time Award]
- Börzsönyi 2001
S. Börzsönyi, D. Kossmann, and K. Stocker. The Skyline Operator. IEEE ICDE. 2001.
Skyline computation (a.k.a. the maximum vector problem) is a fundamental concept in multi-criteria decision making. This highly influential paper opened a new research topic in the database community. It framed the skyline concept in a database setting and offered a study of fundamental techniques for skyline query processing. The paper laid a solid foundation for a multitude of studies that have refined the concept of skylining and proposed efficient implementations in a variety of settings. [ICDE 2011 Influential Paper Award]
- Dean 2004
J. Dean and S. Ghemawat. MapReduce: Simplified Data Processing on Large Clusters. OSDI. 2004.
The MapReduce programming model has had a dramatic impact on the way large datasets are processed in distributed systems. MapReduce received much attention in the scientific community (the paper has received more than 10k citations), led to widespread implementations (e.g., Apache Hadoop), and a number of NoSQL database systems are based on MapReduce (e.g., MongoDB).
- Chang 2006
F. Chang, J. Dean, S Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T. Chandra, A. Fikes, and R. E. Gruber. Bigtable: A Distributed Storage System for Structured Data. OSDI. 2006.