A list of learning materials to understand databases internals, including but not limited to:
- papers
- blogs
- courses
- talks
Please submit a pull request if there is any material that you think should be included in this collection.
- SQL & Relation Algebra
- Query Optimizer
- Query Execution
- DDL
- Transaction
- Network
- Storage
- Serializing & RPC
- Data Partitiioning
- Replication & Consistency
- Consensus
- Scale & Blance
- Benchmark & Testing
Courses:
-
CMU Database Systems (15-445/645), thanks to Andy Pavlo
-
UC Berkeley Introduction to Database Systems
- Introduction + SQL I
- SQL II
- Relational Algebra
Blogs:
- 数据库内核杂谈, thanks to 顾仲贤
- SQL优化器原理 - 查询优化器综述, thanks to 勿烦
Blogs:
- 数据库内核杂谈, thanks to 顾仲贤
- SQL 查询优化原理与 Volcano Optimizer 介绍, thanks to 张茄子
- Cascades Optimizer, thanks to hellocode
Papers:
- 1979, Access Path Selection in a Relational Database Management System, SIGMOD
- 1979, Query Processing in Main Memory Database Management Systems, VLDB
- 1987, Query Optimization by Simulated Annealing, SIGMOD
- 1988, Grammar-like Functional Rules for Representing Query Optimization Alternatives, SIGMOD
- 1993, The Volcano Optimizer Generator- Extensibility and Efficient Search, ICDE
- 1995, The Cascades Framework for Query Optimization, IEEE Data engineering Bulltin
- 1998, An Overview of Query Optimization in Relational Systems, PODS
- 2001, LEO – DB2’s LEarning Optimizer, VLDB
- 2004, Robust Query Processing through Progressive Optimization, SIGMOD
- 2014, Orca: A Modular Query Optimizer Architecture for Big Data, SIGMOD
- 2016, Parallelizing Query Optimization on Shared-Nothing Architectures, VLDB
- 2016, The MemSQL Query Optimizer: A modern optimizer for real-time analytics in a distributed database, VLDB
Papers:
- 1996, Modelling Costs for a MM-DBMS, in Real-Time Databases
- 2014, Approximation Schemes for Many-Objective Query Optimization, SIGMOD
- 2015, Multi-Objective Parametric Query Optimization, VLDB
Papers:
- 2005, An Improved Data Stream Summary: The Count-Min Sketch and its Applications, Journal of Algorithms
- 2007, New Estimation Algorithms for Streaming Data: Count-min Can Do More
- 2017, Adaptive Statistics in Oracle 12c, VLDB
Books:
Papers:
- 1994, Volcano-An Extensible and Parallel Query Evaluation System, IEEE Transactions on Knowledge and Data EngineeringFebruary
- 2014, Morsel-Driven Parallelism: A NUMA-Aware Query Evaluation Framework for the Many-Core Age, SIGMOD
Blogs:
- Overhead of a Generalized Query Execution Engine, from The Pivotal Engineering Journal, thannks to the Pivotal Engineering team
Papers:
- 2005, MonetDB/X100: Hyper-Pipelining Query Execution, CIDR
- 2011, Efficiently Compiling Efficient Query Plans for Modern Hardware, VLDB
- 2017, Relaxed Operator Fusion for In-Memory Databases: Making Compilation, Vectorization, and Prefetching Work Together At Last, VLDB
- 2018, Everything You Always Wanted to Know About Compiled and Vectorized Queries But Were Afraid to Ask, VLDB
Papers:
- 2013, Multi-Core, Main-Memory Joins: Sort vs. Hash Revisited, VLDB
- 2017, Looking Ahead Makes Query Plans Robust, VLDB
- 2013, Online, Asynchronous Schema Change in F1, VLDB
Blogs:
- 一致性模型, thanks to siddontang
Papers:
- 1995, A Critique of ANSI SQL Isolation Levels, SIGMOD
Courses:
-
CMU Database Systems (15-445/645), thanks to Andy Pavlo
-
CMU Advanced Database Systems (15-721), thanks to Andy Pavlo
Papers:
- 2012, Serializable Snapshot Isolation in PostgreSQL, VLDB
- 2015, Fast Serializable Multi-Version Concurrency Control for Main-Memory Database Systems, SIGMOD
- 2017, An Empirical Evaluation of In-Memory Multi-Version Concurrency Control, VLDB
- 2019, Scalable Garbage Collection for In-Memory MVCC Systems, VLDB
Courses:
- CMU Advanced Database Systems (15-721), thanks to Andy Pavlo
Papers:
- 2016, The End of Slow Networks: It's Time for a Redesign, VLDB
- 2016, Accelerating Relational Databases by Leveraging Remote Memory and RDMA, SIGMOD
- 2017, Don't Hold My Data Hostage: A Case for Client Protocol Redesign, VLDB
Blogs:
- On Disk IO, Part 1: Flavors of IO, thanks to Alex
- On Disk IO, Part 2: More Flavours of IO, thanks to Alex
- Read, write & space amplification - pick 2, thanks to Mark Callaghan
Papers:
- 2016, Design Tradeoffs of Data Access Methods, SIGMOD
- 2016, Designing Access Methods: The RUM Conjecture, EDBT
Courses:
Blogs: