Fully-Managed Serverless SQL with Aurora DSQL - A Deep-Dive into Its Internals

Fully-Managed Serverless SQL with Aurora DSQL - A Deep-Dive into Its Internals

What if there would be something like DynamoDB but for SQL?

With Amazon Aurora DSQL, this is finally possible. It's not just a "scales-to-zero" solution like Aurora Serverless V2. It is a true distributed, serverless, pay-per-use database. It also claims unlimited scalability, 99.99% single-region availability, and 99.999% multi-region availability.

As with many of our other articles, we’ve also provided you with a demo application. This will help you start a simple application using an Aurora DSQL database in your own AWS account.

💡
We’ll extend this repository with our upcoming articles to DSQL.

Introduction

At AWS re:Invent 2024, Marc Brooker, AWS VP, and Distinguished Engineer presented a deep-dive session on Aurora DSQL’s internals. In this talk, he states that Amazon Aurora DSQL is

A relational SQL database optimized for transactional workloads.

Let’s explore what this means from AWS's promotional perspective.

The Promises of Aurora DSQL

Aurora comes with a huge set of promises

  • 🔀 Automatic Scaling: Aurora is designed to scale automatically, based on the demand. Amazon promises that you can start with Aurora on the green field without worrying that you’ll hit limits in the future.

  • ⚡️ Serverless: Aurora joins the ranks of other serverless services such as S3, Lambda, DynamoDB, and SQS. This means no patching of the system, no creation of read-replicates, no maintenance downtimes, and no infrastructure management. You have to create your cluster and you can run queries against the endpoint of your database.

  • 🌎 Active-Active & Multi-Region: Your database can span multiple regions while still having strongly consistent transactions.

  • 🐘 PostgreSQL Compatible: Most of the PostgreSQL surface is compatible with Aurora DSQL. This means you should be able to use the clients and ORMs you already know and love.

Let’s explore how Aurora exactly wants to achieve all of this in the following paragraphs.

Deep Dive Into the Internals

Let’s dive into the internal based on Marc’s talk.

Respecting the ACID Principles

Generally, every database wants to respect the ACID principle:

  • Atomicity: Ensures that all operations in a transaction are completed; if any part fails, the entire transaction fails and the database state is unchanged.

  • Consistency: Ensures that a transaction brings the database from one valid state to another, maintaining all predefined rules, including constraints and cascades.

  • Isolation: Ensures that transactions are executed independently of one another. The intermediate state of a transaction is invisible to other transactions.

  • Durability: Ensures that once a transaction has been committed, it will remain so, even in the event of a system failure.

These are also the main goals for Aurora DSQL. But how does Aurora achieve this? Marc explains this nicely with some simple examples.

The Journal - A Log As the Database

Aurora stores everything in what is called a Journal. This is a record of every transaction ever made, serving as a history of all changes.

This Journal is an internal log server, that is not visible to customers. It’s been built by Amazon for over 10 years. It also powers S3, Kineses, DynamoDB, Lambda, and other AWS services. It’s an “atomic, distributed, scalable, replication system”. With this, Amazon doesn’t need to “re-solve” these really hard problems around atomicity and durability.

Inserting records in the Journal (log storage) of Aurora DSQL

But what about isolation if another transaction tries to insert a record with the same primary key, causing a conflict? That's where another component in Aurora's internal architecture comes into play.

The Adjudicator - Ensuring Isolation & Avoiding Conflicts

Aurora needs to avoid these conflicting transactions both commit. They need to be isolated.

The distributed layer that is responsible for this is the Adjudicator. Its job is to look for conflicts of recently committed transactions. Only if the adjudicator accepts a transaction, the commit will be successful.

The adjudicator avoids that conflicts can be successfully committed to the Journal.

The best part of every distributed layer in Aurora is that it's scalable. This means Aurora can create as many Adjudicators as needed as the database and the transaction rate grows.

The Adjudicator is scalable, so it can grow together with the database and transaction rate.

If there are multiple adjudicators, they communicate with each other using a distributed commit protocol designed from decades of AWS experience.

Each part of Aurora DSQL is designed to ensure that customers do not face limits, regardless of their capacity needs. Internally, this is achieved by partitioning the key space in a way that specializes in detecting conflicting transactions. This means that even though this component (like the Journal) is used in multiple AWS services, it is always optimized for the specific use case it needs to serve.

The Storage - Enhancing Query Performance

As mentioned in the beginning, the Journal is a log that contains all changes that have been made to the database. Querying from a log is neither fun nor efficient. It will be very slow and will always take o(n) time. That's why Aurora takes these Journal entries and indexes them. It builds a data structure on top of them that is optimized for querying data.

What Marc repeatedly mentions: the storage does not need to be durable. Aurora can lose every bit of data in the storage.

  • It’s not responsible for durability - that’s the Journal’s responsibility.

  • It’s also not responsible for concurrency control - that’s the Adjudicator’s responsibility.

The storage is simply there to provide an efficient way to query data. It doesn't need to handle anything else. Unlike other database engines, Aurora has separated these responsibilities into distinct layers that are scalable and operate independently.

The storage layer of Aurora is as scalable as every other layer.

How does the storage achieve the scalability? With the traditional way of partitioning keys across the storage. This means it can add as much storage as necessary when the database or workload grows.

💡
The Journal can also scale out! There can be as many Journals as needed.

Operations within a transaction are done via a query processor that runs on the storage itself. It will perform operations like

  • Getting a single record (key-value pair)

  • Getting all records (scan)

  • Counting all records (aggregate)

  • etc…

Since this runs directly on the storage, the SQL engine needs to communicate with the storage much less often. This greatly improves performance.

Aurora Works With a Full-Powered Compute Service: Lambda

Let’s have a look at a fairly simple SQL transaction:

BEGIN;
SELECT * FROM employees WHERE id = 1;
UPDATE employees SET salary = salary + 5000 WHERE id = 1;
INSERT INTO employees (name, position, salary) VALUES ('Jane Doe', 'Developer', 75000);
DELETE FROM employees WHERE id = 2;
COMMIT;

It consists of multiple things:

  1. BEGIN - Starts a new transaction.

  2. SELECT - Retrieves all columns for the employee with id = 1 from the employees table.

  3. UPDATE - Increases the salary of the employee with id = 1 by 5000.

  4. INSERT - Adds a new employee record with the name 'Jane Doe', position 'Developer', and salary of 75000 to the employees table.

  5. DELETE - Removes the employee record with id = 2 from the employees table.

  6. COMMIT - Saves all the changes made during the transaction to the database.

With SQL, we can accomplish many tasks. Between each SQL command, we might also execute additional business logic, which means we have a round-trip to our JavaScript (or whatever programming language we’re using). We could even run computations within SQL itself, as it’s a full-powered programming language itself!

This is what makes implementing scalable SQL databases very complicated. It also means Aurora needs a full-powered computing service.

What would be a good fit? AWS Lambda.

Diagram showing a "Transaction and session router" directing to three "Query Processor" units labeled "Firecracker."

At the front is the transaction and session router, which uses the Postgres protocol to direct the transaction to the correct destination. The compute layer consists of Firecracker micro-VMs, which also power Lambda and container services.

As you probably already guessed: these are also scalable independently!

Isolation of Reads

How does Aurora make sure that transactions are isolated from each other?

It’s done via the Query Processor and via the AWS Time Sync Service.

Diagram showing a "Query Processor" connected to two "Storage" units, with arrows indicating data flow. A clock labeled "Time t" is underneath the processor. Text reads "Read at time t."

The query processor will read the clock at the beginning of the transaction and will then use the time to read the records for this exact time. Or better said: “Hey storage - do these things for me, at time t”.

This is possible, as each storage not only stores the current version of a record but also its history. This means, Aurora supports Multiversioning.

Since the time remains consistent throughout the entire transaction, it can access any storage node and still retrieve a consistent snapshot of the data at that specific time, all without communication between the nodes.

The Overall Picture

Let’s sum up the internals of Aurora DSQL in a single picture:

The layers are completely decoupled and can scale independently of other layers. So if the workload is…

  • ⬆️ read heavy: storage can scale out with more replicas

  • ⬇️ write heavy: storage will be split up into more places and more journals

  • ⚡️ running a lot of SQL commands: there will be more query processors

Aurora can always meet the demands of the workload.

Limitations

Currently, Amazon Aurora DSQL is in preview mode. This means can use it for free. Nevertheless, it comes with many limitations (that may be temporary).

  • No mention of the upcoming pricing.

  • Postgres compatibility is not 100% - see the list of known issues.

  • Transaction limit of 10k rows.

  • No foreign keys.

  • No views.

  • No temporary tables.

You can also find the complete list of cluster and database limits in the documentation.

Bonus: Drizzle ORM

Drizzle ORM is a lightweight, TypeScript-first ORM for SQL databases.

It helps with type safety and simplicity so you can write SQL queries in a type-safe manner while relying on TypeScript's features. Drizzle ORM supports various SQL databases - including Postgres - and aims to provide a straightforward API for interacting with them.

We’re also using Drizzle in our example repository.

If you’ve bootstrapped the table via bootstrap-db.sh, you can run the drizzle commands easily via the provided NPM scripts:

  • pnpm run db:schema:generate: generating the types for your schema.

  • pnpm run db:schema:migrate: generating the SQL migration file.

  • pnpm run db:schema:push: applying the changes to the actual database.

  • pnpm run db:schema:studio: running the database viewer in your local browser.

With Drizzle Studio, you can explore your database and run queries.

Further Reading

There are numerous excellent resources available for those interested in learning more about Amazon Aurora DSQL and its capabilities. This article mostly summaries information from the original resources that are listed in the following:

Frequently Asked Questions

  1. What is Amazon Aurora DSQL? Amazon Aurora DSQL is a fully managed, serverless, distributed SQL database service offered by AWS. It is designed to handle transactional workloads with automatic scaling, pay-per-use pricing, and high availability across multiple regions.

  2. How does Aurora DSQL ensure data consistency and durability? Aurora DSQL ensures data consistency and durability through its use of a Journal, which logs every transaction. The Adjudicator component manages transaction isolation and conflict resolution, while the storage layer is optimized for efficient data querying.

  3. What are the main limitations of Aurora DSQL in its preview mode? In its preview mode, Aurora DSQL has several limitations, including a transaction limit of 10,000 rows, lack of support for foreign keys, views, and temporary tables, and incomplete PostgreSQL compatibility. Pricing details are also not yet available.

  4. How does Aurora DSQL achieve scalability? Aurora DSQL achieves scalability by decoupling its components, allowing each layer (such as the Journal, Adjudicator, and storage) to scale independently. This design enables the system to handle varying workloads efficiently, whether they are read-heavy, write-heavy, or involve complex SQL commands.

  5. What is Drizzle ORM, and how can it be used with Aurora DSQL? Drizzle ORM is a lightweight, TypeScript-first ORM for SQL databases, including PostgreSQL. It provides type safety and a straightforward API for interacting with databases. In the context of Aurora DSQL, Drizzle ORM can be used to generate types, create migrations, and manage database schemas efficiently.