Rails Multiple Databases Gotchas
June 12, 2023
In Rails 6, we were introduced to the amazing feature of handling multiple databases, streamlining the process without requiring extensive work, just some configuration.
Recently, I attended a talk discussing a project where a Rails monolith easily connected to over 5 databases. After the talk, I had a conversation with some people who raised an interesting question about consistency and transactions.
How do we handle transactions across databases? Does ActiveRecord automatically handle that for us, or do we need to tackle this task ourselves?
The answer at the time was that using ActiveRecord transaction blocks was sufficient. But is it? This article aims not to teach you how to use multiple databases, but to shed light on potential issues that can be easily overlooked.
Expected Behavior
When using Rails, we consciously choose to let the framework handle a significant portion of the burden for us. This is one of Rails’ strong points, but sometimes we assume it will do everything automatically. If we use a transaction block with a single database connection, we know that any error within the block will rollback any changes made. However, what happens if we have multiple database connections?
Since Rails simplifies handling these connections, it would make sense for it to handle transactions in the same way, right? Let’s consider an example. Suppose we have an app with two models, each residing in its own database. On one side, we have a User model in the users database, and on the other side, we have a Post model in the posts database. We’ll use a simple example where we create a User and raise a rollback to see if it behaves as expected:
User.count # => 0
User.transaction do
User.create(email: 'test@example.com')
raise ActiveRecord::Rollback
end
User.count # => 0
This example successfully rolls back the created user. So, even with multiple database connections, as long as the transaction block handles only one connection, it behaves in the same way.
What can go wrong?
The problems arise when multiple connections are involved in the transaction. If we need to modify models residing in different databases within the same transaction block, ActiveRecord doesn't have the ability to infer rollbacks correctly. Let's examine an example:
User.count # => 0
Post.count # => 0
User.transaction do
User.create(email: 'test@example.com')
Post.create(title: 'Test')
raise ActiveRecord::Rollback
end
User.count # => 0
Post.count # => 1
That doesn’t look right, does it?
Since the transaction is using the User model database connection it only rollbacks the changes for that connection, so we end up with an inconsistent state in the posts database.
What if we use nested transactions?
Great question! Lets test it.
User.count # => 0
Post.count # => 0
User.transaction do
Post.transaction do
Post.create(title: 'Test')
raise ActiveRecord::Rollback
end
User.create(email: 'test@example.com')
end
User.count # => 0
Post.count # => 0
Success! Well, not really. It may appear that we have a workaround to the issue but this doesn’t work if the one raising the error is the User creation.
User.count # => 0
Post.count # => 0
User.transaction do
Post.transaction do
Post.create(title: 'Test')
end
User.create(email: 'test@example.com')
raise ActiveRecord::Rollback
end
User.count # => 0
Post.count # => 1
We can see that if the Post creation succeeds but the User one doesn’t we still end with an inconsistent state, even when using nested transactions.
Solutions Exist
There are a few potential approaches you can consider to achieve distributed transaction-like behavior in this scenario. If your database supports distributed transactions, such as with PostgreSQL's two-phase commit (2PC), coordinating the transaction across multiple databases becomes possible. However, this approach requires both databases to support 2PC.
Another approach is to use the Saga pattern, a compensating transaction-based pattern that manages distributed transactions through explicit compensating actions. In this pattern, you would define and execute compensating actions to rollback or compensate for the changes made by each step of the distributed transaction. Usually, the Saga pattern is implemented with events.
I won't delve into the implementation details of these solutions in this article, as it can become quite lengthy. But no worries, a follow-up article on Saga is already in the works, so stay tuned for that.
Conclusions
In conclusion, while Rails offers a convenient feature for handling multiple databases, it’s crucial to be aware of the potential challenges that can arise, particularly when it comes to transactions across databases. ActiveRecord’s default behavior may not guarantee consistent rollbacks when multiple connections are involved.
To ensure data integrity and consistency, it’s important to explore alternative solutions. If your database supports distributed transactions, such as two-phase commit (2PC), coordinating the transaction across multiple databases becomes possible. Alternatively, you can consider implementing the Saga pattern, a compensating transaction-based pattern that manages distributed transactions through explicit compensating actions.
It’s essential to thoroughly evaluate your application’s requirements and the capabilities of your databases when working with multiple databases in Rails. Stay informed about best practices, and consider following up with the upcoming Saga pattern article, which will delve deeper into resolving these issues.
Keep learning, exploring, and experimenting to enhance your database skills, and stay tuned for more insights on tackling these challenges.