If you've been following along, you have a multi-tenant application that is able to detect the tenant on each request. Now we need to provide the right data for each user, based on their tenant.
How much segregation is enough?
One strategy to multi-tenant data storage is to simply have a single database, but to use associations to control the siloing. A User belongs to a Tenant, and all of the models associated to that Tenant also have a belongs-to relationship. (In
ActiveRecord terms, you might use
belongs_to :tenant, through: something_else for more deeply-nested models.)
This works, and it has some advantages: you only maintain a single database; you can easily use all the tricks of improving database performance (caching, sharding, etc.) and only have to use them once; and you don't need to write any additional code.
However, all it takes is one programmer slip-up and a user is able to access someone else's data. If a
show actions finder doesn't check that the model it finds belongs to the right tenant/user, then it's possible for a user to simply change the id portion of their URL and see information from another tenant.
If data segregation is important, consider having multiple independent copies of the application's database. The downside to this is that it requires configuration changes to switch out the database connection on each request (reducing or eliminating the effectiveness of keeping a pool of connections) and can't make use of some of the RMDBS's own optimizations. The upside is that once the connection is switched, siloing is complete and guaranteed.
PostgreSQL has a feature known as "schemas" (no not,
schema.rb). While the term is overloaded in databaseland, in this case the feature is essentialy a namespace within a single database. The advantage here is that Rails can maintain a persistent connection, so the pool remains useful. A single command to the database switches schemas for subsequent requests on the same connection. Additionally, since the same database is in use, you don't actually need to spin up another instance – making it possible to dynamically add tenants quickly and easily, even on a host like Heroku.
In addition to preventing programmer slip-ups, completely segregating your database is especially advantageous when refactoring a single-tenant application to handle multiple tenants. By changing out the database entirely, the application's structure remains largely unmodified: there are no extra models for tenants, no logic to hide data from users from other tenants, etc.
Fully segregating your database is pretty simple, thanks to the Apartment gem. If you're working with PostgreSQL, it will employ schemas by default.
Apartment is well-documented, so I won't go into its specifics here. Its Github page goes into details on adding new tenant databases, switching databases, and maintaining certain schemas across all tenants (e.g. for application-wide common configuration, or storing a tenant list).
I do want to touch on Apartment's mechanisms for switching out databases.
Apartment ships with cleverly-named "elevators" for switching between tenants. (Get it? Ha ha … oh.)
As we find in Apartment's README, the library ships with three Elevators (and can build custom ones, too). They're simple to use. For example, to switch databases on subdomains, we invoke the appropriate middleware:
module MyApplication class Application < Rails::Application config.middleware.use 'Apartment::Elevators::Subdomain' end end
Elevators really are that simple – both fortunately and unfortunately. The downside is that they only switch the database. If you have data sources you need to modify, we recommend using
in your tenant-switching code – as we'll discuss in future posts.
Databases are just one component of a modern web application, but an important one. If it's feasible for your data model, keeping all of your tenants in a single database has several advantages, not the least of which is compatibility with all RMDBSs.
On the other hand, it's much simpler programmatically – especially if you're migrating a single-tenant app to be multi-tenant – to switch out the database in Rack before your Rails app begins handling the request; your application is able to remain largely similar to a single-tenant structure. Apartment makes this process very easy. As we'll see in later posts in this series, we can work it into a custom process that switches out all the application's data sources.
Our application was built on PostgreSQL, which has a very handy feature called "schemas." Confusingly named with the same term we use to describe the structure of a database, PostgreSQL's "schemas" act as namespaces within a database, and each can contain independent tables (even with the same name).
Using these schemas, we rearchitected the application to have a separate schema for each tenant, each with a complete copy of the app's database structure – albiet with different data. Thus our production database could have a
abc_inc.users table, but also an
A Ruby gem called 'Apartment' manages the details of pointing ActiveRecord at the right schema. Apartment also comes with its own Rack middleware units, cleverly named 'elevators'. Apartment's elevators are very simple to use, and well documented to boot. Three elevators are provided:
- Tenant switching by subdomain
- Switching by domain
- Switching via a
Procthat takes the request and returns a tenant name.
The first two use the subdomain or domain as the tenant name, and so also the database (or database schema, if you're using PostgreSQL schemas within one database). The third can be used to match each request to an arbitrary tenant identifier, and switch the database connection to the appropriate storage.
We chose not to use Apartment's elevator system to switch tenants. The built-in middleware works just fine to switch out the database connection, but for our application, we needed to switch out more features, so we wrote our own switcher, based on Apartment's tools:
def switch_database(tenant) Apartment::Database.switch tenant end
Since our app also relied on Redis, we used the namespacing provided by the
def switch_redis(tenant) if Redis.current.is_a? Redis::Namespace Redis.current.namespace = tenant else Redis.current = Redis::Namespace.new(tenant, redis: Redis.current) end end