How to Create a Multi-Tenant Database with Shared Table Structures

How to create a multi-tenant database with shared table structures?


However there are some companies of
course who fear that their data might
be compromised, so we are evaluating
other solutions.

This is unfortunate, as customers sometimes suffer from a misconception that only physical isolation can offer enough security.

There is an interesting MSDN article, titled Multi-Tenant Data Architecture, which you may want to check. This is how the authors addressed the misconception towards the shared approach:

A common misconception holds that
only physical isolation can provide an
appropriate level of security. In
fact, data stored using a shared
approach can also provide strong data
safety, but requires the use of more
sophisticated design patterns.

As for technical and business considerations, the article makes a brief analysis on where a certain approach might be more appropriate than another:

The number, nature, and needs of the
tenants you expect to serve all affect
your data architecture decision in
different ways. Some of the following
questions may bias you toward a more
isolated approach, while others may
bias you toward a more shared
approach.

  • How many prospective tenants do you expect to target? You may be nowhere
    near being able to estimate
    prospective use with authority, but
    think in terms of orders of magnitude:
    are you building an application for
    hundreds of tenants? Thousands? Tens
    of thousands? More? The larger you
    expect your tenant base to be, the
    more likely you will want to consider
    a more shared approach.

  • How much storage space do you expect the average tenant's data to occupy?
    If you expect some or all tenants to
    store very large amounts of data, the
    separate-database approach is probably
    best. (Indeed, data storage
    requirements may force you to adopt a
    separate-database model anyway. If so,
    it will be much easier to design the
    application that way from the
    beginning than to move to a
    separate-database approach later on.)

  • How many concurrent end users do you expect the average tenant to support?
    The larger the number, the more
    appropriate a more isolated approach
    will be to meet end-user requirements.

  • Do you expect to offer any per-tenant value-added services, such
    as per-tenant backup and restore
    capability? Such services are easier
    to offer through a more isolated
    approach.


UPDATE: Further to update about the expected number of tenants.

That expected number of tenants (10k) should exclude the multi-database approach, for most, if not all scenarios. I don't think you'll fancy the idea of maintaining 10,000 database instances, and having to create hundreds of new ones every day.

From that parameter alone, it looks like the shared-database, single-schema approach is the most suitable. The fact that you'll be storing just about 50Mb per tenant, and that there will be no per-tenant add-ons, makes this approach even more appropriate.

The MSDN article cited above mentions three security patterns that tackle security considerations for the shared-database approach:

  • Trusted Database Connections
  • Tenant View Filter
  • Tenant Data Encryption

When you are confident with your application's data safety measures, you would be able to offer your clients a Service Level Agrement that provides strong data safety guarantees. In your SLA, apart from the guarantees, you could also describe the measures that you would be taking to ensure that data is not compromised.

UPDATE 2: Apparently the Microsoft guys moved / made a new article regarding this subject, the original link is gone and this is the new one: Multi-tenant SaaS database tenancy patterns (kudos to Shai Kerer)

Multi Tenant Database with some Shared Data

A colleague gave me an insight that worked well. Instead of thinking about the tenant access as per-tenant think about it as group access. A tenant can belong to multiple groups, including it's own specified group. Data then belongs to a group, possibly the Tenant's specific group, or maybe a more general one.

So, "My Bank" would belong to the Tenant's group, "Local Bank" would belong to a regional grouping which the tenant has access to, and "Global Bank" would belong to the "Everyone" group.

This keeps integrity, FK's and also adds in the possibility of having hierarchies of tenants, not something I need at all in my scenario, but a nice little possibility.

Design database schema to support multi-tenant in MYSQL

There are multiple possible ways to design schema to support multi-tenant. The simplicity of the design depends on the use case.

Separate the data of every tenant (school) physically, i.e., one
schema must contain data related to only a specific tenant.

Pros:

  1. Easy for A/B Testing. You can release updates which require database changes to some tenants and over time make it available for others.
  2. Easy to move the database from one data-center to another. Support different SLA for backup for different customers.
  3. Per tenant database level customization is easy. Adding a new table for customers, or modifying/adding a field becomes easy.
  4. Third party integrations are relatively easy, e.g., connecting your data with Google Data Studio.
  5. Scaling is relatively easy.
  6. Retrieving data from one tenant is easy without worrying about the mixing up foreign key values.

Cons:

  1. When you have to modify any field/table, then your application code needs to handle cases where the alterations are not completed in some databases.
  2. Retrieving analytics across customers becomes difficult. Designing Queries for usage analysis becomes harder.
  3. When integrating with other databases system, especially NoSQL, you will need more resources. e.g., indexing data in Elasticsearch for every tenant will require index per tenant, and if there are thousands of customers, it will result in creating thousands of shards.
  4. Common data across tenants needs to be copied in every database

Separate data for every tenant (school) logically, i.e., one schema
contains data for all the tenants.

Pros:

  1. Software releases are simple.
  2. Easy to query usage analytics across multiple tenants.

Cons:

  1. Scaling is relatively tricky. May need database sharding.
  2. Maintaining the logical isolation of data for every tenant in all the tables requires more attention and may cause data corruption if not handled at the application level carefully.
  3. Designing database systems for the application that support multiple regions is complicated.
  4. Retrieving data from a single tenant is difficult. (Remember: all the records will be associated with some other records using foreign keys.)

This is not a comprehensive list. These are based on my experiences with working on both the type of designs. Both the designs are common and are used by multiple organization based on the usecase.



Related Topics



Leave a reply



Submit