[ACCEPTED]-Multitenant DB: Why put a TenantID column in every table?-multi-tenant
If one of your key design considerations 25 is security--specifically, one client can 24 no way no how no when access another client's 23 data--then, depending on how you implement 22 this security, sticking that qualifying 21 column in every table may be necessary. One 20 such tactic described here requires building 19 a view on every table; assuming each table 18 contains a tenantId column, then if properly 17 configured each view could contain a "WHERE 16 tenantId = SUSER_SID()" clause (and 15 of course you configure the database so 14 that clients can only access the views).
Another 13 factor (as in my current job) is loading 12 warehouse data (ETL). Tables are partitioned 11 on tenantId (we use table partitioning, but 10 partitioned views would also work), and 9 data can be easily loaded or unloaded for 8 a client without seriously impacting any 7 other client.
But as ever, there's a lot 6 of "it depends" involved. If there 5 is no clear and present need, and a very low 4 likelihood of future need, then normalize 3 that column out. Just realize that it's 2 more a devise of physical implementation 1 than of conceptual or logical database design.
Its there for convenience and performance 12 - in terms of normalisation you're absolutely 11 right, it only needs to go in at the top. The 10 problem then becomes that to get to some 9 data (say zoo -> animal -> food -> supplier) you 8 have to have horribly complex joins on what 7 are notionally very simple queries.
So in 6 the real world one has to compromise - question 5 then becomes where and to what extent.
See 4 this article Maybe Normalizing Isn't Normal - and its conclusion:
As the 3 old adage goes, normalize until it hurts, denormalize 2 until it works
as a place to start exploring 1 the subject
If I had tenantID at the top of the hierarchy 2 (i.e. at the zoo level) you have several 1 issues to consider.
- The top of the hierarchy can never change, for example if you need to add a node on the tree above the zoo level (say regions -> zoos -> animals) then it will force a re-org every time.
- For certain queries, you will be forced to start at the top of the hierarchy, i.e. give me a list of all animals available will force you to start at the top of the tree
- Why not use schemas ? Each tenant is isolated within their own schema. This will also separate the data-sets nicely.
The first thing that springs to mind is 19 that it's slower to look up animals > zoos > tenants
than simply 18 animals > tenants
. And most likely this is a lookup you will 17 do often (for example, "get all animals for a 16 certain tenant, regardless of zoo").
For 15 small to mid-sized applications you can 14 get away with a more normalized structure, but 13 for the sake of efficiency, you should go 12 with extraneous data (and generally speaking, multitenancy 11 applications are not small). Just make sure 10 it doesn't go "out of sync", which is a 9 risk that comes with having redundant data.
To 8 answer your last paragraph, the reason is 7 performance, pure and simple. Joins are 6 no bad thing; they help you keep a piece 5 of data in one place rather than three. It's 4 definitely not to prevent bugs. Adding a 3 tenant_id
field to more tables will increase the 2 risk of bugs (although for an id that never 1 changes, it wouldn't be as much of an issue).
Well, Bob may own a giraffe in zoo No1, while 2 Joe may own a lion in the same zoo. They 1 are not supposed to look at each others' data.
More Related questions
We use cookies to improve the performance of the site. By staying on our site, you agree to the terms of use of cookies.