Shared Schema Strategy With Postgres (Part 1)

Apply the shared schema strategy in Postgres by using the row-level security policy.

Szymon Tarnowski

CORE ·

Feb. 09, 21 · Tutorial

Likes (5)

Comment

Save

12.0K Views

The shared schema strategy is one of the main multi-tenancy strategies besides the separate database or separate schema. Although there is no limitation, there has to be only one schema in the database, but all tenants should have the same access. The strategy assumes that all tables in a database (except tables that store vocabulary data or data available for all tenants) have a column that stores tenant identifiers. Based on this value column, we know which tenant is the owner of the table row.

Obviously, this approach offers a lower isolation level than the separate database and shared schema strategies but, just like the shared schema strategy, saves infrastructure costs. In comparison to the shared schema strategy, we can say that, although Postgres can handle multiple schemas, we can find comments from people who use this strategy with Postgres for a larger number of tenants. The maintenance for this approach, however, is harder than when the schema is shared between tenants. Executing DDL scripts for hundreds of schemas might be easily automated, but still, it might create complexity and slower deployment. It does not mean that executing DDL scripts for a database with a shared schema approach is error-free.

Implementation

The shared schema strategy assumes that all tables shared by the tenant have a column that defines its owner. This means that the value of this column has to be checked in each SQL query and operation. The column must be checked during the SELECT statement execution as well as all operations that modify data like UPDATE, INSERT and DELETE.

One approach would be that the application should handle these checks. In every SQL statement sent to the database, the application must attach a condition that checks the tenant column's value. It might be easier if the application uses some ORM framework. But still, if the application code contains any custom query, the developer has to be aware that he needs to add checks for the tenant identifier column. The pros for this approach are that if we would ever consider changing the database engine for the application, it would be quick and easy. However, there are cons for such a situation. Say, for example, we had many applications that use the same database, then ll applications should have implemented tenant column checks; not to mention if those applications are written in different programming languages.

The Implementation That Uses a Database Engine

Another approach similar to the previous one, but from a different side, is when the database engine is doing tenant column value checks. Postgres has a feature called the row security policy that allows for the implementation of such a solution. Probably the biggest advantage of this approach is that it takes a huge amount of responsibility from the application code. Of course, the application needs to implement how to create connections per specific tenants. But, despite that, the programmer does not have to worry about the tenant column's missing condition in the SQL statement unless the SQL statement is so custom that the developer changes the tenant during its execution.

Implementation Tips

No matter if we decide to choose the application approach or database approach, it needs to be mentioned that our database constraints need to be wisely designed, especially those constraints that check if the record values are unique. Designing table primary keys that would contain tenant columns is also worth considering; at least for tables that store data of different tenants. Dictionary tables used by all tenants don't need to have such requirements.

Posmulten

Posmulten is an open-source project that generates DDL statements that create a shared schema strategy for your project that uses the Postgres database. Posmulten uses a row security policy mechanism to handle tenant data isolation. Below there's a script example with DDL statements that create a simple schema. In this script, all tables have a primary key, which part is the tenant column. There is also the possibility to generate a shared schema strategy for a model where the tenant column is not a part of the primary key. Such an approach might be easier to adopt to existed applications. This approach will be described in the next part of this article.

     SQL 
   
 
  
    
 
      
 
     
 
      
 
     
 
     
 
    
 
          x 
         
 
          
 
         
 
          
 
         

              
          
CREATE TABLE public.user_info
(
    id bigint NOT NULL,
    tenant character varying(255) COLLATE pg_catalog."default" NOT NULL,
    name character varying(32) COLLATE pg_catalog."default" NOT NULL,
    manager_id bigint,
    CONSTRAINT user_info_pkey PRIMARY KEY (id, tenant),
    CONSTRAINT manager_fk FOREIGN KEY (manager_id, tenant)
        REFERENCES public.user_info (id, tenant) MATCH SIMPLE
        ON UPDATE NO ACTION
        ON DELETE NO ACTION
)
WITH (
    OIDS = FALSE
)
TABLESPACE pg_default;
ALTER TABLE public.user_info
    OWNER to "postgresql-core-owner";
    
CREATE TABLE public.groups
(
    id bigint NOT NULL,
    tenant character varying(255) COLLATE pg_catalog."default" NOT NULL,
    name character varying(255) COLLATE pg_catalog."default" NOT NULL,
    CONSTRAINT groups_pkey PRIMARY KEY (id, tenant)
)
WITH (
    OIDS = FALSE
)
TABLESPACE pg_default;
ALTER TABLE public.groups
    OWNER to "postgresql-core-owner";
    
CREATE TABLE public.posts
(
    id bigint NOT NULL,
    tenant character varying(255) COLLATE pg_catalog."default" NOT NULL,
    name character varying(255) COLLATE pg_catalog."default" NOT NULL,
    user_id bigint,
    CONSTRAINT posts_pkey PRIMARY KEY (id, tenant),
    CONSTRAINT users_fk FOREIGN KEY (tenant, user_id)
        REFERENCES public.user_info (tenant, id) MATCH SIMPLE
        ON UPDATE NO ACTION
        ON DELETE NO ACTION
)
WITH (
    OIDS = FALSE
)
TABLESPACE pg_default;
ALTER TABLE public.posts
    OWNER to "postgresql-core-owner";
    
CREATE TABLE public.user_groups
(
    user_id bigint,
    group_id bigint,
    tenant character varying(255) COLLATE pg_catalog."default",
    CONSTRAINT group_fk FOREIGN KEY (group_id, tenant)
        REFERENCES public.groups (id, tenant) MATCH SIMPLE
        ON UPDATE NO ACTION
        ON DELETE NO ACTION,
    CONSTRAINT user_fk FOREIGN KEY (tenant, user_id)
        REFERENCES public.user_info (tenant, id) MATCH SIMPLE
        ON UPDATE NO ACTION
        ON DELETE NO ACTION
)
WITH (
    OIDS = FALSE
)
TABLESPACE pg_default;
ALTER TABLE public.user_groups
    OWNER to "postgresql-core-owner";
 
      
 
     

As Posmulten is a Java library, there is a wrapper script that us it, and based on the configuration file, it creates two files with DDL statements (create_script.sql and drop_script.sql).

The drop_script.sql file contains statements that drop all changes applied by the create_script.sql file. Be in mind that those instructions are not compensations operations. They do not restore state before applying the create_script.sql script. Below there is an example of a configuration file. Please check the site for project configuration-yaml-interpreter to know how to prepare a configuration file.

     YAML 
   
xxxxxxxxxx
 
default_schema: public
current_tenant_id_property_type:  "VARCHAR(255)"
default_tenant_id_column: "tenant"
grantee: "postgresql-core-user"
set_current_tenant_identifier_as_default_value_for_tenant_column_in_all_tables: true
valid_tenant_value_constraint:
  is_tenant_valid_function_name:  is_t_valid
  is_tenant_valid_constraint_name:  "is_tenant_valid_constraint_SDFA"
  tenant_identifiers_blacklist:
    - "DUMMY_TENANT"
tables:
  - name: user_info
    rls_policy:
      name: users_table_rls_policy
  - name: posts
    rls_policy:
      name: "posts_table_rls_policy"
  - name: groups
    rls_policy:
      name: groups_table_rls_policy
  - name: user_groups
    rls_policy:
      name: user_groups_table_rls_policy

After executing the wrapper script for our configuration file:

     Shell 
   
          x 
         
posmulten-ddl.sh configuration.yaml

We get the create_script.sql script just like below:

     SQL 
   
 
  
    
 
      
 
     
 
      
 
     
 
     
 
    
xxxxxxxxxx
 
          
 
         
 
          
 
         

              
          
CREATE OR REPLACE FUNCTION public.get_current_tenant_id() RETURNS VARCHAR(255) AS $$
SELECT current_setting('posmulten.tenant_id')
$$ LANGUAGE sql
STABLE
PARALLEL SAFE;
CREATE OR REPLACE FUNCTION public.set_current_tenant_id(VARCHAR(255)) RETURNS VOID AS $$
BEGIN
PERFORM set_config('posmulten.tenant_id', $1, false);
END
$$ LANGUAGE plpgsql
VOLATILE;
CREATE OR REPLACE FUNCTION public.is_id_equals_current_tenant_id(VARCHAR(255)) RETURNS BOOLEAN AS $$
SELECT $1 = public.get_current_tenant_id()
$$ LANGUAGE sql
STABLE
PARALLEL SAFE;
CREATE OR REPLACE FUNCTION public.tenant_has_authorities(VARCHAR(255), VARCHAR(255), VARCHAR(255), VARCHAR(255), VARCHAR(255)) RETURNS BOOLEAN AS $$
SELECT public.is_id_equals_current_tenant_id($1)
$$ LANGUAGE sql
STABLE
PARALLEL SAFE;
CREATE OR REPLACE FUNCTION public.is_t_valid(VARCHAR(255)) RETURNS BOOLEAN AS $$
SELECT $1 <> CAST ('DUMMY_TENANT' AS VARCHAR(255))
$$ LANGUAGE sql
IMMUTABLE
PARALLEL SAFE;
ALTER TABLE public."user_info" ENABLE ROW LEVEL SECURITY;
ALTER TABLE public."user_groups" ENABLE ROW LEVEL SECURITY;
ALTER TABLE public."posts" ENABLE ROW LEVEL SECURITY;
ALTER TABLE public."groups" ENABLE ROW LEVEL SECURITY;
CREATE POLICY users_table_rls_policy ON public.user_info
FOR ALL
TO "postgresql-core-user"
USING (public.tenant_has_authorities(tenant, 'ALL', 'USING', 'user_info', 'public'))
WITH CHECK (public.tenant_has_authorities(tenant, 'ALL', 'WITH_CHECK', 'user_info', 'public'));
CREATE POLICY user_groups_table_rls_policy ON public.user_groups
FOR ALL
TO "postgresql-core-user"
USING (public.tenant_has_authorities(tenant, 'ALL', 'USING', 'user_groups', 'public'))
WITH CHECK (public.tenant_has_authorities(tenant, 'ALL', 'WITH_CHECK', 'user_groups', 'public'));
CREATE POLICY posts_table_rls_policy ON public.posts
FOR ALL
TO "postgresql-core-user"
USING (public.tenant_has_authorities(tenant, 'ALL', 'USING', 'posts', 'public'))
WITH CHECK (public.tenant_has_authorities(tenant, 'ALL', 'WITH_CHECK', 'posts', 'public'));
CREATE POLICY groups_table_rls_policy ON public.groups
FOR ALL
TO "postgresql-core-user"
USING (public.tenant_has_authorities(tenant, 'ALL', 'USING', 'groups', 'public'))
WITH CHECK (public.tenant_has_authorities(tenant, 'ALL', 'WITH_CHECK', 'groups', 'public'));
ALTER TABLE "public"."user_info" ADD CONSTRAINT is_tenant_valid_constraint_SDFA CHECK (tenant IS NULL OR public.is_t_valid(tenant));
ALTER TABLE "public"."user_groups" ADD CONSTRAINT is_tenant_valid_constraint_SDFA CHECK (tenant IS NULL OR public.is_t_valid(tenant));
ALTER TABLE "public"."posts" ADD CONSTRAINT is_tenant_valid_constraint_SDFA CHECK (tenant IS NULL OR public.is_t_valid(tenant));
ALTER TABLE "public"."groups" ADD CONSTRAINT is_tenant_valid_constraint_SDFA CHECK (tenant IS NULL OR public.is_t_valid(tenant));
ALTER TABLE public.user_info ALTER COLUMN tenant SET DEFAULT public.get_current_tenant_id();
ALTER TABLE public.user_groups ALTER COLUMN tenant SET DEFAULT public.get_current_tenant_id();
ALTER TABLE public.posts ALTER COLUMN tenant SET DEFAULT public.get_current_tenant_id();
ALTER TABLE public.groups ALTER COLUMN tenant SET DEFAULT public.get_current_tenant_id();
 
      
 
     

When the superuser or table owner can apply changes, the security policy is designed for the "postgresql-core-user" user, which should not be a superuser. However, the user can be the table owner. Now let's try to insert some data.

For this article's purpose, we assume that the "postgresql-core-user" user has privileges that allow modifying tables (INSERT, SELECT, UPDATE, DELETE) in the schema for which DDL statements were applied.

Insert Data

     SQL 
   
xxxxxxxxxx
 
SELECT set_current_tenant_id('XXX');
INSERT INTO user_info(id, tenant, name) VALUES (1, 'XXX', 'manager1');
INSERT INTO user_info(id, tenant, name, manager_id) VALUES (2, 'XXX', 'user1', 1);
INSERT INTO user_info(id, name) VALUES (3, 'user 2');
SELECT set_current_tenant_id('YYY');
INSERT INTO user_info(id, name) VALUES (1, 'manager1');
INSERT INTO user_info(id, name, manager_id) VALUES (2, 'user1', 1);
INSERT INTO user_info(id, name) VALUES (3, 'second user for YYY');
SELECT set_current_tenant_id('XXX');
INSERT INTO posts(id, tenant, name) VALUES (1, 'XXX', 'post 1');
INSERT INTO posts(id, tenant, name, user_id) VALUES (2, 'XXX', 'post 2', 1);

The above script is an example of how data should be inserted by the "postgresql-core-user" The first thing that catches our attention is calling the set_current_tenant_id(?) method. Btw, all names of methods generated by the posmulten library can be customized. The function sets a tenant identifier for the current connection session. You may notice that some INSERT instruction does not have specified the tenant column in the VALUES section. The posmulten library can generate the DDL statement that sets the default value defined for the tenant column.

Executing SQL statement without setting tenant:

Let's try to check what will happen when the tenant is not being set, and we are trying to add data.

     SQL 
   
xxxxxxxxxx
 
INSERT INTO user_info(id, tenant, name) VALUES (4, 'XXX', 'user4');

We can get such error:

     PLSQL 
   
xxxxxxxxxx
 
ERROR:  unrecognized configuration parameter "posmulten.tenant_id"
SQL state: 42704

Executing SQL statement with setting different tenant:

Let's try to check what will happen when we set a different tenant than an INSERT statement:

     SQL 
   
xxxxxxxxxx
 
SELECT set_current_tenant_id('YYY');
INSERT INTO user_info(id, tenant, name) VALUES (4, 'XXX', 'user4');

We can get such error:

     PLSQL 
   
xxxxxxxxxx
 
ERROR:  new row violates row-level security policy for table "user_info"
SQL state: 42501

Modify Data

Below there is an example of how to execute UPDATE instruction:

     SQL 
   
 x 
         
SELECT set_current_tenant_id('XXX');
UPDATE user_info SET name = 'main manager' WHERE id = 1

Output:

     PLSQL 
   
xxxxxxxxxx
 
UPDATE 1

Based on the output, we can see that one record was updated.

Executing UPDATE statement for data that belongs to the different tenant:

Below there is an example where we executed UPDATE instruction for a tenant that does not have any records. However, the rows with the specified value for id exists.

     SQL 
   
xxxxxxxxxx
 
SELECT set_current_tenant_id('No_such_tenant');
UPDATE user_info SET name = 'Boss' WHERE id = 1;

Output:

     PLSQL 
   
xxxxxxxxxx
 
UPDATE 0

Based on the output, we can see that no record was updated.

Executing SQL statement without setting tenant:

Let's try to check what will happen when the tenant is not being set, and we are trying to modify data.

     SQL 
   
xxxxxxxxxx
 
UPDATE user_info SET name = 'Boss' WHERE id = 1;

We can get such error, the same which had with the similar case for INSERT statement:

     PLSQL 
   
xxxxxxxxxx
 
ERROR:  unrecognized configuration parameter "posmulten.tenant_id"
SQL state: 42704

Searching Data

Let's add some more data for the presentation:

     SQL 
   
xxxxxxxxxx
 
SELECT set_current_tenant_id('XXX');
INSERT INTO posts(id, name, user_id) VALUES (11, 'post 11', 1);
INSERT INTO posts(id, name, user_id) VALUES (12, 'post 12', 1);
INSERT INTO posts(id, name, user_id) VALUES (13, 'post 13', 2);
SELECT set_current_tenant_id('YYY');
INSERT INTO posts(id, name, user_id) VALUES (21, 'post 21', 1);

Selecting data for "XXX" tenant:

     SQL 
   
xxxxxxxxxx
 
SELECT set_current_tenant_id('XXX');
SELECT COUNT(*) FROM posts;

Output:

     PLSQL 
   
xxxxxxxxxx
 
"5"

     SQL 
   
xxxxxxxxxx
 
SELECT set_current_tenant_id('XXX');
SELECT * FROM posts;

Output:

Selecting data for "YYY" tenant:

     SQL 
   
xxxxxxxxxx
 
SELECT set_current_tenant_id('YYY');
SELECT COUNT(*) FROM posts;

Output:

     PLSQL 
   
xxxxxxxxxx
 
"1"

     SQL 
   
xxxxxxxxxx
 
SELECT set_current_tenant_id('YYY');
SELECT * FROM posts;

Output:

Search statement with JOIN instruction:

Selecting data for "XXX" tenant:

     SQL 
   
xxxxxxxxxx
 
SELECT set_current_tenant_id('XXX');
SELECT u.id, u.tenant, u.name, p.* FROM user_info u JOIN posts p ON u.id = p.user_id;

Output:

Selecting data for "YYY" tenant:

     SQL 
   
 x 
         
SELECT set_current_tenant_id('YYY');
SELECT u.id, u.tenant, u.name, p.* FROM user_info u JOIN posts p ON u.id = p.user_id;

Output:

Search statement with LEFT JOIN instruction:

Selecting data for "XXX" tenant:

     SQL 
   
xxxxxxxxxx
 
SELECT set_current_tenant_id('XXX');
SELECT u.id, u.tenant, u.name, p.* FROM user_info u LEFT JOIN posts p ON u.id = p.user_id;

Output:

Search statement with RIGHT JOIN instruction:

Selecting data for "XXX" tenant:

     SQL 
   
xxxxxxxxxx
 
SELECT set_current_tenant_id('XXX');
SELECT u.id, u.tenant, u.name, p.* FROM user_info u RIGHT JOIN posts p ON u.id = p.user_id;

Output:

Selecting all data:

It is always possible to select all data by superuser or table owner (unless row level policy is forced for checking a table owner) without setting the current tenant.

     SQL 
   
xxxxxxxxxx
 
SELECT * FROM posts;

Output:

Deleting Data

The Row Level Security is applied for the DELETE statement. Below there is an example deleting all data for one tenant (without WHERE phrase) and then displaying data for the other tenant.

     SQL 
   
xxxxxxxxxx
 
SELECT set_current_tenant_id('XXX');
DELETE FROM posts;

Output:

     PLSQL 
   
xxxxxxxxxx
 
DELETE 5

Selecting data for the different tenant:

     SQL 
   
xxxxxxxxxx
 
SELECT set_current_tenant_id('YYY');
SELECT * FROM posts;

Output:

Summary

Based on the presented examples, we can admit that it is possible to implement the shared schema strategy in the Postgres database by using the row-level security policy.

Database engine sql Schema Data (computing) PostgreSQL application

Opinions expressed by DZone contributors are their own.

Related

Trending

Shared Schema Strategy With Postgres (Part 1)

Apply the shared schema strategy in Postgres by using the row-level security policy.

Implementation

The Implementation That Uses a Database Engine

Implementation Tips

Posmulten

Insert Data

Modify Data

Searching Data

Deleting Data

Summary

Related

Partner Resources