Problem Statement:
Team decided to maintain a
common metastore database for all the source deployed in Azure Data Lake. Since
the database is common across all the compute, it will allow developer to
create\drop\modify tables in a database which team doesn’t own it. We need to
ensure proper governance is maintained to prevent developer not to make any
change to database which team doesn’t own it.
Understand how Metastore Table are used internally:
Metastore is the central repository.
It stores metadata for tables (like their schema and location) and
partitions in a relational database. It provides client access to this
information by using metastore service API.
dbo.DBS and dbo.TBLS are the two tables used by metastore whenever we create any table from any compute. DB_ID is the foreign key which used to validate between DBS and TBLS table.
What is ROW Level Security?
Row-Level Security enables
you to use group membership or execution context to control access
to rows in a database table. Row-Level Security (RLS)
simplifies the design and coding of security in your application. RLS
helps you implement restrictions on data row access.
RLS
supports two types of security predicates:
·
Filter predicates silently filter the rows
available to read operations (SELECT, UPDATE, and DELETE).
We’re not going
to implement this predicate because we want everyone to read other source
system tables.
·
Block predicates explicitly block write
operations (AFTER INSERT, AFTER UPDATE, BEFORE UPDATE, BEFORE DELETE) that
violate the predicate.
We’re going to implement this block
predicate and it will prevent developer to create table which they don’t own
it.
By Default, Databricks cluster
use ROOT as a user and all the session request to metastore will get
established with ROOT only.
Step
1: Create Schema
to make sure it is separate from other purpose
create schema security;
go
Step
2: Create a
function which accept UserName and DataBase ID as a parameter. And validate
those value in DBS table. If it is valid it will return 1 else NULL
create function
[security].[fn_DBsecuritypredicate](@User as sysname,@db_id
bigint)
RETURNS TABLE
with schemabinding
as
return select 1 as
fn_DBsecuritypredicate_result from [dbo].[DBS]
where
@user = [OWNER_NAME] and @db_id = [DB_ID];
GO
Step
3: Create a
security policy to block the insert when the above condition is not satisfied
CREATE SECURITY POLICY DB_CHECK
ADD BLOCK PREDICATE security.fn_DBsecuritypredicate(owner,DB_ID) ON [dbo].[TBLS] AFTER INSERT
with
(state=on)
Conclusion:
Post security policy implementation. I'm able to prevent use to create table in database which they don't own.