Wednesday, March 25, 2020

Copy Table Schema From One Metastore To Another


Requirement:

Need to migrate the table schema from 2.1 to 3.2 and copy the data from Data Lake Gen1 to Gen2

Component Involved:

Azure Data bricks, Azure Data Lake

Language: Scala

Code Snippet:

%scala
val path = dbutils.widgets.get("path")
val tname = dbutils.widgets.get("tname")
val des_tname = dbutils.widgets.get("des_tname")

%scala
val data = sqlContext.read.parquet("adl://adls.azuredatalakestore.net/data/" + path) data.createOrReplaceTempView(tname)

%scala
val SQL_builder = StringBuilder.newBuilder
val tmp_qry = "show columns in " + tname
val df=spark.sql(tmp_qry)
val ins_query = "CREATE TABLE IF NOT EXISTS " + des_tname + " Location '" + "abfss://data@adls.dfs.core.windows.net/"+ path + "' AS" + " select "

SQL_builder.append(ins_query)

df.collect().foreach(row =>{
if(row.toString() == "[remove_column_list]"){}
else{SQL_builder.append(row.toString().replace("[","").replace("]","") + ",")}} )

SQL_builder.append("CASE WHEN modifycolumnname= '1' THEN 'Yes' ELSE 'No' end `hdisdeletedrecord` from ")

SQL_builder.append(tname)

spark.sql(SQL_builder.toString())

No comments:

Call Data bricks Job Using REST API

Below power shell will help to call Data bricks Job with parameter  [Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]...