Requirement:
Need to migrate the table schema from 2.1 to 3.2 and copy the data from Data Lake Gen1 to Gen2
Component Involved:
Azure Data bricks, Azure Data Lake
Language: Scala
Code Snippet:
%scala
val path = dbutils.widgets.get("path")
val tname = dbutils.widgets.get("tname")
val des_tname = dbutils.widgets.get("des_tname")
%scala
val data = sqlContext.read.parquet("adl://adls.azuredatalakestore.net/data/" + path)
data.createOrReplaceTempView(tname)
%scala
val SQL_builder = StringBuilder.newBuilder
val tmp_qry = "show columns in " + tname
val df=spark.sql(tmp_qry)
val ins_query = "CREATE TABLE IF NOT EXISTS " + des_tname + " Location '" + "abfss://data@adls.dfs.core.windows.net/"+ path + "' AS" + " select "
SQL_builder.append(ins_query)
df.collect().foreach(row =>{
if(row.toString() == "[remove_column_list]"){}
else{SQL_builder.append(row.toString().replace("[","").replace("]","") + ",")}} )
SQL_builder.append("CASE WHEN modifycolumnname= '1' THEN 'Yes' ELSE 'No' end `hdisdeletedrecord` from ")
SQL_builder.append(tname)
spark.sql(SQL_builder.toString())
No comments:
Post a Comment