Wednesday, August 12, 2020

Call Data bricks Job Using REST API

Below power shell will help to call Data bricks Job with parameter


 [Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12

$args.length

$job_name= $args[0]

$dbname= $args[1]

$adlpath= $args[2]

$prefix= $args[3]

$loadtype= $args[4]

$rerunid= $args[5]

$headers = New-Object "System.Collections.Generic.Dictionary[[String],[String]]"

$headers.Add("Authorization", "Bearer <<<dapisfsadasd2132432434>>>")

$headers.Add("Content-Type", "text/plain")

$job_nm = "https://<<<Workspace Name>>>.azuredatabricks.net/api/2.0/jobs/list"

$job_nm_response = Invoke-RestMethod $job_nm  -Method 'GET' -Headers $headers

foreach ($r in $job_nm_response.jobs)

{

        if($r.settings.name -eq $job_name)

        {

        $job_id = $r.job_id

        break

        }

}

$body = "{

`n  `"job_id`": $job_id,

`n  `"notebook_params`": {

`n  `"databasename`":`"$dbname`",

`n  `"adlpath`": `"$adlpath`",

`n  `"loadtype`": `"$loadtype`",

`n  `"failed_run_id`": `"$rerunid`",

`n  `"prefix`": `"$prefix`"

`n}

`n}"

$response = Invoke-RestMethod 'https://<<<Workspace Name>>>.azuredatabricks.net/api/2.0/jobs/run-now' -Method 'POST' -Headers $headers -Body $body

#$response | ConvertTo-Json

$batch_id =  $response.run_id

$run_sta = "https://<<<Workspace Name>>>.azuredatabricks.net/api/2.0/jobs/runs/get?run_id=" + $batch_id 

$response_status = Invoke-RestMethod $run_sta  -Method 'GET' -Headers $headers

$response_status1= $response_status.state.life_cycle_state

while($response_status1 -ne "success" -Or $response_status1 -ne "timedout" -Or $response_status1 -ne "failed" -Or $response_status1 -ne "canceled")

{

$response_status = Invoke-RestMethod $run_sta -Method 'GET' -Headers $headers

$response_status1 = $response_status.state.result_state

$life_cycle_state = $response_status.state.life_cycle_state

if ($response_status1 -eq $null)

{

#$life_cycle_state + ": " + $state_message

if($life_cycle_state -eq "PENDING" -Or $life_cycle_state -eq "RUNNING") { $life_cycle_state }

Else 

echo "FAILURE" 

exit 1

}

}

Else {

$response_status1

If ($response_status1 -eq "SUCCESS") 

echo "DONE" 

exit 0  

}

Else 

echo "FAILURE"

exit 1

}        

}

}


Sunday, June 21, 2020

Databricks Instance Pools API

What is the Databricks REST API? 
REST API allows you to run commands directly on Databricks. In this article I’m going to explain about POOL api and how I utilized in my project. An instance pool reduces cluster start and auto-scaling times by maintaining a set of idle, ready-to-use cloud instances. When a cluster attached to a pool needs an instance, it first attempts to allocate one of the pool’s idle instances. If the pool has no idle instances, it expands by allocating a new instance from the instance provider in order to accommodate the cluster’s request. When a cluster releases an instance, it returns to the pool and is free for another cluster to use.
 Problem Statement:
During data bricks implementation we’re not able to predict how much node allocated and how much we need for future. Since data bricks cluster allow auto scale up, many time our jobs reached the maximum subnet allocated size. This lead multiple job failure and data issues.
Solution:
In order to prevent allocation issues. I have decided to built a framework which monitor the usage in near real time and capture those data in a table. From that I have created a trend dashboard which indicate how much maximum node we’re utilizing and based on the trend I can tweak my job schedule to avoid collision J
Code Snippet
To implement this I have used POWER SHELL to call the rest api. 
cls
[Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12
Add-Type -AssemblyName System.Web.Extensions
$JS = New-Object System.Web.Script.Serialization.JavaScriptSerializer
$servername="ENTER SQL SERVER NAME"
$databasename="ENTER Database Name"
$username="ENTER USER NAME" ## <--- ENTER USER NAME
$passwd="ENTER PASSWORD"     ## <--- ENTER PASSWORD
$headers = New-Object "System.Collections.Generic.Dictionary[[String],[String]]"
$headers.Add("Authorization", "Bearer <<Databricks TOKEN will be punched here>>")
$response = Invoke-RestMethod 'https://<<WORKSAPCE URL>>/api/2.0/instance-pools/list' -Method
'GET' -Headers

$headers -Body $body
$json = $response | ConvertTo-Json 
$data  = $JS.DeserializeObject($json)
foreach
(
$i in
$data.instance_pools)
{   
foreach ($stat in $i.stats)   
{   
$a,$b,$c,$d = $stat.split(';')   
}  
$Table_query = "INSERT INTO [dbo].[databricks_pool_usage] VALUES   ('" + $i.instance_pool_name +
"'," +
$i.min_idle_instances
+ ",'"
+ $i.node_type_id +
"','"                  +
$stat +
"', CURRENT_TIMESTAMP," + $a.Replace("@{used_count=","")
+ ","
+ $b.Replace("idle_count=","")
+ ","                  +
$c.Replace("pending_used_count=","")
+ ","
+ $d.Replace("pending_idle_count=","").Replace("}","")
+ ")"       
Invoke-Sqlcmd -ServerInstance $servername  -Database $databasename -Username $username 
-Password $passwd
-Query $Table_query}


Sample Graph:




Call Data bricks Job Using REST API

Below power shell will help to call Data bricks Job with parameter  [Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]...