Who Should See This Blog Post?
This course is intended for the people who wants to Run their Analytics workload and Machine Learning workload on DataBricks And chosen Azure as their distributor.
Skills Required To Follow:
- Machine Learning
Why we need DataBricks and Azure services ….. well we can train one model on DataBricks and deploy it on Azure Services
Before we can run a spark job we needs to create a computer cluster and before we create a computer cluster we need to create a DataBricks Workspace.All the work done in DataBricks need not to be done within Azure we need Azure if we import data from Azure and wantto deploy our services.
To create a databricks account in Azure it will be looked like this :
And if you are using only DataBricks account it will be looked like this:
The log in page in Azure looks like this:
And in only databricks it is looks like this
First create a cluster for spark , on the left side click on clusters than click on create cluster then fill the required fills a snapshot of it is shown here
Now we have our cluster so we can start working straight away with it . we will work with DataBricks NoteBook which is similar to Jupyter NoteBooks or datalab in GCP. it resides on the workspace . Go there click on the drop down menu and click on create NoteBook option there.
name it test in language select SQL click create.
As name suggests Azure DataBricks integrated with other data services provided by Azure
so if you have data resides on this services you can use it directly with DataBricks
As this is a demo tutorial we will use demo file resides on databricks to get that type %fs ls.here %fs to let notebook understand that we are trying to execute bash scripts and ls to list a file system
command you needs to type to get a table to run SQL query given below:
%fs ls %fs ls databricks-datasets %fs head --maxBytes=1000 dbfs:/databricks-datasets/Rdatasets/data-001/csv/Ecdat/Computers.csv
DROP TABLE IF EXISTS computers; CREATE TABLE computers USING csv OPTIONS (path "/databricks-datasets/Rdatasets/data-001/csv/Ecdat/Computers.csv", header "true", inferSchema "true")
Now run SQL to enrich your SQL knowledge
Now to build a Machine learning model visit the link
You can also run jobs every scheduled time to see any changes in your database in job portal.