Azure ML Notebooks

The new Azure ML environment contain a Azur Notebook that you able to write the python code there. In this post, I will go through the experiment and see how we can use this environment for the aim of regression analysis.

First you need to setup the environment inside azure portal as below, click on the new resource then AI and machine learning. next in the new workspace click on the name of the workspace, resource group and make sure to choose the workspace as Enterprise 

 

After creating the workspace, you need to click on the overview and then in the overview click on the Experiment to navigate t to the new Studio.

 

In the new studio we are going to explore the Notebook. Star from existing sample in Python folder. click on the regression experiment for taxi driver in New York. and then in the experiment click on the Clone that is assign to the virtual machine or compute.

if you already created one comyte just start it otherwise you need to create a new compute.

Now you can open the python file in the Jupyter, just make sure the compute is running then, click on the Edit and then Edit in Jupyter.

 

Next you able to run the code.

The code has some simple step from prepare data to apply the model.

First we need to import some of the library for accessing the AzureML function and data cleaning functions.

The data is about the taxi driver …. The dataset is huge so using the below code helps to extract the data step by step and extract one month then append it to the rest of the data using append function and then check the first 10 rows of data with head command

green_taxi_df = pd.DataFrame([])
start = datetime.strptime(“1/1/2015″,”%m/%d/%Y”)
end = datetime.strptime(“1/31/2015″,”%m/%d/%Y”)
for sample_month in range(12):
temp_df_green = NycTlcGreen(start + relativedelta(months=sample_month), end + relativedelta(months=sample_month)) \
.to_pandas_dataframe()
green_taxi_df = green_taxi_df.append(temp_df_green.sample(2000))green_taxi_df.head(10)
Next some data cleaning for creating the month number, day of the month and so forth. and remove some of the un wanted columns.

just run the whole cell till configure the work space.

we need to create a workspace using the command from_config()

Then you need to create a train and test dataset using the function train_test_split with 20% for test and rest for the training.

Now you can train the model, but before that need to store some the setting in the variable name automl_settings

then in the next cell, rung the AutoMLConfig with the task of regression and sending the train and test dataset.

In next step it is going to run the mode using the Experiment function in the local machine.

 

you can explore the rest of the code, and then stop the compute if you do not want to run it again

 

Leila Etaati on LinkedinLeila Etaati on TwitterLeila Etaati on Youtube
Leila Etaati
Trainer, Consultant, Mentor
Leila is the first Microsoft AI MVP in New Zealand and Australia, She has Ph.D. in Information System from the University Of Auckland. She is the Co-director and data scientist in RADACAD Company with more than 100 clients in around the world. She is the co-organizer of Microsoft Business Intelligence and Power BI Use group (meetup) in Auckland with more than 1200 members, She is the co-organizer of three main conferences in Auckland: SQL Saturday Auckland (2015 till now) with more than 400 registrations, Difinity (2017 till now) with more than 200 registrations and Global AI Bootcamp 2018. She is a Data Scientist, BI Consultant, Trainer, and Speaker. She is a well-known International Speakers to many conferences such as Microsoft ignite, SQL pass, Data Platform Summit, SQL Saturday, Power BI world Tour and so forth in Europe, USA, Asia, Australia, and New Zealand. She has over ten years’ experience working with databases and software systems. She was involved in many large-scale projects for big-sized companies. She also AI and Data Platform Microsoft MVP. Leila is an active Technical Microsoft AI blogger for RADACAD.

Leave a Reply