Anexinet BI Tweets

Since this is my first post on this blog, I figured it would be best to write about something that has not been mentioned here in the past.  After scouring the blog for a few hours, I decided a simple post on using Azure ML with a little bit of R would be useful.

In this post, I am going to use a simple dataset of the teams tweets and create a simple word cloud based on user tweet frequency.

The dataset is a simple cvs file that looks like this:

I am going to make a simple assumption that you have created a free account on Azure ML.  Once you navigate to the Azure ML Studio site, login and click on "my experiments".   You will then presented with a screen similar to this:

At the bottom left-hand side of the screen there is a link to create a new project.  Click on the new link and the following window will slide up from the bottom of the screen:

The modal window will default to EXPERIMENT on the left-hand side, but we need to first setup our custom csv dataset referenced above, so we will select DATASET first.

Next, we will select FROM LOCAL FILE to the right of the DATASET and be presented with the following:

The process is as follows:

 1. Select a file.

 2. Name the dataset.

 3. Select the type of dataset 

Next we need to create an experiment.  Select the new link from the bottom left-hand side and choose EXPERIMENT, then Blank Experiment.

Once you select the Blank Experiment you will be presented with a blank canvas.  The first thing we will need is the data source.  If you navigate to the left-hand side under Saved Datasets you will find the dataset we created in the previous step.  To be more efficient, you could just start typing in the search to find your dataset quicker.

Next, drag the dataset onto the canvas.  Once the dataset is on the canvas, you can single-click on the component and see a list of properties on the right-hand side.

Also, if you want to see a sample visualization of your dataset, you can click on the circle on the bottom of the component (called a port) and select visualize.

The next component we will need is the Execute R Script which is found under the R Language Modules section.  Drag this onto the canvas below our dataset

We now need to connect the components.  Select the circle at the bottom of the dataset and connect it to the left circle of the Execute R Script like below.

All of the ports (circles) on each component have names and can be seen be hovering over any of them.

In our example, the top port is the dataset port and the bottom left port is dataset1 port.  The Execute R Script can have multiple datasets, so the port to the right of our connected port can also be used for input.

Now all we have left to do is add some R code to generate the word cloud.  Single click on the Execute R Script and add the following code in R Script window.

Code Overview


To run the experiment, click Run on the bottom tool bar and wait until you see green check marks indicating a successful run.

You should get an output similar to the following: