At the bottom left-hand side of the screen there is a link to create a new project. Click on the new link and the following window will slide up from the bottom of the screen:
3. Select the type of dataset
Next we need to create an experiment. Select the new link from the bottom left-hand side and choose EXPERIMENT, then Blank Experiment.
Once you select the Blank Experiment you will be presented with a blank canvas. The first thing we will need is the data source. If you navigate to the left-hand side under Saved Datasets you will find the dataset we created in the previous step. To be more efficient, you could just start typing in the search to find your dataset quicker.
Next, drag the dataset onto the canvas. Once the dataset is on the canvas, you can single-click on the component and see a list of properties on the right-hand side.
Also, if you want to see a sample visualization of your dataset, you can click on the circle on the bottom of the component (called a port) and select visualize.
The next component we will need is the Execute R Script which is found under the R Language Modules section. Drag this onto the canvas below our dataset
We now need to connect the components. Select the circle at the bottom of the dataset and connect it to the left circle of the Execute R Script like below.
All of the ports (circles) on each component have names and can be seen be hovering over any of them.
In our example, the top port is the dataset port and the bottom left port is dataset1 port. The Execute R Script can have multiple datasets, so the port to the right of our connected port can also be used for input.
Now all we have left to do is add some R code to generate the word cloud. Single click on the Execute R Script and add the following code in R Script window.
Code Overview
- Lines 1 & 2 are just requiring the necessary libraries to generate the word cloud.
- Line 4 is assigning the first input port to the variable tweets. The 1st port is our dataset determined from dragging the dataset to the left-most port on the Execute R Script.
- Lines 6-11 are calling the wordcloud function found in the wordcloud library. The tweets$handle and tweets$tweets are references to our two columns in the dataset. The first is the user handle and the second are the number of tweets.
- More info on this library can be found here Wordcloud
- Line 13 is just defining an output port that could be used for subsequent components. In our case, the output would be the image of the word cloud
Output
To run the experiment, click Run on the bottom tool bar and wait until you see green check marks indicating a successful run.
You should get an output similar to the following:
Enjoy.