IBM Watson Studio : Drag and Drop Machine Learning Model Development

Abhishek Ghosh

By Abhishek Ghosh November 28, 2018 8:58 am Updated on November 28, 2018

IBM Watson Studio : Drag and Drop Machine Learning Model Development

This article is an informative guide especially for those who are not used to data sciences but interested in developing an application with some AI or data analytics functions.
Earlier we had shown how to easily use IBM Watson for text analysis (on Google Docs). This article more towards to help developing something like WordPress plugin to analyze post emotion using AI. IBM Watson Studio helps with a variety of data science products and services into one environment for working with data and deploy machine learning models. IBM Watson Studio is the new name of our previously discussed IBM Data Science Experience (DSX). Through our earlier article, we have introduced the readers to the cloud-based machine learning services which included a working classification.

Table of Contents

1 Introduction
2 Get Started
3.a Instance of Watson Studio on IBM Cloud
3.b Install IBM Watson Studio local
4 Play with IBM SPSS Modeler
5 How to work from Jupyter Notebook
6 Conclusion

IBM Watson Studio Basics to Get Started

Watson Studio is a set of tools (and also a collaborative environment) for the data scientists, developers and domain experts. It is also an easy set of tools for the new developers who want to integrate some machine learning or AI capabilities in their application. Needless to say, Watson Studio serves as a great source of computing power to the data scientists and IBM’s cloud object storage (uses OpenStack technology) is a ubiquitous tool.

IBM Watson Studio local is an IDE for data preparation and data modelling. It integrates with private cloud backed by powerful CPU and GPU infrastructure and IBM Cloud Object Storage. It is possible to integrate it with Apache Spark clusters for distributed processing.

IBM Cloud (formerly Bluemix) has a SaaS version of Watson Studio with the Lite plan which is free for 50 capacity unit-hours as monthly limit, 1 virtual CPU, 4 GB RAM. The Lite plan is great to start experimentation. IBM has a beautiful guided tour for Watson Studio.

Basic capabilities of IBM Watson Studio are same as that of predecessor IBM Data Science Experience – with Jupyter Notebook, Python, R, Scala a data science environment to start working. IBM Watson Studio provides the tools and services to store and catalogue data and models, transform and prepare data for analysis and analyze data in a collaborative environment. With Watson Studio, we are getting:

the capabilities to work around deep learning (including TensorFlow)
the access to pre-trained models, such as Watson Visual Recognition
a chance to work with non-structured data
insight into model management
a drag-and-drop interface to build analytics models using SPSS Modeler
easiness to visualize the insights with dynamic dashboards

We already discussed the basic theoritical aspect of application of machine learning in text recognition and approaches of deep learning. For a quick recap, we can divide learning in to:

Supervised Learning
Unsupervised Learning
Reinforcement Learning

The official examples, help tutorials for working with IBM SPSS Modeler is actually creating a supervised learning model. With IBM SPSS Modeler, we can build machine learning models by simple drag and drop.

IBM Watson Studio Drag and Drop Machine Learning Model Development

How to Create an Instance of Watson Studio on IBM Cloud (suggested way for the new users)

Like any other Cloud services (like former Bluemix), first, the user needs to create an instance of Watson Studio. There will be an associated billing plan and geographic location. Lite plan should be enough to start testing. It must be noted that Watson Studio does not include SPSS functionality in Peru, Ecuador, Colombia and Venezuela (that is what is officially written, we have not tested). After creating the service instance, the next step will be to create a project which will act as a container for the datasets, models, deployments, and API credentials. Each of these project types is pre-configured for a specific task which is usually performed by data scientists. The generic project for working on any type of asset is Standard Type. Also, we can add project collaborators with control on their access level. A detailed, example official guide can be found on IBM’s official site.

How to install IBM Watson Studio local (optional way for the new users)

IBM Watson Studio local is an enterprise-grade software solution for data scientists with data science tools such as RStudio, Spark, Jupyter Notebooks, and Zeppelin notebooks. Integration basically configuring private cloud. It can be installed on a server running REHL, IBM private cloud, HDP cluster, Cloudera cluster, installed via a web browser, integrated with other clusters. Many ways of installing the local software for delivering maximal flexibility. IBM Watson Studio local has 60 days trial and $99/month/user plan. Here is a dedicated official website with documentation on IBM Watson Studio local.

How to Use IBM SPSS Modeler (Drag and Drop)

With IBM SPSS Modeler, we can build machine learning models by simple drag and drop. The visual interface gives us a way to load data, sample it, transform it, apply algorithms, evaluate predictive model performance through a series of nodes to find patterns or variables. We need some sample CSV data to start working (reader needs to use own set).

Within Watson Studio, we need to select New Modeler Flow, give it a name, keep everything at default settings, and then click Create button. Then from the Import menu, we can drag the Data Asset node onto the stream canvas. Then we need to select the CSV data file in the node settings to load. If we right-click the node and select Preview, then we can see our detailed dataset.

Next, to build a modeler stream, we need to navigate to Record Operations, we need to pick Sample and drag it onto the canvas. Then click and drag the line from a visual clue. We can right-click on Sample, go to the settings. Keeping default should work. To experiment with algorithms, we need to navigate to the Modeling menu, find machine learning models from the provided source. We need to choose, drag those the nodes to the canvas and connect them to the Data Types node. We need to click the small blue triangle on the stream canvas top menu to start streams. After the end of the run, orange nodes will appear containing model performance results. We can right click each of them to check. This is the basic process to create simple supervised machine learning models with IBM SPSS Modeler.

Machine Learning Models in Jupyter Notebook

Although this article’s name is pointing to drag and drop work, Jupyter Notebook is used by many of the users who are not data scientists, it is not exactly complex and gives the opportunity to run commands. It is probably practical to touch this part.

We can create machine learning models in (Juptyer) notebook by usual way of writing the code and implementing IBM specific machine learning API. After a model is created, we can train and deploy. Official examples include a sample notebook, showing the commands and steps to load data, create an Apache Spark model, create a pipeline, and train the model. To install the required packages, we can run the command in the usual format :

!pip install wget --user --upgrade

1	!pip install wget --user --upgrade

The linked sample notebook shows how to make CSV file available on gpfs, it is simple and easy :

filename = 'GoSales_Tx_NaiveBayes.csv'
if not os.path.isfile(filename): link_to_data = 'https://apsportal.ibm.com/exchange-api/v1/entries/8044492073eb964f46597b4be06ff5ea/data?accessKey=9561295fa407698694b1e254d0099600'
filename = wget.download(link_to_data)
print(filename)

1

2

3

4

filename = 'GoSales_Tx_NaiveBayes.csv'

if not os.path.isfile(filename): link_to_data = 'https://apsportal.ibm.com/exchange-api/v1/entries/8044492073eb964f46597b4be06ff5ea/data?accessKey=9561295fa407698694b1e254d0099600'

filename = wget.download(link_to_data)

print(filename)

Conclusion

Jupyter notebook of IBM data science experience is our favourite part. It is clearly open source part, even if Watson API included it, it remains “open”. IBM Watson Studio’s drag and drop interface avoid the complexity of training machine learning models making data preparation, model deployment workflow easy to the newbies. It works as ready to use platform with huge computing resources to the data scientists.

However, like anything on this earth, IBM Watson Studio has some cons as well (Tweet this ). Watson Studio does not yet support exporting a fully trained model and also has no way to import trained machine learning model on a different system (Do not misunderstand – Neural networks can be exported in TensorFlow, Keras, PyTorch, Caffe, JSON format for sharing). It is a wizard-like development environment, not a container based training environment. If the mentioned points are not a big headache to the developer then there is probably not many cons.

Watson Studio is delivered mostly like a PaaS service for machine learning. From that angle, it is really successful from a developmental point of view.

Tagged With drag & drop commands , github GoSales_Tx_NaiveBayes , how to configured local code in ibm watson ide , ibm watson default face model , ibm watson studio documentation

About Abhishek Ghosh

Here’s what we’ve got for you which might like :

Take The Conversation Further ...

Get new posts by email: