Getting actionable insights from text data requires software that “knows” about language. The foundation of any text mining application is good Natural Language Processing (NLP) software, whether you’re a marketing analyst monitoring online discussions about your products, or a digital humanities researcher tracing attitudes towards women through literary history. Anywhere there’s language, there’s potential for an NLP application.
The Google Cloud Natural Language API is one of the newest additions to the Google Cloud Machine Learning platform, and to Temboo’s Choreo library. Harnessing Google’s powerful Machine Learning platform—the same technology behind the Google Assistant, and many other Google applications—is remarkably easy. Compared to installing and maintaining NLP frameworks, relying on an API to do the work is an excellent, lightweight choice for applications that don’t require custom NLP implementations. Furthermore, your application will benefit from the continued training of Google’s Machine Learning platforms, carried out by their team of linguistics PhDs.
The Natural Language API provides support for three major kinds of analysis:
- Sentiment Analysis
Sentiment analysis creates estimations of how positive, neutral, or negative a text is. The AnalyzeSentiment Choreo returns the input text’s sentiment score on a scale from -1.0 to 1.0, where a score of -1.0 is very negative, 0.0 is neutral, and 1.0 is very positive. The sentiment is also given a magnitude score on a scale from 0 to infinity, which indicates the intensity of the emotion expressed. Sentiment scores are returned for individual sentences within a text, as well as for the document as a whole.
- Determining customer satisfaction over the course of a customer support interaction by monitoring the sentiment score and magnitude within a chat application or support emails
- Making decisions about which products to purchase based on sentiment statistics of online reviews
- Analyzing the attitude towards a given public figure across different news organizations
- Predicting changes in the market based on attitudes expressed by influencers and experts in news sources and social media
- Identifying which aspects of a product have the most potential for improvement given social media chatter and online product reviews
- Entity Analysis
The AnalyzeEntities Choreo identifies the nouns found in a text. Each entity is given a salience score on a scale from 0.0 to 1.0, which indicates the significance of that entity to the text. The salience score can help you programmatically determine what a text is about.
The API also returns whether the given noun is a proper noun. Entities are further classified by type, including person, work of art, location, and organization. The full list of entity types identified by Google may be found in the Natural Language API documentation.
Further metadata provided in the API response includes a link to its Wikipedia article, and its Google Knowledge Graph machine-generated identifier (MID), should they exist.
- Summarizing social media posts about your business by identifying the topic and sentiment score
- Finding trends in the topics discussed in an online community over time
- Discovering what people and places are commonly discussed in the same context as a public figure or business
- Automatically linking nouns in a text to their Wikipedia entries
- Syntax Analysis
The AnalyzeSyntax Choreo parses the sentences within a text. It returns each individual sentence found within the text, a description of the morphological properties of each word in each sentence, and a dependency tree for each sentence describing the grammatical relationships between the words within it.
This API is designed to return grammatical data for multiple languages, so some of the potential morphological properties that the API accounts for may not be relevant to the language your application is analyzing. Find more detail about the morphological information returned, as well as information about how to interpret dependency trees in Google’s API documentation.
- Discovering which adjectives are most commonly used to describe a product or public figure
- Building a large dataset of a particular writer’s typical language use to determine whether they were the author of a text in question
- Analyzing the typical language use of customers in order to produce marketing and instructional materials that feel more familiar to them
- Extracting facts from news sites and press releases about market performance in order to inform financial decisions
- Identifying the dictionary form of all the words in a text to easily link them to definitions
- Sentiment + Entity + Syntax Analysis
If you would like to perform all three types of analysis on a text at once, the convenient AnnotateText Choreo is the tool for the job.
At the time of publication, the languages supported by the Natural Language API are English, Japanese, and Spanish. Other languages are currently in beta, with additional languages coming soon. See the latest list of supported languages in the API documentation.
Authenticating to the Natural Language API is simple and only requires an API key:
- You’ll need a Google account. If you don’t already have one, you can sign up here.
- Login to Google’s Developer Console, and create a new Project if you haven’t done so already.
- Using the API Manager, make sure you’ve enabled API Access for the Natural Language API in the Library tab.
- Under the Credentials tab, create a new API Key.
Be sure to follow Google’s best practices for using your API key securely. You can further enhance security by saving your Google credentials inside a Temboo Profile within your Temboo account. With your API key saved in a Profile, it will never need to appear in your code at all. Updating your API key in your Profile applies the update to all instances of your application automatically, so it’s easier than ever to periodically regenerate your API keys.
It’s easy to extend functionality by combining multiple APIs in one application with Temboo. Combine natural language analytics data with related data points to train a machine learning model using Amazon Web Service’s Machine Learning API.
Temboo Choreos that retrieve text could pair nicely with the Google Cloud Natural Language API. Depending on the purpose of your application, Zendesk, Gmail, and any social media API, such as Twitter, Facebook, and Google Plus, are potentially valuable sources for textual data. Any cloud file storage service, like Box or Dropbox, can store text files to be analyzed.