How To Perform Sentiment Analysis and Classification on Text (In Java)
This article provides context for NLP Sentiment Analysis and Classification and demonstrates an API solution.
Join the DZone community and get the full member experience.
Join For FreeIn much the same way mutual empathy defines the development of long-term relationships with our friends, it also plays a key role in defining the success of our business’ relationship with its customers. When customers take the time to type their thoughts and feelings into a review for a product or service, share their feelings through a social media platform, or provide feedback through some similar medium, it behooves us to empathize with them as fellow human beings and determine how they collectively feel about what they experienced. Using programmatic solutions, we can quickly analyze and then adjust (or maintain) the experience we provide to our customers at scale, efficiently improving customer relationships with our brand.
Of course, unlike the human brain, computers aren’t raised and socialized to draw specific emotional conclusions from an evolving human language. They need to be trained to do so – and that’s where the field of sentiment analysis and classification comes into play. Using Natural Language Processing (NLP) techniques, we can train Machine Learning algorithms to analyze and classify unique sentiments in text.
Like many NLP fields, sentiment analysis is a complex, multi-step process modeled with a simple set of classification outcomes. For a classifier model to return a simple sentiment tag (i.e., positive, negative or neutral), it must be trained to extract specific features from a text input and quickly reference those features against a database full of pre-determined tags. Getting to that point involves pairing myriad feature vectors with their respective sentiment tags ahead of time – an exhaustive task requiring vast amounts of thoroughly vetted (and often peer-reviewed) data. When it comes to finally creating a classification prediction, a statistical model must be applied to match input text with tagged features from the reference dataset; after that, it must determine the sentiment of the entire sentence based on the balance of its sentiment tags relative to a given subject.
It's important to note that the baseline complexity of sentiment analysis classification is exacerbated by everyday inconsistencies in human expression - some of which are difficult as-is for human analysts to interpret without reacting naturally to audible cues in spoken language or fully understanding the context of a discussion. For example, it’s easy for any model to get tripped up by language quirks like sarcasm (i.e., “Oh yeah, sure, this product was really great”), out-of-context comments (i.e., “Wasn’t worth it”), out-of-context comparisons (“This service is much better than others”), and much more. Training a model to work around these challenges entails extra preprocessing work.
Immensely beneficial as sentiment analysis is, the complexity and cost associated with training a productive model and processing the vast quantities of data required for that model to function accurately often trumps the impetus to create a new one from scratch. Given the labor involved, incorporating sentiment analysis is best accomplished by leveraging an existing service with exhaustively validated prediction outcomes and powerful underlying infrastructure. This is a problem best solved with Sentiment Analysis APIs, which enable us to rapidly interface with powerful underlying NLP logic without having to take on any responsibility for training or updating that model over time.
Demonstration
The goal of this article is to provide you with a low-code, free-to-use Sentiment Analysis and Classification API. The underlying service analyzes raw text sentences against a rigorously trained reference database to determine if the input is positive, negative, or neutral (only English language inputs are supported). API calls can be authenticated with a free-tier API key, which you can get by registering a free account on the Cloudmersive website.
Each request (formatted as a “TextToAnalyze
” string) will return the following information:
SentimentClassificationResult
– A string describing if the input text was positive, negative, or neutralSentimentScoreResult
– A classification score (float) between -1.0 and +1.0; scores closest to zero are considered neutral sentiment, scores closest to -1.0 are considered negative sentiment, and scores closest to +1.0 are considered positive sentiment.SentenceCount
– The number of sentences (integer) in the input text string
Positive, Neutral, and Negative Response Examples
Let’s look at a few examples of how this model reacts to and classifies certain text inputs.
Let’s pretend a customer ordered and received a package from an online store. In the review section on that business's website, the customer wrote:
{
"TextToAnalyze": "the package is nice"
}
The Sentiment Analysis API will classify this sentence like so:
{
"Successful": true,
"SentimentClassificationResult": "Positive",
"SentimentScoreResult": 0.42149999737739563,
"SentenceCount": 1
}
As humans with context for this response, we can easily validate the accuracy of this outcome. The sentence is indeed “positive” in nature, but not overwhelmingly so; thus, the score does not exceed +0.5.
Let’s process a second example. This time, the customer received a package that was a different color than the one they expected, noting that:
{
"TextToAnalyze": "the package was red, but I was expecting the color brown"
}
The Sentiment Analysis API will classify this sentence like so:
{
"Successful": true,
"SentimentClassificationResult": "Neutral",
"SentimentScoreResult": 0,
"SentenceCount": 1
}
While we might be tempted to read this input text as a dissatisfied customer response, the model correctly identifies that there are no specific negative or positive sentiments present. Without understanding this customer’s feelings further, we can’t know if the discrepancy they noticed was a good or bad thing – we can only wait for further information.
In one final example, let’s incorporate a second sentence to the previous example in which the customer clarifies that:
{
"TextToAnalyze": "the package was red, but I was expecting the color brown. I hate the color red."
}
The Sentiment Analysis API response will categorize this two-sentence input like so:
{
"Successful": true,
"SentimentClassificationResult": "Negative",
"SentimentScoreResult": -0.7226999998092651,
"SentenceCount": 2
}
As we can see, the sentiment score result has dropped from 0 to -0.72, which falls firmly in the “negative” sentiment category. It’s perfectly clear based on the two-part customer response that they were very unhappy with the change, which means their dissatisfaction is probably worth addressing directly.
These are only basic examples, of course - I would certainly encourage running through as many complex examples as you see fit and validating results against your own intuition (and/or other models).
Implementation
Below, I’ll demonstrate how you can install the SDK and structure your API call in Java.
To install the client SDK, first, add a reference to the repository in your Maven POM file (we use JitPack to dynamically compile the library):
<repositories>
<repository>
<id>jitpack.io</id>
<url>https://jitpack.io</url>
</repository>
</repositories>
And then add a reference to the dependency:
<dependencies>
<dependency>
<groupId>com.github.Cloudmersive</groupId>
<artifactId>Cloudmersive.APIClient.Java</artifactId>
<version>v4.25</version>
</dependency>
</dependencies>
With the installation complete, copy and paste from the ready-to-run Java code examples below to structure your API call. Include your API key and configure your text inputs in their respective lines:
// Import classes:
//import com.cloudmersive.client.invoker.ApiClient;
//import com.cloudmersive.client.invoker.ApiException;
//import com.cloudmersive.client.invoker.Configuration;
//import com.cloudmersive.client.invoker.auth.*;
//import com.cloudmersive.client.AnalyticsApi;
ApiClient defaultClient = Configuration.getDefaultApiClient();
// Configure API key authorization: Apikey
ApiKeyAuth Apikey = (ApiKeyAuth) defaultClient.getAuthentication("Apikey");
Apikey.setApiKey("YOUR API KEY");
// Uncomment the following line to set a prefix for the API key, e.g. "Token" (defaults to null)
//Apikey.setApiKeyPrefix("Token");
AnalyticsApi apiInstance = new AnalyticsApi();
SentimentAnalysisRequest input = new SentimentAnalysisRequest(); // SentimentAnalysisRequest | Input sentiment analysis request
try {
SentimentAnalysisResponse result = apiInstance.analyticsSentiment(input);
System.out.println(result);
} catch (ApiException e) {
System.err.println("Exception when calling AnalyticsApi#analyticsSentiment");
e.printStackTrace();
}
Please note that each request will consume 1-2 API calls per sentence, and you’ll have a limit of 800 API calls per month (with no commitments) when authenticating with a free-tier API key.
Opinions expressed by DZone contributors are their own.
Comments