Use natural language processing to quantify the opinion expressed in text
The Sentiment analysis module is an opinion mining system based on natural language processing and machine learning that aims at quantifying the sentiment wrapped in a piece of text.
This system takes a batch of sentences, processes it and assigns a sentiment score to each sentence, ranging from -1 (very negative) to +1 (very positive).
State of the art regression model uses the latest advances in natural language processing and deep learning to analyze the sentiment behind text data.
Batch processing and parallelization allows you to parse hundreds of sentences in a matter of seconds.
Extremely simple to use: select your source of text. Sentences can come from a parsed document or from our Twitter request API.
No code needed: once your app is set up in the studio, everyone can use it. No need to be a data aficionado to tune the system to your needs.
Applicable to literally any piece of text: our module tackles sentences coming from very broad and different sources.
Accelerate analysts work: our sentiment score can be used by analysts and can be fed to other numerical analysis modules.
How it works
Our Sentiment analysis system consists in a single algorithm trained on a dataset specifically curated by our team of annotators.
What does it mean when we say that an algorithm is “trained”?
Through supervised learning, we can train our algorithm how to assign a score to a sentence. To do so, we dispose of labeled data -in our case, pairs of sentences and their scores given by a human annotator- that we feed to our system. By comparing the true scores to the predicted scores, we can quantify the error made by the algorithm and change its parameters accordingly. By doing so on large datasets, we train the system how to correctly label sentences with a sentiment score.
How are predictions made?
First of all, we need to split the input sentence into tokens. To do so, we use character-level byte-pair encoding tokenizer. This tokenizer is trained on a large corpus and learnt to split words into subwords that are the most recurrent, until the vocabulary reaches a predefined size.
Then, we must transform each sentence into a language understandable by the machine. To this end, the second step of our algorithm embeds the tokens of the sentence into a vector space such that each token is represented by a vector.
Then, we use a deep neural network architecture, the bidirectional GRU with Attention, to represent the sentence with a single vector that carries the information necessary to run the regression and assign a sentiment score.
Finally, the last layer of the model performs the regression task, by computing the sentiment score from the previously built sentence vector.
The model can be schematized as follows:
Figure 1. Deep Neural Network Architecture used in Sentiment Analysis
Sentiment analysis of Tweets
Nowadays, social networks are the go-to places to quickly spread news, opinions or any other type of information. Opinion mining on social networks could have multiple benefits for your business, as an automated tool can provide you with real-time analysis of textual content.
For this use case, let’s say that you want to assign a sentiment score to tweets about the topic of your choice. How can you use our software and its AI-enable applications to carry out your task?
Step 1: Specify your keywords for the Twitter request
Behind this app, a workflow that calls our Twitter request API gets the relevant tweets and feeds them to our Sentiment analysis API. The workflow looks like this:
Step 2: Done! You can observe the results in the app’s tables.
We looked for the keyword “awesome”. One of the collected Tweets is “@ArChoisan Choi San is awesome” and our algorithm assigned it a score of 0.9, which is pretty positive!