How does BrandBastion analyze sentiment?

Sentiment analysis is a popular tool and a field of study in multiple research areas such as computational linguistics, data science, social sciences, market research, etc. Frequently referred to as opinion mining as well, sentiment analysis has proved to be an efficient method of solving many problems and bringing additional insights into broad audience’s opinions. However, as researchers notice,

“the expansion of the method away from its original contexts has produced misunderstandings and misinterpretations about how the method works, which are detrimental both to its application and to our broader social understanding of social media.” (Puschmann, Powell, 2018)

These days it is very easy to find multiple definitions of sentiment analysis that vary from very pragmatic yet not really explanatory ones, describing sentiment as an “associated score between 0 and 1” to very vague ones which tend to emphasize certain applications of sentiment analysis rather than to define it, such as “[Sentiment analysis] crawls social media for any mention of your product and analyses what people are saying about a brand to produce insights.” This shows how the practical application of sentiment analysis has been diverted from the original methodology of conducting it. This disconnection has produced an application without a concept and measurement of something called “sentiment” frequently fails to establish what sentiment actually stands for. This is why it is important to get back to definitions and to clarify what sentiment analysis is, what its components are, and how it works, as currently, this seems to not be too clear to the broader audience due to the vast number of its applications.

Accuracy of Sentiment Analysis

Measuring the accuracy of sentiment analysis is probably a more difficult challenge than performing the sentiment analysis itself. There are known challenges in defining the accurate measurement for it, but no commonly agreed standard. In practice, any vendor performing sentiment analysis can easily define a correctly designed set of tests that would make their sentiment analysis look better than anyone else's. As this easily becomes very biased, most commercial systems explicitly prohibit using their products in accuracy assessments (Liu, 2015: xii).

Evidence of this is the so-called problem of inter-annotator agreement level. Simply put, since no two humans can agree on what sentiment should be, there will always be someone who will agree with an algorithm's results, as well as someone who will disagree with them.

Many sources argue that analyzing sentiment is a very intuitive task making humans the perfect fit for the task. At the same time, research shows that while it is really intuitive on an individual level, it is hardly possible to achieve consistency in sentiment analysis performed by humans. Every person needs at least some basic instructions to mark given documents with sentiment polarities, but the agreement between different annotators is still very low.

Bermingham and Smeaton (2009) report agreement statistics of 0.422, while the recommended level for even tentative reliability starts at 0.667.

Furthermore, we at BrandBastion observed that the agreement level of the same annotation from the same annotator is often below the reliability threshold once a certain amount of time has passed, as many annotators change their minds over time.

BrandBastion’s sentiment analysis

The methodology we use at BrandBastion to approach sentiment analysis includes several stages. Each includes its own unique elements to ensure we provide our clients with the best possible quality of sentiment insights.

Defining Sentiment

An important foundation for our approach to sentiment analysis at BrandBastion is to have a clear definition of the sentiment that can be used consistently across all stages, from the data preparation to the final evaluation of the algorithms in production.

Context-free Sentiment: To make sure that sentiment as we measure it is the same KPI as communicated, we are starting with having it defined as in state of the art research publications, what we will refer to as Context-free Sentiment

“positive or negative evaluation [of an opinion] expressed through language”

This already implies several elements that are important to take into account at all stages of working on sentiment analysis:

1. The Opinion aspect (in contrast to facts). As some of the top researchers state:

The input to a sentiment classifier is not necessarily always strictly opinionated. Classifying a news article into good or bad news has been considered a sentiment classification task in the literature. But a piece of news can be good or bad news without being subjective (Pang & Lee 2008)

Compare some clear-cut examples that show the difference between factual and opinionated text:

Objective (factual) comments:

I bought two s10 phones as Christmas presents. Both broke after a few weeks.
I haven’t received my box

Subjective (opinionated) comments:

Terrible quality!!!!!!
Quick delivery and nice features.

A significant proportion of real social media data includes elements of both:

Mixed comments:

Still haven’t received my box, so awful 😔I've had no problem with them. (fact: box not received, no prior problem, opinion: awful)
I’m satisfied with it because the palette is good quality and I needed new eye brushes! (fact: need eye brushes, opinion: satisfactory, good quality)

To sum up, we separate factual information from the sentiment aspect of the text and does not categorize purely factual comments as positive or negative. At the same time, we observe that most of the comments will as well have an element of a subjective opinion expressed, and for such comments, sentiment can also be defined.

2. The “Expressed” aspect: Sentiment being expressed through language in contrast to the author or reader's emotional state of mind. In many materials, there is

“frequently no clear distinction made between affect expressed in a text and the emotional state of a text’s author” (Puschmann, Powell, 2018).

This becomes visible in practical applications, especially when it comes to expressing the negative polarity.

Experience shows that negative sentiment is frequently confused with an attempt to attack a brand or a group of people via text that the reader self-identifies with.

Obviously, a lot of such attacking texts are as well expressed negatively, contributing to the confusion. However, some of them might not be expressed negatively, and a writer’s or reader’s emotional states should not be equalized to the sentiment of the document.

Below are examples of comments where the expressed sentiment is not explicit, despite the commenter being in a certain emotional state of mind or having somewhat harmful intent. These comments would, from a Context-free sentiment perspective, be classified as neutral — although they reflect a certain emotional state of mind, the sentiment is not expressed explicitly in the comments themselves and is rather inferred from the presumed emotional state of the commenter.

“backed by science” is no longer a promising marketing approach people
old movies
The only selection is that shade? Cool, if everyone wants that shade!
[brand] answer the questions when people ask Wash your eyes, then write.
They should read the comments on the app before paying
Donate some money instead of just making a post!!!
maybe your parents named you after their favorite dog?

Business-contextualized Sentiment: At BrandBastion we realize that sentiment analysis requires further interpretation to be used for commercial purposes.
The examples above with neutral Context-free sentiment comments are good reflections of this need. There is a (rather obvious) intent of uttering a critical negative opinion that is indeed not explicitly expressed but still reasonably inferred.
As such, in circumstances that pertain to the perception of the business or other users, we override the Context-free sentiment perspective based on our categorical tagging rules[1].

In simple terms, we can say that Business-contextualized sentiment follows the academic Context-free rules, but is overridden in case of inferred harmful intent towards the brand or users.

Preparing Training Data

With the definitions established, the next step in our process is to apply this definition to collect and prepare the right data to be used to train BrandBastion’s sentiment analysis algorithms. The main factor here is to minimize the effect of low inter-annotator agreement (simply put, since no two humans can agree on what sentiment should be, there will always be someone who will agree with an algorithm's results, as well as someone who will disagree with them) and to achieve the highest possible consistency within the labelled data.

Recognizing that the assumed intuitiveness of manual labelling of sentiment can easily cause extensive quality issues, we’ve at BrandBastion done the following:

All our content processing specialists, manually labelling data, are in-house, full-time team members. We don’t outsource manual labelling, as we want to control the process and the quality closely.
We’ve invested in creating clear processing guidelines for the data preparation, including extensive examples and guidelines. We also provide our content processing specialists with extensive and ongoing training.
We apply multiple layers of quality control to the labelled data.
We’ve built a proprietary classification platform that our content processing specialists work on when applying manual labelling to maximize quality and accuracy.

All the above allows us to receive highly accurately labelled trained social media data.

Training Sentiment Analysis Models

The main approaches that exist in the field of conducting sentiment analysis automatically are very well described in various sources and include:

The Linguistic (rule-based) approach performs sentiment analysis based on a set of provisioned rules which could use various techniques from the field of computational linguistics. This approach allows considering very granularly what language features make a document positive or negative and putting this into the context of the domain area.
The Machine Learning (automatic) approach usually models sentiment analysis as a classification problem, which includes documents as inputs and detected sentiment polarity as an output.
The Hybrid approach combines the best elements of those mentioned above. Such systems are usually known to demonstrate the best results while requiring innovative design to ensure the seamless co-existence of the elements and constant improvement.

[1] Tags that override sentiment to negative polarity: Brand attack, Brand critique, Against person featured, Competitor promotion, Account misuse, Legitimacy, Threat, Severe event, Protest, Creative not resonating, HR & factory practices, Complaints, Technical issues, Personal attacks, Bullying, Discrimination, Disturbing/Violent, Self-harm, Negative emoji only. Tags that override sentiment to positive polarity: Fan community, Purchase intent, Positive emoji only

Ecosystem of BrandBastion sentiment models

At BrandBastion, we adopted the hybrid approach that enables us to achieve better results and enables us to be flexible when it comes to processing various types of data. We have built an ecosystem of multiple sentiment analysis classifiers that complement each other towards the best results.

To make it work, we expose every piece of data to all the models in real time. Whichever model demonstrates the highest confidence in the specific social media comment, it marks it with positive or negative polarity. The comment is considered to bear neutral polarity if none of the models can confidently classify it as positive or negative.

While Machine Learning models based on Deep Learning are generally better at interpreting longer texts with complex syntax structures and ingenuine use of words, linguistic models are naturally better suited to deal with the edge cases of communication typical for social media.

Our team of linguists and data scientists keeps constantly advancing both rule-based and machine-learning sentiment analysis models solving the challenges of applying sentiment analysis techniques to social media language. Although we separate models into linguistic and machine-learning ones, BrandBastion keeps advancing in both directions in parallel. This erases the border between the linguistic approach and the machine learning approach, which are opposed to each other in some sources.

Thus, to enhance the lexicon, our linguistic team uses not only traditional methods of corpus linguistics but also some modern automated methods that include purpose-built Machine Learning models trained specifically to recommend the most efficient additions to the lexicon.

And the deep learning models trained by our Data Science team use a linguistic lexicon for pre-processing data to improve the quality of the model via highlighting typical sentiment triggers in data.

BrandBastion Deep Learning Models

There are various exact techniques used in this approach, which are evolving over time with state of the art of Machine learning and Natural Language Processing. Over the last few years, especially the Large Language Model architectures are frequently referred to as efficient, which is also confirmed by our research at BrandBastion. These deep learning models create a representation of whole comments and not just separate words based on the sentence structure. It evaluates the sentiment based on how words compose the meaning of longer phrases. This way, the model takes textual context into account, which helps identify sentiment properly for more complicated cases of natural language texts.

See a few examples below where the polarity of some separate words or phrases can be misleading when detecting the whole comment’s polarity:

Yea they look good but only on a skinny bitch
Wow! This isn’t a scam at all! Also, why am I being shown horseshit like this?
This box’s ain’t even a good one 😢
nice to see your proud of all the sex offenders you hired.
Doesn't recognise it working. Lots of nice buzz words though!👍👍👍🤣🤣🤣

In addition to taking the context into account, another important aspect is taking into account other categories than mere sentiment to which a comment can belong. Real-world data is much more complicated than just being positive or negative, and frequently a single text can bear several labels relevant for the client’s business. At BrandBastion, we support a vast set of categories that could be Universally Harmful across all Social Media, or Sensitive for specific industries and businesses, as well as for example, categories related to Customer Engagement.

The model learns these connections between various categories and the sentiment during the training stage and applies this knowledge in production use. This makes BrandBastion sentiment models more reliable compared to those not taking other categories of social media into account while training models to perform sentiment analysis.

BrandBastion Linguistic-Based Models

These types of BrandBastion sentiment models allow considering very granularly what language features make a document positive or negative, and putting this into a context of the domain area. For these purposes, a sentiment lexicon (or an opinion lexicon) is an essential tool.

Through constant augmenting and using various techniques of suggesting and testing new rules, subject matter experts in computational linguistics can implement a very performance efficient set of rules.

As we analyze language through which sentiment is expressed, we consider that especially Social Media language demonstrates a lot of features that have a significant impact on sentiment analysis. Under the umbrella of linguistic models, BrandBastion develops and improves dedicated sentiment classifiers to cover the following scenarios.

Social media documents and especially comments are very short types of textual documents. In practice, it is possible to observe brands with the median length of the user-generated comments equal to 13 characters. 55% of the comments are less than 100 characters long. These cases are harder to handle with the machine learning models, which frequently struggle to apply a polarity to such comments, while the linguistic models show good performance on such content.

Another factor we take into account at BrandBastion is that although most of the sources related to sentiment analysis use terms “language” and “text” as interchangeable, the social media language is highly multimodal and offers many ways to express opinions in addition to the text. Some of the modalities that have to be taken into account when working with social media language are:

Users tagging other users in combination with the text or independently are also frequently considered to be bearing positive polarity because users will usually tag other users to recommend a brand. We observe that different brands, on average, have users tagging other users for 17%-56% of the comments, and up to 24% of comments contain nothing but user tags.
Comments including user tags and being usertag-only may contain no sentiment, but will still be actually positive for the brand. User Tags are not only used when responding to a person, but usually, they are used to recommend the brand/product to another user. This is usually the case with usertag-only comments. Due to the recommendatory nature of these comments, at BrandBastion, we usually include User Tag comments in reports as a separate sentiment bucket so brands can get a more accurate impression of their general sentiment.
Use of emojis as sentiment indicators or as synonyms to the meaningful words or the whole sentences. Research done in 2017 on a sample of 86 702 Facebook users found that 90% of users used emojis (Karwowski et al. 2017). In some industries such as Beauty, emojis are used extensively, and user-generated content contains just 2.71 times more words than emojis (3.77 words vs. 1.39 emojis in average per social media comment). This also has a practical complication related to visual differences in the same emojis between different social media platforms and end-user devices. These differences result in different ways of using the same emojis by the audience.
Heart-emojis typically express a positive sentiment (with the exception of the broken heart 💔), whereas crying emojis and thumbs down are typically used negatively. The same is usually true for the poop emoji (that usually just stays for the actual word it represents and is used in the same context), similar to this is the trash emoji, and the vomiting emoji also tends to be used negatively.
Such usage of emojis seems rather straightforward, but there are also exceptions to this — emojis being used in unexpectable ways: laughing emojis that are being used more negatively than positively (especially in sarcasm) or vegetable and fruit emojis having sexual connotations. Furthermore, there are also comments where emojis are being used as a mere representation of the object they represent. This is typically the case with most animal emojis, fruit emojis, lipstick, etc.).
Stickers, images, and gifs that are frequent can easily carry polarity. For some brands, 3% of user-generated content contain attachments of some type. Being another visual representation of people’s emotions, or concepts, stickers’ are being used pretty similar to the emojis. The sticker shown here was, for example, used in a similar sarcastic remark as the laughing emoji in the examples above.
Hashtags, which became an essential part of Social Media communication, may serve in some cases as useful hints for sentiment analysis. There are different scenarios of the hashtag usage that we take into account when training the sentiment analysis models at BrandBastion.
In the first scenario, hashtags are detached from the syntax structure of the text, placed in the beginning, or at the end of the comment. In such instances, hashtags are mostly beneficial in providing additional context or specifying the topic of the comment. These types of hashtags are most frequently not bearing any sentiment on their own:
Update much needed especially now. #lockdown
But sometimes it may happen that these hashtags used basically as the subject of the comment are helping to identify sentiment correctly:
My order just gor cancelled because of a bug in your google maps api. #brandsucksThe other scenario is when hashtags are used in the middle of the sentences to replace words or combinations of words. In such instances, hashtags should be considered an essential part of the text important for sentiment analysis because texts become meaningless without them. These may be contextually free (#sucks, #loveit) and also context-dependent (#pulluporshutup) hashtags that could bear certain polarity. Also, there are “focused” hashtags (#hatebrand). Although most of the hashtags start being used as a direct replacement of the corresponding words and can be in the beginning analyzed as such, some of the hashtags can rapidly evolve and become much more than merely a combination of words comprising the hashtag. This is usually observed for the hashtags getting a lot of social resonance: compare the #metoo hashtag vs. just words "me too."

Measuring sentiment in production

Applying Quality Measurements

The quality measurement is an extremely important step because due to BrandBastion’s use of a hybrid approach, we have a significant number of sentiment analysis models working simultaneously. Because of it, quality evaluation has to be applied not only on the level of each model but also on the level of the whole ecosystem.

We use the same approach of utilizing the high-quality processed data for the quality evaluation as for the original training of the models to guarantee that what we are measuring is consistent with what we are trying to achieve.

One critical aspect of quality measurements at BrandBastion is related to the problem of the inter-annotator agreement that is described above. We include the agreement level into the measurement of the quality. This makes KPIs we use as the accuracy measure more conservative and stricter than in most of the sentiment evaluation tasks. We use 90% as the minimum acceptable precision level for both positive and negative polarities for any of our clients. This means we ensure that any person who is familiar with the sentiment definition we use would agree with at least 90% of judgments of our algorithm when it marks some comments as positive or negative.

The last essential element of evaluating the quality of sentiment analysis is to perform it ongoingly irrespective of the work on the models themselves. This is directly related to another feature of the social media language: its dynamic nature and fast evolution.

The language used on social media is much more similar to spoken communication rather than the written one (Barton & Lee, 2013: 5; Crystal, 2006: 19-25; Baron, 2003) - it is "condensed" - certain keywords are used (this is also why the trend of hashtags became so popular), often the sentences are not fully elaborated, the context is often outside the text itself - the grammar, syntax as well as vocabulary are simplified. In addition to this, social media language evolves very fast, rapidly inventing and adopting new words or new word meanings, which all affect ways of expressing sentiment by authors. This can easily be seen at unfortunate times of crisis when negative topics start trending.

The quality evaluation of sentiment analysis at BrandBastion is built to detect changes in social media language early and act fast adapting to them. In the instances of detecting the emerging changes, BrandBastion adapts quickly, updating the especially linguistic models and adding clarifications to the instructions used by human content processing specialists. In the most critical scenarios, we may as well contact our clients to suggest taking actions on their Social Media properties to handle the situation more efficiently.

Using Sentiment Analysis in Production

Once sentiment has been defined, training data has been prepared, training model/s have been applied and quality measurements applied, it’s time for sentiment to be used in production.

At BrandBastion, our clients use our sentiment classifications in many ways such as:

To react to situations that require their attention, such as a new ad receiving a high spike in negative sentiment, which may indicate that either the audience targeting is wrong, the creative doesn’t resonate, or perhaps something has recently happened that has led to increased negativity towards the brand.
To track how a specific audience perceives the brand or specific campaigns over time. Especially when a brand expands to a new target market or launches a new product, sentiment insights can provide valuable feedback on how the launch is going.
To use as part of their KPIs in terms of general sentiment expressed towards their brand and to track how certain branding campaigns impact the general sentiment.

Conclusion

The field of sentiment analysis is extremely popular across multiple research disciplines and has plenty of practical applications. Its popularity grew very fast because of the high availability of user-generated documents provided by the web in general and especially social media. It was easily adopted in multiple areas of commercial use because sentiment analysis brings a context of the audience’s opinions into the existing KPIs and due to sentiment being a very intuitive measure for any decision-makers. At the same time, the fast growth in popularity led to sentiment analysis as a practice that is disconnected from the sentiment analysis as a study. Questions like “What is sentiment?”, “How is the sentiment analysis performed?”, or “What is sentiment analysis accuracy?” are often not answered similarly across vendors and frequently are not answered at all.

In this article, we presented BrandBastion's own experience of how the practical application of sentiment analysis can be brought back to the basics of definitions used in the state of the art of academic research. We showed how each part of the definition matters for performing sentiment analysis and evaluating its quality. We further spoke about the challenges of performing sentiment analysis by humans that are counter-intuitively not limited to scalability, but also affect accuracy. Finally, we covered the specificity of social media user-generated language and what challenges it brings into the definition of sentiment and practicalities of automated sentiment analysis.

References

Puschmann, C., & Powell A. (2018). Turning Words Into Consumer Preferences: How Sentiment Analysis Is Framed in Research and the News Media. https://journals.sagepub.com/doi/full/10.1177/2056305118797724
Liu, B. (2010). Sentiment Analysis and Subjectivity. In Indurkhya, N., & Dmaerau, F.J. (Eds.), Handbook of natural language processing ( pp. 627-666).
Taboada, Maite. (2016). Sentiment Analysis: An Overview from Linguistics. Annual Review of Linguistics. 2. 10.1146/annurev-linguistics-011415-040518.
Adam Bermingham, A., & Smeaton, A.F. (2009). A Study of Inter-Annotator Agreement for Opinion Retrieval. In: SIGIR 2009 - The 32nd Annual ACM SIGIR Conference , 20-22 July 2009, Boston, USA. ISBN 978-1-60558-483-6. 10.1145/1571941.1572127.
Pang, B., & Lee, L., 2008. Opinion mining and sentiment analysis. In Foundations and Trends in Information Retrieval Vol. 2, No 1-2 (2008) 1–135.
Karwowski, M., & Pisanski, K., & Sorokowski, P., & Sobrado, B., & Sorokowska, A. (2017). Who uses emoticons? Data from 86 702 Facebook users. In: Personality and Individual Differences. 119. 10.1016/j.paid.2017.07.034.
Barton, D., & Lee, C. (2013). Language online: Investigating digital texts and practices. Routledge.
Crystal, D. (2006). Language and the Internet (Second Edition). Cambridge University Press.
Baron, N. S. (2003). Why email looks like speech (pp. 83-89). Routledge.
Zaepernicková, E. (2019). Emojis and Your Brand: How to Interpret them and the Benefits of Using Them. Retrieved from: https://blog.brandbastion.com/emojis-and-your-brand-how-to-interpret-them
Paulus, T., Warren, A., & Lester, J. N. (2016). Applying conversation analysis methods to online talk: A literature review. Discourse, context & media, 12, 1-10.
Liu, B. (2015). Sentiment Analysis: Mining Opinions, Sentiment, and Emotion in Text (pp. xii). Cambridge University Press.