Authored by:
Elias Vamakas, Chairman
BrandProtect
With the explosion of social media and the formation of new virtual communities that are becoming bigger than countries, corporations are recognizing that consumer insights and influence will be the future determinants of success. As a result, there is a frenzy to understand online discussion. For those seeking customer insights, “discussion-monitoring” is the rage, and “sentiment-analysis” has become the “Holy Grail”.
The magnitude of this opportunity has created more than 50 technology providers of discussion monitoring and analysis software, and probably twice that many service providers that claim to be able to provide insight from the data. Even with all this activity and focus, customers continue to be completely dissatisfied by the results. And while the data is plentiful and wonderfully presented with multicolour 3D graphs, the output is generally perceived as overwhelming and useless, and the insights are few and far between.
We at BrandProtect, as one of the earliest pioneers in this space, have been striving to understand the issues around discussion-monitoring for more than 5 years. We clearly understand that customers want insights and intelligence, but without volumes of data.
Unlike traditional mass marketing approaches where the key is to understand the “average” (What does the average person want to buy? On average do more people want blue or green?), the internet is so vast that there is no value in trying to understand the “average”. The key to internet marketing is to understand the “un-average”. To get insights from online discussions you need to filter out all of the “average stuff” to get to the unique, powerful emotions (un-average) which capture all of the insights.
Let me give you a couple of examples of what I mean. When “Drug Company” introduces “a new drug”, there are millions of posts that talk about: the launch and where you can buy it and why the company’s stock is going to do well; and why the CEO is overpaid; and what other drugs the FDA is looking to approve; and how Obama is changing health care...... but, there may be only 3 people that mention that they were very upset because the new drug made them feel faint or dizzy. In a month, it maybe 30 people, and in another month, it may be 300 people. The fact is that averaging how people like “the new drug” is the path to disaster. The key is finding “the-needle-in-a-haystack”. I mentioned earlier that sentiment analysis is the Holy Grail. While customers intrinsically know this to be the case, they really don’t know why. The reason why everyone is frustrated with the outcome from today’s search tools and the lack of insights can all be traced back to this: Sentiment.
Sentiment is the key to uncovering the needle in a haystack. Sentiment is like a beacon of light flashing in a vast ocean of data. Insights come from people that feel strongly about something. Its more than just uncovering that someone feels strongly about something and they are willing to write about it, and that there are thousands more that feel the same way, who couldn’t be bothered (or don’t know how).
While that may be the case, what’s more significant, is that these people are changing the way society communicates. The computer era has reduced the requirement for individuals to communicate face-to- face, and has created both the fortitude and the ability to say and to do things that would not have been socially acceptable in the past. This fortitude comes from the anonymity of not having to communicate in person, plus having the tools that the new world technology and the internet have created to allow individuals access to unprecedented resources enabling them to express themselves. A $50 camera can now create a video that can be viewed by millions.
Back to sentiment: I would like to give you our theory as to why sentiment is so important. We believe that the only sentiment that really matters is sentiment around a specific topic of interest. When we look at blogs for example; we can assess the sentiment around a blog, or a post, or a thread, or even the person whom the discussion is attributed to, but what really matters is sentiment specifically connected to the issue that you are focusing your search on.
I am sure you can appreciate how difficult a task it is to properly extract and identify emotional attributes from written text; ignoring colloquialisms (this singer is wicked!) and sarcasm (I love this company like I love a hole in the head) to assign appropriate sentiment. Additionally, we are challenged to identify high and/or extreme levels of sentiment because we believe that this is where key insights reside.
If all that isn’t difficult enough, true success requires that the identified sentiment is attributed to a set of specific search criteria, and not a discussion in its entirety.
Given these requirements, it’s not difficult to see why you are so dissatisfied with what technology is providing.
Now that we understand the magnitude of the problem, let me tell you how we have dealt with it. Actually, let me tell you how we have transitioned over time to deal with what we have learned and where we think it’s going.
Having understood early on that clients are looking for quality not quantity, we designed a process years ago that tried to be all-encompassing when receiving data, but only presented the nuggets of insight that we identified for our clients. In this process, we leveraged technologies and used human tagging and analysis to confirm sentiment and identify issues that we believed our clients cared about.
This is a real example of the process and the magnitude of the effort required to put together target reports.

While our technology did a significant amount of “distilling”, the data continued to be overwhelming and the insights were only discovered through our human taggers and analysts. This process while achieving good results and satisfied clients had 2 limitations. The first being costs. Human tagging is obviously expensive and fairly slow. The most important limitation was that the human analysis was limited to the knowledge and experience of the individual doing the tagging. This process was deficient in that it was missing the industry and company knowledge that is embedded within the client.
Therefore, in recent research and development of our new process, we determined the following to be key elements:
Let me spend a couple of minutes on the importance of word taxonomy, so that you can understand this concept a little better.
If you are trying to analyze a stock whose performance is being criticized as “bad” or the CEO as “wicked”, you would want to attribute a different sentiment class than if you were analyzing the performance of the latest rock star and their music among a teenage audience. That’s an easy example, but it gets more complicated. If you are in the financial services sector and assessing sentiment of an insurance company you would need to know that “low” is negative if it is referring to earnings, but positive if they are referring to premium rate.
To accomplish the above 3 elements we have now designed our system to go through 5 distinct phases of Analysis before the Reporting stage:

We were very excited about the process, the technology as a base platform, and the results. The platform worked incredibly well with targeted studies, however the results were not as good as we had hoped for undirected studies, something like “show me everything that people talk about us that I should be concerned with”.
I think you will find the “why?” fascinating. I will also go through some real life data and our research to demonstrate our findings, but first let me describe what we have discovered.
Financial success on the internet is all about traffic. In the old days Meta Tags where being used to “prop up” the importance of web sites. I am sure you remember that in the early days of the web, key words were being repeated over and over again to fool traffic engines, all for the sake of SEO (search engine optimization). Well, the Googles of the world got smarter and reduced the influence of those types of old SEO techniques, but the opportunity was too big to dismiss. Smart humans always find a way! The new way to fool search engines, techniques like imbedding key words, phrases and even paragraphs invisibly amongst text had successfully tricked most search engines. This text can’t be seen when you read a web page, but is picked up by the search engine robots and thus make their way into search results. In most cases they have nothing to do with the context of the article, but they are highly charged with emotion to fool a search engine and they can fool us as well.
To put this all in context, let me give you some real data. We decided to use Manulife Financial as a test case to determine the accuracy of the data and our analysis. We compared our machine scored data against human validated scoring (yes, we had our people read EVERY post) and here are the results.
I won’t go into specifics of this search other than reporting the numbers so you can get a feel for the outcome and the magnitude of the issues.

So...
10% of the mentions Identified had strong sentiment and were completely relevant to the study. They are “the needles” that we are trying to identify. Human and machine were in total agreement on the sentiment polarities.
74% where correctly identified as sentimentally neutral and therefore not significant (This is the Hay). Human and machine were in total agreement on the lack of sentiment in the content.
1% where incorrectly scored. Machine assessed the sentiment as positive while human said it is negative and vice versa, or machine said there is no sentiment and human identified sentiment and vice versa. That means a polarity error of 1%.
15% where scored with correct sentiment, but human analysts considered them not relevant to the study. Something had fooled the system! On further analysis, we discovered that these posts had imbedded text or were primarily designed for SEO purposes
As far as the industry is concerned the above statistics would be incredible. Our automated systems scored the data for sentiment with unprecedented accuracy. If you assume all of the irrelevant comments were mistakes made by the system, our accuracy rate would be an industry leading 84%. If you assume that all of the irrelevant posts were scored accurately, the accuracy rating would move up to an incredible 99%.
So why aren’t we satisfied?
The data that we would be prioritizing for our clients would be those that had sentiment, or everything except posts without sentiment.
Out of those, the gems would be 585 out of 1,463 (585+817+61) or 40%.
So we now have the real reason that customers are dissatisfied!
An industry leading 84%-99% accuracy rate means that more than every other post that a customer reviews is irrelevant. Of course 1 in 2 is better than the 1 in over a thousand, which would be the case for unaided search.
We and the industry clearly have some work to do. All I can tell you at this point is “We’re on it!” As we are currently testing new technological advancements in this area I will keep you posted on our new test results.