Why automatic sentiment is not working, and why it’s a good thing.

Let’s start with an example: Imagine you work for Dell and you have to rate the following conversations:

  •  HP is good
  • Dell is good, but I still prefer HP
  • The new Dell PC is good but the previous one worked better.
  • The new Dell PC is good but I miss the look of the previous one.
  • HP works great but I hate them
  • Dell works great but I hate them
  • Dell is as good as HP
  • HP, Dell  today’s PC’s are the same crap, maybe Dell a little bit less
  • HP and Dell are nice entry level products
  • Dell is good if you can afford it.
  • Dell is exclusive.
  • I would only recommend Dell’s PC to small businesses.
  • HP is the Dom Perignon of Netbooks.
  • Apple is to Dell what Saint Amour is to Beaujolais Nouveau
  • I worked too much on my Dell last night and I got sick looking at the screen.
  • Dell is only good for gaming
  • HP is like it was in the Packard’s time
  • HP is like what it was in Carly’s time
  • No wonder why the Dell stock is going South
  • I love HP (from HP’s PR agency or Director)
  • I hate HP (from an employee recently fired)
  • A tweet re-purposing the one above without any additional comment
  • eCairn is delivering automatic sentiment in 50 languages with 105% accuracy.

Hard to rate isn’t it? Maybe you should do it again ?

These are fairly standard sentences, not corner cases. There is no irony and no borderline use of language (except the last one), but it just shows that:

  • Sentiment is subjective
  • Sentiment is role based: depending on whether you’re the brand manager for Dell overall, or in charge of the new Dell notebook, or the Small Biz vertical, QA manager, Investor relations, or looking into competitive analysis, you may have different views on what positive means.
  • Language to express sentiment is context dependent and cultural (wine analogy)
  • Language to interpret sentiment is context dependent and cultural (see the HP history analogy)
  • Sentiment analysis has to factor in the “who”, and account for real and pseudo duplicates. Same conversation from different persons, similar conversations but from the same person. Everybody is re purposing nowadays, re tweeting, using ping.fm or alike, not counting the huge amount of “robot “ sites or blogs that just sit there to attract traffic and ads.
  • “Social media” is not a valid sample of any user base, even when the user based is defined as internet users. As an example: the Opensource/Linux community is way more vocal than the windows one. Just taking data from the river of news from community is a statistical heresy, except if you want to study Linux fans. Taking into account the re purposing point, twitter is even not  a representative sample of the “twitter” population…

That’s why solutions with so called automatic sentiment ends up with 60+ neutral, 70% precision and manual options to override the machine generated sentiment.

It just does not make sense. Putting everything neutral may well be a better bet from a recall and precision standpoint.

I’m not saying it can’t be done. It just can’t be done “generically”. Now if you build a solution for sentiment analysis specializing in the stock exchange/ investor community, this is another story and I can give you solid pointers for that , just email and have your checkbook ready.

You would have to  build up dictionaries, invest in a learning algorithm, train it… and yes, that sounds doable but it would be very expensive to setup and the applicability of this would be limited to “stock exchange”, even maybe to stock exchange in 2010.

So forget this Holy Grail, stop wasting $$$ on producing low quality results and come back to the initial objectives of the brand:

Get sentiment on its brand from a specific audience with a reasonable investment.

There are alternatives to reach this objective and the good news is that maths comes to the rescue.

1. Why not sampling? , Marketers have always used focus group and samples, why not extending that on the social web?

2. Why not rating manually? When you zoom in a specific community,  manual starts to make sense. We looked at the top Mommy bloggers that we’ve mapped (top 3500) and went down to 2000 discussions about Pampers in the last 6 months. One can do a good job rating 3 conversations per minute, so it’s roughly a 12h job, at $20/hour, that’s $240 over 6 Months.

3. Moreover,  focusing on a specific community brings more consistency between the conversations that you rate and the quality of the rating would be higher ( Apple means Apple and Orange means Orange :-), in other words, fruits in the Food Community and brands in the Wireless/Telco Community).  It’s easier as an example to establish a standard to rate blog conversations from Mommies on Pampers, than  one for any conversation on Pampers (that would go for analyst report on sustainability  to conversations from experts to comments on their new ad campaign)! Conversations are more consistent, more alike and easier to rate with a focus. You also get specific results for your  key target communities and way more actionable results.
4. While you rate, you also spot insights, key conversations to share, ideas for content marketing.
5. Also doing it this way, you will actually find your promoters and detractors. Connecting conversations to people, you will see who’s moving in positive territory  and see whether clusters of influencers are moving in the right direction. This is key for targeted outreach campaigns.

Last but not least, I’m still wondering what type of actionable plan a brand can take when its “sentiment” drops by 3% with an accuracy claimed at 70% … The Motrin case went wild over the week end, Domino Pizza within hours. So for crisis management, investing in Proactive ORM and building up a solid base of fans within the target community is a much better option. (Ford’s approach).

The morale: when 93% of consumers say they want brands to engage in social media, I doubt that they mean engaging with algorithms and I bet they are expecting real and empowered persons. But that’s another story.

7 thoughts on “Why automatic sentiment is not working, and why it’s a good thing.

  1. Interesting analysis, but, I’ll have to disagree by about 180-degrees.

    There are emerging automated systems that fulfill “emotion detection” and can contextually understand nuances of User input. If you want to get 3,000 people in a stadium to provide brand/market opinion (while thinking they are getting some perk/incentive) via mobile (not IVR polling, but, Natural Language), it’s do-able. Your examples above are readily handled by systems I’ve recently seen.

    You can’t do it with the “human touch” due to cost. We’ve all become accustomed to checking ourselves into airports and out of groceries. “Social Media” as a term is a complete fallacy, and of course, consumers would like better customer service ANYWHERE (preferably on the telephone; a device that used to be used for speaking to someone on the other end). To think that “social media” is going to bring back customer service is really a joke…it’s a marketing ploy at best. But, saying 93% of consumers want social media to rescue them is misleading (they want any form of interaction to save them and deeply hate IVR systems).

    So, creating and implementing smarter automated systems can bring value to consumers/users — and, they can use the networks of so-called “social networks’ to do so. that is the future, not “social media” humans.

    Creating expectations for “real and empowered people”to come to the rescue is sure fire way to be disappointed.

  2. While automated sentiment technology isn’t perfect, it does play a valuable role and, at the same time, it continues to improve. As well, automated sentiment does let massive amounts of data to be categorized as positive, negative or neutral.

    Meanwhile, there is a definite role for people to play in tweaking, editing and adjusting sentiment to take into account inaccuracies and things such as sarcasm and slang.

    In a sense, social media sentiment works well when there’s a marriage between technology and humans.

    cheers, Mark

    Mark Evans
    Director of Communications
    Sysomos Inc.

    1. @ Sam , thanks for the comment. I think we don’t know yet how far the “real and empowered people” can go.

      Let’s take a cie like Walmart: 1.6 million employees maybe more. The ration customers/employees would -theoretically and using the Dunbar number – enable them to develop personal relations with everyone in the US.

      Pure theory. But compared to sending millions of calls in India or to answering machines and hoping for customer empowerment. Human like the human touch and feel and see the danger of industrialized “stuff” (be it food, health or education). If the world emerged out of the “industrial age” , who knows how customer relationship would end up to be.

      @ Mark, Thanks for commenting on our blog. You and Sysomos are doing a great job.

      It’s not a technical problem. I frankly think it doesn’t make sense from a marketing standpoint.

      My point is that “sentiment” is not something attached to a “piece of text” but to the relation between a piece of text and a reader with its own subjectivity, value, objectives aso. No one sells to machines or to robots.

      In most real use cases, real marketers are (or should be) interested by the effect of their message on specific people, not by its effect on an algorithm.

      This is particularly true in a “community” context where the goal is not to be “mildly positive” for “most of the people” but to get noticed and loved by a few.

      Back to the technical details. You may tackle this with machine learning etc but unless you train the system to rate with the proper objectives in mind – i.e one rating algorithm for each situation- what is it that’s produced ? A rating by default from the most common prospective an English speaking human would get ?


  3. Yes I can see where you’re going, however, as much as measuring sentiment is still in it’s infancy by default so is any argument against it.

    Sentiment is not the only thing that SM measuring brings to the table – plus you can ask the people who are writing what they think, after all that is what it’s there for.

    SM measuring points to a pool of conversation and unless the company gets involved in the conversation then they are involved in folly.

    Without SM measuring companies know nothing, with it they know something and have method in getting involved – I would argue that’s better than nothing.

    Point of fact, SM measuring is not the all of everything, it’s just part of a bigger picture that companies have to get involved in at some point. Assuming that SM measuring is the answer is as silly as buying a Bic Biro and expecting to be Oscar Wilde!

    There’s more too it methinks.


    1. “Without SM measuring companies know nothing, with it they know something and have method in getting involved – I would argue that’s better than nothing.”
      But that something that they might know might be the exact opposite of the actual sentiment!

Leave a Reply

Your email address will not be published. Required fields are marked *