Sentiment Analysis best done by humans

I’m working on an reputation analysis of a international training organization who has expressed concerns about their online reputation.    After pulling the data from Sysomos MAP and comparing the sentiment score against human scoring – I’ve decided that if you care about Sentiment Accuracy – it’s best to have humans evaluate sentiment.

Also noted that human scoring of documents has an additional advantage that never seems to be spoken about  – the more “we” get involved in the output of our monitoring (i.e.:  by devising scoring metrics and applying them to the data at hand) the more “engaged” and satisfied, I feel, we will be with the monitoring programs we devise.

I’ve been seeing this more and more, and have taken that to mean that community managers and analysts that are fully “invested” in the data they collect (they touch it, in other words) the more satisfied they are with what they are doing.

And, as a result of doing all this sentiment “scoring” we have more confidence in the results than if we let a machine program do it.

Which reminds me – not only did I present at the Sentiment Analysis Symposium last week but I also presented on Sentiment Analysis in London last month – and did a deck for it.

How To Monitor Sentiment And Benefit From The Insight This Offers

In the case of my client – i found about 22% of the sentiment around them was negative – but that was after I looked at everything – and that’s actually a fairly large amount of negative sentiment.
By the way, I’m a fan of using Automated Sentiment Analysis – problem is – there’s no standards around this and the current implementations and technologies are still too immature to handle many of the tasks Sentiment Analysis is used for.

Reblog this post [with Zemanta]

14 replies
  1. Martin Bendig says:

    Assuming that digital media is just another tool for human communication I would agree that these automated analyses alone can not really do a good job in interpreting human feelings by analysing vocabulary in a text that can be seen as a human expression that stands at the end of a more or less long chain of reactions (in our nervous system) that might happen between reception and final resulting behaviour. We know that misunderstandings in communication are a common case because of the many different ways people use language. Too many influence factors form the way we use it and the way how we understand it (cultural, religious, social background, education, criterias and believes, knowledge and abilities, decisions, memories, … = can be summarized as our personality?). There are even some people who have their difficulties to understand their own feelings or to distinguish between their own words (or verbalized feelings/sentiment) and their experience (inner or outer).

    Wouldn’t it be similar trying to find a theory that suggests the sentiment or intention of the artist while creating an oil painting based on the color combinations used and color density or based on a geometrical area analysis? In some cases it could also be possible that the artist changed his intention during the creation process. How does it influence the resulting expression?

    No. Currently I do not see a way how a software can fill in all the gaps that exist by not knowing all the deletions, generalizations and distortions that come into play when we begin to use language. ;-)

    Without a doubt, these tools can do a great job analyzing according to predefined criterias and they will also help you managing this big amount of information in a much shorter time (than without these tools). And only in combination with your (human) synthesis, this kind of software supported analysis can bring out more useful results that could lead to better decisions.

    Best regards.

  2. Martin Bendig says:

    PS:

    In NLP (Neuro-Linguistic Programming) we call it “calibration” when we try to find out and learn as much as possible about the model of the world (cultural, religious and social background, education, criterias and believes, …) and also about the non-verbal and verbal signals as a reaction to certain circumstances in different situations. This calibration certainly helps to reduce misunderstandings. You know it for example from your friends or family: If someone lifts the eyebrow it could mean something, and you can interprete a totally different “something” into this reaction when another person lifts her eyebrow. A similar expression does not only have different interpretations, it could also have different meanings from person to person, but also vary in different contexts. So, we calibrate on each other continously (consiously or unconsciously). If you know each other well, you are already calibrated well. A calibration should be a step in all online sentiment analysis, too. Would be interesting to think about a list of observable patterns and useful assumptions that make it easier in this context.

  3. Mike Layton says:

    I couldn’t agree more and sentiment is but one piece of a meaningful measurement program. Sentiment’s greatest value comes when viewing it in relation to other qualitative metrics, such as topics or brand attributes, which are coded by humans versus the existence of keywords. A dependence on the use of keywords can cloud an analysis because it sets artificial limits for metric values as they are generated based on preconceived notions as opposed to the content at hand. There is no substitute for human analysis, and not just from an accuracy standpoint, but also in ensuring that content is qualified in the first place and void of “noise” such as ads, spam sites and listings. In the end, it’s nice to be able to confidently communicate *why* coverage was negative instead of just stating a percentage.

    Also, I love your point about the satisfaction gained from being “invested” in the data… so true. Thanks for sharing, Mike.

  4. Mark Evans says:

    Marshall,

    Automated sentiment technology does a great job of handling massive amounts of data. While it may not be perfect, there’s little doubt technology will continue to get better. At the same time, there is a role for humans to play a role in determining sentiment. For example, we recently introduced a new way for Heartbeat users to quickly adjust sentiment. In many ways, sentiment works best when there’s a marriage between technology and humans.

    cheers, Mark

    Mark Evans
    Director of Communications
    Sysomos Inc.
    @sysomos

  5. Glueberry Pie says:

    I’m doing quite a lot of buzzmonitoring for brands and of course use automated tool. It only helps to measure the amount of mentions. To be honest the sentiment is not often accurate. Because when people use irony or humor, the data is the opposite sentiment. The human checking is an absolute necessity.

  6. Iason Demiros says:

    This is a nice post and the presentation as well. Far from solved, this is a very difficult problem that requires pragmatics and world knowledge, humour analysis and treatment of advanced phenomena such as ellipsis, among other natural language understanding tasks. It is – at least until now – a machine assisted task rather than a fully automated one. However progress so far leaves room for improvement and for optimism, especially at a macroscopic (aka document or multi-document) level.

    Iason Demiros
    QUALIA

Trackbacks & Pingbacks

  1. [...] Hace unas semanas, Hugo Zunzarren (colega y experto en Inteligencia Competitiva) nos presentaba un post en el cual hacía mención a las falsas expectativas que generan en el mercado quienes comercializan este tipo de software, fundamentalmente cuando nos enfrentamos con información cualitativa, hoy en auge para el análisis de las Redes Sociales. A menos que exista evidencia explicita, un robot no es capaz de discernir si un comentario tiene connotación positiva, neutra o negativa (a pesar de que se está insistiendo con cierto éxito en algoritmos que se aproximen a los resultados ciertos), ya que este no solo depende de las palabras en sí mismas, sino del contexto y del tono del comentario. Hugo concluía en la escasa fiabilidad de los análisis automáticos, citando la afirmación categórica de Jason Falls y Marshall Sponders,  expertos en la medición cualitativa de los mensajes: el análisis lo hace mejor un humano. [...]

  2. [...] new tool can identify sentiment “better than most humans”. Just a few days later, I read a post this week claiming that ”sentiment analysis [is] best done by [...]

  3. [...] new tool can identify sentiment “better than most humans”. Just a few days later, I read a post this week claiming that ”sentiment analysis [is] best done by [...]

  4. [...] new tool can identify sentiment “better than most humans”. Just a few days later, I read a post this week claiming that ”sentiment analysis [is] best done by [...]

  5. [...] assumptions and very little code we're able to get almost 73% accuracy. This is somewhat near human accuracy, as apparently people agree on sentiment only around 80% of the time. Future articles in this [...]

  6. [...] assumptions and very little code we're able to get almost 73% accuracy. This is somewhat near human accuracy, as apparently people agree on sentiment only around 80% of the time. Future articles in this [...]

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>