Comparing Social Media Monitoring Platforms on Sentiment Analysis about Social Media Week NYC 10

My first two posts in this series on Comparing Social Media Monitoring Platforms for coverage of Social Media Week NYC and Comparing Social Media Monitoring Platforms on content about Social Media Week NYC 10 examined volume and media types captured by each platform – with very different results in each case.

But what about Sentiment?

Radian681% Positive and 19% Negative Sentiment

I used “Widget Keywords” which is to say, I told Radian6 to use anything that had “social media week” in it for sentiment analysis.  I could have used Topic Profile keywords which would have broken down the results differently, similar to what Techrigy does with “Tone”.

Alterian/SM2/Techrigy – 91% Positive and 9% negative sentiment

Considering the number of posts each platform considered, the results for sentiment, on the face of it – between Radian6 and Techrigy are not as far off as I’d thought they would be.

Sysomos – had 76% Positive and 1% Negative with 23% neutral - again, not as far off as I thought they’d be.

But Brandwatch was really far off from the rest with only 1% positive and 1% negative – and the rest of the data was “neutral”.

It’s clear Brandwatch is much more strict with assigning sentiment – maybe that’s good – -not sure yet.

Biz360 had 16% Positive and 4% Negative with 81% Neutral  - quite a difference from the rest.

Of course, we need to look at what each platform considers to be negative vs. positive.  More to the point – would the same content be found in each platform to be negative – or might a negative post or comment in one platform be considered to be positive by another?

Whew ..that is a hot !!!  How far off can these platforms be ?  Remember – same exact query (Social Media Week New York City) same time period.

Radian6 – none of the 22 tweets that were flagged negative –  were actually negative  when I read then visually.

Look – Radian6, to their credit – added Sentiment Analysis by keywords just last month. I want to say it’s not the platform’s fault that people (me) don’t know how to configure or use it might be intended.

On the other hand, I put in “Social Media Week” as the keyword to do sentiment around – on a topic profile about Social Media Week NYC  – so …. you figure it out.    Let me put it another way – if you really care about the accuracy of your results around sentiment – best to look at them by hand and reclassify them to what you think they are – Radian6 may not be smart enough to tell sentiment.  There- I tried to be nice.

Alterian/SM2/Techrigy – total wash out - most of the results don’t make sense – only one or two actually have any connection with Social Media Week New York City – and they’re not negative – at least, not negative about Social Media Week –

Sentiment Permalink
Negative opinion http://lorimacvittie.ulitzer.com/node/1196850
Positive opinion, Negative opinion http://jeremygeelan.ulitzer.com/node/665165
Negative opinion http://yehudaberlinger.ulitzer.com/node/1171610
Negative opinion http://makethelogobigger.blogspot.com/2010/01/one-more-thing-about-detroits-problems.html
Positive opinion, Negative opinion http://blog.taragana.com/sports/2010/01/23/cammalleri-gets-4-points-halak-posts-shutout-in-canadiens-6-0-win-over-rangers-68489
Positive opinion, Negative opinion http://nytm.org/2010/01/22/january-newsletter
Negative opinion http://www.megite.com/technology/1264044453/41#item_1
Negative opinion http://kittenlounge.onsugar.com/Sippin-Saturdays-Reconnect-Yourself-6965071
Positive opinion, Negative opinion http://kittenlounge.onsugar.com/Wearable-Wednesdays-Mind-Games-6944987
Negative opinion http://blog.taragana.com/sports/2010/01/19/record-tying-53-non-seniors-apply-for-nfl-draft-67110

I realize the Sentiment Analysis part of  Techrigy is going to be updated on February 1st, and it will be interesting to look at these results – then.

People I know at Techrigy have been very nice to me and bend backwards to please – and I really appreciate their support for me over the last two years –  on the other hand, as an analyst, a blogger whose opinion people respect in this area, for this query – the results Techrigy returned were less than useless – they actually are a detractor.  I’m glad there’s an update in a few days – Sentiment Analysis needs an update, badly.

BrandWatch – 2 negative results – both were negative

Yes, let’s call a “spade a spade” - these two tweets do have a negative tone - look for yourself.   Congratulations Giles Palmer – you  and your team won this challenge –  maybe the English got this one right – sentiment analysis –  at least in this case – the results make sense.

Sysomos - the sentiment results were generally good –  I did not find any negative blog posts according to Sysomos, so I looked at the positive and neutral posts and I found them to be reasonable in their classification of Social Media Week Content – many of the Mashable posts about Social Media Week were listed  and it’s fair to read that as “positive” since they support and give attention to Social Media Week.

Unlike Brandwatch, Sysomos didn’t find any negative tweets and classified them all as neutral - perhaps that is not far from the truth – I would rather have a result be classified as neutral than have it be flagged as positive or negative by the system when it’s clearly not, having looked at it visually.   Sysomos treated news stories in mainstream media as either neutral or positive – again, Sysomos assignment of sentiment was reasonable – so maybe it’s a tie between Brandwatch and Sysomos.

What about Biz360? – the negative sentiment selections were pretty poor and looked like the same things that Sysomos flagged as positive, Biz360 flagged as negative.

It’s really hard to imagine most of what was picked up by Biz360 could be negative – just look at the titles of the content it selected.  I do like that Biz360 qualifies the reach of a piece of content while the other platforms that are being looked at here do not offer that feature, today.

Taking this all in – I need to say that I seriously question a “marketer” who builds a Social Media Monitoring platform is going to come up with a satisfactory solution for many of the things that are really important to measure – and to accurately gauge.

Radian6 is getting better – but the problem is, as far as sentiment goes – there’s too much that has to be done on a configuration level to get sentiment right – I know they  will continue to improve – but they also come from a “marketing” background and the Flash Interface, that makes their product look better than the others, is also, at time, frustrating to work with.

Alterian/Techrigy/SM2 – well …. they know Sentiment isn’t their strong point – they have a good technology and have improved alot over the years – yet they too, are built from a marketing perspective – and that can be a problem, because marketers aren’t that good at technology – that’s the same issue as Biz360, in my opinion – but look – I could be wrong – hell, I hope I am.

Brandwatch – I don’t know what their origins are – but I have met and spoken to Giles Palmer – who founded Brandwatch several years ago – and they consider themselves to be the #1 platform in Europe – their sentiment analysis seems to be better than most, based on my experience with it so far – and from what Giles tells me – they do a lot of work to make it and keep it that way.

Sysomos, the more I work with it, the more I like – Sysomos was built by a programmer and a university- the University of Toronto – and their backend is able to splice and dice data very well – Sysomos is more like a programming think tank that grew into marketing – yes,  their interface could improve – but the sentiment analysis and noise suppression are excellent.

So, there you have it – Sentiment Analysis – buyer beware – for the time being, if your going to have large amounts of data you need to score with automated sentiment analysis – I think you’ll be best off with Sysomos or Brandwatch, all things being equal.

And look, I’m open to discussion and reexamining my results (I don’t know a lot of what goes on behind the interface)- clearly – a lot more deep dives into the data of all the platforms are needed – but that’s precisely what most Social Media Monitoring Platform make it hard to do – and yes, a couple of analysts types have looked at these platforms, including Forrester and Nathan Gilliatt who will be at Monitoring Social Media Bootcamp with me in London in two months and a sample of his ebook is encouraging (and the new book he’s working on is going to be even better).

Nathan’s approach, as far as I can tell – is to look at these platforms more as a “buyers” guide  – similar to what Phil Kemelor does with Web Analytics platforms for CMSWatch.

My approach is hands on – how well does these platforms work for what we use them for?

I don’t think anyone else in the field actually does that - and this post alone – shows that.

By the way, if you want to get far more than even what I can write here – hire me to help you – and please join me at Monitoring Social Media Bootcamp in London on March 31st. 2010.

Monitoring Social Media BootCamp

Tickets are on sale now.

Reblog this post [with Zemanta]

25 thoughts on “Comparing Social Media Monitoring Platforms on Sentiment Analysis about Social Media Week NYC 10”

  1. Hi Marshall,
    Thanks for the overview and comparisons.

    We recommend that the sentiment analysis be used as a high level overview.

    Our customers that want to have the sentiment & tone scored with a high percentage of reliability customize the dictionary. It’s a very simple process and increases the reliability to the level depending on the sample size that you use. We have global PR agencies doing that for their clients.

    As I mentioned when we met in New York City I would be glad to show you how.

    Connie Bensen
    Community Strategist, Alterian
    @cbensen

  2. Hi Marshall:

    Nice work. I think another interesting angle would be to examine what results get returned (quantity and source) for similar searches.

    SM analysis is hard work – and it takes complex tools + smart people to get good, actionable results. My post about this “Baking a Social Media Cake”:

    http://preview.tinyurl.com/cgzts4

    No question it is easy to get results and jam out reports and fancy looking sentiment charts, but if you are going to use this information to make business decisions – then a lot more rigor is required.

    Tom O’Brien
    MotiveQuest LLC

  3. It is fun to look at individual verbatims to evaluate accuracy and that is how the quality research companies do this. The goal of any company is to achieve accuracy rates in excess of 80%. Higher than 85% is absolutely impossible (even human beings can’t do it) unless you ignore and delete all the maybes and uncertainties.

    What really matters is looking at entire sets of data. When you have thousands of records, the confidence intervals around the findings are very tight and generally lead to extremely accurate results.

    You might also want to have a look at http://www.tweetfeel.com for a twitter solution, or our evolisten product which evaluates the entire internet.

  4. Glad to finally see an Apples to Apples comparison of Sentiment Analysis between the major category offerings. Have used many of the tools you review and would agree with your summation. Techrigy can be fraught with frustration. Stay tuned, SOCIALtality will be releasing our Social Intelligence Engine soon, complete with an automated, multi-spectrum, sentiment analysis.

  5. clearly the right horse won! :)

    seriously, thanks for being thorough Marshall. I’m sure some of the other systems will win in other ways – overall, i think it’s just good to have someone with authority be so thorough.

    Can we resign now?

    giles

  6. Marshall,

    I like to see these qualifications, would be interesting to add some new tools to the mix, like Chrimson Hexagon (which clamins a more scientific method for calculating sentiment) and scoutlabs which has one of the lowest cost solutions. I agrfee whith previous commentors that automated sentiemnt should be used a s a guide to make directional decisions. We find it a usefull tool when trending sentiment over time and making comparisons between brands all using the same methodology. Kudos for starting this conversation.

    X

  7. There is a serious need to establish a set of benchmarks by which we can – as an industry – measure the accuracy of automated sentiment. Great post, clearly highlights the need to set up something far more robust.

    Because as we all know, the claimed levels of accuracy are FAR from the truth.

  8. Hi Marshall,

    This is a great comparison you’ve done of some of the various social media monitoring tools and how they evaluate sentiment. We’re actually planning on being at the Social Media Bootcamp on March 31st and I’d love to chat with you about sentiment analysis. We do sentiment analysis manually at Synthesio as we are not yet satisfied with the automatic tools that are out there, especially when monitoring in multiple languages for many of our clients, and it would be interesting to get your feedback.
    It’s always a pleasure to read your blog and it will be nice to finally be able to meet up with you in person.

    Best,
    Michelle
    @Synthesio

  9. I am glad to see someone finally testing the claims made by companies when it comes to sentiment analysis, as those within the industry know that they are generally greatly inflated.

    That being said, I believe your test was somewhat hampered by choosing a topic for which there is likely to be very little, if any, negative conversation. I would love to see another test of a more contentious topic (the iPad?); I think it would better reveal the problems with automated sentiment analysis, and possibly expose the discrepancies between volume measures as well.

    Furthermore, I’m not sure that most users would consider a tool that scored 99% of entries as neutral to be very valuable, nor a tool that scored casual media mentions as positive. Top-level sentiment is already a difficult metric to use, as it doesn’t specify what the authors are positive about, exactly; to further muddle things by mixing casual mentions as either neutral or positive, or to just score almost everything as neutral leaves very little for a user to work with or to base business decisions upon.

    Per Jon Burg, I think a set of clear measurement standards would be beneficial to the industry. Until we get there, the market will be confused by claims and won’t be able to differentiate a robust solution from one that is merely a coin flip.

  10. Great post, thanks Marshall. I see all these tools as simply being a first-pass filter – before the hard work begins. If we are serious about monitoring conversations, then we simply need human intervention. Sure, they will get better – but as you point out, there is a long way to go.

  11. Despite the unavoidable variations in the systems this type of head to head comparison is extremely valuable. Thanks for doing the leg work and sharing it with us.
    PS it was good seeing you last night at the Media Bistro Tweetup :)
    Chris

  12. It would be great to have ListenLogic in if you ever do this again. Our sentiment is tuned daily by a dedicated analyst, so I don’t know how that would factor in, but we’d be love to participate!

  13. Software has contributed to the efficiency and timeliness of media measurement, but I do not believe that asking a machine to “evaluate” the sentiment of an article/blog/tweet/transcript is one of them. The variance between your results go a long way to illustrating their weaknesses.

    If you cannot trust these automated solutions to make business decisions, then I fail to see the point. In regards to measuring brand reputation and PR efforts, sentiment is just the tip of the iceberg. By relying on a keyword list, you are also assuming that the results are based on content that is comprehensive, qualified and void of “noise.” To a large extent, this is determined based on the ability to compile a list of keywords that precisely identifies your coverage, markets and products and services. Some organizations may be fortunate enough to have the use distinct keywords while for others, this process may resemble a game of pin the tail on the donkey. And let’s not forget about measures such as key message communication, influencer visibility and the attributes that impact reputation such as “innovation”, “customer experience” and “product performance” when analyzing brand positioning and demonstrating PR efforts.

    Don’t get me wrong, we strongly believe in the use of technology, but technology should serve as a tool not a solution.

    Excellent post, thanks for sharing. Mike.

  14. Marshall, great recap of the silly-state of sentiment analysis. Regrettably I keep hearing from people that WANT this, but only because they want a Report Card of their outreach efforts. Users need to really think about what they need to do, and remember that work requires work. I applaud you for diving in to the data and sharing the lessons with others.

  15. Hi Marshall,
     
    I represent a social media monitoring tool called Simplify360. We also have our own sentiment classifier implemented in our tool. If you are interested, let me know we can arrange a free demo for the purpose.
     
    Regards,
    Deep Sherchan
    CMO, Inrev Systems
    http://www.simplify360.com
    deep@in-rev.com

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>