Using man and machine to understand the web

The worldwide web is an unprecedented source of market intelligence that, in theory, allows businesses to analyse public opinion automatically.

But human communication is complex. Context, slang, varying dialects and other foibles make it extremely difficult for a computer to extract what someone means from what they write online.

Professor Arno Scharl, of MODUL University Vienna’s Department of New Technology, has set out to crack this conundrum with an approach that combines automated analysis with human intuition. And, he claims, his team is making significant strides.

His approach is made up of two components. The first is an automated web content aggregator called the extensible Web Retrieval Toolkit, or eWRT, which uses semantic analysis to identify content, such as social media messages or online news stories, that are relevant to a particular topic and detects their sentiment.

Scharl claims that eWRT can achieve both 80% recall – i.e. it will identify 80% of the relevant content on the web – and 80% precision in sentiment analysis – i.e. it will correctly identify the sentiment in 80% of the content.

This, he says, is a significant improvement on commercial tools that may have high precision but low recall rates, or vice versa.

These are bold claims, but there are users to back them up. One example is the US National Oceanagraphic and Atmospheric Administration (NOAA), which uses a tool built on the eWRT framework to gauge the public’s reaction to its online communications.

According to David Herring, director of communication and education at the NOAA’s Climate Program Office, the tool not allows his team to measure the volume of reaction but also to understand “when and how people are ‘spinning’ and ‘framing’ information about NOAA or climate science in ways that are not accurate”.

Crowdsourced gamification

The second component of Scharl’s project taps into to two modish concepts in information technology: crowdsourcing and gamification. Quite simply, the idea is to build engaging, game-like applications that encourage web users to share their thoughts and feelings voluntarily.

For example, continuing the climate change theme, Scharl and his colleagues have built a Facebook application called “Climate Quiz”. A simple quiz game that simultaneously gauges opinion and informs the player, the app offers incentives in the form of cash prizes and a leaderboard where users can compete with their peers.

The ongoing project to combine these two techniques is called uComp. Doing so, Scharl believes, will offer unrivalled analysis of public opinion.

“Normally these two things are pursued separately,” he. “But our idea is to combine them.

“Where the automated method cannot produce a result, or you want to validate a result or fill in missing data, you would then turn to the second platform.”

The end result is what Scharl calls “Embedded Human Computation”, the combination of human and computer processing to collect and analyse knowledge.

Scharl hopes the uComp project will build clearer, more complete picture of the “eco system” of opnion surrounding environmental issues and climate change. The crowdsourcing application will be available open-source towards the end of the year, with a final report on the uComp project expected to be produced in two and a half years.

It is Scharl’s hope that environmental stakeholders, climate change organisations, environmental NGOs and policy makers will start to make use of the platforms and the data already being collected before that.

Avatar photo

Ben Rossi

Ben was Vitesse Media's editorial director, leading content creation and editorial strategy across all Vitesse products, including its market-leading B2B and consumer magazines, websites, research and...

Related Topics