Skip to main content

The Dynamics of Micro-Task Crowdsourcing

On 20 May 2015 Dr Gianluca Demartini of the Information School will present a paper on 'The Dynamics of Micro-Task Crowdsourcing' at the 24th World Wide Web Conference in Florence, Italy.

Micro-task crowdsourcing is a modern technique that allows outsourcing of simple data collection tasks to a crowd of individuals online. Tasks such as image annotation, document summarisation, or audio transcription are easy for humans to complete but very challenging for computers. micro-task crowdsourcing is commonly used to build information systems that combine the scalability of computers over large amounts of data with the quality of human intelligence.

Over the last 10 years different micro-task crowdsourcing platforms have been created. These platforms are marketplaces where crowd workers complete tasks (usually called Human Intelligence Tasks or HITs) in exchange of small monetary rewards and where requesters post their data and tasks to quickly obtain large scale annotations.

As part of research carried out with Difallah, Catasta, Ipeirotis and Cudré-Mauroux, Gianluca analysed logs between 2009 and 2014 from the most popular crowdsourcing platform: Amazon Mechanical Turk and observed the evolution over time of its usage. This research will be presented at the World Wide Web Conference, and the full paper is available here.

The main findings from the research are:

Published tasks:
- The most frequent HIT reward value on Amazon Mechanical Turk has increased over time, and reaches $0.05 in 2014.
- HITs about audio transcription have been gaining momentum over last years and are now the most popular tasks on Amazon Mechanical Turk.
- Content Access HITs (like “Visit this website” or “click this link”) popularity on Amazon Mechanical Turk has decreased over time.
- Surveys are the most popular type of HITs for US-based workers on Amazon Mechanical Turk.

Workers and Requesters:
- HITs on Amazon Mechanical Turk that are exclusively asking for workers based in India have strongly decreased over time
- While most HITs on Amazon Mechanical Turk do not require country-specific workers, most of such HITs require US-based workers
- New requesters constantly join Amazon Mechanical Turk, making the number of active requesters and available reward increase over time: Over the last 2 years, an average of 1000 new requesters per month joined Amazon Mechanical Turk
- There is a weekly seasonality effect in the amount of rewards assigned to workers and in the HITs available on Amazon Mechanical Turk

Market size and dynamics:
- On Amazon Mechanical Turk 10K new HITs arrive and 7.5K HITs get completed every hour (on average)
- New HITs attract new workers to the Amazon Mechanical Turk website
- New workers arriving to the Amazon Mechanical Turk platform complete both fresh and old HITs
- Workers on Amazon Mechanical Turk prefer to work on fresh, recently posted HITs
- New work has almost 10x higher attractiveness for workers as compared to remaining work on Amazon Mechanical Turk

Work size and speed:
- Very large (300K HITs) batches recently appeared on Amazon Mechanical Turk
- Throughput of HIT batches on Amazon Mechanical Turk can best be predicted based on the number of HITs in the batch and its freshness
- Large HIT batches can achieve high throughput (thousands of HITs per minute) on Amazon Mechanical Turk


Above: Cumulative HITs (log) per country plotted by time


Above: Micro Reward per year


Comments

Popular posts from this blog

New Article: Services for Student Well-Being in Academic Libraries: Three Challenges

Services for Student Well-Being in Academic Libraries: Three Challenges
Our Director of Research and Senior Lecturer, Dr Andrew Cox, has published a new article alongside Dr Liz Brewster at Lancaster University.

There has been a wave of interest in UK academic libraries in developing services to support student well-being. This paper identifies three fundamental and interrelated issues that need to be addressed to make such initiatives effective and sustainable. Firstly, well-being has to be defined and the impacts of
interventions must be measured in appropriate ways. Secondly, there is a need to identify the true nature of the underlying social problem around well-being. Thirdly, relevant approaches to the issue need to be located within the professional knowledge base of librarianship.
To read the article, click here.

Joint PhD presentation between Sheffield and Makerere, Uganda, delivered by Liliana Sepulveda Garcia

Last week saw the first presentation in a series of joint talks between the Information School's Health Informatics and Information Systems Research Groups in Sheffield. and Makerere University in Kampala, Uganda.The talks aim to promote research collaboration and knowledge sharing.

Dr Laura Sbaffi and Dr Efpraxia Zamani are organising this series and chairing the Sheffield presentations, and Prof Josephine Nabukenya will be chairing the presentations from Makerere.

In this first session, PhD student Liliana Sepulveda offered the audience a great overview of her PhD research on "An experiential study of the human-technology relationship between informal caregivers of people with dementia and assistive technologies".


There will be similar virtual meetings every month and the next schedule one is for Tuesday 11th June, when a PhD student from Makerere will be presenting their work. More details will be forthcoming.

You can view the recording of the session here: https://di…

PhD student Gianmarco Ghiandoni presents at UK-QSAR conference

Gianmarco Ghiandoni, PhD student in our Chemoinformatics research group, recently attended and presented at the UK-QSAR conference in Cambridge.

Gianmarco attended the conference and presented a part of his PhD project, which involves the development of "Reaction Class Recommender Systems in de novo Drug Design".

'These algorithms are machine learning models that have recently acquired great importance due to their effectiveness in product recommendation', Gianmarco said. 'In particular, companies such as Amazon, Netflix, Spotify, etc., have built their reputations and businesses on the top of these models. At Sheffield, we have decided to apply these methods in order to produce suggestions for decision making in automated molecular design. The results from their application indicate that recommender systems can improve the synthetic accessibility of the designed molecules whilst reducing the computational requirements.'