At the start of March Dr Robert Villa presented a plenary talk entitled "Understanding assessors: a forgotten element of big data research?" at the 7th International Conference on Corpus Linguistics.
This conference is run annually by the Spanish Association for Corpus Linguistics and this year taking place in Valladolid.
The talk discussed the importance of studying the human assessors who generate the training collections which are required when using supervised training methods, as is typical when applying machine learning techniques, and is based on work carried out in the AHRC funded project "Understanding the annotation process: annotation for Big data". In this project we're looking in more detail at the behaviour of assessors, and how gathering more information about assessor behaviour could potentially be applied in machine learning. This was a terrific opportunity to meet up with members of the corpus linguistics community, and to learning more about how big data is influencing work in this area.
Many thanks to Professor Pedro Fuertes-Olivera from the Universidad de Valladolid for the invitation to speak at the conference, and also to the other members of the project team: Martin Halvey, Simon Wakeling and Laura Hasler. This work was funded by the UK Arts and Humanities Research Council (grant AH/L010364/1).