Skip to main content

Do You See What I See? How Google results differ depending on where you are

 “We rely so much on Google these days”, says Dr Frank Hopfgartner, Senior Lecturer at the Information School and Investigator on the ‘Do You See What I See’ project. “Google has a search engine market share of over 90% worldwide.”

Undertaken by Dr Hopfgartner with several fellow members of the Cyprus Center for Algorithmic Transparency (CyCAT) - which was profiled in the research magazine Inform II in 2019 (page 29) - this project aimed to discover the differences in the search results that Google provides to users in different parts of the world. Google states that their mission is to “organize the world’s information and make it universally accessible and useful”. This project asked: is that true? Does everyone everywhere have equal access to the same information? And if not, what impact might that have?


Dr Frank Hopfgartner


“CyCAT for me was very interesting because algorithmic transparency and bias is a very timely topic and one which is receiving a lot of attention”, says Dr Hopfgartner of his own involvement with the Center. “My prior work was based on user profiling and personalisation; I was looking at the methods and now CyCAT is looking at the societal consequences and what we can do about them.”


The COVID-19 pandemic, whilst challenging for all, did create a fortuitous environment for this kind of research. “We didn’t initially think about COVID as a subject for this study”, explains Dr Hopfgartner, “but then we realised that it was the one thing that was on everyones’ minds, which gave us a unique opportunity; it doesn’t happen too often that there’s a single topic that the whole world talks about, apart from maybe the World Cup or the Olympics.”


The project team wanted to compare what Google shows to users of their search engine when they searched for topics related to COVID-19 from different geographical locations, as well as from the same location but using different languages.

"It doesn’t happen too often that there’s a single topic that the whole world talks about, apart from maybe the World Cup or the Olympics.”

The first issue with gaining any insight from this kind of research is that without working directly with Google, it’s impossible to know how their algorithms work. For this reason, the search engine had to be treated as a “black box”; a system whose internal workings are hidden. The team could put search queries in and analyse the output, drawing conclusions from there.


The second issue was knowing what people actually search for, without access to Google’s log files. The solution to this lay in crowdsourcing. The team used a crowdsourcing platform to assign two tasks to 400 crowd workers - 100 each from the UK, Italy, Spain and Germany. As one of the crowd workers, your first task was to imagine you had to create a photo diary of what happened during the pandemic; a history book for future generations. Your second task was to create a similar photo diary, this time of what habits were developed during the pandemic. To complete both of these tasks, the crowd workers used Google’s image search function, with the researchers then collecting the search terms that were used.


The search terms were split into five categories which emerged as the most common themes: “stay at home”, “personal protection”, “healthcare”, “pandemic general” and “society impact”. The researchers then put these terms into the Google image search themselves and analysed the images that came back by running them through CLARIFAI, an AI tool designed to create tags based on what it ‘sees’ in images (for example, given an image of a beach at a holiday resort, CLARIFAI would return tags like “sand”, “water”, “people”, “outside” etc). Alongside these tags, the team also looked at the URL of each image that was returned in the search and ascertained where the servers that hosted these sites were based - essentially, from which country the images were being shown - to see how many were local to the user and how many were foreign.


“The results from different countries varied significantly”, Dr Hopfgartner says. “For example the results from Spain and Italy were closer than the results from Spain and the UK were.”

“Given that there are these differences in how search results are shown in different countries, we concluded that this might actually influence how we see the world”

Searches conducted in the UK showed almost no results from Spanish, Italian or German servers, but searches conducted in all three other countries did show a fair amount of results from the UK.



People from different countries were clearly seeing different results when searching for the same thing, but there was some overlap. Searching location-specific terms like “lockdown protest London” or “thank you NHS” showed upwards of 90% of the same results wherever you searched from. However, searching more general terms like “COVID 2020”, “COVID social” or “COVID lockdown” only showed 2% overlap between countries.


Even where there was overlap, there were some surprises. Searching “how to get taste back” gave an 87% overlap between locations, despite there being far from a universal answer to this question in medical science.


The team also looked at how much overlap there was in results split into the broad categories of search terms, as well as how many local vs foreign results were returned in each country. Again, there were wide variations in both cases.


“Given that there are these differences in how search results are shown in different countries, we concluded that this might actually influence how we see the world”, says Dr Hopfgartner. “For example, if a UK Google user is seeing a vast majority of results that are from the UK, they have no idea what is going on in, say, Italy.”


When we rely on one search engine - or one company - to collate our information for us, it’s easy to assume that they are doing so completely blindly and objectively, but that may not always be the case.


“If we look back on this time in the future we may realise that we had no idea what was going on across the world because we were using Google as a filtering lens that narrows our viewpoint”, says Dr Hopfgartner.


There are also concerns about the creation or exacerbation of a so-called ‘digital divide’, where people who regularly use search engines may develop a certain view of the world, whilst those who don’t (or can’t) use them develop a totally different one.

“We need to remember that there may be more information out there that we don’t get if we rely on just one such service.”

One of the main focuses of the CyCAT project as a whole is aimed at combating some of these issues. The team has developed a system that interfaces with search engines and highlights potential biases to users. For example, the tool might tell you that most of your search results are coming from the UK and give you the option to filter your results to show you only sites hosted in Spain. This greater transparency and control over your search results is what CyCAT is aiming at, and work on perfecting this tool is ongoing.


With the ‘Do You See What I See’ project having wrapped up in December of 2021, a research paper is now published showing the team’s findings. As scrutiny on fairness and transparency in our tech increases, work like this is becoming more and more important, with consequences for the everyday lives of many of us who use services like Google’s search engine on a daily basis.


“It can be an eye-opener to realise that when we use Google to gather information, they are a gatekeeper”, concludes Dr Hopfgartner. “We need to remember that there may be more information out there that we don’t get if we rely on just one such service.”


- Richard Spencer

Comments

Popular posts from this blog

Raspberry Pi Weather Project now live

A project to create a raspberry pi weather station is currently live in the Information School.  The Sheffield Pi weather station has been created by Romilly Close, undergraduate Aerospace Engineering student at the University of Sheffield.  The project was funded by the Sheffield Undergraduate Research Experience (SURE) scheme and is being supervised by Dr Jo Bates, Paula Goodale and Fred Sonnenwald from the Information School. Information about the Sheffield Pi station and how to create your own can be found on the project website .  You can also see live data from the Sheffield Pi station on Plot.ly , and further information can also be found on the Met Office Weather Observations Website .    This work compliments the School’s existing project entitled ‘The Secret Life of a Weather Datum’ which explores socio-cultural influences on weather data.  This project is funded under the AHRC’s Digital Transformations Big Data call.  It aims to pilot a new approach to im

Our Chemoinformatics Group wins Jason Farradane Award

The Information School's Chemoinformatics Research Group has been awarded the 2012 UKeiG Jason Farradane Award , in recognition of its outstanding 40 year contribution to the information field. The prize is awarded to the three current members of the group,  Professor Val Gillet , Dr John Holliday and Professor Peter Willett . The judges recognised the Group's status as one of the world's leading centres of chemoinformatics research, a major contributor to the field of information science, and an exemplar in raising the profile of the information profession. The School has a long association with the Farradane prize. Its second recipient was long time member of staff Professor Mike Lynch in 1980.

Professor Mike Thelwall gives inaugural lecture

Professor of Data Science Mike Thelwall recently gave his inaugural lecture at the University of Sheffield, entitled  How helpful are AI and bibliometrics for assessing the quality of academic research? The lecture, delivered in the University's Diamond building, was introduced by Head of the Information School Professor Briony Birdi. It covered Mike's research into whether Artificial Intelligence can inform - or replace - expert peer review in the journal article publication process and what this could look like, as well as to what extent bibliometrics and citation statistics can play a role in assessing the quality of a piece of research. Mike also discussed whether tools like ChatGPT can accurately detect research quality. The inaugural lecture was well attended by colleagues from around the University.