How can you rank the relative popularity of players within a team for free and why would you even want to bother doing this?
As is common with blog posts that ask two things, let’s answer the second first – after all it’s easier and maybe more important – there’s no point learning how to do something if there’s no value to doing it.
What We’re Going To Build
Before all that, a quick look at what we’ll create – we’ll make the tables and charts below (and a whole lot more).
Step by step instructions follow.
Why Relative Popularity Matters
I’m going to use the example of Arsenal throughout – I supported Arsenal as kid (that was when the Arsenal way was to be classy, not to get rid of your mascot and lay off loads of staff whilst simulateously giving your star striker 350k a week…)
You could apply this to any team, football, cricket, US Sports or whatever. It would even work for non-sport settings. Want to know the most popular actor in the Marvel franchise or which members of Boris Johnson’s cabinet are eliciting the most interest? This method will work for that too.
Relative popularity matters because it helps editorial teams decide what to focus on, who is worth planning articles and features around.
If we know that Bukayo Saka is three times as popular as Rob Holding then, all other things being equal, there is more point doing a feature on Saka. More people search for Saka and so, if it ranks well on search engines, more people come to your site.
Clearly, much of the time, we can’t factor popularity into our editorial judgement, there is a need to cover the news as it happens. However, the popularity allows sites to do what I term a necessary evil, to bring people in from search and then expose them to the content we are really proud of, or to move them around the site.
If you want someone to sign up to your newsletter, or read the niche in-depth feature you’re incredibly proud of, you have to first get them into the site.
If we know the relative popularity we can know that that quiz we’re making purely for fun should feature certain players in the headline and not others. It will mean that if we’re looking to do a mid-term report on the form of the players so far, these are the ones people seem to really care about.
I don’t for one minute believe that you can run editorial policy based on player popularity, but it can be an extremely useful extra factor to consider when future planning, especially around some of the more lighthearted content.
Limitation of The Data (That isn’t actually a limitation)
I’m going to compare players over the course of the year – when I run this search within Google trends I get one data point per week so it is not as detailed as it could be in terms of seeing sudden spikes.
If, for instance, an unknown player trends one day because they are linked with a move, if we looked day by day or, better still hour by hour, we would see this in detail. The spike is often very short, but also huge (I cover this in a post on targeting micro moments).
What we see with the trends taken one data point per week is an averaging out – so you can see that search interest rose for that player, but it won’t be as dramatic a rise as if you looked daily.
Personally, I think this is fine – having the weekly data gives us an idea of how volume of search for a player ebbed and flowed, but we don’t get complete granularity. If you want that for any player or term, I suggest refining the search to a very short time frame and drilling in.
How To Create A Table of Relative Popularity
For this, I’m going to assume you are at least mildly proficient in using a spreadsheet – I’m going to use Google Sheets as my passion, at least within sport analysis, is looking at how you can imporve sites without spending. However, the rich kids among you can use your fancy Excel (just make sure you don’t add so much data it fails to update for a few days…)
What we’re going to do is…
Choose a player as a reference – this player (or person or thing as this concept works for any categorisation) has to appear in every set of searches
Set Google Trends up correctly so we are charting the right data
Run the search for five terms at a time (as that is all Google Trends allows)
Import the data to your spreadsheet of choice and align it correctly
Using your reference player as a base, adjust other players’ data by the appropriate ratio (don’t worry, all will be explained)
Produce totals and also produce fancy graphs and further insights
Wow your colleagues
Step 1 – Choose a Reference Point
When you compare terms in Google trends the point with the most search interest always scores 100, this is fine for self-contained searches but creates a problem if we want to compare multiple terms.
To explain what I mean, let’s say we want to compare search popularity for Lionel Messi v Cristiano Ronaldo.
We put both terms into trends, set the location we want to track and the timescale. We then see a graph, it no doubt fluctuates, I suspect is fairly even overall, and the peak will be 100.
Looking at the last year, Messi has one huge surge (when he said he was going to leave Barca) – this the 100 win the graph, Ronaldo’s peak down at 25. Overall though, this short surge aside, they are very even for the year.
What 100 refers to doesn’t really matter, it would be nice to know but Google doesn’t tell us. We can guess both players get a lot of search, it’s generally even and that for a day or two Messi was suddenly in huge search demand.
If I then ran other search for Eden Hazard v Gareth Bale I would learn that Gareth Bale gets a lot more searches than Hazard on average.
Bale’s peak on the graph is 100, so was Messi’s. That means nothing. It doesn’t mean equality, the peaks were in no way even. if I run a comparison of Bale v Messi, Bale’s 100 peak suddenly becomes a mere 17.
That is why we need a constant throughout, something that allows us to apply a weighting to all other data we gather.
I generally try to find a player who will be fairly popular without being so popular they dwarf all other search – if I was doing Barca I might choose Pique rather than Messi.
For Arsenal, my gut instinct is that David Luiz is a good choice.
One important thing to note – if your team has one or two players who eclipse the search interest in all others, never use them as the reference point.
At Barca, if we used Messi we would see that other players hardly register in comparison, much of the data would simply read <1. We need a reference point who is in the same ballpark as the majority of other players in the team.
When you then want to include the Messi-type player, run a final comparison at the end of them against the reference player (as I explain how to do below)
Step 2 – Set up Google Trends
Head over to Google Trends and enter your reference player into the box for entering a topic.
Google may recognise the person – with my David Luiz example you can see it suggests the footballer. If this happens, choose this option as this is Google grouping all search topics linked to David Luiz the footballer but ignoring those that context suggests is for another David Luiz.
Once selected, it will display a graph of search interest over time – it will default to the US, so tweak this and also the time frame, I want UK and for a year, so a lot like…
And that’s it for step 2. That’s better, we’re making progress and the horrible spreadsheet part is still off in the distance…
Step 3 – Compare Five terms At A Time
Trends lets you compare up to five terms at a time, so I’ll do this for my first five Arsenal players (of course keeping David Luiz, my reference point, in there).
I’m just going to run through the players who played a decent number of games, starting from the back, so keepers, defenders etc. You can do it however you want and do this for as many or as few players as you want – do the entire squad if you want complete data.
For five terms, it looks like this.
I’m not actually bothered about the graph, I just want to download the data. You’ll notice the download simple top right of the grab above, use it.
We then repeat for other players MAKING SURE TO KEEP THE REFERENCE IN. My second set of players looks like this…
Again, download and repeat for the next batch of players and so on and so forth.
Right, I’m off to grab a coffee, after which I’ll finish downloading the data for this example and see you in step 4.
A Quick Note – Why Not Just Use The Average in the Mini Bar Chart?
Those with sharp eyes might have noticed that for each of these searches there is an average in a mini bar chart on the left hand side.
Why not just use this?
Yeah, I wondered that too the first time I used this approach, having spent hours working out the long-winded way to find an average.
You could use these averages, note them down and then use the reference player to recalculate.
The problem, is that you then have no context. Some terms will have a very steady level of search, few visible peaks or troughs, others will have very low search traffic bar one or two huge spikes. Cedric Soares at Arsenal very much the latter – a big surge when he signed for Arsenal, a second big surge later (if memory serves he scored a bit of a stunner (relative to what you’d expect from a third-choice right-back).
I like to have this ability to see some of the context, to see a little bit beyond the basic average but without having reams of minute-by-minute data.
So that’s why. Also, this post would never rank if it was only 47 words long.
Step 4 – Let’s Get It Lined Up In The Spreadsheet
Step 3 was brutal, wasn’t it. Oh, how I long for the days when teams would use about 14 players all season – in the title-winning season of 1988/89, Arsenal used 17 players all season and four of those made six starts or fewer (but Niall Quinn still gets all over the 1989 stuff despite two league starts. Glory hunter).
Anyway, we now have our lovely data, time to bring it into the spreadsheet.
I’m going to run through with Google Sheets, if you have Excel now is the time to show that when you said you could use it in that job interview that wasn’t actually a massive lie.
I have set up a sheet that has a tab for each stage of what I describe below so you can check we’re on the same page. Describing actions within a spreadsheet isn’t my strength.
Link to the example spreadsheet of Arsenal data
Create your new sheet and then go to import data.
Navigate to where the data was downloaded to and then select the append to existing sheet option, this is far easier than ending up with multiple sheets and having to combine it all…
Then repeat this for each of the downloads from trends – each time choosing append to the current sheet.
Each new set of data will be added below the end of the previous set – it will look like this and we need to sort it to get everything going neatly across the page.
What we need to do is copy each chunk of data so that they line ups with the first set – so the Leno, Martinez etc in the grab above would be lined up with the first set of players, not below. We don’t need to copy the date info each time.
I also leave a blank column between each data set so I can easily see each group – we’ll delete these blank columns later
I would then tidy up names to remove the extra info (English footballer) and it should all look like the tab entitled stage 2 in my example spreadsheet.
Pro Tip: Every name in my data had : (United Kingdom) after it. Rather than manually deleting all the, use find and replace (under edit in the menu). Enter : (United Kingdom) in the find and leave replace blank as we want it to simply delete this text.
Extra Pro Tip: Add a view freeze of one column and one row – this means that as you scroll you can still always see the week and the player name. Use the view menu to achieve this.
Step 5 – Making Each Minor Group Comparable
We now need to multiply some of the data so that everything is a fair comparison.
First though we need to get rid of all the instances of <1 in the data. Given there are also zeroes, there was clearly enough search traffic to register, but not enough to go down as ‘1’. As such, I use the find and replace method mentioned above and change every <1 to 0.5.
Annoyingly, I cannot use the find and replace as that would alter every sheet and so people still reading step four above would get confused by a tweak that hadn’t yet been mentioned.
In every set of data – each of the groups of five players – there will be a value of 100 (possibly more than one if data levels were tied – if David Luiz got a surge of search interest in March that was exactly matched by a search of interest in Hector Bellerin in June. This rarely happens)
In my first group of five, it is Luiz who has 100.
In the second group, it is also Luiz who has 100. These groups are therefore comparable, I don’t need to do any recalibration.
In set three, it’s the same – I suspect this is because I started by looking at the keepers and defenders, not the sexiest of players from a search POV.
Phew – set four is different. In set four, the data for Luiz that has been 100 in every previous group is only 50 (I promise it’s pure luck that the calculation is so easy!). This is because if we scroll up to an earlier date we see it is Granit Xhaka who has a 100, this for when he went a bit crazy at his own fans. Poor Granit, he’d have been the perfect captain for Covid times with no fans present.
Because Luiz’s peak value is halved, this shows that we need to double the value for everyone within this group of five – this is because Dani Ceballos, for instance, is scoring relative to the Xhaka peak of 100 and not the Luiz peak.
An example would be:
Luiz’s peak is 100,000 searches per week
Xhaka’s is 200,000
Ceballos has 30,000 searches at his peak.
In a group where Luiz’s 100,000 was the highest total, this would get 100 and Ceballos would be given 30.
However, because Ceballos is unlucky enough to be in the group with Xhaka, his score of 30,000 is now recorded as 15 (30,000 / 200,000).
By doubling this, we get back to the fair comparison of 30.
Now, I’m sure there’s a neater way of doing this, but here is my method – when you get used to it, it’s pretty quick.
We can’t just multiply the column by 2 where it sits, you have to select a new cell and tell this cell that you want it to be the contents of another cell multiplied by whatever.
I copy this group of data into a new sheet and then just their names a second time – we don’t need to bother with our reference player.
I then need to apply the formula, so selecting the blank cell under Ceballos I would put = then click on the first Ceballos cell with data (the 2) and then the formula. Let’s do it properly which would be /50 *100 as we’re dividing by the peak Luiz score in this example and then multiplying by 100. My formula in cell E2 reads: =A2/50*100
Obviously in this case we could have just done *2 as the maths is pretty simple…
My formula in cell E2 reads: =A2/50*100
Pro Tip: The little grab handle in the bottom right of the cell is your friend. First drag it across to cover all four players and the formula will copy. Then double click it and it will fill out all the data.
Now, select all this data and copy and go back to the tab with all the data.
Selecting the first data cell we need to update (so the first cell of Ceballos in my case) click in this and then choose edit and paste values only – we want the numbers without the underlying formula.
In this instance, it is Aubamayang who records the 100 and the point where Luiz has tended to hit 100 is just 72.
Using the method as above, we do exactly the same, only this time the formula would be =A2/72*100. See, it was good using the formula, don’t know about you but my mental maths struggles with factors of 72.
I then formatted the data just because I had some horrendous numbers with multiple digits after the decimal point – I think going to one digit post decimal point is fine.
Examples – You can see how I worked with the data by going to the tab called adding formulas… and then it is all put together in the tab ‘data recalibrated’.
Step 6 – Finishing Touches
In the tab entitled ‘Neat and tidy’ I have done just that.
I’ve deleted the spare Luiz columns off to the right and the blank columns.
I’ve also gone into format > number and chose an option without decimal points – now we’ve done our recalibration this is fine IMO – anything that is a 0.5 or above will get rounded up, the concern pre calibration was that a 0.5 could actually be a 2 or 3 for instance once it had been multiplied.
I also love some conditional formatting and so I went into the conditional formatting menu and applied a colour heat map so that I can easily spot the high values and get a visual overview for which players were getting search interest and when.
Pro Tip: If applying conditional formatting, play around with the mid point value so that the visual is usable – for this I find you actually want the mid point very low (I have it at six) so that it shows when a player is getting at least some decent search traffic.
This conditional formatting actually looks great if we zoom out – you can very quickly see which players got consistent search traffic, those who got some and those who rarely, if ever attracted searches.
We can also see, at a glance, the players who had one sudden surge (step forward Cedric Soares).
I added in average, min value and maximum value for everyone ( =average (data range); =min(data range) and =max (data range) where for data range you click and drag for that player’s data. Use the grab handle method to copy across.)
Then we have some graphs – plotting every player on the same graph is an almighty mess, so I went for groups by position, making sure that the Y axis had the same max value in every case.
Via the top right menu in any chart, we can publish the chart and embed – great for an intranet or reports – the forwards chart is below.
A quick visible way to see averages, min and max in order would be nice, one handy table that you can reorder.
How to make this chart
(again, with the caveat many of you are sure to be better at spreadsheets than me).
I copied the names lower down the page – copy them all and then use the paste special>transpose option so they go down the page.
Next we need the headers – name, average, min and max.
If we copy the data straight in and try to transpose it will give us an error. We therefore need to copy the data, paste special as a value and then re-copy this and paste and transpose. Then we can delete the first copy we made just to get the values.
Then use the menu button in the chart and publish if you want to embed it on a page.
What Can We Learn From This Data?
The whole process takes a while to implement (especially if you’re turning it into a guide…) but leads to a tonne of insights.
From my Arsenal data, I can see very quickly that:
Pierre-Emerick Aubamayang had by far the highest search interest on average, he averages 60% more search traffic than anyone else and at least 4x as much search traffic as almost anyone in the squad.
Mesut Ozil is a solid second despite never actually playing, a reminder that every stage of his fallout with the club is extremely newsworthy.
A spike in interest does not necessarily lead to longer term increased searched interest. Granit Xhaka and Cedric Soares had spikes in search, Xhaka’s was huge, but two weeks later it had fallen by 90%. Two weeks after Soares’ own surge, he was barely searched for.
Some rankings surprised me – Kieran Tierney was far more searched for on average than Bukayo Saka, a youngster who had a huge breakthrough season. Tierney was solid in search terms all year (and on the pitch), never higher than 31, no score lower than 2.
Being young, English and emerging as a first-team player doe snot necessarily translate to search interest. Alongside Saka, Joe Willock received few searches, ditto Reiss Nelson and Rob Holding.
Matteo Guendouzi surged in search from the moment he fell out with the club hierarchy. As a first-team regular, he averaged 3.5 in out search data. From the moment he had the run-in, his average jumped to 14.
If we were using this within planning we might think that analysis of Aubamayang is always of interest, even if we have analysed his play before. If you’re doing a feature on Arsenal’s attacking play, he is the player that receives search interest.
Editors might be thinking that people are getting bored of Mesut Ozil, being paid a huge amount to not play. This doesn’t seem to be the case, his average is 30 over the past 18 weeks and a fortnight ago he had his highest score (92) all year.
We can also see that although they might be building an exciting young squad, there is not a huge deal of search interest in those youngsters. This isn’t to say there isn’t value producing content on them, just that it won’t be bringing in huge numbers via search. However, cynically work in Aubamyang too – how Arsenal’s young guns are reducing the pressure on Aubamayang and the feature might just get a few thousand more from Google.
The data helps to give a bit more certainty – it can’t tell you what to write or produce, it can’t predict all future trends, but it gives you a general overview of players standings within a squad relative to each other.
Of course, you can also look at global data and see how that compares – in the case of Arsenal I suspect it will be even more lopsided with Aubamayang, Ozil and Luiz dominating. those young English players who weren’t seeing huge searches in the UK are unlikely to be doing particularly well globally.
Finally, Answering a Few Questions
q: are there better tools
a: Almost certainly, but everything used here is free. If you have a large budget that’s a different matter. However, what we have here is great for finding an overview and some interesting data insights
q: how accurate is the data
a: As we’re getting the data directly from Google, accuracy is assured. One small issue though is that they’re not given us absolutes or using decimal points. Assuming it’s using standard rounding, 0.6 and 1.4 would both be a 1 – if this also has to multiplied it could lead to individual data points that have a bit of a margin for error. However, over the course of a year these will have only a time impact overall.
q: Arsenal are a big team – would this work for a smaller club or topic with less search traffic?
a: yes, absolutely. There may be limits, if you try this for a non league club you might get a lot of zeroes in your data, but for a topic of even reasonable interest you can use this method. Also, it will be obvious very quickly if the data is insufficient.
Below, we have Norwich – the process would work just as well for them as Arsenal.
For Cabinet members it would be fine too. I use popularity loosely here.
Or England cricketers (I excluded Ben Stokes as he dominated)
q: This is great as a moment-in-time snapshot, but what about ongoing trends?
a: Rather than trying to add to this spreadsheet (which will lead to data error unless you are very careful as you won’t be comparing like for like), I would suggest creating new versions every few months, either for a year or just the past few months. This way you can see how data is changing and look for longer term trends – who has dropped in interest, who has emerged?
I don’t personally see huge value in having years worth of data in one spreadsheet, anyone who has huge popularity over that timeframe would be an obvious big name. We don’t need to run a tonne of analysis to find that Roger Federer, Lionel Messi or Virat Kohli have enduring appeal.
Once you have created the comparison, use it to spark ideas and also partially inform editorial decisions.
Share the information within the team – this could potentially be a fun way such as asking people to rank who they think the most popular people will be. How does the data match their preconceptions?
Don’t become a slave to the datas but equally don’t ignore what people are proven to search for.
Also – if this has been useful please let me know in the comments or get me on Twitter @jonbarbuti