Every year, the Eurovision Song Contest brings with it fresh accusations that the results are affected more by politics than music. But how much of the outcome is in fact determined by mathematics?
On Sunday the Independent ran a story reporting on a ‘voting controversy’: the Austrian entry (who won the overall contest), despite being placed first in the UK’s results, received fewer telephone votes from the British public than the Polish entry. This is because a country’s popular vote is taken alongside the deliberations of a jury of music industry insiders to determine their overall rankings. The ‘story’ was triggered by the fact that the Eurovision organisers have this year taken the creditable step of releasing the rankings of each country’s jury and public votes on their website (in universally-loved .xls format). These figures reveal a rather mundane truth behind the ‘news’: Austria came 3rd with the judges and 3rd with the public, while Poland came 1st with the public but dead last with the judges. Since ameliorating the more eccentric tendencies of a public vote is presumably the reason the juries exist in the first place, this ‘controversy’ turns out to be somewhat of a non-story. The story was repeated in The Times and The Guardian.
But a salient fact was missing from all the coverage. Given that this is a story explicitly about how the public’s and the judges’ opinions combine to give the final result, it seems strange that no mention is given to the mechanism by which this is done. Indeed no explanation was given during contest itself. Some of the news reports and the contest itself spoke merely of a “50% split” between the two methods. Even the jaunty video provided on the Eurovision website (below) mentions nothing except “combining your country’s televote and jury vote”. It’s as if everybody considers it self-evident how this done, or that it’s not even a question, and ‘combining with a 50% split’ constitutes a full explanation in itself.
A very brief explanation of the contest for anyone unfamiliar with its delights: twenty-six countries compete in the final, and the winner is decided on the judgements of forty-odd countries who vote for which of the 26 they prefer (if they are one of the 26 they vote between the 25 others).
When slightly more detail on the system is provided, it’s mentioned that it’s the rank orders of the panel and public results that are ‘combined’. Getting a ranking from the phone voting is easy, though it throws up some issues: for instance, a bald rank order makes no distinction between a winner who garners 80% of the vote and one who squeezes into first with 10%. Generating a ranking from the six judges’ opinions is more complicated, but we’ll gloss over it for now since it’s just a more complicated version of the main problem: what does it mean to ‘combine’ the two rankings into one final result?
There is no single method for combining two orderings of preference into one list that reflects both sets of preferences equally. Indeed it is in general impossible to come up with a method that will never throw up a troubling result under one of various differingly contrived circumstances.
Having found no mention of a methodology after almost four minutes of dedicated Googling, I decided to have a crack at reverse-engineering the system from what results are detailed on the website. For each country, the rankings from each juror are provided, along with the combined judges’ ranking after the individual rankings have been passed through the Euromatic Patented Ranking Aggregator Machine (EPRAM). The phone rankings are also given, as well as the overall final ranking after EPRAM has combined the final judges’ ranking with that of the phone vote.
From a brief glance at the website it’s easy enough to break through the conspiracy of silence and determine that Eurovision uses, as you may have guessed, what I’ll call the ‘Strictly Come Dancing protocol’. The sub-rankings for the 25 or 26 countries being judged are considered as numbers from 1 to 25 or 26, and for each country the mean of their sub-rank values is taken — the overall rankings are taken from these averages. So a country placing 3rd with the judges and 8th with the public has an “average rank” of 5.5 and is beaten by a country taking both the fifth-place spots.
Country | UK Judges’ Ranking | UK Phone Vote Ranking | “Mean Rank” | Overall UK Ranking |
---|---|---|---|---|
Austria | 3 | 3 | 3 | 1 |
Malta | 1 | 5 | 3 | 2 |
The Netherlands | 7 | 2 | 4.5 | 3 |
Sweden | 5 | 8 | 6.5 | 4 |
Finland | 2 | 11 | 6.5 | 5 |
Spain | 4 | 10 | 7 | 6 |
Iceland | 15 | 4 | 9.5 | 7 |
Denmark | 13 | 6 | 9.5 | 8 |
Greece | 14 | 7 | 10.5 | 9 |
Switzerland | 9 | 13 | 11 | 10 |
Poland | 25 | 1 | 13 | 11 |
Hungary | 12 | 14 | 13 | 12 |
Norway | 11 | 17 | 14 | 13 |
Russia | 10 | 18 | 14 | 14 |
Slovenia | 6 | 23 | 14.5 | 15 |
Ukraine | 18 | 12 | 15 | 16 |
Romania | 22 | 9 | 15.5 | 17 |
Azerbaijan | 8 | 24 | 16 | 18 |
Germany | 16 | 20 | 18 | 19 |
France | 23 | 15 | 19 | 20 |
Belarus | 19 | 19 | 19 | 21 |
Italy | 17 | 22 | 19.5 | 22 |
Armenia | 24 | 16 | 20 | 23 |
San Marino | 20 | 21 | 20.5 | 24 |
Montenegro | 21 | 25 | 23 | 25 |
This is, I would guess, the system that most people would naturally implement if they were asked to combine two sets of rankings, since it starts with the intuitively obvious step of adding together the two numbers you’re given. I speculate that the Eurovision organisers, and the producers of Strictly Come Dancing, did not even consider the idea that alternate systems were even possible.
The SCD protocol might sound reasonable enough, but it’s worth considering in a bit more detail. Taking the mean is something you do with numbers that denote some quantity. It’s not really meaningful to average a set of ordinal numbers. Certainly Stanley Smith Stevens would not be happy with you if you did.
It also fails to account for the importance we place on the different placings within an ordering. Ask a man who won £10 yesterday and £70 today if he’d have been equally happy getting £40 both days and I’m sure he probably would. Ask a runner who came 1st yesterday and 7th today if she’d be happy with two fourth-place finishes and I doubt you’d get the same answer. First place means you’re the best. Fourth and seventh are both just also-rans.
We can see this at work in the UK results. Austria wins the UK vote with a 3rd and a 3rd, while Malta gets the runner-up spot with a 1st and 5th. (This is a tie by the “mean ranks”; ties are evidently broken by the phone vote. So much for a 50/50 split.) Likewise Finland came 4th among the panel of judges despite gaining four top-3 placements from the five-strong group, more than any other country. I challenge you to find a person who would say that 1st-2nd-3rd-3rd-13th is not a better set of results than 2nd-2nd-3rd-4th-9th.
Compare this with the system I thought up in thirty seconds while reading the Independent’s story. The countries are ranked according to their best rank out of the public and judges’ orderings, with ties then broken by whose lower rank is best. So Malta and Finland come top for each winning one of the sub-contests, but Malta’s 1st/5th pips Finland’s 1st/25th. Austria finishes 5th in the UK ranking, instead of first — a big difference from changing a system you may not have realised even needed to be chosen.
Note that this system is also a ’50/50′ split of the judges’ and public opinions. It also eliminates ties and better reflects the cachet placed on finishing first. It certainly has its own problems, and the existing system could be repaired to iron out some of its flaws. You could, for instance, assign points values to the different ranks, with bigger points differences at the top than the bottom, and add up these scores. You might recognise this idea, since this is what Eurovision does to combine the individual country rankings into the final result. Indeed, the irony is that the scoring system at this final stage, with its douze points and nul points is perhaps the most famous aspect of the contest. But elsewhere, this little piece of mathematics is either glossed over or assumed not to exist in the first place. And it could have a big impact on the final outcome.
Since there was a spreadsheet just sitting there, I decided to calculate the revised rankings if the system I outlined above were to be used for aggregating the individual judges’ rankings and then for combining that with the public vote. Disappointingly for my incipient career as an investigative journalist, the top 4 remains unchanged. Unsurprisingly Poland are the biggest beneficiary, their strong showing with the public shooting them up from 14th place to 5th.
Here is my spreadsheet in case you fancy a fun hour of checking mySUMIFS
:
“So a country placing 3rd with the judges and 8th with the public has an “average rank” of 5.5 and beats a country taking both the fifth-place spots.”
I have read this sentence five times and still I don’t understand. If a country takes both the fifth-place slots then its average rank must be 5. The order is based in ascending average rank so the country taking both the fifth-place spots must be beat the country with an average rank of 5.5, mustn’t it?
You’re right, this should have read “is beaten by”, now fixed.
Hi,
I disagree with you. You are only taking into account the positive marks. Why not the negatives? Why Poland (and not Finland as you write in the twelfth paragraph) should rank first (and not second as you write because ties are broke by televote, meaning your first table is wrong) if the jury thinks it’s the worst performance?
Furthermore, dispora voting wouldn’t be reduced with your method, in contrast with the current one.
I could see your point, and it’s interesting. Perhaps the best and worst marks should be eliminated, as in some Olympic competitions such as diving; speaking about the jury.
In my opinion, the simpler, the better.
regards
Indeed, I didn’t mean to imply that my suggested method is necessarily superior, merely that it’s a valid alternative and that the current method should in no way be considered the ‘default’ or only option.
The table and commentary beneath did indeed have the phone and jury votes transposed, this should be fixed now.