"It's like an OkCupid for voting" - the Finnish election engines

Posted on 2015-05-11 in General

Have I ever told you about the time I built an "OkCupid for elections" for the communists?

No? That's strange, I tend to get good mileage out of that story during election season. Unfortunately for the story to make any sense, you'll need a bit of absolutely fascinating background information on how elections work in Finland, and especially how websites that tell people whom to vote for became an integral part of it.

The Finnish electoral system

Like many other countries, Finland has a representative electoral system. The country is divided into 13 districts, each district elects some number (average of ~15) of representatives into the parliament. The number of representatives in each district is determined based on population, while the distribution of the seats in each district is based on the proportion of the vote that each party got there. The proportional distribution of seats is done using the D'Hondt method.

That's pretty standard fare at least for continental Europe. The odd bit is in how the distribution of seats is done within each party. There is no predetermined party list at all, the ordering of candidates within a party is determined purely by votes. And what's more, it's not even possible to give a generic vote to a party as a whole; each vote is equally a statement on which party you'd like to win the seats, and on which specific person in that party you'd like to get one of the seats that the party wins.

This is fundamentally different from systems where voting for a person rather than just the list requires extra effort. It has a couple of sweet advantages, but also a bunch of disadvantages.

The advantages of the system I can think of are:

It decreases the power of party elites by removing from them the decision on how to order a list, and instead giving it to the people who voted for the party.
It encourages demographic diversity in the candidate pool. It makes a lot of tactical sense for every party to have candidates of all ages, of both genders, of different educational backgrounds, and different ethnic groups. Some people won't vote purely based on issues, but for example specifically want to vote for some group they have an affinity for, or feel is under-represented. Each party will want have a good spread of candidates to make sure that they don't lose any voters due to a lack of a compatible candidate.
This is of course pure identity politics, but somehow it feels a lot less objectionable when the effect is progressive. It's worth noting that while maximizing the demographic diversity of candidates makes sense for each party, the same might not apply to ideology / opinions. Candidates that don't fit into any of the wings of a party will just make it unclear what the party really stands for.

The downsides of the system come mostly from overloading a single vote to mean two things. You vote for a candidate, and the party gets a vote as a side effect.

The meaning of a vote is obfuscated. If your ideal candidate is not in your ideal party, what's the right action? Should you try to ensure that they get elected at the cost of helping the wrong party, or should you vote for a suboptimal candidate in the right party?
I'm pretty sure that the rational choice there is to vote for the party first. The party distribution of the parliament has more impact than the exact distribution of people. And unless you're pretty certain that your preferred candidate is "on the bubble" within the party, it's more likely that a single vote will swing a seat between two parties then it's to swing a seat between two individuals in the same party.
If the more important choice is the party, why is the candidate at the forefront of the decision?
Since some people will ignore the maxim to vote for the party first and person second, it makes sense for parties to run candidates with good name recognition in an attempt to hoover in some of these voters.
So the candidate list will have TV celebrities, beauty contest winners, olympic medalists and so on. And occasionally they'll even get elected bumping off "serious" politicians from the list. (Please note that I don't want to imply that the celebrities would automatically be incompetent at that job, that's not true at all. But their actual or perceived level of competence is not what gets them elected, it's their fame.)
I'm pretty sure that there's even some level of "no publicity is bad publicity" in the system as a result. The most obvious recent example is the most notorious corrupt politician in the country (guilty of soliciting for bribes as a minister, and I think the only person expelled from the parliament during my lifetime) getting re-elected once again. That's just the kind of cronyism you'd hope to get rid of by eliminating the party list, but it clearly doesn't always work.
For a candidate who is serious about getting elected the main competition are the other people in that party in the same district. If as a candidate you don't have any natural name recognition, you have to get it by advertising. This creates a situation where even a nominally party-dominated system can still have uncomfortably large level of either personal fundraising, or election campaigns being self-funded by the rich.
The problem with campaign fundraising is of course corruption, or the perception of corruption. I give 10k € to your personal election campaign, you after getting elected help in solving some nasty zoning issues that my company has. The last large scale election funding scandal happened only 8 years ago, and implicated a disturbing number of high profile politicians. And the problem with funding a large campaign from your personal wealth is that it ensures that the rich have a much higher chance of getting elected than the poor.
The issue with competition within the party being potentially more important than the competition between parties can also have other odd consequences. Like the the bizarre case of two candidates from opposing (and not particularly compatible) parties doing a joint TV advertising campaign. They were in different districts, so neither was personally hurt by increasing the success chances of the other one. But they were directly undermining their own party in the other district.
Even if you're trying to be a "good" voter and make your decision based on something else than fame or budget, how are you actually going to decide which of the 200 candidates in the district or even the 30 candidates in one party to vote for? Nobody has the time to do deep research on all of them.

You might notice that a lot of these issues boil down to what's essentially a discovery problem. If you could somehow solve that and make people aware of candidates that they might want to vote for based on some criteria other than name recognition, surely this would be an absolutely awesome election system?

The election engines

The solution to all these discovery problems that Finland arrived in the mid-90s was the vaalikone, which literally translates to "the election machine", or a bit less literally to "the election engine". The first one was made by the national broadcasting corporation YLE, in later elections other news outlets added their own versions, followed by the political parties and various kinds of trade / lobbying organizations. There's probably at least a couple of dozen active ones for any given election.

The core concept is simple. The site has a bunch of multiple choice questions. 20 questions is a typical amount. Some questions are related to the hot political topics of the day, some with stale / evergreen topics (why yes, let's ask people about NATO once again), others with cultural values, and yet others with economic ones. A month or two before the elections, the candidates answer the questions, possibly tell how important they consider that issue, and they might even write a bit of text explaining their answer.

During the campaign the voters will use the site to answer the same questions, and get a match percentage with all the candidates in the voter's district. For the really picky users, some of the modern implementations even allow you to restrict the results e.g. only to candidates in a given age range. You can now perhaps see why I use the quote in the title of this post when explaining the idea ;-)

And do these sites matter? Apparently over half the voters use one or more of them. In the 20-30 year old age group it's up to three quarters. And the results aren't just ignored. One study says 40% of the users had the results from the site affect their voting behavior in some way. Another says that a sixth outright voted for the top rated candidate. These numbers are not insignificant.

(Note: I am aware that similar sites exist in other countries. But my impression is that they are nowhere near as central to the process. If I'm wrong about it, I'd love to hear about counterexamples.)

The problem with the concept

So the issues have been solved, right? Use the recommendations from the website to narrow the candidate pool down to a handful, do deeper research on those candidates, and select the best one. Informed democracy has been saved!

Unfortunately no. While the concept is simple, I don't know if it really should be used for anything more than entertainment.

The most critical problem is that the algorithm, the question selection, the question phrasing, and even the reporting UI have a huge effect on the results. It's no different from e.g. the way polling results could be distorted by these kinds of methodological things. But unlike polls, these tools are directly intended to affect voting behavior.
One party or candidate can easily be the best match on one site and a horrible match on another. Someone from the Green party can be branded as the most right-wing MP in the country based on the results. Filtering the results to just show extremely left-wing politicians might show up several people from the conservative party due to (presumably) some kind of uninitialized data. These are not hypothetical examples, but event that actually happened.
As an example on the power of the UI, a site that specially highlights candidates from the best matching party (no matter how small the margin is to the 2nd best match) is going to give a dramatically different view from a site that has a single sorted list of candidates.
Reliance on these kinds of tools puts quite a lot of unauditable power into very few hands, and feels distinctly anti-democratic even if you assume that no malice is involved. Some people might argue that a similar if not greater amount of power is placed in the hands of e.g. the people moderating televised election debates. The difference is that at the end of a debate nobody gives you a personalized recommendation on what your opinion should be. The viewer would need to apply at least some thought to the process, rather than trust a number with too many significant digits that was spit out by an ostensibly neutral website.
The results are often flaky and inconsistent. False positives are fine; you shouldn't be just blindly voting for the single top candidate with no thought at all (even if some people are). But that strategy doesn't work with false negatives.
While a voter has an incentive to answer the questions honestly, the candidates do not. Nobody will get raked over the coals if their views happen to evolve between answering, and actually having to vote on an issue, except for a handful of the most contentious issues. Answering a multiple choice question simply does not appear to count as an election promise. (Note: This year one of the sites phrased a few of their questions in the form of explicit yes / no votes on some potential bills; that seemed like a clever idea).
It is an interesting question how exactly an unscrupulous candidate should "optimize" their answers to maximize their chances of getting elected, or if they could do it at all. Most likely the problem is completely intractable without insider knowledge of the algorithms and the answers of other candidates. If you assume that can't happen, the worst you have to deal with is just plain political dishonesty and trying to avoid unnecessary controversy.
The whole idea of trying to compute a match between the a voter and a party based on the candidates of that party is fundamentally flawed. The candidates are not the party. The party wide match is computed based on the aggregate matches of the candidates with the voter. But the full pool of candidates is irrelevant. What actually matters is the pool of people who get elected, and that is obviously not yet known at this time!
A cynic might say that even that's not important, and what really matters are the opinions of the party leader and their coterie, but if you go down that road you might as well just vote randomly.

All this is perhaps better illustrated by a real world example.

This year one of the major election engine sites produced a 72-77% match for me with six of the parties, and significantly lower ratings for the remaining two "serious" parties. When I redid the test a few weeks later, the results for those parties were 70-79% in roughly the same order; so the margin of error just from the uncertainty in my answers was at least a couple of percentage points.

Now, this list of six top matches included the conservatives, the xenophobes, the greens, the social democrats, the agrarian centrists, and the language-centric Swedish people's party. All of them were rated as the basically equally good within the margin of error, despite some of them being polar opposites. It's just absurd that all these parties spanning most of the ideological spectrum could be equally acceptable to me. I have no idea of what algorithm was used, but it was certainly a good way to give the impression that my vote doesn't matter since every party is the same.

And the UI issue I mentioned earlier is relevant here too. One of those six almost equally good matches was promoted an order of magnitude more than the others on the result page, based on a difference of just a couple of percentage points. And as it happens, that party was reported as an abysmal 50% match on another major site.

Anecdotally, this isn't just a useless idea. It's dangerously deceptive at least as currently implemented.

How the sausage was once made

Oh yes, I promised to tell a story!

The year is 2003. Our first startup had been recently acquihired for what was effectively peanuts compared to our big dotcom bubble dreams. But hey, anything beats not making payroll next month.

One of our customers was the 4th largest party in the country, created in the early 90s from merger of two struggling communist parties. At that time it was expected for each party to have an election engine on their website, with answers just from their candidates. Yes, a party-specific election engine is a pretty stupid idea. Doesn't matter, since everyone else will have one too. And someone had sold them one for way too cheap, maybe as a sweetener for a website redesign done a bit earlier.

Especially given how rough web development was in 2003 compared to now, the amount of time budgeted was pitiful. I don't remember the totals, but when broken down it'd be like half a day for a CRUD app to gather answers from the candidates, a couple of hours to make some kind of ranking algorithm, a couple of hours for design, and so on.

As was often the case with these underbudget one-off projects, it was given to me and I rolled a quick flat-file Perl special for the data collection and polling. But I had no good ideas on the ranking algorithm. So I explained the problem in a mangled way to a statistician friend, probably stopped listening to the answer too soon, and ended up with the impression that a normal (Pearson) correlation between the answers of a candidate and voter would be appropriate; just map the answers from strongly disagree / disagree / don't care / agree / strongly agree to a 1-5 linear scale, normalize the -1 to -1 results to a 0 - 100 range instead.

In reality it wasn't appropriate for a number of reasons. The answers were ordinal and non-linear. The results would be undefined if someone answered every question the same way. Due to the linear transformation property of the correlation computation, someone who answered every question with various degrees of "disagree" could be a pretty decent match with someone who answered everything with various degrees of "agree". And it'd be susceptible to overweighting certain issues if the set of question wasn't constructed very carefully, but included many questions that correlated closely with each other. (IIRC the questions were supplied by the party HQ, so the chances of them having been carefully constructed to be non-correlating are pretty damn low).

But never mind that. I was young, stupid, ignorant of statistics, and just happy that I had something working so quickly. Time to test the system! So I took it out for a spin in the Helsinki voting district. Top match 45%. Try another district. And another, and another. The results were horrible everywhere. But at least it worked superficially, so I sent an email to my coworkers and asked them to kick the tires a bit.

Everyone in the same office, my old startup friends, reported similar results.

We can't possibly ship something like this! They're going to cancel their contract if we give them an election engine that gives at best 50% matches and just loses them votes. So we started whiteboarding ways to nudge the numbers in the right direction without being too obvious. Maybe normalize to a 0 to 1 range first, take a square root, and then normalize to the 0 to 100 range? So 50% becomes around 70%. Is that too aggressive?

But in the middle of this brainstorming, before we did any changes, the project manager walked to our office. "Guys, you need to change the site. The match percentages are way too high. This will be really embarrassing, it looks like the numbers have been cooked.".

Turns out that even if we weren't exactly Ayn Rand-quoting liberalists, entrepreneurial 20-somethings weren't quite the right people to use as test users for the ex-communist party website, while the middle aged mother of three in the office next door had a slightly different viewpoint. After we stopped laughing about the situation, we concluded that the algorithm must be about right, and shipped it. (There might have been a couple of kludges added on top, like inserting a non-integer dummy vote into each answer set, to guarantee some variance so that the results would at least always be defined).

I guess it must have worked ok, since we never heard any complaints. And it being a party-specific election engine, the odds are that it did not materially affect the voting behavior of anyone let alone the final election results. It was only a few years later that I properly understood what a shoddy system we'd created, and how irresponsible building it was.

Even if I might not really like the results I get from the major election engine sites, they must surely be doing a better job than this. But it's still hard to shake off the feeling that automating away a basic human right is the wrong solution to the problem.

Name
Message
	As an antispam measure, you need to write a super-secret password below. Today's password is "xyzzy" (without the quotes).
Password