Python Governance Electoral System

taleinat · October 25, 2018, 9:27pm

At Brett’s suggestion, I’m breaking out the dicussion about which electoral system / vote tallying algorithm to use into this separate topic.

A few of us, myself included, have voiced concerns over the use of IRV for selecting a governance model. In the big picture of the process of choosing a new governance model, this is just a small technical detail, yet some feel strongly that it should be properly considered and chosen. I also suspect that many simply don’t have the time/energy to do the required reading and take part in this discussion, but would have a preference if given a clear choice.

Let’s try to come to a concrete agreed-upon suggestion quickly so that we may propose it instead of IRV; otherwise IRV will remain the chosen method due to there not being another widely supported alternative. Brett has suggested a deadline of October 30th for a poll on this subject to have been completed.

My reservations regarding IRV are not theoretical. The results of the 2009 Burlington mayoral election were different than what would have resulted by either a plurality or Condercet electoral system. There was much controversy about the results, and soon after Burlington replaced the IRV method.

Additionally, I’ll quote @dstufft from the main governance voting process topic:

Over a decade ago, Ka-Ping Yee (who used to be very active in Python development) ran some visual voting simulations on 5 popular systems, which scared him (& me) away from IRV forever:
http://zesty.ca/voting/sim/
The following images visually demonstrate how Plurality penalizes centrist candidates and Borda favours them; how Approval and Condorcet yield nearly identical results; and how the Hare method yields extremely strange behaviour. Alarmingly, the Hare method (also known as “IRV”) is gaining momentum as the most popular type of election-method reform in the United States (in Berkeley, Oakland, and just last November in San Francisco, for example).

Personally, I don’t feel strongly about which alternative method is chosen. I’m also far from an expert on the subject.

That being said, I like @njs’s suggestion to “Copy whatever Debian does.”, which means using the Shulze method. I find it very convincing that it is used by many prominent open-source projects and organizations, including Ubuntu, Gentoo, Haskell, KDE and OpenStack in addition to Debian.

The only negative about the Shulze method is that it is hard to understand. The Wikipedia page just includes a complex algorithm described in mathematical notation (relatively simple math, but certainly above a “high-school level”), and a hard-to-follow example. We should consider, though, that Brett specifically wrote:

I personally would strongly suggest something simple to explain, e.g. Borda

If we can present a simpler explanation of the rationale and workings of the Shulze method, I think it could be a good option.

pitrou · October 25, 2018, 9:39pm

Borda, Approval and Condorcet all look fine to me. I agree simplicity is a benefit, which would advantage Borda and Approval (IMHO).

njs · October 26, 2018, 5:17am

The only negative about the Shulze method is that it is hard to understand. The Wikipedia page just includes a complex algorithm described in mathematical notation (relatively simple math, but certainly above a “high-school level”), and a hard-to-follow example. We should consider, though, that Brett specifically wrote:

The core idea of all Condorcet-based methods is very simple: voters rank their choices like A > B > C. The algorithm then runs all the pairwise elections (“What if we had a vote between A and B? What if we had a vote between A and C? What if we had a vote between B and C?”). It assumes that if you rank A higher than B, then it means that in an election between A and B, you’d vote for A. So that part’s very simple. And the winner is whichever option would win every possible pairwise election. (E.g. A wins if a majority of people prefer A to B, and a majority also prefer A to C.)

However, in theory, a pathological thing can happen: a majority prefers A to B, and a majority prefers B to C, and a majority prefers C to A. So the majority’s preferences are: A > B > C > A.

This is kind of mind-bending, but it can actually happen with reasonable-looking individual votes. For example:

One third of people prefer A > B > C
One third of people prefer B > C > A
One third of people prefer C > A > B

A few observations:

I can’t find any evidence that one of these pathological cycles has ever occurred in any real election.
In the pathological cycle case, there is a very real sense in which the electorate simply does not have a single overall preference. It’s not exactly a tied vote, but it’s conceptually similar.
All the complexity on that Wikipedia page comes from the method that Schulze uses to try to produce a single winner even when there is a cycle like this.
AFAIK, everyone agrees that so long as you don’t hit a pathological cycle, Condorcet methods are great. If you don’t hit a cycle, then they pass every voting criterion, and there’s no incentive to do tactical voting.

So… in a sense the only time Condorcet methods are complicated or have any theoretical problems is when they violate “In the face of ambiguity, refuse the temptation to guess”.

So personally:

I’d be fine with using pure Condorcet, and simply declaring that in the unlikely event we hit the pathological case, we treat that as a tie and do another round of discussion or flip a coin or whatever our preferred tiebreaker is. I don’t think it will happen. And if it does, then at least the vote will have accurately revealed that we don’t have a well-defined collective preference, rather than picking one arbitrarily.

I’d be fine with using Schulze, because I don’t think we’ll hit the pathological case, and then it’s equivalent to pure Condorcet.

I’m not a fan of Borda. It ends up acting very like first-past-the-post: if there are two front-runners, then the rational thing to do is to hold your nose and rank the one you like better as first, and the one you like least as last, regardless of your actual preferences.

I’m also fine with approval voting. Conceptually, it’s doing something slightly different than ranked methods: it’s trying to find an option that the most people can live with, rather than trying to find an option that’s most-liked by a majority.

At the sprint, I brought up approval voting, and Raymond was against it on the grounds that it will tend to compromise and he thinks that’s bad. OTOH the reason Donald suggested SCORE in the other thread was because it tends to compromise like approval voting, and he thinks that’s good ¯\(ツ)/¯. Personally I don’t really have a preference; I think either general approach is fine, and it’s unlikely to make much difference. My main preference is just that we use a system where we don’t feel obliged to do weird tactical calculations in order to get a good result, and both approval and Condorcet methods seem OK to me on those grounds. I guess some people dislike approval because for options where they’re wishy-washy they find it hard to set their threshold, though, which I can sympathize with; it’s similar to my desire not to have to do weird tactical calculations.

An advantage of approval voting (and the reason I brought it up in the first place at the sprint), is that in addition to producing a winner, it also lets you say “and look, 70% of devs approved on this winner”, i.e. you can measure absolute support as well as relative support. I think this is a nice property, since it increases the overall legitimacy of the vote and it means that when reporters come asking we can cite this number to prove how unified we all are now that we’re past the election.

Debian has a clever trick that lets them get a similar advantage in their elections, even though they don’t use approval voting: they put a “Further discussion” item on the ballot, and it always loses by some huge landslide margin. This is probably even better at producing a nice number you can quote at journalists (“look, not everyone voted for this option as their first choice, but 95% of devs like it better than continuing to discuss”). It’s also nice because if there’s an option you think is really bad, like it would actively harm the project to pick, then you can express that by ranking it below “Further discussion” (and in Debian, sometimes individual options do end up below “Further discussion”). So if we don’t use approval voting, then I suggest we add a “Further discussion” option to the ballot.

brettcannon · October 26, 2018, 10:06pm

We specifically came to the conclusion at the dev sprints that we did not want a tie which would lead to more discussion. We all want this resolved and another discussion will just drag this out, especially in the face of two choices being so evenly split among us. We need a way to break ties and it needs to not be dragged out. And this becomes more crucial the higher the probability that a ties occurs …

… like with approval voting. This is why I don’t love this approach as I personally want to minimize any tie-breaking as it is simply going to get messy if people’s preferred choice isn’t chosen in the tie-break (which could happen if e.g. trio loses but commons supporters vote consistently for two or more of the four options). IOW I would potentially strategically not vote for ones I think might bring out a tie (which is why potential strategic voting with Borda doesn’t bother me as it’s going to happen regardless of the system we choose).

I have the exact opposite reaction. I don’t want to see this drag out. With out current timelines we have a chance to have this all settled with (potentially) elected positions all settled by Feb 1. If we have to go through this all again then we can only make that happen if we really time-box a second round of discussion plus vote to only a month (which I don’t know would happen in the face of a tie), otherwise we will go past March which seemed like a date people were happy with trying not to go past. So for me avoiding a tie is a plus in a voting system.

njs · October 27, 2018, 4:08am

Sure, any system can have ties, so we should have some plan for what to do. I am not nearly as worried about having a second round of discussion as everyone else seems to be (what else do we ever do in OSS except have long discussions?), but whatever, my point here is just that having a pathological cycle in a Condorcet vote is pretty similar to a tie, so if we have a plan for ties then one optoin is to extend it to cover this case too. As far as I can tell, genuine ties are more common than pathological cycles (I’ve heard of examples of the former, but not the latter), so I don’t think this would have any material effect on the probabilities.

I’m afraid I don’t understand what you’re saying here. In Condorcet methods, leaving options off your ballot is like voting for them equally, which seems like it will increase the chance of ties. For approval voting, it seems like the situation is symmetric – if there’s a tie, then flipping your vote from approve → disapprove will break it, and flipping your vote from disapprove → approve will also break it, whichever one you started with.

Like I explained, I think “Further discussion” is good for reasons that have nothing to do with it actually winning :-).

Anyway… “we should leave this option off the ballot because I don’t want it to win” is kind of anti-democratic, isn’t it? I can’t imagine it winning, since AFAICT an overwhelming majority of devs want this to be finished ASAP. But in the hypothetical world where an actual majority of Python devs think that all the available options are so bad that it’s better to keep discussing further, then surely you don’t want to unilaterally override the majority? At some point we have to trust each other to do the right thing

pitrou · October 27, 2018, 9:47am

+1 for a “Further discussion” option. This will allow to measure how certain the community is of its choice.

steve.dower · October 27, 2018, 2:03pm

I’m +0 on a further discussion option.

If we were having any discussion prior to taking the vote, I’d be +1, but since we’re basically faced with silence right now I think a “further discussion” option would only result in further delays with no benefit.

tim.one · October 27, 2018, 6:04pm

From what I’ve seen of people who earnestly simulate millions of elections across dozens of election methods and sets of assumptions, STAR does best overall in maximizing most desirable measures (nothing beats “plurality” for simplicity, but it’s hard not to beat plurality on any other measure - although Borda manages to do far worse if everyone is “strategic”).

IRV beats plurality in general, but in turn is beat by just about everything else.

Since I doubt anything will change here, I’ll just plant a seed: the little-known newish 3-2-1 system is easy to explain yet remarkably robust against manipulation. You give each candidate one of “good”, “OK - meh”, “bad” (like nerdy +1, 0, -1 but with friendler names).

Stage 1: Retain only the three candidates with the highest number of “good” votes.

Stage 2: Of the three remaining, eliminate the one with the highest number of “bad” votes.

Stage 3: Of the two remaining, the winner is the one ranked higher on more ballots.

It’s fiendishly clever, and partly for the same reasons as STAR (whose last step is the same): “strategic” voters can lie to influence the first two stages, but, if they do, they’re likely to lose their true preferences in the last stage. Each ballot is used in each stage, but for a different purpose in each (“keep the most loved”, “toss the most hated”, “pick the most preferred”).

And part of the cleverness is that a clear motivation for each stage makes intuitive sense at first glance. Which is a desirable criterion I’m not sure I’ve ever seen spelled out: that people intuitively expect a method to do well after minimal effort to digest what “the rules” are. For example, Schulze is horrible on that count.

barry · October 28, 2018, 12:36am

I vote for letting Tim decide

willingc · October 28, 2018, 7:41pm

Just to clarify @tim.one, your proposed system has one vote, not three? The three stages are completed using the results from the one vote. Correct?

willingc · October 28, 2018, 7:54pm

That works for me.

I would prefer to skip the “further discussion” option. We’ve already had a long period for discussion (3 months at this point). It’s important to try to decide the new governance model in December so we enter the new year with the goal of implementing the governance. Entering 2019 with undecided governance calls into question the dev community’s ability to make a decision and move forward.

All of the options on the table are reasonable and could work with commitment from the committers to do so. Also, none will be a perfect governance since humans have a hard time coming to unanimous viewpoints. Let’s do our best on selecting one of those on the table and move forward iteratively on implementation,

tim.one · October 28, 2018, 9:40pm

Oh yes! In 3-2-1, everyone fills out one ballot, once, They rate each proposal “Good”, “OK”, or “Bad”. Then they’re done.

The 3 “stages” just describe how the completed ballots are analyzed.

In general, all schemes that try to mitigate the distorting effects of “strategic” voters only allow voters to speak once, at the start, before any tallying is done or partial results are revealed. Cuz manipulators gonna manipulate That is, if there were three stages of voting, that would give strategic voters three chances to skew the results.

taleinat · October 29, 2018, 7:00am

I like the proposed 3-2-1 method. Barry and Carol seem to as well.

Since we should be bringing this discussion to a close, let’s see if we can agree on 3-2-1 as our suggested electoral system:

3-2-1
Further discussion (of which electoral system to use)

0 voters

pitrou · October 29, 2018, 9:13am

I don’t know why we would use a “little-known newish” voting system while there are perfectly acceptable tried and tested voting systems. Obviously the new system looks perfect because it wasn’t tested in the wild (or not much), so its defects and caveats are not known. That doesn’t sound like an advantage to me. Would you prefer a “little-known newish” cryptographic mechanism over a tried and tested one with known caveats?

pitrou · October 29, 2018, 10:30am

(OT, but the way this discussion is currently going makes me a bit skeptical about the claimed advantages of Discourse as decision-making aid)

ambv · October 29, 2018, 12:47pm

Tal’s vote only has two options: take the 3-2-1 method or discuss further. This is missing the default of “let’s keep things as they are” and the only concrete option proposed in the poll is not a notable voting process. As @brettcannon said, it looks like the people concerned about the voting process are outnumbered by the silent majority.

As such, I consider Tal’s poll to be a way for opponents to IRV to rally around a single counter-proposal (3-2-1).

I strongly suggest that if support for that single counter-proposal is not clear by end of October 30 AoE, we run the vote as currently expressed in PEP 8001.

3-2-1

While I am very curious about the 3-2-1 method myself, it is alarming that the best source of information on it is currently a Quora answer. It does not sound like this is a notable, proven method (yet). Singling it out as the sole concrete option in the poll is surprising to me, especially given the Arrow theorem brought up on python-committers by @Alex_Martelli. But if this is the counter-proposal you’re rallying around, that’s fine by me.

Tick tock

I think it is already risky to fiddle with PEP 8001 this close to the vote. But it would be unacceptable to make changes to the voting process within two weeks of the vote. That means we would have to push the vote to 2019.

Keep in mind this is also distracting from discussing the actual proposals.

Q: “But I want this poll to decide the voting method, period!”

A: Well, the poll as stated above is not compatible with this want. At the sprint, out of 25+ committers there, we’ve had a self-selected group of nine that decided on IRV. Thus, a late vote to overturn the selected voting mechanism should identify first and foremost whether there is enough consensus to do it. So at the very least the vote should have three options:

Keep PEP 8001 as is;
Replace IRV with 3-2-1;
Postpone the vote to 2019 and discuss PEP 8001 further.

Stating it in such terms would make it more clear how big the opposition actually is. And that “keep bikeshedding” has a price.

tim.one · October 29, 2018, 4:49pm

As I said, I didn’t expect anything to change - I was “planting a seed”. I take large-scale voting simulations seriously, and 3-2-1 has done very well on those. The number of wholly transparent real-life elections whose details are accessible is tiny, so if you want “real life experience” you’re going even more on faith What the world has most experience with is “plurality”, which sucks just as much as voting simulations predict. On the rangevoting site, you can dig to find real-life examples showing that IRV actually does, at times, deliver results as bizarre as Ka-:Ping Yee’s voting simulations (or any number of others’ simulations) said it would. Etc.

The seed I’m planting is the idea that it’s possible to have a robust scheme whose rules are easy to grasp. I bet you understood every step of 3-2-1 on first (at worst, second) reading. That’s worth a whole lot, In my experience, people who aren’t “election nerds” generally can’t do an IRV calculation them self, not because it’s particularly difficult, but because they don’t actually understand what “the rules” are. They just think they do. Schulze? Forget it.

STAR is also easy to grasp, but you first have to digest the idea of “adding scores” instead of just “counting votes”.

But we do, every time a new crypto standard is released. There is no “real world experience” with them at first, but there’s a world of theory that’s been vetted by crypto experts. For voting systems, there are simulations, and a world of formal properties that can be analyzed in advance (Arrow’s theorem - depending on which version you bump into first - names only about 5 of them - there are many more that can be analyzed).

Only election nerds care about the latter, though. For whatever reason, I’m attracted to systems that people really do understand with minimal effort

tim.one · October 29, 2018, 5:03pm

It’s the only source I happened to link to. Here’s more

There are many “impossibility theorems” concerning voting systems. Arrow’s is one. No voting system ever mentioned here evades even one of them - that’s why they’re called “impossibility” theorems The link I just gave above spells out how 3-2-1 fares on 7 of the commonly analyzed formal properties.

But who cares? Election nerds. Given the impossibility of satisfying all intuitively desirable properties simultaneously, I’m more interested in how simple a method is to understand and how well it does in large-scale simulations.

I said at the start I didn’t expect anything to change. I’m not pushing that at all for this election. Just “planting a seed”.

taleinat · October 29, 2018, 10:20pm

To be frank, I feel out of my league here. I’m new in this community, far from an expert on the topic, and find myself unsure how to proceed.

I created the poll here after seeing support from two members for the “3-2-1” suggestion. If there was a consensus here I meant to create a poll “IRV vs. 3-2-1” by sending to python-commiters in addition to posting on Discourse, but that’s not happening.

It doesn’t seem that there could be a method we agree upon to suggest in place of IRV, at least within the short given time frame. I guess this means IRV remains the chosen method by default, which makes me sad.

I can only hope the the vote results will be so clear-cut that the voting method ends up having been a non-issue in hindsight.

tim.one · October 29, 2018, 11:49pm

Tal, you did fine! I didn’t expect anything to change, but the poll was a good idea just in case there was a hidden groundswell of support. I might have started one myself, but had no idea it was possible to create a poll.

I said before (probably not here) that I’m fine with using IRV for this election. Problems start in earnest when political factions start manipulating the system. Since that’s not the case here, I’d even be fine with using straight plurality here (“pick one - most votes wins”). We’d probably get the same result regardless.