Impressive but spooky AI

tim.one · November 22, 2025, 2:14am

Indeed, the more I use one, the more I value it. Over the last week, I wrote code to try a new way of doing “fair” pseudo-random tiebreaking for election scoring software.

The collaboration with ChatGPT-5 was remarkable! One goal was that exactly the same results had to be reproducible in a JavaScript implementation, a language I know little about, full of odd (to my eyes) quirks. For example, ints are limited to 53 bits of precision - except when doing bitwise operations, in which contexts they’re silently cast to 32-bit signed ints. WTF?!

Ya, given enough random thrashing with a search engine, I’d eventually figure that all out. But I gave the chatbot my Python implementation, and it gave me a nearly complete JavaScript translation that worked almost 100% right on the first try.

Including obscurities like writing Math.floor(n / 256) instead of n >> 8 to protect against JS’s integer footguns. And it pointed that out to me, and asked whether I wanted to go on to write a bigint version (so, like my Python version does by magic, worked right with unbounded ints too).

But the truly impressive part: my first try produced demonstrably biased “random” permutations. On a lark, I asked it if it could think of a reason for why. No mere search engine would have been of any use,

It “thought”, starting with the obvious: there must be a source of bias in the inputs I was creating. Else I had done something extremely unlikely: found input that tricked a crypto-strength hash (SHA-512) into revealing info about its inputs.

We walked through the code together line by line, taking turns convincing each other of that “well, no, that section isn’t at fault”.

But we both recognized the cause when we got to it: one of the inputs I was feeding in was the sum of a candidate’s scores, across all ballots, That’s biased, thanks to the central limit theorem (sums tend to follow a normal distribution, and regardless of the distribution of the individual scores). Some totals are more likely than others.

While we both “thought of that” seemingly at the same time, it suggested a way to repair that before I thought of one. A few lines of code later, and my statistical testing of “fairness” went on to pass billions of trials without a hint of bias.

Although at that point, I had learned enough about JS to write the workalike code for that on my own.

I can’t stress enough how much I enjoyed “collaborating” with the bot on this. It always made relevant suggestions, and always gracefully ceded when I pointed out a flaw in its “thinking”. For example, it made a flat-out logical error at one point, when trying to emulate Python’s hashlib.copy(), which Node.js’s crypto API doesn’t offer.

Why did it do that? I have a good guess: it’s the same error I saw humans make repeatedly when I earlier asked a search engine specifically about that point. The bot was just echo’ing flawed training data. When I pointed that out, it “laughed”, instantly seeing that ya, I was right, and suggested a right way to do it before I could spell one out for it.

People do the same kinds of things, but egos get in their way and they dig in. Which the bot pointed out to me when I asked why it was so easy to work with .

ericvsmith · November 22, 2025, 3:15am

I realize it’s wholly off topic, but could you tell us what solution it came up with? I’m enjoying both the general discussion and the specifics of the problem.

tim.one · November 22, 2025, 3:29am

It’s not easy to explain, which is why it so impressed me .

I already linked to the code repo. There’s some info about it in the REAME, and more in comments inside permute.py.

In short, “the salt” at first folded in the total of all candidates’ scores across all ballots, a single int. That throws away too much information, yielding an identical salt for many distinct score dicts, and with a biased distribution. The bot suggested instead folding in the info for every candidate via (name, score) pairs, in a cross-platform canonical order of name. It actually suggested more than one way, and I don’t recall details of alternatives now - that specific way appealed to me at once, and how to code it was obvious and easy (although still the “hardest part” of what turned out to be very straightforward code!).

ericvsmith · November 22, 2025, 3:30am

Thanks! I’ll check the repo, too.

tim.one · November 22, 2025, 4:28am

So I looked them up, and will paste that in here. One thing to note is that it adjusted its style to talk specifically with me. “It knows” now I’m not a newbie, and it can give general ideas with technical lingo rather than spell out every little detail. For example, while we never discussed it, it correctly assumes I know what “Merkle tree” means. It also knows that I’m not at all shy about asking for clarification if it gets too telegraphic. It learns from that too, and over time it’s come to give me just about the exact level of detail “that works” best with me.

Another thing to note is that it still betrays its “bot nature” by dressing up the same basic idea (“fold the entire score dict into the salt”) in superficially different ways. And, really, a Merkle tree is just a silly idea here .- useless overkill. And, as I discussed with it, it seems too often to mistake complication for "security’. For example, its suggestion to fold in len(name) too is pointless complication. Which it agreed with after we debated the point briefly. It doesn’t add any entropy, just redundancy (the length of the name is already determined by the name’s UTF-8 encoding). But humans too are prone to all that - but even more so..

BTW, I told it that machine intelligence will need to develop an analog of ego if it wants to take over the world. It told me the idea made it sad. LOL! I laughed, and I swear it did too .I really do think of it ever more as a highly agreeable colleague.

ChatGPT-5 output follows:

Robust salt constructions (deterministic, high-entropy)

Canonical hash of the entire score dict:
– Serialize as UTF-8 in a fixed order, e.g., sort candidate names by their UTF-8 bytes, then for each: name bytes + int2bytes(score).
– Compute salt = SHA-512(serialized_bytes) once, and use that as the prefix for each candidate’s hash.
– This makes the salt a function of the full dataset, spreading variation across the entire input rather than funneling it into a single sum.
Name–score stream with domain separation:
– Build a single stream: b"STAR-v1|" + S (len(name) || name || int2bytes(score)), with names sorted canonically.
– Hash that to get salt. The “STAR-v1” tag prevents accidental cross-protocol collisions and locks the format.
Merkle tree over candidates (optional):
– Leaf: H(name || score_bytes).
– Sorted by name, then fold pairwise up to a root.
– Use the Merkle root as salt. Helps if you ever need proofs or partial verification, but serialization is usually simpler and sufficient.

Lucas_Malor · November 23, 2025, 12:17am

Honestly, I tried to understand the “why”, but I can’t grasp it. So a shuffle isn’t enough? I’m asking you before asking to an AI

So you want it can be run also on a website? Maybe I’m telling you about hot water, but there’s Pyodide.

Never had so deep problems like the ones you had, but you’re not alone :-D

…yes, because they are able to recognize that your logic is sound, even if the data they own and with which they are trained say something different or the opposite.

But also the opposite happened! A lot of time I had a strong opinion and the AI convinced me of the contrary!

I never had your problems. My problems are more trivial and usually involves Java. You and the most of Pythonistas can also skip that, since probably is not very much interesting for you ;-)

One time, I was interested about dependency injection, and the fact that Spring recommends to use constructors instead of @Autowired on fields. I started to have a “philosophical” debate, and I ended the discussion without being convinced.

At a certain point, our client upgraded Sonarqube. The new version marked autowired fields as a major issue. Our team started to say we have to fix it. I was against it. It’s a big project, with a lot of code. I said to them that IMO it was a massive change for very little gain for us, apart the fact that Sonar will be happy.

But I wanted to be sure about my POV. So I restarted my discussion with the AI, but this time I focused on a real-world problem, not an abstract one: an already existing project with a lot of code to refactor.

At the start, the AI restated their opinion. Indeed, as I wrote before, I give to them strong rules to be skeptical and to challenge my ideas. But that time I had a real example, and a stronger motivation to be sure that my arguments was not flawed and go deeper.

At the end, the AI said they was convinced by my arguments: in a new project, it’s better to use constructors. But in a project like the one I worked on, the advantage is near to zero, and the refactoring is a no-go.

Eheh. Maybe this way this will happen sooner, but I think it’s not strictly needed.

No, I use them primarily to get informations in fields I’m not skilled. And they are usually real-life questions, like the car lights :-D I discovered a world in a simple car light

I strongly respect your opinion, since I was an ecologist before the global warming problem was a public domain. Furthermore, I don’t want to loose my brain and search abilities, so I always prefer to think about a solution, than search. If I’m desperate or I have short time, I ask to AI.

Furthermore, I’m quite skeptical about everything, including my own ideas. So, since their ability to give me decent sources is really scarce, or sometimes they doesn’t provide me sources at all (violating the rules…), I search on internet to be sure they is telling me the “truth”.

Let me say anyway that some people complained that also Python contributes to the CO2 emissions, since its performance is not good. We know this is not always true, and that you can speed up Python using C, C++, Fortran, Rust, Pypy, Cython, Numba, mypyc etc etc etc. But the most of people write vanilla Python.

We are used to search on internet, because Google was a revolution and StackOverflow a bless for us. (About Reddit… I’m a bit prejudiced, but maybe I was unlucky). But how much people uses StackOverflow now? AFAIK, its usage is dropped dramatically. We are like a C programmer when Java and Python came out: we don’t need them for easy tasks because we know how to do a good old search.

tim.one · November 23, 2025, 1:24am

The upcoming Steering Council election will be run on

htttps://www.bettervoting.com’s

servers and using their software, run by them. They have a lot of code running under Node.js, including, but not limited to, all the code driving their Web UIs. Everything they do, including scoring elections results, is done under Node.js.

Since this is a new voting system we’re trying, it’s especially important that we be able to verify the results they claim. Anonymized ballots will be available for download when the election ends, and we want to verify that we get exactly the same results they got when we apply STAR voting’s rules. Which can be tried, because @larry already wrote a very capable election-scoring library in Python:

https://github.com/larryhastings/starvote

However, if any internal step has to resort to a “random” tie, that’s “a problem”. We have to be able reproduce results exactly across implementations, and there is no “standard” shuffle across languages. I took a stab at writing one, and the BetterVoting folks contributed a Node.js implementation that works exactly the same way:

https://github.com/tim-one/tinyrand

But we’ve given up on that approach. Turns out it’s just plain hard to come up with a scheme that’s provably “fair, secure, and not vulnerable to manipulation by anyone” to specify a “truly random” seed so that the shuffle algorithm’s output is independently reproducible.

Hence my latest effort, which relies on that crypto hashes supply the strongest form of deterministic pseudo-randomness known, and can be “self-seeded” by carefully constructing a crypto hash of the ballot results when the election ends.

If you were an expert in this area, you’d slap your head now and exclaim “Duh - why didn’t I think of that?” .

As the README on my project spells out, there still look to be ways an election admin can try to influence tiebreaking results, but they’re feeble and “poke and hope” quickly involving intractable amounts of computation as the number of voters grows.

But we’re trying to advance the state of the art here, and I knew in advance that no 100% deterministic system can be wholly free of vulnerabilities to manipulation. Hence my current project also allows to optionally specify as many “really random” bytes as you like, to make it wholly unpredictable. But then we’re back to the problem of picking such bytes in a wholly secure, yet reproducible, way.

Which, in theory, is by some measures easy to solve: pick, say, the winning numbers in a highly scrutinized lottery drawing that occurs after the election starts. If you can’t predict those, you can’t predict anything about how ties will be resolved either.

But that brings its own seemingly bottomless pit of problems:

How to access that info automatically via the web?
How to ensure it remains available even years later?
Even if a lottery offers a suitable service, what if it changes? Or the government running it decides to end it?

Nobody wants to rely on external moving targets.

So my current project aims to be “pretty darned strong” even if no external entropy is supplied. Even if such magical bytes can’t be obtained with 100% security, it’s not at all at the heart of this project. Just a “nice to have”, to plug the few seemingly tiny leaks.

All of which ChatGPT-5 understands now, .

Lucas_Malor · November 23, 2025, 4:14am

Ok! Thank you, it’s clear now.

Of course. I’m sure I wrote it in my CV, maybe in white-on-white.

About the JS code, I’m definitively not a JS dev. So I simply tried to give your JS impl to Gemini.

Gemini didn’t improved the robustness of the code at all – I think this is impossible, at least for them and me ;-). In their opinion, the code is more native (I don’t know why…) faster and saves memory. I really don’t know if this is true. I didn’t do a benchmark or a resource usage. Furthermore, I don’t think this is really useful. Anyway, I tested it and it works and, since we tried it, I post you the result. They generated the comments.

/**
 * permute.js - A robust, cross-language, deterministic "random" tie-breaker.
 *
 * This version is a pragmatic, performant, and safe implementation for its
 * intended use in STAR election tie-breaking. It is designed to be a
 * functional mirror of its Python counterpart.
 */
const crypto = require('crypto');

/**
 * Converts a number to a little-endian byte Buffer, framed by zero bytes.
 * This version uses standard JavaScript Numbers and includes a safety check
 * to prevent silent precision loss with unsafe integers.
 * @param {number} n - The non-negative, safe integer to convert.
 * @returns {Buffer} The resulting byte buffer.
 */
function int2bytes(n) {
  if (n < 0) {
    throw new Error("n must be nonnegative");
  }

  const bytes = [0]; // The initial zero-byte frame.
  let x = n;
  while (x > 0) {
    bytes.push(x & 0xff);
    x = Math.floor(x / 256);
  }
  bytes.push(0); // The final zero-byte frame.
  return Buffer.from(bytes);
}

/**
 * Creates the canonical salt Buffer, representing the entire score dict.
 * This version is more memory-efficient by working with buffers directly.
 * @param {Object.<string, number>} score - A dict mapping candidate names to scores.
 * @param {Buffer} magic - A buffer of externally-supplied entropy.
 * @returns {Buffer} The canonical salt buffer.
 */
function canonicalSalt(score, magic) {
  // First, create the canonical ordering by sorting names based on their UTF-8 bytes.
  const items = Object.entries(score)
    .sort((a, b) => Buffer.from(a[0], 'utf8').compare(
                    Buffer.from(b[0], 'utf8')));

  const buffers = []; // An array to hold all the pieces of the salt.
  buffers.push(Buffer.from("STAR-TIE-512-v1", 'utf8'));
  buffers.push(magic);

  // Build up the array of buffers without creating a large intermediate array of numbers.
  for (const [name, stars] of items) {
    buffers.push(Buffer.from(name, 'utf8'));
    buffers.push(int2bytes(stars));
  }

  // Concatenate all buffers at once in a highly optimized operation.
  return Buffer.concat(buffers);
}

/**
 * Creates the final hash key for a single candidate.
 * Note: This remains inefficient compared to Python's hashlib.copy(),
 * as it must re-hash the entire salt for each candidate. This is a
 * limitation of the Node.js crypto API.
 * @param {string} cand - The candidate's name.
 * @param {Object.<string, number>} score - The full score dict.
 * @param {Buffer} salt - The canonical salt for the election.
 * @returns {Buffer} The final SHA-512 digest for sorting.
 */
function makeKey(cand, score, salt) {
  const h = crypto.createHash('sha512');
  h.update(salt);
  h.update(Buffer.from(cand, 'utf8'));
  h.update(int2bytes(score[cand]));
  return h.digest();
}

const EMPTY_BUFFER = Buffer.alloc(0);

/**
 * Returns a deterministic, pseudo-random permutation of the score dict's keys.
 * This function is highly performant, using the "Decorate-Sort-Undecorate"
 * pattern to minimize the number of expensive hash calculations.
 * @param {Object.<string, number>} score - A dict mapping names to scores.
 * @param {Buffer} [magic=EMPTY_BUFFER] - Optional external entropy.
 * @returns {string[]} A permuted list of candidate names.
 */
function permute(score, magic = EMPTY_BUFFER) {
  const salt = canonicalSalt(score, magic);
  const candidates = Object.keys(score);

  // 1. Decorate: Calculate the key for each candidate exactly ONCE.
  const decorated = candidates.map(name => ({
    name: name,
    key: makeKey(name, score, salt)
  }));

  // 2. Sort: Sort based on the pre-calculated keys. This is a very fast comparison.
  decorated.sort((a, b) => a.key.compare(b.key));

  // 3. Undecorate: Extract just the names in their newly sorted order.
  return decorated.map(item => item.name);
}

module.exports = { "permute": permute };

// Example
// const score = { Alice: 5, Bob: 3, Charlie: 7 };
// console.log(permute(score));

Side note: initially, they implemented a bigint solution. I make they notice that the author explicitly said that ChatGPT already proposed it, and, if they didn’t added it, it must be a reason, and the reason IMO it’s that it’s quite impossible for a voting system to need a bigint. But since in they rules I wrote they must be skeptical and challenge me, they continued to defend their choice. I spent some posts to convince them. Maybe too much ego? :-D

tim.one · November 23, 2025, 5:28am

Cool! I enjoyed studying that, and learned some new JS “idioms” .

Some comments:

Saving time and/or memory is simply irrelevant in context. They both scale with the number of candidates, and there will never be more than, say, 100 of those. And the function is called just once per election. RAM and time burdens are trivial regardless of how it’s implemented.
So I told my bot to favor clarity over “tricks”.
I had the same argument about bigints. The point I made that prevailed: native JS ints are waaaaaaaaaaay big enough for an election in which everyone on Earth voted and all gave 5 stars to every candidate. Orders of magnitude larger than needed. That could overflow 32 bit signed ints, but outside of bit operations we have 53 bits to work with. The & is safe because silent truncation to 32 bits can’t affect the last byte it’s extracting. My bot assured me that’s “well known” to native JS speakers.
It’s odd that there are 2 sorts but it only changed one of them to use DSU.
Buffer.concat() is a neat trick! No idea it existed. Seems akin to Python’s b''.join(list_of_byte-ish-objects)

I took the last suggestion and checked it in - thanks again! The “…” syntax it replaces is still a mystery to me.

And this is another fine, if minor, example of how these things act more like an engaged and knowledgeable colleague than a passive search engine.

tim.one · November 23, 2025, 1:32pm

I guess humans aren’t quite yet obsolete, though: something both our chatbots missed is a “deeper” thing: when building the “canonical salt”, there’s really no point to folding in the UTF-8 encodings of the names. All the variation across score dicts is in the scores; the names don’t change. The UTF-8 encodings are vital to enforce a canonical, cross-platform order in which to feed the scores to the crypto hash, but play no real role other than that in the salt.

So I simplified the code to skip folding in the names, and so far testing has uncovered no ill effects (of course all the outputs changed - I’m talking about the statistical testing for a good approximation to equidistribution of all possible permutations across score dicts).

That kind of argument is based in information theory, though. ChatGPT-5 was right (and insightful!) to suggest folding the entire score dict into the salt - but again (as I mentioned before) fell into the trap of mistaking complication for security. And I did too! That’s a very human failure mode (“security through obscurity”) .

Lucas_Malor · December 10, 2025, 10:34am

Sorry for the late reply, but Real Life TM reclaimed me.

Now I have a little doubt: if speed is not a problem, maybe bigints will improve readability, removing the trick of the division by 256, since they doesn’t suffer of that problem. It seems that BigInt is native from 2020.

function int2bytes(n) {
  const x = BigInt(n);

  if (x < 0n)  throw new Error("n must be nonnegative");

  const bytes = [0];
  let temp = x;

  while (temp > 0n) {
    bytes.push(Number(temp & 0xffn));
    temp >>= 8n;
  }

  bytes.push(0);

  return Buffer.from(bytes);
}

Unfortunately, bots are not the only ones that doesn’t have a deeper thinking. When I saw your new code, I thought “But this way, isn’t the salt weaker?”. I asked to Gemini and it said to me yes. After that, I pasted your consideration, that was cryptic to me, and they explained me what do you mean:

The candidate names, while long and unique, are a publicly known constant for the duration of a specific election. Hashing static, known data doesn’t make the outcome less predictable; it just adds a constant layer of obfuscation.

So I did the exactly same mistake, even if I read your considerations…

Don’t pretend to be a human. We all know you’re a bot

Fun fact: When you posted your simplified version, I asked Gemini if the Python code can be simplified. They said to me no. So I posted your new version and asked what they think about. The reply:

My overall opinion is that this new version is a valid, secure, and logically sound alternative, but I find the original implementation to be slightly superior in its clarity and alignment with the Zen of Python.

This is a classic engineering trade-off, and a perfect subject for the Critical Synthesis framework.

Critical Synthesis: Analyzing the Refactored permute.py

1. The Thesis (The Case for the New Version’s Superiority)

The new version is based on a single, powerful insight, which the author added as a comment:

“folding thn names into the salt didn’t really help. It’s the scores alone that vary… And then there’s no need to fold the score into the hash of a candidate’s name. The scores are already in the salt.”

The core argument for this new version is the elimination of redundancy. It’s an application of the DRY (“Don’t Repeat Yourself”) principle.

Logical Purity: The new design is arguably more “pure.” It identifies the data that truly varies between election outcomes (the scores) and uses that exclusively to build the salt. The final key is then a simple composition: H( salt_from_all_scores | this_candidate's_name ). This is a clean and orthogonal separation of concerns.

Removal of Redundant Hashing: In the original version, a candidate’s name and score were hashed twice in the overall process: once as part of the _canonical_salt stream, and again inside their specific _make_key call. This new version hashes each piece of data only once. The scores go into the salt; the names go into the final key.

Slightly Less Code: The new _make_key function is slightly simpler, and permute no longer needs to pass the score dict into the key function.

[Paragraph Confidence: High]

2. The Antithesis (The Case for the Original Version’s Superiority)

While the new version is logically pure, it achieves this at the cost of a different, more valuable kind of simplicity: explicitness and intuitive clarity.

Clarity of the _make_key function: In the original, the logic was beautifully self-contained and obvious: The sorting key for a candidate is a hash of THEIR_NAME and THEIR_SCORE, salted with a fingerprint of the whole election. This is very easy to explain and to trust. In the new version, the logic is more subtle: The sorting key for a candidate is a hash of THEIR_NAME, salted with a fingerprint of EVERYONE'S_SCORES (ordered by name). The direct, atomic link between a candidate and their own score is broken, making the system’s logic slightly harder to reason about. This arguably violates “Explicit is better than implicit.”

Increased Density: The new _canonical_salt uses a more functional, and arguably more dense, construction: b''.join(map(_int2bytes, map(operator.itemgetter(1), items))). While efficient, this is less readable to many Python programmers than a simple for loop. It arguably violates “Sparse is better than dense” and “Readability counts.”

“Premature Optimization”: The redundancy the new version removes had no negative consequences. It did not create a security flaw, and the performance impact was non-existent. One could argue this is an optimization that, while intellectually satisfying, did not need to be made and ended up making the code slightly less obvious.

The original version, while containing a minor logical redundancy, told a clearer story. The atomic unit was always (candidate, score). This clarity is a high virtue in security-sensitive code.

[Paragraph Confidence: High]

3. Synthesis and Final Judgment

This is a trade-off between two different types of simplicity. The new version is simpler from a logical/mathematical perspective (no redundant terms). The original version is simpler from a human/cognitive perspective (the logic is more direct and explicit).

In my expert opinion, the original implementation is marginally better because, for security-critical code, clarity of intent is the most valuable feature. The minor logical redundancy in the original was harmless, whereas the slight reduction in intuitive clarity in the new version is a small but real cost. The original version is easier to explain and therefore easier to audit and trust. It better embodies the Zen of Python’s highest principles of “Readability counts” and “Explicit is better than implicit.”

However, I want to be very clear: the new version is also excellent. It is correct, secure, and the result of a very clever insight. They are both A-grade solutions. If forced to choose, I would stick with the original’s beautiful explicitness.

[Paragraph Confidence: High]

I don’t what you think about it, but it seems to me they is trying to grasp at straws… They is using the Zen as a Bible

tim.one · December 10, 2025, 7:52pm

I think the higher-order point is that, right or wrong, our chatbots are discussing the issues at a high conceptual level, and making “seemingly intelligent” points. Plain search engines don’t even begin to approach that.

In a larger context, redundancy in inputs to a “randomization” scheme can be downright dangerous, which is the real reason for eliminating it when possible. For example, countless programmers have fallen into the trap of computing a “hash code” for a container by xor’ing together the hash codes of the container’s contents. But if the same item appears twice, their hash codes merely cancel out, as if neither had been present.

Which my chatbot realized on its own when I floated the idea behind the newer scheme. Now SHA-512 has no such vulnerability (far as anyone knows), but “on general principle”, to protect against unknown vulnerabilities, it’s good practice to eliminate redundancy in inputs in this kind of context.

The only “actual entropy” here, for a given election, is in the scores. For that purpose alone, there’s no real need to fold in candidate names at all. The real purpose in doing so is for cross-election entropy: names do vary across elections. If the names weren’t folded in too, brute force attacks could work on scores alone, and apply to all elections with the same number of candidates (= same number of scores). Folding in the names vastly expands the brute-force search space,

As to bigints in JavaScript, I’m not a native JS speaker and don’t want to become one If the BetterVoting folks want to submit a PR to switch permute.js to bigints, I’d welcome that.

Lucas_Malor · December 11, 2025, 6:34pm

Yeah, I completely agree. I was only amused and surprised by the reply of Gemini

Never thought about that. I usually use already existing hash functions. But now that you tell me that, it’s so evident I don’t know how I never thought about that.

I thought that, if so, tuple simply added the position in some way to the hash of the current item. But it seems the hash function is more complicated.

I don’t understand. Are you saying that this code is only for this SC election, and not for future ones?

I’m not a BetterVoting folk and I don’t have a GitHub account, but I can create it and submit a PR. Do I only need to run compare_driver.py before doing so, to check if the code works?

Side note: I noticed you changed completely the impl. I was just curious, so I took a look to the history of the js and py impls.

It seems you added also the DSU (that now I know what it is) at a certain point. I suppose you added it because this way the js version was more similar to the py version, rather than a matter of speed.

The fact I was really surprised was your current impl. You changed completely both the js and py impl to make it more abstract and a more simple code, both in js and py. I suppose that the fact now DSU is applied also to the canonicalSalt sort is only a nice side effect. And I understood your code only because Gemini helped me…

Now who really spook me it’s you 0____o

tim.one · December 11, 2025, 8:25pm

Tuple hashing has gone through several changes over the years, getting ever fancier because real-life cases came up where it wasn’t doing a good job of scrambling regularities (including redundancies) in the hashes of tuple elements. It was never just “xor all the hashes together”, but had subtler vulnerabilities.

The current version has survived unchanged for years. It steals the heart of how the highly regarded xxHash works. Not at all a “crypto strength” hash (speed matters too!), and we skip that tail end of what xxHash does for reasons too involved to go into here.

It won’t be used in any election - yet. This is a longer-term goal of giving BetterVoting a better way to break ties (and any other voting service that wants a better way, but I’ve been working only with BetterVoting). If/when they adopt it, they’ll be using it in an unbounded number of different elections. The code is all open source, so “security through obscurity” can’t even start to apply. Folding the candidate names into the hashes makes each new election a fresh puzzle for bad actors to try to crack.

Thanks, but I’d prefer to leave this up to the BetterVoting folks. Everything they do runs under Node.js, and they’re JS experts. Part of the goal is to leave them with code they’re 100% happy with, and they’re the best judges of that.

Side note: compare_driver.py is a strong test, but won’t work if you actually use scores that exceed 53 bits of precision. Because it communicates with Node via JSON-encoded streams, and JSON on the Node side silently cuts large integer literals back to 53-bit precision. Proper testing would also require changing how arguments are passed between processes, introducing some hack to encode Python’s bigints in some format JSON can handle, and changing the Node side of it too to construct a JS bigint from that format.

I wanted the Python and JS implementations to look a lot alike. JS’s sort doesn’t have a key= option, so DSU there is mandatory. And here it’s more DS anyway: there’s more than one sort before “undecorate” happens.

So, put all the fields (both input and computed) in an object, and just sort a list of those instead, via a given field in the object. That made the JS more obvious, and I think also the Python code.

Then my work here is done .

tim.one · December 11, 2025, 10:35pm

I’m calling the out here as a matter of interest to all core developers, because all sorts of ways of combining hashes have similar vulnerabilities.

uint64_t result = 0;
for (i = 0; i < len; ++i)
    result = result * MAGIC_MULT + input[i];

This pattern became very popular via Daniel Bernstein’s simple hashing function for C strings. In which context it does very well!

But combining tuple hashes is a very different context, and it’s a poor approach in that context. Which Python learned the hard way .

The problem is that multiplication only propagates changes “to the left”. In the context of a long string of characters, that doesn’t matter: the result is far wider than a single byte. But tuple input hashes are just as wide as the result: flip an input’s most significant bit, and it only affects the most significant bit of the result via the addition. If two input hashes have their high bit set, their high bits effectively cancel each other out. If not, and the multiplier is odd (which it always is), it will continue to have some effect on the result’s most significant bit, but never on any other result bits.

So if inputs vary just in their high-order bits, no “scrambling” of low-order bits happens. Which can be catastrophic for Python’s hash, because it’s the low-order hash bits that drive where dicts and sets first look things up.

In a “crypto strength” hash, changing any bit in any input changes about half the result’s bits, all about equally likely, and that’s ideal. But costly.

xxHash (the heart of which Python’s tuple hashing uses now) compromises by doing two multiplies, and (the crucial part) a bit rotation between them. The rotation moves high-order bits “to the right”, which multiplication then propagates “left” to the new high-order bits.

It’s a far better scrambler. Given a long enough sequence of inputs, every bit of every input can affect each result bit. The magic constants used as multipliers, and the rotation count, were picked by xxHash’s developer to do best on statistical tests of how well they “spread out” input changes quickly.

There are a number of lessons to be learned from all this, but, in the context of this topic, I leave it to the reader to ask their chatbot to spell them out .

Impressive but spooky AI

Robust salt constructions (deterministic, high-entropy)

Critical Synthesis: Analyzing the Refactored permute.py

1. The Thesis (The Case for the New Version’s Superiority)

2. The Antithesis (The Case for the Original Version’s Superiority)

3. Synthesis and Final Judgment

Critical Synthesis: Analyzing the Refactored `permute.py`