Odd order of the results on docs.python.org for builtins

If you search for the str builtin on docs.python.org, it will appear buried below not-exact results. This is true for other builtins, such as map or zip although to the lesser extent.

Shouldn’t they appear #1 given that the match is exact and the namespace is the most direct?

The doc search box is somewhat like a web search limited to the doc site. I don’t know how the ordering works, but its oddness has been noted before. For ‘python str function’, Google gave me 3 other sites and 4 “People also ask” questions before the 3.11 doc definition. Duck Duck Go (Bing?) gave me right-hand sidebar box linking to the 3.8 doc definition. The doc index (upper right) takes one there directly. I always use that when I want the obvious main entry.

It’s not quite a web search limited to the doc site; there are actually three types of results. First, you have glossary hits, if any, which appear in gold above the normal search results and are driven by a custom local Sphinx extension.

Second, you have the results from the Sphinx object index, which appear first in the main search results as {object.name} ( {Domain} {object type}, in {modulename} {Page Name} ). In Sphinx 4.5.0 that the docs are currently running on, that includes functions, classes, methods, etc. but not manual index entries, though based on our feedback, @AA-Turner added that for Sphinx 5.2(.3, IIRC).

That’s what you’re seeing here; the issue is Sphinx is not prioritizing fully qualified exact matches over prefixed matches (e.g. str before locale.str), and prefixed matches over fuzzy non-whole-word matches (e.g. locale.str before stringprep). And perhaps even matches at the start of a string component over inner matches (e.g. stringprep over email.headerregistry…it took me a minute to find the str).

Finally, you have the full page content matches, which work more or less like a web search with page titles and then snippits from the page with the text highlighted. It should rarely get to that point, but its possible those could be sorted better too—right now, they are just alphabetical whereas they could perhaps be sorted by whole word matches first, then start matches, then inner matches, perhaps by match count…though that’s much lower priority than the above.

2 Likes

Is there a plan to upgrade to the improved Sphinx to get the improvement?

Just to be clear, that particular improvement doesn’t directly solve the prioritization/ordering issue, but we’d need to upgrade it anyway to get that and other fixes to search, as well as a future hypothetical version where the prioritization is improved (and I’m not sure if there are other improvements in that vein in 5.x or not; its been discussed but I’m not sure if concrete changes have been made yet).

But yes, there are plans to upgrade; @AA-Turner has a PR that upgrades the Python docs theme to support Sphinx 5.x and with some infra changes I plan to do, we can safely build the docs website with the latest Sphinx version while still testing and retaining support for earlier versions, so we can backport the changes to stable docs branches. Furthermore, there are current preliminary plans to propose migrating to a much more modern docs theme that supports Sphinx 5.x+ for Python 3.12+.

1 Like

Currently blocked on JS review, from memory.

A

1 Like

That was my understanding as well—though, I thought I saw you state elsewhere that jQuery support was planned to be removed in Sphinx 6, not Sphinx 5? If that’s what’s blocking the migration (which seems to be the lion’s share of that PR), and it isn’t required for Sphinx 5, would it be better just to drop it? Realistically, if Lutra is adopted for Python 3.12, it seems unlikely that upgrading the old theme to Sphinx 6 would be that critical of a concern for the near term, unless other docs sites really want to keep using it on that version.

After investigation, p-d-t 2022.1 works as-is for Sphinx 5.0.1 and newer, so I have opened gh-99380: Update to Sphinx 5.3.0 by AA-Turner · Pull Request #99381 · python/cpython · GitHub

A