Miss-islington hits GitHub API rate limits

@Mariatta: “Automerge feature caused miss-islington to reach rate limit. Need to figure out what to do about it.”

What are our options? Abandon the bot? Pay to get higher rate limit?

GitHub documentation:

“This means that all OAuth applications authorized by a user share the same quota of 5,000 requests per hour when they authenticate with different tokens owned by the same user.”

“For unauthenticated requests, the rate limit allows for up to 60 requests per hour. Unauthenticated requests are associated with the originating IP address, and not the user making requests.”

I guess that the bot uses unauthenticated requests, since 60 requests/hour sounds more likely to explain the issue, than a limit of 5000 requests/hour.

To register, I found this doc: https://developer.github.com/v3/guides/basics-of-authentication/ which points to https://github.com/settings/applications/new

The bot could not merge PRs if is using unauthenticated requests.

1 Like

Does it mean that the bot sends 5000 requests per hour?

That seems to be the case give the information we have, yep. I assume that this is a spike, not constant 5000 request/h

1 Like

5000 request/hour across the account being used to do the authenticating. And with the REST API this isn’t actually difficult to do since every ask for each different piece of data is a request (you have to hope that the relevant data was included in the webhook that triggered things).

My guess is we have the following options:

  1. See if miss-islington can cut out some requests
  2. Convert it to a full-on GitHub App and see if that gets us the request quota we need
  3. Convert miss-islington to GitHub Actions (Rust got increased minutes from GitHub when they moved their CI over)

This would only work if you’re okay with created backport PRs not triggering any CI jobs because events caused by using GITHUB_TOKEN that is pre-provisioned in GHA jobs don’t cause events in the GH platform.

This may be okay sometimes for the merging feature, though. Depending on whether you’re okay with missing the merge event and nothing depends on in.

Another workaround is to register a GitHub App and put its credentials into GitHub Actions secrets, then retrieve an installation token in each job and use that instead of the built-in token.

FWIW I have an example backporting GitHub App that I’m currently running in Ansible community repos (example backport PR: https://github.com/ansible-collections/community.general/pull/882 + Checks API reporting: https://github.com/ansible-collections/community.general/pull/545/checks?check_run_id=1107947382). The source is dead simple: https://github.com/sanitizers/patchback-github-app/blob/master/patchback/event_handlers.py. It doesn’t have auto-merge feature (it can be added easily, though). It uses (indirectly) gidgethub+aiohttp under the hood so there’s some bits similar to miss-islington.
If anybody wants to reuse it as a prior art or even just use this App — feel free. And I’m open to providing any help necessary, so reach out if needed.

Another obvious workaround for this would be separating automerge feature from the backport creation into a separate robot but I truly believe that using GitHub Apps would be superior.

P.S. It is important to understand that GitHub App installed into an org on GitHub will still get 5000rph per org. But depending on the number of repos it may get bonus requests: https://docs.github.com/en/free-pro-team@latest/developers/apps/rate-limits-for-github-apps#normal-server-to-server-rate-limits.

Organization installations with more than 20 users receive another 50 requests per hour for each user. Installations that have more than 20 repositories receive another 50 requests per hour for each repository. The maximum rate limit for an installation is 12,500 requests per hour.

So this basically means a GitHub App installation will get another 50x78=3900 requests for the repos (although, I’m not sure if this will happen if the install will be limited to just one repo) plus 50x103=5150 more according to the public users number I can see in the org.

In order to have an educated strategy, it would be good to have an idea of what API requests are more common, just to check if we could reduce them without changing significantly the infrastructure.

Yeah… I think that things like https://github.com/python/miss-islington/blob/main/miss_islington/delete_branch.py could be replaced by a toggle in GitHub repo settings.

It should be fairly easy to count the API calls per request by exploring the bot repo. But from what I saw it’s quite efficient. The only unobvious thing is that cherry-picker adds one extra query deep under the hood: https://github.com/python/cherry-picker/blob/39e8d6d/cherry_picker/cherry_picker.py#L321

One wild suggestion would be to replace posting comments with sending them to GitHub over email. But I think that a GH App solution would be way cleaner and easier to implement.