Add timeout management in pyperformance

Before creating the issue/PR, I’d like to understand if there is interest for having timeout management in pyperformance.

If a benchmark run hangs, pyperformance doesn’t have a mechanism to time out and terminate the underlying process. We’ve had a few cases where a benchmark was hanging literally forever (deadlock) and the job was terminated by the CI system eventually (after many hours!). This is not ideal because:

  • it’s hard to set an appropriate timeout for the whole process as it might vary from machine to machine
  • sometimes folks might not have full control of the CI system hence impossible to set a sensible timeout
  • having an early failure by pyperformance is preferred than having a timeout at process (pyperformance) level. This allow better usage of CI resources.

Ideally the timeout would be applied at benchmark run level and not for the whole pyperformance process allowing a more fine grained control in case of timeout.

I’ve been having this idea for some time but before doing it, I wanted to double check that it makes sense.

4 Likes

I like this idea a lot.

It’s worth pointing out that anything that happens outside of pyperformance (e.g. Github Action’s maximum job length timeout) means that a single runaway benchmark means no data is recorded for any of the other benchmarks.

I think there’s probably a reasonable default maximum for each benchmark we can agree on, and it will probably be significantly less than Github Action’s global maximum of 6 hours :wink: And an ability to override it for when that’s not enough will probably need to exist (though we should aim for most cases not needing to care).

Handling it at the pyperformance level rather than the pyperf level seems easier (since I think it doesn’t run the benchmark in the same process as the pyperformance orchestrator).

3 Likes

Since pyperf spawns worker processes, the main process should be able to implement a timeout easily.

2 Likes

Looks good to me! and I also concur the Michael and Victor’s comments :slight_smile:

1 Like

Thanks folks for the feedback. It looks like a good idea to implement. I’ll do it :slight_smile:

UPDATE: Implement timeout mechanism · Issue #353 · python/pyperformance · GitHub

And you all have the PR: Implement timeout mechanism for a benchmark run by diegorusso · Pull Request #354 · python/pyperformance · GitHub

Feel free to review and comment.

1 Like

The timeout was implemented in pyperf eventually: Implement --timeout when running benchmarks by diegorusso · Pull Request #205 · psf/pyperf · GitHub Thanks to @vstinner for the help!

The PR in pyperformance has been adapted to pass the --timeout flag to pyperf but this cannot be merged until we have a release in pyperf (otherwise it won’t work).

Who does that?