Detect when a package's behavior drastically changes

This is more of a thought I entertained a few days ago than a formal proposal. Some supply-chain attacks had a common pattern - a package started uploading data all of a sudden. I think that if we could track that a package that didn’t make network calls started doing them in a new release, we could use this information for security purposes.

While I don’t think that there’s a foolproof way to statically detect network/HTTP calls, I think that such metric would be useful to “flag” a package for further inspection. The problem is that HTTP calls come in all shapes and sizes. From raw OS calls all the way up to using a high level library like requests. Making an efficient heuristic for this may be challenging.

This behavioral analysis may be expanded beyond network.

If anyone has any thoughts about that, I’d be glad to discuss the matter further. I think that this project is of utmost importance to our community and I am looking forward to contributing to it

2 Likes

This is one of the intended use cases for audit hooks, so with 3.8 you could easily collect an activity profile and compare it over time.

3 Likes

Very interesting! I hadn’t heard of the audit events feature before.

It does seem like they’d be used for dynamic analysis though, right? I agree with @fbidu’s point that trying to detect some of this behavior statically would be challenging.

When it comes to dynamic analysis, would there be benefits to using audit hooks vs just using something like strace to capture all syscalls that occur when, say, a given package is installed?

In that context, probably not. Audit hooks are mostly just easier to configure and work cross-platform, unlike strace. They might be more efficient, depending on how the hook is implemented, as they aren’t going to be triggered as often as all syscalls. Apart from the cross-platformness, that doesn’t really matter for testing an install script.

I was thinking this sort of approach could be extended beyond the installer by incentivizing test coverage, possibly by ranking similar projects by coverage in the search results.