Stepping in here to discuss the system we (Mantis) have crafted, as I believe it more or less satisfies the requirements and constraints you’re describing (and more importantly, we’re steadily stepping towards an open source codebase).
We assertively scan all new uploads to PyPI, and use an organizationally developed crib of yara rules to detect a variety of different malicious behaviors within packages that may be found here. In our experience thus far, false positives create an environment where such a system, if made official in its entirety without individual review prior to publishing, could sufficiently degrade the trust and integrity in numerous packages.
My team and I are in fairly constant communication with other third party package security providers, and false positives constantly present a continually evolving issue that, frankly, I’m not sure we can expect to get a sufficient hold on without requiring manual review in all circumstances.
From our perspective, I think tools that support third party solutions to this (REF: Malware detection and reporting infrastructure to support 3rd party reports · Issue #12612 · pypi/warehouse · GitHub) are the most practical solution to implement a scoring or warning system, whereby we can distribute verdicting/scoring of packages across multiple services to prevent the likelihood of false positives, while also distributing the tremendous workload of manually parsing packages before making any sort of verdict.
We tweaked our yara to pass on a ‘weighting’ parameter, which is evaluated based on the likelihood that a behavior is unique to malware. I would propose that this scoring and warning system might present a significantly more effective warning system with the caveat that yara itself is just regex at the end of the day, and suffers from many of the same syntactical drawbacks.
However, that being said, there are certain packages that flag on our service that are undoubtedly malicious based purely on the number of behaviors they exhibit, and I believe that at least on a base level, that level of functionality can be implemented with little to no impact on legitimate developers.
TL;DR: I can’t imagine we ever get away from human review, but Issue#12612 kind of encompasses how we could effectively craft a ‘warning’ for potentially malicious packages by “crowdsourcing” security intelligence.