MSan availability on XUbuntu 20.04?

I saw msan mentioned in configure output, but can’t seem to find it on my XUbuntu 20.04 machine. I have clang and llvm tools installed, but msan doesn’t turn up on PATH. Clang sits at version 10.0.0-4ubuntu1 and I have 140+ llvm-prefixes commands on PATH, none of which appear to have “msan” in their names. apt search msan doesn’t turn up anything promising. Where might I look for it short of grabbing it from GitHub?

Ah, never mind. I was thinking because “msan” was quoted in the configure help output there would be a command. It’s just clang compiler flag fiddling I think. I can see it spewing complaints during build now.

Probably MemorySanitizer — Clang 18.0.0git documentation

MemorySanitizer is a detector of uninitialized reads. It consists of a compiler instrumentation module and a run-time library.

To be honest, while AdressSanitizer and UndefinedBehaviourSanitizer are widely used (and ThreadSanitizer is mildly popular as well), I’ve never seen a project using MemorySanitizer.

This is because using a memory sanitizer requires that everything be built msan enabled build. All libraries ever loaded by the process. Shared or not. Ideally including libc itself. MSAN has some hacks so it can pretend to know the effects of some library and system call but they are incomplete; the more you built with msan the better it’ll work.

“MemorySanitizer is known to work on large real-world programs (like Clang/LLVM itself) that can be recompiled from source, including all dependent libraries.” - MemorySanitizer — Clang 18.0.0git documentation

Most people are not in environments where they can easily do this. I wanted to get a MSAN buildbot setup but for it to produce meaningful non-noise-dominant output for us is too much work on a normal OS distro and build. We use MSAN all the time to verify builds at work within our hermetic build system that does build everything. Any issue we eventually find there in the CPython internals we’ll get upstream. We’re not to the point of being able to test CPython from the main branch though.

Terminology: MSAN, TSAN, UBSAN, ASAN … they’re just low-syllable shorthand for Memory, Thread, Undefined Behavior, and Address Sanitizer respectively. All clang features. Though I believe gcc offers its own versions of some of these today I haven’t looked.

2 Likes

You’ll find a nice selection of GCC sanitisers here: Instrumentation Options (Using the GNU Compiler Collection (GCC)). Quick and dirty summary:

  • -fsanitize=address: AddressSanitizer – fast memory error detection
  • -fsanitize=kernel-address: AddressSanitizer for Linux kernel
  • -fsanitize=hwaddress: hardware-assisted AddressSanitizer
  • -fsanitize=kernel-hwaddress: hardware-assisted AddressSanitizer for Linux kernel
  • -fsanitize=pointer-compare: instrument comparison operation (<, <=, >, >=) with pointer operands
  • -fsanitize=pointer-subtract: instrument subtraction with pointer operands
  • -fsanitize=thread: ThreadSanitizer – a fast data race detector
  • -fsanitize=leak: LeakSanitizer – a memory leak detector
  • -fsanitize=undefined: UndefinedBehaviorSanitizer – a fast undefined behavior detector, with a bunch of suboptions
  • … and more :slight_smile: