Great thread, thanks to all for the constructive discussion. I’ve upgraded my build Dockerfile for SciPy and NumPy which was modelled on the same article (the one from 5 years ago), targetting Amazon Linux 2 (a “layer” for a packaged AWS Lambda microservice). The build process from source enables it to fit within the AWS Lambda size constraints, which I appreciate is a bit of a misuse of the purpose but got to do what you’ve got to do to deploy!
There was a bit of a cascade of build dependency requirements with recent version updates’ requirements (cmake, gcc, etc.) so a significant portion of the dependencies had to be built from source in the Docker image.
I hope nobody minds if I share this here where it will might be visible to others facing a similar challenge, or for comparison against the solution above. There were a few other places describing this problem (e.g. here) which trailed off without a clear indication of whether they solved it or not.
I haven’t found the magic combination of options that works yet, but at least I think you’ve set me on the right path now, thanks.
Basic setup and Yum packages installed before build:
FROM mlupin/docker-lambda:python3.10-build AS build
USER root
WORKDIR /var/task
# https://towardsdatascience.com/how-to-shrink-numpy-scipy-pandas-and-matplotlib-for-your-data-product-4ec8d7e86ee4
ENV CFLAGS "-g0 -Wl,--strip-all -DNDEBUG -Os -I/usr/include:/usr/local/include -L/usr/lib64:/usr/local/lib64:/usr/lib:/usr/local/lib"
RUN yum install -y wget curl git nasm openblas-devel.x86_64 lapack-devel.x86_64 python-dev file-devel make Cython libgfortran10.x86_64 openssl-devel
# Download and install CMake
WORKDIR /tmp
ENV CMAKE_VERSION=3.26.4
# Download and install CMake
RUN wget https://github.com/Kitware/CMake/releases/download/v${CMAKE_VERSION}/cmake-${CMAKE_VERSION}.tar.gz
RUN tar -xvzf cmake-${CMAKE_VERSION}.tar.gz
RUN cd cmake-${CMAKE_VERSION} && ./bootstrap && make -j4 && make install
# Clean up temporary files
RUN rm -rf /tmp/cmake-${CMAKE_VERSION}
RUN rm /tmp/cmake-${CMAKE_VERSION}.tar.gz
WORKDIR /var/task
RUN pip install --upgrade pip
RUN pip --version
# Specify the version to use for numpy and scipy
ENV NUMPY_VERSION=1.24.3
ENV SCIPY_VERSION=1.10.1
# Download numpy and scipy source distributions
RUN pip download --no-binary=:all: numpy==$NUMPY_VERSION
# Upgrade GCC to version 8 for SciPy Meson build system
RUN wget https://ftp.gnu.org/gnu/gcc/gcc-8.4.0/gcc-8.4.0.tar.gz && \
tar xf gcc-8.4.0.tar.gz && \
rm gcc-8.4.0.tar.gz && \
cd gcc-8.4.0 && \
./contrib/download_prerequisites && \
mkdir build && \
cd build && \
../configure --disable-multilib && \
make -j$(nproc) && \
make install && \
cd / && \
rm -rf gcc-8.4.0
# Set environment variables
ENV CC=/usr/local/bin/gcc
ENV CXX=/usr/local/bin/g++
ENV FC=/usr/local/bin/gfortran
# Verify GCC version
RUN gcc --version
RUN /usr/local/bin/gfortran --version
# Extract the numpy package and build the wheel
RUN pip install Cython
RUN ls && tar xzf numpy-$NUMPY_VERSION.tar.gz
RUN ls && cd numpy-$NUMPY_VERSION && python setup.py bdist_wheel build_ext -j 4
ENV BUILT_NUMPY_WHEEL=numpy-$NUMPY_VERSION/dist/numpy-$NUMPY_VERSION-*.whl
RUN ls $BUILT_NUMPY_WHEEL
NumPy and SciPy build (for simplicity I installed a wheel with the same version of NumPy as I was building from source, the wheel being purely for building SciPy)
# Don't install NumPy from the built wheel but use same version (it's a SciPy dependency)
RUN pip install numpy==$NUMPY_VERSION
RUN python -c "import numpy"
# Install build dependencies for the SciPy wheel
RUN pip install pybind11 pythran
# Extract the SciPy package and build the wheel
# RUN wget https://github.com/scipy/scipy/archive/refs/tags/v$SCIPY_VERSION.tar.gz -O scipy-$SCIPY_VERSION.tar.gz
RUN git clone --recursive https://github.com/scipy/scipy.git scipy-$SCIPY_VERSION && \
cd scipy-$SCIPY_VERSION && \
git checkout v$SCIPY_VERSION && \
git submodule update --init
RUN cd scipy-$SCIPY_VERSION && python setup.py bdist_wheel build_ext -j 4
ENV BUILT_SCIPY_WHEEL=scipy-$SCIPY_VERSION/dist/SciPy-*.whl
RUN ls $BUILT_SCIPY_WHEEL
# Install the wheels with pip
# (Note: previously this used --compile but now we already did the wheel compilation)
RUN pip install --no-compile --no-cache-dir \
-t /var/task/np_scipy_layer/python \
$BUILT_NUMPY_WHEEL \
$BUILT_SCIPY_WHEEL
RUN ls /var/task/np_scipy_layer/python
# Clean up the sdists and wheels
RUN rm numpy-$NUMPY_VERSION.tar.gz
RUN rm -r numpy-$NUMPY_VERSION scipy-$SCIPY_VERSION
# Uninstall non-built numpy after building the SciPy wheel
RUN pip uninstall numpy -y
RUN cp /var/task/libav/avprobe /var/task/np_scipy_layer/ \
&& cp /var/task/libav/avconv /var/task/np_scipy_layer/
RUN cp /usr/lib64/libblas.so.3.4.2 /var/task/np_scipy_layer/lib/libblas.so.3 \
&& cp /usr/lib64/libgfortran.so.4.0.0 /var/task/np_scipy_layer/lib/libgfortran.so.4 \
&& cp /usr/lib64/libgfortran.so.5.0.0 /var/task/np_scipy_layer/lib/libgfortran.so.5 \
&& cp /usr/lib64/liblapack.so.3.4.2 /var/task/np_scipy_layer/lib/liblapack.so.3 \
&& cp /usr/lib64/libquadmath.so.0.0.0 /var/task/np_scipy_layer/lib/libquadmath.so.0 \
&& cp /usr/lib64/libmagic.so.1.0.0 /var/task/np_scipy_layer/lib/libmagic.so.1 \
&& cp /usr/local/lib/libmp3lame*.so* /var/task/np_scipy_layer/lib \
&& cd /var/task/np_scipy_layer \
&& zip -j9 np_scipy_layer.zip /var/task/np_scipy_layer/avconv \
&& zip -j9 np_scipy_layer.zip /var/task/np_scipy_layer/avprobe \
&& zip -r9 np_scipy_layer.zip magic \
&& zip -r9 np_scipy_layer.zip python \
&& zip -r9 np_scipy_layer.zip lib