Why does Python seem to slow down with larger data sets?

I’ve been working on a Python project that processes large amounts of data, but I’m noticing a significant slowdown when the dataset gets bigger. My code runs fine with smaller sets, but as the size increases, it feels like everything crawls to a halt. Has anyone experienced this with Python? Is it a limitation of the language itself or am I missing some optimization techniques?
Here is [pdf](file:///C:/Users/Admin/Downloads/New%20year%20wishes%202025.pdf)

What’s the code?

It sounds like maybe you’re running out of memory, have you checked that?

Depending on what you’re doing there might be ways to avoid this. But we don’t know because you haven’t shared any code.

Another possibility is that your algorithm scales poorly with lots of data[1] , and so it gets much slower as you increase the size of the input. Again, it might be fixable, but we’d need to see the code to know.

edited for grammar and to add a link.


  1. this is a huge topic but worst-case complexity is a good place to start ↩︎

We’ve encountered this with C, Fortran, and Assembly. It’s not about the language used; it’s about the time complexity of the algorithms.

  1. What are you having Python do with the PDF file? Just reading the PDF file?
  2. If you are having AI summarize the file, AI can take lots of memory. I have 16GB of RAM on my machine plus a GPU RTX 3060. Ollama works fine for most things but cannot handle the mistral-large LLM as it requires 56GB of RAM!
  3. What is the size of the PDF file in megabytes?
  4. How much free RAM do you have just before you run the Python program?

Without more details it’s not possible to help you much more.

1 Like