Why your Cloud Functions run out of memory on the 100th request, and the best practices to fix it.

Introduction: The “It Worked Locally” Mystery

You’ve just deployed a new serverless function. It’s designed to do some heavy lifting—perhaps parsing a large CSV file, resizing an image, or performing complex calculations on a dataset.

You tested it locally: Success. You ran a few test invocations in the cloud console: Success.

You push it to production. An hour later, your monitoring dashboard is flashing red. You are seeing “Out of Memory” (OOM) errors, increased latency, and your cloud bill is ticking upward faster than expected.

What happened? You didn’t change the code.

The problem likely isn’t the complexity of your task; it’s a misunderstanding of the serverless lifecycle. It’s the “Warm Start” trap, where your “stateless” function is secretly holding onto the ghosts of past requests.

Understanding the Serverless Execution Environment

There is a common misconception that “serverless” means every single request spins up a brand new, pristine environment that is immediately destroyed afterward.

If cloud providers (AWS Lambda, Azure Functions, Google Cloud Functions) did this, the latency (known as a “Cold Start”) would be unbearable for most applications.

To solve cold starts, providers reuse execution environments. Once a container spins up and handles a request, the provider “freezes” it. If another request comes in shortly after, the provider “thaws” that existing container to handle it instantly. This is a Warm Start.

The Problem: When “Frozen” Memory Doesn’t Melt

The danger lies in what gets preserved during that freeze.

Generally, variables defined outside your specific function handler (in the global scope or as static members of a class) persist between warm invocations.

The Scenario: Imagine you have a global list variable meant to temporarily hold data during processing.

Request A arrives: Your function reads 100MB of data and appends it to the global list. The request finishes, and the container freezes. The 100MB is still in RAM.
Request B arrives (Warm Start): It hits the same container. It reads another 100MB and appends it to that same global list. The container now holds 200MB.
Request C arrives…

You can see where this ends. Eventually, the container hits its configured memory limit (e.g., 1GB) and crashes hard. The next request will face a slow “cold start,” and the cycle begins again.

4 Best Practices to Master Serverless Memory

To build robust, scalable serverless functions that handle heavy processing without crashing, you need to code defensively against memory persistence.

Scope is Everything: AvoidGlobals

The golden rule of serverless development: If data is specific to a single request, define it inside the request handler.

Only use global or static scope for things intended to be shared across invocations, such as:

Database connection pools.
Authenticated API clients.
Loaded configuration data or small lookup tables.

If you are processing a user’s file, the variable holding that file data must be local to the function execution so it gets garbage collected the moment the request exits.

Stream, Don’t Load

A common mistake in data processing is loading an entire file into memory before acting on it.

If your function needs to process a 500MB CSV file from Cloud Storage/S3, do not download the whole file into a byte array. You will immediately spike your memory usage to at least 500MB.

Instead, use Streams.

Streaming allows your code to process the file in small chunks (e.g., line by line). You read a chunk, process it, send the result elsewhere, and discard the chunk. With streaming, processing a 500MB file might never use more than 50MB of RAM at any given second.

Beware the/tmpDirectory Trap

Most serverless environments provide a temporary scratch space on the disk (usually /tmp).

It’s crucial to remember two things about /tmp:

Its size is often limited (sometimes as low as 512MB, though configurable on some platforms).
It does not automatically clear between warm invocations.

If you download a file to /tmp to process it, and you don’t delete it when you’re done, the next request will find less available space. Eventually, your function will fail with “No space left on device” errors. Always use a finally block to ensure temporary files are deleted, regardless of success or failure.

Explicit Disposal (For Managed Languages)

In languages with Garbage Collection (like C#, Java, or Node.js), the runtime decides when to free up memory. Sometimes, in a constrained serverless environment, it doesn’t happen fast enough between rapidly freezing containers.

If you are dealing with exceptionally large objects, it can be beneficial to explicitly help the GC. In C#, ensure you use using statements for disposable resources. In other languages, explicitly setting large variables to null at the very end of your handler can be a signal to the runtime that the memory is ready to be reclaimed immediately.

Conclusion

Serverless offers incredible power and scalability, but “serverless” does not mean “memory-less.”

By understanding the lifecycle of your execution container and treating memory as a scarce, reusable resource, you can avoid embarrassing production crashes and keep your cloud costs predictable. Code your functions as if they will live forever, and they will serve you well.

Happy Coding !!

The Silent Killer of Serverless Performance: Avoiding the “Warm Start” Memory Trap

Introduction: The “It Worked Locally” Mystery

Understanding the Serverless Execution Environment

The Problem: When “Frozen” Memory Doesn’t Melt

4 Best Practices to Master Serverless Memory

Scope is Everything: AvoidGlobals

Stream, Don’t Load

Beware the/tmpDirectory Trap

Explicit Disposal (For Managed Languages)

Conclusion

Related Posts

Architectural Decision Guide: When to Adopt FastEndpoints

Understanding FastEndpoints in .NET

Knowing When (and When Not) to Use Cloud Functions

About saikk