Skip to main content

Posts

Showing posts with the label Streams

Node.js Streams: Handling Backpressure to Prevent OOM Crashes in Large ETL Jobs

  You have built an ETL pipeline. It reads a 5GB CSV file, transforms the rows, and inserts them into a database or writes to a new format. In your local development environment with a sample dataset, it runs perfectly. You deploy to production, feed it the real dataset, and 45 seconds later, the process dies. FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory This is the classic "Fast Producer, Slow Consumer" problem. In Node.js, if you read data faster than you can write it, the excess data has to go somewhere. Without flow control, that "somewhere" is your RAM. Here is the root cause of the crash and the exact pattern to manage backpressure manually using modern Node.js APIs. The Root Cause: The Internal Buffer and  highWaterMark Node.js streams are not just event emitters; they are buffer managers. Every  Writable  stream has an internal buffer ( writableBuffer ). The size limit of this buffer is determined b...

Node.js Streams: Solving Heap Out of Memory with `stream.pipeline`

  You have deployed a standard ETL job or a file upload service. It works flawlessly with 100MB files on your local machine. But in production, under load, or when processing a 5GB CSV, the pod crashes. You check the logs and see the dreaded V8 signature: FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory The immediate knee-jerk reaction is to increase  max-old-space-size . However, if your logic relies on  source.pipe(dest) , throwing RAM at the problem is a band-aid, not a fix. The issue isn't just the file size; it is likely how your stream pipeline handles error propagation and lifecycle management when things go wrong. The Root Cause: Why  .pipe()  Leaks Memory The classic  readable.pipe(writable)  method manages backpressure—it pauses the readable stream when the writable stream’s high-water mark is reached. However,  .pipe()  has a fatal flaw regarding  error propagation  ...