Multi-threaded Audio Processing in Node.js: Lessons from Building a High-throughput Voice AI System

How we scaled realtime audio processing to handle 80ms chunks with sub-500ms latency using Bun runtime and worker threads.

By Anthony · July 2025

The Event Loop Problem: Why Single-threaded Fails for Audio

JavaScript's event loop is well-designed for I/O-intensive applications, but it becomes a bottleneck for CPU-intensive workloads, such as realtime audio processing. Let's understand why:

The Event Loop Constraint

Node.js runs on a single main thread with an event loop that processes callbacks from various phases:

┌───────────────────────────┐
┌─>│           timers          │  ← setTimeout, setInterval
│  └─────────────┬─────────────┘
│  ┌─────────────┴─────────────┐
│  │     pending callbacks     │  ← I/O callbacks deferred to next iteration
│  └─────────────┬─────────────┘
│  ┌─────────────┴─────────────┐
│  │       idle, prepare       │  ← internal use only
│  └─────────────┬─────────────┘
│  ┌─────────────┴─────────────┐
│  │           poll            │  ← fetch new I/O events
│  └─────────────┬─────────────┘
│  ┌─────────────┴─────────────┐
│  │           check           │  ← setImmediate callbacks
│  └─────────────┬─────────────┘
│  ┌─────────────┴─────────────┐
└──┤      close callbacks      │  ← socket.destroy(), etc.
   └───────────────────────────┘

The Audio Processing Challenge

Our voice AI system needs to simultaneously:

Process incoming audio streams at 16kHz/16-bit (32KB/second per call)
Handle multiple audio tracks (human voice, AI responses, background effects)
Perform realtime operations like Fast Fourier Transforms, volume mixing, and format conversion
Maintain microsecond-precise audio synchronization across tracks
Scale to hundreds of concurrent calls without audio dropouts

The math is unforgiving. For a single phone call processing 80ms audio chunks:

// Per-call audio processing requirements
const SAMPLE_RATE = 16000; // Hz
const CHUNK_DURATION_MS = 80; // milliseconds
const BITS_PER_SAMPLE = 16;
const AUDIO_TRACKS = 4; // human, AI, background, effects

// Calculations
const samplesPerChunk = SAMPLE_RATE * (CHUNK_DURATION_MS / 1000); // 1,280 samples
const bytesPerChunk = samplesPerChunk * (BITS_PER_SAMPLE / 8); // 2,560 bytes
const totalSamplesPerChunk = samplesPerChunk * AUDIO_TRACKS; // 5,120 samples

// Scale to 100 concurrent calls
const totalSamplesPerTick = totalSamplesPerChunk * 100; // 512,000 samples every 80ms

// Processing budget: Must complete in <80ms to avoid audio dropouts
const processingBudget = 80; // milliseconds - hard realtime constraint

Why the Event Loop Fails

When CPU-intensive audio processing blocks the event loop:

// This blocks the event loop for ~50ms
function processAudioChunk(audioBuffer: Buffer): Buffer {
  // Complex DSP operations
  const fftResult = performFFT(audioBuffer);           // ~15ms
  const filtered = applyNoiseReduction(fftResult);     // ~10ms
  const normalized = normalizeAudio(filtered);         // ~5ms
  const mixed = mixWithBackground(normalized);         // ~10ms
  const compressed = applyCompression(mixed);          // ~10ms
  return compressed;
}

// Meanwhile, other operations are starved:
setTimeout(() => {
  console.log("This timer is delayed!"); // Fires 50ms late
}, 10);

// WebSocket responses are delayed
websocket.on('message', (data) => {
  // This handler waits 50ms before processing
  handleIncomingMessage(data);
});

The event loop's single-threaded nature means audio processing blocks all other operations, causing:

Dropped audio frames when processing takes >80ms
Delayed WebSocket responses to telephony providers
Missed timer callbacks for realtime synchronization
Cascading latency across the entire system

Threads vs Processes: The Architecture Decision

When moving beyond the single-threaded event loop, we had two options: child processes or worker threads. Let's examine why we chose threads.

Child Processes: The Traditional Approach

// Child process approach


class AudioProcessorWithProcesses {
  private processes: Map<string, ChildProcess> = new Map();

  async processAudio(audioData: Buffer): Promise<Buffer> {
    // Spawn a new process for each audio chunk
    const process = spawn('node', ['./audio-processor.js']);

    // Send data via stdin/stdout
    process.stdin.write(audioData);

    return new Promise((resolve, reject) => {
      let result = Buffer.alloc(0);
      process.stdout.on('data', (chunk) => {
        result = Buffer.concat([result, chunk]);
      });
      process.on('close', (code) => {
        if (code === 0) {
          resolve(result);
        } else {
          reject(new Error(`Process exited with code ${code}`));
        }
      });
    });
  }
}

Process Advantages

Complete isolation: Process crashes don't affect the main application
Separate memory space: No shared memory corruption risk
Platform stability: Mature IPC mechanisms

Process Disadvantages

High overhead: Process creation takes 10–50ms
Memory duplication: Each process duplicates the V8 heap
IPC serialization: Data must be serialized/deserialized across process boundaries
Resource limits: OS limits on process count (typically ~1000s)

Worker Threads: The Modern Solution

// Worker thread approach


class AudioProcessorWithThreads {
  private worker: Worker;
  private nextId = 1;
  private promises = new Map<number, { resolve: Function; reject: Function }>();

  constructor() {
    // Create a persistent worker thread
    this.worker = new Worker(__filename);
    // Handle responses from worker
    this.worker.on('message', (msg: any) => {
      const { id, result, error } = msg;
      const promise = this.promises.get(id);
      if (promise) {
        if (error) {
          promise.reject(new Error(error));
        } else {
          promise.resolve(result);
        }
        this.promises.delete(id);
      }
    });
  }

  async processAudio(audioData: Buffer): Promise<Buffer> {
    return new Promise((resolve, reject) => {
      const id = this.nextId++;
      this.promises.set(id, { resolve, reject });
      // Send to worker thread - no serialization overhead for Buffers
      this.worker.postMessage({
        id,
        method: 'processAudio',
        audioData
      });
    });
  }
}

// Worker thread code (same file, different execution context)
if (!isMainThread && parentPort) {
  parentPort.on('message', async (msg) => {
    const { id, method, audioData } = msg;
    try {
      // CPU-intensive processing happens here
      const result = await performAudioProcessing(audioData);
      parentPort.postMessage({ id, result });
    } catch (error) {
      parentPort.postMessage({ id, error: error.message });
    }
  });
}

Thread Advantages

Shared memory space: Direct buffer sharing without serialization
Low creation overhead: Thread creation takes <1ms
Efficient communication: SharedArrayBuffer for zero-copy data transfer
Resource efficiency: Threads share the same V8 isolate resources

Thread Disadvantages

Shared memory risks: Potential for memory corruption
Limited isolation: Thread crashes can affect the main process
Debugging complexity: Race conditions and deadlocks

Why Threads Won for Audio Processing

For our realtime audio workload, threads provide critical advantages:

Shared Memory Performance: Audio buffers can be shared directly via SharedArrayBuffer
Low Latency: Thread communication overhead is <1ms vs 10–50ms for processes
Memory Efficiency: No V8 heap duplication for hundreds of concurrent audio streams
Resource Scaling: Can create thousands of threads vs hundreds of processes

Why We Chose Bun Over Node.js

Before diving into our multi-threading implementation, let's examine why we chose Bun as our runtime foundation, particularly for CPU-intensive workloads.

JavaScript Engine Performance

Node.js uses V8 (Google's JavaScript engine)

Optimized for Chrome's web workloads
Excellent JIT compilation for typical web applications
Heavy garbage collection pressure under high throughput

Bun uses JavaScriptCore (Apple's JavaScript engine)

Optimized for performance-critical applications
Lower memory overhead per JavaScript context
Better performance for CPU-intensive operations

Benchmark Results for Our Workload

Based on our internal benchmarks and this public data:

// Audio processing benchmark (1000 iterations)
const audioBuffer = Buffer.alloc(1280 * 2); // 80ms @ 16kHz

// Node.js performance
console.time('node-audio-processing');
for (let i = 0; i < 1000; i++) {
  processAudioChunk(audioBuffer);
}
console.timeEnd('node-audio-processing');
// Result: ~116 seconds

// Bun performance
console.time('bun-audio-processing');
for (let i = 0; i < 1000; i++) {
  processAudioChunk(audioBuffer);
}
console.timeEnd('bun-audio-processing');
// Result: ~46 seconds (2.5x faster)

HTTP Server Performance (Critical for WebSocket connections)

Bun: 52,000+ requests/second
Node.js: ~14,000 requests/second
Improvement: 377% better throughput for realtime telephony connections

Startup Performance

Bun: ~100ms cold start
Node.js: ~300ms cold start
Impact: Faster auto-scaling in our microservice architecture

The Memory Trade-off

Bun consumes more memory (+75% in our testing)

// Memory usage comparison for 100 concurrent audio streams
const memoryUsage = process.memoryUsage();

// Node.js
console.log(memoryUsage);
// { rss: 40MB, heapUsed: 25MB, heapTotal: 30MB }

// Bun
console.log(memoryUsage);
// { rss: 70MB, heapUsed: 44MB, heapTotal: 53MB }

This trade-off is acceptable for our use case because:

Memory is cheaper than CPU cycles in a distributed infrastructure
CPU-bound workloads benefit more from execution speed than memory efficiency
Horizontal scaling allows us to add more instances rather than optimize memory

The Architecture: Worker Thread Proxy Pattern

Our solution employs a worker thread proxy pattern that isolates CPU-intensive audio processing while maintaining a clean API for the main thread, leveraging JavaScript's event loop for I/O operations.

System Architecture

Main Thread (Event Loop)
├── WebSocket I/O (telephony providers)
├── HTTP Server (REST API)
├── Timer Management (synchronization)
├── AudioProcessorProxy (thread coordination)
│   ├── Worker Thread 1 (audio processing)
│   ├── Worker Thread 2 (audio processing)
│   └── Worker Thread N (audio processing)
└── Database I/O (call metadata)

Core Implementation

1. Main Thread Proxy (Event Loop Optimized)


  private worker: Worker;
  private nextRequestId = 1;
  private pendingRequests = new Map<number, {
    resolve: Function;
    reject: Function;
    timestamp: number;
  }>();

  constructor() {
    this.worker = new Worker(__filename);
    this.setupMessageHandling();
    this.setupErrorHandling();
  }

  private setupMessageHandling(): void {
    this.worker.on('message', (msg: WorkerResponse) => {
      const { requestId, result, error } = msg;
      const request = this.pendingRequests.get(requestId);

      if (request) {
        // Clean up request tracking
        this.pendingRequests.delete(requestId);

        // Track processing time
        const processingTime = Date.now() - request.timestamp;
        this.metrics.histogram('audio.processing.duration', processingTime);

        if (error) {
          request.reject(new Error(error));
        } else {
          request.resolve(result);
        }
      }
    });
  }

  // Non-blocking method that returns immediately
  async processAudioChunk(audioData: Buffer, options: ProcessingOptions): Promise<Buffer> {
    return new Promise((resolve, reject) => {
      const requestId = this.nextRequestId++;

      // Track request for response correlation
      this.pendingRequests.set(requestId, {
        resolve,
        reject,
        timestamp: Date.now()
      });

      // Send to worker thread - this is async and non-blocking
      this.worker.postMessage({
        requestId,
        method: 'processAudioChunk',
        audioData,
        options
      });
    });
  }

  async mixAudioTracks(tracks: AudioTrack[]): Promise<Buffer> {
    return this.sendToWorker('mixAudioTracks', { tracks });
  }

  async applyEffects(audioData: Buffer, effects: AudioEffect[]): Promise<Buffer> {
    return this.sendToWorker('applyEffects', { audioData, effects });
  }

  private async sendToWorker(method: string, args: any): Promise<any> {
    return new Promise((resolve, reject) => {
      const requestId = this.nextRequestId++;
      this.pendingRequests.set(requestId, {
        resolve,
        reject,
        timestamp: Date.now()
      });
      this.worker.postMessage({
        requestId,
        method,
        ...args
      });
    });
  }
}

2. Worker Thread Implementation (CPU-Intensive Processing)

// This code runs in the worker thread context
if (!isMainThread && parentPort) {
  class AudioProcessor {
    private audioBuffer: SharedArrayBuffer;
    private processingMetrics: Map<string, number> = new Map();

    constructor() {
      // Pre-allocate shared memory for audio processing
      this.audioBuffer = new SharedArrayBuffer(1024 * 1024 * 10); // 10MB
    }

    async processAudioChunk(audioData: Buffer, options: ProcessingOptions): Promise<Buffer> {
      // This is CPU-intensive work that would block the event loop
      const startTime = performance.now();
      try {
        // Convert to Int16Array for processing
        const samples = new Int16Array(audioData.buffer);

        // Apply processing pipeline
        let processedSamples = samples;
        if (options.enableNoiseReduction) {
          processedSamples = await this.applyNoiseReduction(processedSamples);
        }
        if (options.enableVolumeNormalization) {
          processedSamples = await this.normalizeVolume(processedSamples);
        }
        if (options.enableCompression) {
          processedSamples = await this.applyCompression(processedSamples);
        }

        // Convert back to Buffer
        const result = Buffer.from(processedSamples.buffer);

        // Track processing time
        const processingTime = performance.now() - startTime;
        this.processingMetrics.set('lastProcessingTime', processingTime);

        return result;
      } catch (error) {
        throw new Error(`Audio processing failed: ${error.message}`);
      }
    }

    private async applyNoiseReduction(samples: Int16Array): Promise<Int16Array> {
      // CPU-intensive FFT-based noise reduction
      const fftSize = 1024;
      const result = new Int16Array(samples.length);

      for (let i = 0; i < samples.length; i += fftSize) {
        const chunk = samples.slice(i, i + fftSize);
        // Apply FFT
        const frequencyDomain = this.fft(chunk);
        // Apply noise reduction filter
        const filtered = this.applyNoiseFilter(frequencyDomain);
        // Apply inverse FFT
        const timeDomain = this.ifft(filtered);
        // Copy back to result
        result.set(timeDomain, i);
      }

      return result;
    }

    private async mixAudioTracks(tracks: AudioTrack[]): Promise<Buffer> {
      // Mix multiple audio tracks with precise timing
      const maxLength = Math.max(...tracks.map(t => t.audioData.length));
      const mixedSamples = new Int16Array(maxLength / 2);

      for (const track of tracks) {
        const trackSamples = new Int16Array(track.audioData.buffer);
        const volume = track.volume || 1.0;

        for (let i = 0; i < trackSamples.length; i++) {
          // Mix with volume control and clipping prevention
          mixedSamples[i] = Math.max(-32768, Math.min(32767,
            mixedSamples[i] + (trackSamples[i] * volume)
          ));
        }
      }

      return Buffer.from(mixedSamples.buffer);
    }

    // Placeholder for FFT implementation
    private fft(samples: Int16Array): Complex[] {
      // Implement Fast Fourier Transform
      // This is CPU-intensive and would block the event loop
      return [];
    }

    private ifft(frequencies: Complex[]): Int16Array {
      // Implement Inverse Fast Fourier Transform
      return new Int16Array(0);
    }
  }

  // Message handling in worker thread
  const processor = new AudioProcessor();
  parentPort.on('message', async (msg: WorkerRequest) => {
    const { requestId, method, ...args } = msg;

    try {
      // Route to appropriate processing method
      let result;
      switch (method) {
        case 'processAudioChunk':
          result = await processor.processAudioChunk(args.audioData, args.options);
          break;
        case 'mixAudioTracks':
          result = await processor.mixAudioTracks(args.tracks);
          break;
        case 'applyEffects':
          result = await processor.applyEffects(args.audioData, args.effects);
          break;
        default:
          throw new Error(`Unknown method: ${method}`);
      }

      // Send result back to main thread
      parentPort.postMessage({
        requestId,
        result
      });
    } catch (error) {
      // Send error back to main thread
      parentPort.postMessage({
        requestId,
        error: error.message
      });
    }
  });
}

Advanced Threading Optimizations

1. SharedArrayBuffer for Zero-Copy Operations

For maximum performance, we use SharedArrayBuffer to eliminate serialization overhead:

class OptimizedAudioProcessor {
  private sharedBuffer: SharedArrayBuffer;
  private sharedView: Int16Array;

  constructor() {
    // Create shared memory accessible by both threads
    this.sharedBuffer = new SharedArrayBuffer(1024 * 1024 * 4); // 4MB
    this.sharedView = new Int16Array(this.sharedBuffer);
  }

  async processWithSharedMemory(audioData: Buffer): Promise<Buffer> {
    // Copy audio data to shared memory
    const audioSamples = new Int16Array(audioData.buffer);
    this.sharedView.set(audioSamples, 0);

    // Send offset and length instead of copying data
    return this.sendToWorker('processSharedAudio', {
      offset: 0,
      length: audioSamples.length
    });
  }
}

// Worker thread processes shared memory directly
if (!isMainThread && parentPort) {
  parentPort.on('message', async (msg) => {
    if (msg.method === 'processSharedAudio') {
      const { offset, length } = msg;

      // Access shared memory directly - no serialization!
      const sharedView = new Int16Array(msg.sharedBuffer);
      const audioSamples = sharedView.subarray(offset, offset + length);

      // Process in-place
      for (let i = 0; i < audioSamples.length; i++) {
        audioSamples[i] = audioSamples[i] * 0.8; // Apply volume
      }

      // Result is already in shared memory
      parentPort.postMessage({
        requestId: msg.requestId,
        result: 'processed'
      });
    }
  });
}

2. Thread Pool Management

For handling multiple concurrent audio streams:

class AudioProcessorPool {
  private workers: Worker[] = [];
  private roundRobinIndex = 0;
  private workerLoad = new Map<Worker, number>();

  constructor(poolSize: number = 4) {
    for (let i = 0; i < poolSize; i++) {
      const worker = new Worker(__filename);
      this.workers.push(worker);
      this.workerLoad.set(worker, 0);
      worker.on('message', (msg) => {
        // Decrease load count when work completes
        const currentLoad = this.workerLoad.get(worker) || 0;
        this.workerLoad.set(worker, Math.max(0, currentLoad - 1));
      });
    }
  }

  private getNextWorker(): Worker {
    // Load balancing: choose worker with lowest load
    let minLoad = Infinity;
    let selectedWorker = this.workers[0];

    for (const worker of this.workers) {
      const load = this.workerLoad.get(worker) || 0;
      if (load < minLoad) {
        minLoad = load;
        selectedWorker = worker;
      }
    }

    // Increase load count
    this.workerLoad.set(selectedWorker, minLoad + 1);

    return selectedWorker;
  }

  async processAudio(audioData: Buffer): Promise<Buffer> {
    const worker = this.getNextWorker();

    return new Promise((resolve, reject) => {
      const requestId = Date.now() + Math.random();
      const timeout = setTimeout(() => {
        reject(new Error('Worker processing timeout'));
      }, 5000);

      const messageHandler = (msg: any) => {
        if (msg.requestId === requestId) {
          clearTimeout(timeout);
          worker.off('message', messageHandler);
          if (msg.error) {
            reject(new Error(msg.error));
          } else {
            resolve(msg.result);
          }
        }
      };

      worker.on('message', messageHandler);
      worker.postMessage({
        requestId,
        method: 'processAudio',
        audioData
      });
    });
  }
}

3. Memory Management and Garbage Collection

Worker threads require careful memory management:

class MemoryManagedAudioProcessor {
  private bufferPool: Buffer[] = [];
  private maxPoolSize = 100;
  private gcInterval: NodeJS.Timeout;

  constructor() {
    // Pre-allocate buffer pool
    for (let i = 0; i < this.maxPoolSize; i++) {
      this.bufferPool.push(Buffer.alloc(1024 * 4)); // 4KB buffers
    }

    // Periodic garbage collection
    this.gcInterval = setInterval(() => {
      if (global.gc) {
        global.gc();
      }
    }, 10000); // Every 10 seconds
  }

  private getBuffer(size: number): Buffer {
    // Reuse existing buffers when possible
    for (let i = 0; i < this.bufferPool.length; i++) {
      const buffer = this.bufferPool[i];
      if (buffer && buffer.length >= size) {
        this.bufferPool[i] = null; // Mark as used
        return buffer.slice(0, size);
      }
    }

    // Allocate new buffer if pool is empty
    return Buffer.alloc(size);
  }

  private returnBuffer(buffer: Buffer): void {
    // Return buffer to pool for reuse
    for (let i = 0; i < this.bufferPool.length; i++) {
      if (!this.bufferPool[i]) {
        this.bufferPool[i] = buffer;
        return;
      }
    }
  }

  async processAudio(audioData: Buffer): Promise<Buffer> {
    const workBuffer = this.getBuffer(audioData.length);
    try {
      // Process audio using pooled buffer
      audioData.copy(workBuffer);

      // Perform processing...
      const result = await this.performProcessing(workBuffer);

      return result;
    } finally {
      // Always return buffer to pool
      this.returnBuffer(workBuffer);
    }
  }

  destroy(): void {
    clearInterval(this.gcInterval);
    this.bufferPool = [];
  }
}

Integration with Realtime Voice AI Pipeline

Our worker thread architecture integrates seamlessly with the broader voice AI pipeline, leveraging the event loop for I/O and threads for CPU-intensive work:

// Main thread handles I/O and coordination

  private audioProcessor: AudioProcessorProxy;
  private websocket: WebSocket;
  private llmClient: LLMClient;
  private ttsClient: TTSClient;

  constructor() {
    this.audioProcessor = new AudioProcessorProxy();
    this.setupWebSocketHandling();
  }

  private setupWebSocketHandling(): void {
    // Event loop handles WebSocket I/O efficiently
    this.websocket.on('message', async (audioData: Buffer) => {
      // This is non-blocking - audio processing happens in worker thread
      const processedAudio = await this.audioProcessor.processAudioChunk(audioData, {
        enableNoiseReduction: true,
        enableVolumeNormalization: true
      });

      // Parallel processing while audio is being processed
      const [transcription, vadResult] = await Promise.all([
        this.transcribeAudio(processedAudio),
        this.analyzeVoiceActivity(processedAudio)
      ]);

      if (vadResult.isSpeaking) {
        // Generate AI response
        const aiResponse = await this.llmClient.generateResponse(transcription);

        // Convert to audio (also uses worker thread)
        const aiAudio = await this.ttsClient.generateAudio(aiResponse);

        // Mix with background audio
        const mixedAudio = await this.audioProcessor.mixAudioTracks([
          { audioData: aiAudio, volume: 0.8 },
          { audioData: this.backgroundAudio, volume: 0.2 }
        ]);

        // Send back via WebSocket (event loop I/O)
        this.websocket.send(mixedAudio);
      }
    });
  }

  private async transcribeAudio(audioData: Buffer): Promise<string> {
    // I/O operation - handled by event loop
    return await this.makeHTTPRequest('/transcribe', audioData);
  }

  private async analyzeVoiceActivity(audioData: Buffer): Promise<VADResult> {
    // CPU-intensive operation - delegated to worker thread
    return await this.audioProcessor.analyzeVoiceActivity(audioData);
  }
}

Performance Monitoring and Observability

We instrument our worker threads with comprehensive metrics to understand performance characteristics using Datadog. Below is a provider-agnostic implementation:

class InstrumentedAudioProcessor {
  private metrics = {
    processedChunks: 0,
    averageProcessingTime: 0,
    memoryUsage: 0,
    errorRate: 0
  };

  constructor() {
    // Monitor worker thread performance
    setInterval(() => {
      this.reportMetrics();
    }, 1000);
  }

  async processAudio(audioData: Buffer): Promise<Buffer> {
    const startTime = performance.now();
    const initialMemory = process.memoryUsage().heapUsed;
    try {
      const result = await this.performProcessing(audioData);

      // Track success metrics
      const processingTime = performance.now() - startTime;
      this.updateMetrics(processingTime, true);
      return result;
    } catch (error) {
      // Track error metrics
      this.updateMetrics(0, false);
      throw error;
    } finally {
      // Monitor memory usage
      const finalMemory = process.memoryUsage().heapUsed;
      this.metrics.memoryUsage = finalMemory - initialMemory;
    }
  }

  private updateMetrics(processingTime: number, success: boolean): void {
    this.metrics.processedChunks++;

    if (success) {
      // Moving average for processing time
      this.metrics.averageProcessingTime =
        (this.metrics.averageProcessingTime * 0.9) + (processingTime * 0.1);
    } else {
      this.metrics.errorRate =
        (this.metrics.errorRate * 0.9) + (0.1);
    }
  }

  private reportMetrics(): void {
    console.log('Worker Thread Metrics:', {
      processedChunks: this.metrics.processedChunks,
      averageProcessingTime: `${this.metrics.averageProcessingTime.toFixed(2)}ms`,
      memoryUsage: `${(this.metrics.memoryUsage / 1024 / 1024).toFixed(2)}MB`,
      errorRate: `${(this.metrics.errorRate * 100).toFixed(2)}%`
    });
  }
}

Production Results and Lessons Learned

Performance Achievements

Our multi-threaded audio processing system now handles:

500+ concurrent voice calls across our infrastructure
<80ms processing latency for realtime audio chunks
99.9% uptime for the audio processing pipeline
<1% CPU usage per call on our worker threads
2.5x performance improvement over single-threaded Node.js

Key Lessons Learned

1. Event Loop + Worker Threads = Optimal Architecture

The combination of JavaScript's event loop for I/O operations and worker threads for CPU-intensive processing provides the best of both worlds:

Non-blocking I/O for WebSocket connections and database operations
Parallel processing for audio DSP operations
Resource efficiency through proper separation of concerns

2. Bun's Performance Gains Are Real and Significant

Our switch to Bun provided measurable improvements:

2.5x faster execution for audio processing workloads
377% better HTTP throughput for WebSocket connections
Faster startup times for our microservice architecture
Lower garbage collection pressure under high load

3. SharedArrayBuffer Is a Game-Changer

Using SharedArrayBuffer for audio data eliminates serialization overhead:

Zero-copy data transfer between threads
Predictable performance characteristics
Reduced memory allocation pressure

4. Thread Pools Require Careful Management

Worker thread pools need sophisticated load balancing:

Round-robin scheduling causes hotspots
Load-based distribution provides better balance
Graceful degradation when workers fail

5. Memory Management Is Critical

Pre-allocating buffers and using object pools prevents:

Garbage collection pauses during audio processing
Memory leaks in long-running audio streams
Buffer allocation overhead in hot paths

Debugging and Troubleshooting

Common issues we encountered and solutions:

// Issue: Worker thread deadlocks
class DeadlockSafeProcessor {
  private requestTimeout = 5000; // 5 second timeout

  async processWithTimeout(audioData: Buffer): Promise<Buffer> {
    return new Promise((resolve, reject) => {
      const timer = setTimeout(() => {
        reject(new Error('Worker thread timeout - possible deadlock'));
      }, this.requestTimeout);
      this.processAudio(audioData)
        .then(resolve)
        .catch(reject)
        .finally(() => clearTimeout(timer));
    });
  }
}

// Issue: Memory leaks in worker threads
class MemoryLeakSafeProcessor {
  private cleanupInterval: NodeJS.Timeout;

  constructor() {
    // Periodic cleanup
    this.cleanupInterval = setInterval(() => {
      this.cleanup();
    }, 30000); // Every 30 seconds
  }

  private cleanup(): void {
    // Force garbage collection
    if (global.gc) {
      global.gc();
    }

    // Clear any lingering references
    this.clearCaches();
  }

  destroy(): void {
    clearInterval(this.cleanupInterval);
    this.cleanup();
  }
}

Conclusion

Multi-threaded audio processing with worker threads represents a significant architectural evolution from single-threaded Node.js applications. By combining JavaScript's excellent event loop for I/O operations with worker threads for CPU-intensive tasks, and choosing Bun for its superior performance characteristics, we've built a system that scales to handle realtime voice AI at production scale.

The key insights from our implementation:

Understand your workload: CPU-intensive tasks need threads, I/O tasks need the event loop
Choose the right runtime: Bun's performance advantages compound at scale
Optimize data transfer: SharedArrayBuffer eliminates serialization overhead
Monitor aggressively: Worker threads require different monitoring strategies
Plan for failure: Graceful degradation and timeouts are essential

The patterns shown here are broadly applicable to any high-throughput, low-latency system that combines I/O operations with CPU-intensive processing.

Want to work on problems like this? Toma is disrupting the $2T automotive industry with agentic AI. We're looking for senior engineers who love solving complex challenges in realtime systems. Check out our jobs page to learn more.

Disclaimer: This post represents our engineering approach as of 2025. Technologies, benchmarks, and implementation details may evolve as we continue to optimize our systems.