← Blog

Multi-threaded Audio Processing in Node.js: Lessons from Building a High-throughput Voice AI System

How we scaled realtime audio processing to handle 80ms chunks with sub-500ms latency using Bun runtime and worker threads.

By Anthony · July 2025

The Event Loop Problem: Why Single-threaded Fails for Audio

JavaScript's event loop is well-designed for I/O-intensive applications, but it becomes a bottleneck for CPU-intensive workloads, such as realtime audio processing. Let's understand why:

The Event Loop Constraint

Node.js runs on a single main thread with an event loop that processes callbacks from various phases:

┌───────────────────────────┐
┌─>│           timers          │  ← setTimeout, setInterval
│  └─────────────┬─────────────┘
│  ┌─────────────┴─────────────┐
│  │     pending callbacks     │  ← I/O callbacks deferred to next iteration
│  └─────────────┬─────────────┘
│  ┌─────────────┴─────────────┐
│  │       idle, prepare       │  ← internal use only
│  └─────────────┬─────────────┘
│  ┌─────────────┴─────────────┐
│  │           poll            │  ← fetch new I/O events
│  └─────────────┬─────────────┘
│  ┌─────────────┴─────────────┐
│  │           check           │  ← setImmediate callbacks
│  └─────────────┬─────────────┘
│  ┌─────────────┴─────────────┐
└──┤      close callbacks      │  ← socket.destroy(), etc.
   └───────────────────────────┘

The Audio Processing Challenge

Our voice AI system needs to simultaneously:

The math is unforgiving. For a single phone call processing 80ms audio chunks:

// Per-call audio processing requirements
const SAMPLE_RATE = 16000; // Hz
const CHUNK_DURATION_MS = 80; // milliseconds
const BITS_PER_SAMPLE = 16;
const AUDIO_TRACKS = 4; // human, AI, background, effects

// Calculations
const samplesPerChunk = SAMPLE_RATE * (CHUNK_DURATION_MS / 1000); // 1,280 samples
const bytesPerChunk = samplesPerChunk * (BITS_PER_SAMPLE / 8); // 2,560 bytes
const totalSamplesPerChunk = samplesPerChunk * AUDIO_TRACKS; // 5,120 samples

// Scale to 100 concurrent calls
const totalSamplesPerTick = totalSamplesPerChunk * 100; // 512,000 samples every 80ms

// Processing budget: Must complete in <80ms to avoid audio dropouts
const processingBudget = 80; // milliseconds - hard realtime constraint

Why the Event Loop Fails

When CPU-intensive audio processing blocks the event loop:

// This blocks the event loop for ~50ms
function processAudioChunk(audioBuffer: Buffer): Buffer {
  // Complex DSP operations
  const fftResult = performFFT(audioBuffer);           // ~15ms
  const filtered = applyNoiseReduction(fftResult);     // ~10ms
  const normalized = normalizeAudio(filtered);         // ~5ms
  const mixed = mixWithBackground(normalized);         // ~10ms
  const compressed = applyCompression(mixed);          // ~10ms
  return compressed;
}

// Meanwhile, other operations are starved:
setTimeout(() => {
  console.log("This timer is delayed!"); // Fires 50ms late
}, 10);

// WebSocket responses are delayed
websocket.on('message', (data) => {
  // This handler waits 50ms before processing
  handleIncomingMessage(data);
});

The event loop's single-threaded nature means audio processing blocks all other operations, causing:

Threads vs Processes: The Architecture Decision

When moving beyond the single-threaded event loop, we had two options: child processes or worker threads. Let's examine why we chose threads.

Child Processes: The Traditional Approach

// Child process approach


class AudioProcessorWithProcesses {
  private processes: Map<string, ChildProcess> = new Map();

  async processAudio(audioData: Buffer): Promise<Buffer> {
    // Spawn a new process for each audio chunk
    const process = spawn('node', ['./audio-processor.js']);

    // Send data via stdin/stdout
    process.stdin.write(audioData);

    return new Promise((resolve, reject) => {
      let result = Buffer.alloc(0);
      process.stdout.on('data', (chunk) => {
        result = Buffer.concat([result, chunk]);
      });
      process.on('close', (code) => {
        if (code === 0) {
          resolve(result);
        } else {
          reject(new Error(`Process exited with code ${code}`));
        }
      });
    });
  }
}

Process Advantages

Process Disadvantages

Worker Threads: The Modern Solution

// Worker thread approach


class AudioProcessorWithThreads {
  private worker: Worker;
  private nextId = 1;
  private promises = new Map<number, { resolve: Function; reject: Function }>();

  constructor() {
    // Create a persistent worker thread
    this.worker = new Worker(__filename);
    // Handle responses from worker
    this.worker.on('message', (msg: any) => {
      const { id, result, error } = msg;
      const promise = this.promises.get(id);
      if (promise) {
        if (error) {
          promise.reject(new Error(error));
        } else {
          promise.resolve(result);
        }
        this.promises.delete(id);
      }
    });
  }

  async processAudio(audioData: Buffer): Promise<Buffer> {
    return new Promise((resolve, reject) => {
      const id = this.nextId++;
      this.promises.set(id, { resolve, reject });
      // Send to worker thread - no serialization overhead for Buffers
      this.worker.postMessage({
        id,
        method: 'processAudio',
        audioData
      });
    });
  }
}

// Worker thread code (same file, different execution context)
if (!isMainThread && parentPort) {
  parentPort.on('message', async (msg) => {
    const { id, method, audioData } = msg;
    try {
      // CPU-intensive processing happens here
      const result = await performAudioProcessing(audioData);
      parentPort.postMessage({ id, result });
    } catch (error) {
      parentPort.postMessage({ id, error: error.message });
    }
  });
}

Thread Advantages

Thread Disadvantages

Why Threads Won for Audio Processing

For our realtime audio workload, threads provide critical advantages:

  1. Shared Memory Performance: Audio buffers can be shared directly via SharedArrayBuffer
  2. Low Latency: Thread communication overhead is <1ms vs 10–50ms for processes
  3. Memory Efficiency: No V8 heap duplication for hundreds of concurrent audio streams
  4. Resource Scaling: Can create thousands of threads vs hundreds of processes

Why We Chose Bun Over Node.js

Before diving into our multi-threading implementation, let's examine why we chose Bun as our runtime foundation, particularly for CPU-intensive workloads.

JavaScript Engine Performance

Node.js uses V8 (Google's JavaScript engine)

Bun uses JavaScriptCore (Apple's JavaScript engine)

Benchmark Results for Our Workload

Based on our internal benchmarks and this public data:

// Audio processing benchmark (1000 iterations)
const audioBuffer = Buffer.alloc(1280 * 2); // 80ms @ 16kHz

// Node.js performance
console.time('node-audio-processing');
for (let i = 0; i < 1000; i++) {
  processAudioChunk(audioBuffer);
}
console.timeEnd('node-audio-processing');
// Result: ~116 seconds

// Bun performance
console.time('bun-audio-processing');
for (let i = 0; i < 1000; i++) {
  processAudioChunk(audioBuffer);
}
console.timeEnd('bun-audio-processing');
// Result: ~46 seconds (2.5x faster)

HTTP Server Performance (Critical for WebSocket connections)

Startup Performance

The Memory Trade-off

Bun consumes more memory (+75% in our testing)

// Memory usage comparison for 100 concurrent audio streams
const memoryUsage = process.memoryUsage();

// Node.js
console.log(memoryUsage);
// { rss: 40MB, heapUsed: 25MB, heapTotal: 30MB }

// Bun
console.log(memoryUsage);
// { rss: 70MB, heapUsed: 44MB, heapTotal: 53MB }

This trade-off is acceptable for our use case because:

The Architecture: Worker Thread Proxy Pattern

Our solution employs a worker thread proxy pattern that isolates CPU-intensive audio processing while maintaining a clean API for the main thread, leveraging JavaScript's event loop for I/O operations.

System Architecture

Main Thread (Event Loop)
├── WebSocket I/O (telephony providers)
├── HTTP Server (REST API)
├── Timer Management (synchronization)
├── AudioProcessorProxy (thread coordination)
│   ├── Worker Thread 1 (audio processing)
│   ├── Worker Thread 2 (audio processing)
│   └── Worker Thread N (audio processing)
└── Database I/O (call metadata)

Core Implementation

1. Main Thread Proxy (Event Loop Optimized)


  private worker: Worker;
  private nextRequestId = 1;
  private pendingRequests = new Map<number, {
    resolve: Function;
    reject: Function;
    timestamp: number;
  }>();

  constructor() {
    this.worker = new Worker(__filename);
    this.setupMessageHandling();
    this.setupErrorHandling();
  }

  private setupMessageHandling(): void {
    this.worker.on('message', (msg: WorkerResponse) => {
      const { requestId, result, error } = msg;
      const request = this.pendingRequests.get(requestId);

      if (request) {
        // Clean up request tracking
        this.pendingRequests.delete(requestId);

        // Track processing time
        const processingTime = Date.now() - request.timestamp;
        this.metrics.histogram('audio.processing.duration', processingTime);

        if (error) {
          request.reject(new Error(error));
        } else {
          request.resolve(result);
        }
      }
    });
  }

  // Non-blocking method that returns immediately
  async processAudioChunk(audioData: Buffer, options: ProcessingOptions): Promise<Buffer> {
    return new Promise((resolve, reject) => {
      const requestId = this.nextRequestId++;

      // Track request for response correlation
      this.pendingRequests.set(requestId, {
        resolve,
        reject,
        timestamp: Date.now()
      });

      // Send to worker thread - this is async and non-blocking
      this.worker.postMessage({
        requestId,
        method: 'processAudioChunk',
        audioData,
        options
      });
    });
  }

  async mixAudioTracks(tracks: AudioTrack[]): Promise<Buffer> {
    return this.sendToWorker('mixAudioTracks', { tracks });
  }

  async applyEffects(audioData: Buffer, effects: AudioEffect[]): Promise<Buffer> {
    return this.sendToWorker('applyEffects', { audioData, effects });
  }

  private async sendToWorker(method: string, args: any): Promise<any> {
    return new Promise((resolve, reject) => {
      const requestId = this.nextRequestId++;
      this.pendingRequests.set(requestId, {
        resolve,
        reject,
        timestamp: Date.now()
      });
      this.worker.postMessage({
        requestId,
        method,
        ...args
      });
    });
  }
}

2. Worker Thread Implementation (CPU-Intensive Processing)

// This code runs in the worker thread context
if (!isMainThread && parentPort) {
  class AudioProcessor {
    private audioBuffer: SharedArrayBuffer;
    private processingMetrics: Map<string, number> = new Map();

    constructor() {
      // Pre-allocate shared memory for audio processing
      this.audioBuffer = new SharedArrayBuffer(1024 * 1024 * 10); // 10MB
    }

    async processAudioChunk(audioData: Buffer, options: ProcessingOptions): Promise<Buffer> {
      // This is CPU-intensive work that would block the event loop
      const startTime = performance.now();
      try {
        // Convert to Int16Array for processing
        const samples = new Int16Array(audioData.buffer);

        // Apply processing pipeline
        let processedSamples = samples;
        if (options.enableNoiseReduction) {
          processedSamples = await this.applyNoiseReduction(processedSamples);
        }
        if (options.enableVolumeNormalization) {
          processedSamples = await this.normalizeVolume(processedSamples);
        }
        if (options.enableCompression) {
          processedSamples = await this.applyCompression(processedSamples);
        }

        // Convert back to Buffer
        const result = Buffer.from(processedSamples.buffer);

        // Track processing time
        const processingTime = performance.now() - startTime;
        this.processingMetrics.set('lastProcessingTime', processingTime);

        return result;
      } catch (error) {
        throw new Error(`Audio processing failed: ${error.message}`);
      }
    }

    private async applyNoiseReduction(samples: Int16Array): Promise<Int16Array> {
      // CPU-intensive FFT-based noise reduction
      const fftSize = 1024;
      const result = new Int16Array(samples.length);

      for (let i = 0; i < samples.length; i += fftSize) {
        const chunk = samples.slice(i, i + fftSize);
        // Apply FFT
        const frequencyDomain = this.fft(chunk);
        // Apply noise reduction filter
        const filtered = this.applyNoiseFilter(frequencyDomain);
        // Apply inverse FFT
        const timeDomain = this.ifft(filtered);
        // Copy back to result
        result.set(timeDomain, i);
      }

      return result;
    }

    private async mixAudioTracks(tracks: AudioTrack[]): Promise<Buffer> {
      // Mix multiple audio tracks with precise timing
      const maxLength = Math.max(...tracks.map(t => t.audioData.length));
      const mixedSamples = new Int16Array(maxLength / 2);

      for (const track of tracks) {
        const trackSamples = new Int16Array(track.audioData.buffer);
        const volume = track.volume || 1.0;

        for (let i = 0; i < trackSamples.length; i++) {
          // Mix with volume control and clipping prevention
          mixedSamples[i] = Math.max(-32768, Math.min(32767,
            mixedSamples[i] + (trackSamples[i] * volume)
          ));
        }
      }

      return Buffer.from(mixedSamples.buffer);
    }

    // Placeholder for FFT implementation
    private fft(samples: Int16Array): Complex[] {
      // Implement Fast Fourier Transform
      // This is CPU-intensive and would block the event loop
      return [];
    }

    private ifft(frequencies: Complex[]): Int16Array {
      // Implement Inverse Fast Fourier Transform
      return new Int16Array(0);
    }
  }

  // Message handling in worker thread
  const processor = new AudioProcessor();
  parentPort.on('message', async (msg: WorkerRequest) => {
    const { requestId, method, ...args } = msg;

    try {
      // Route to appropriate processing method
      let result;
      switch (method) {
        case 'processAudioChunk':
          result = await processor.processAudioChunk(args.audioData, args.options);
          break;
        case 'mixAudioTracks':
          result = await processor.mixAudioTracks(args.tracks);
          break;
        case 'applyEffects':
          result = await processor.applyEffects(args.audioData, args.effects);
          break;
        default:
          throw new Error(`Unknown method: ${method}`);
      }

      // Send result back to main thread
      parentPort.postMessage({
        requestId,
        result
      });
    } catch (error) {
      // Send error back to main thread
      parentPort.postMessage({
        requestId,
        error: error.message
      });
    }
  });
}

Advanced Threading Optimizations

1. SharedArrayBuffer for Zero-Copy Operations

For maximum performance, we use SharedArrayBuffer to eliminate serialization overhead:

class OptimizedAudioProcessor {
  private sharedBuffer: SharedArrayBuffer;
  private sharedView: Int16Array;

  constructor() {
    // Create shared memory accessible by both threads
    this.sharedBuffer = new SharedArrayBuffer(1024 * 1024 * 4); // 4MB
    this.sharedView = new Int16Array(this.sharedBuffer);
  }

  async processWithSharedMemory(audioData: Buffer): Promise<Buffer> {
    // Copy audio data to shared memory
    const audioSamples = new Int16Array(audioData.buffer);
    this.sharedView.set(audioSamples, 0);

    // Send offset and length instead of copying data
    return this.sendToWorker('processSharedAudio', {
      offset: 0,
      length: audioSamples.length
    });
  }
}

// Worker thread processes shared memory directly
if (!isMainThread && parentPort) {
  parentPort.on('message', async (msg) => {
    if (msg.method === 'processSharedAudio') {
      const { offset, length } = msg;

      // Access shared memory directly - no serialization!
      const sharedView = new Int16Array(msg.sharedBuffer);
      const audioSamples = sharedView.subarray(offset, offset + length);

      // Process in-place
      for (let i = 0; i < audioSamples.length; i++) {
        audioSamples[i] = audioSamples[i] * 0.8; // Apply volume
      }

      // Result is already in shared memory
      parentPort.postMessage({
        requestId: msg.requestId,
        result: 'processed'
      });
    }
  });
}

2. Thread Pool Management

For handling multiple concurrent audio streams:

class AudioProcessorPool {
  private workers: Worker[] = [];
  private roundRobinIndex = 0;
  private workerLoad = new Map<Worker, number>();

  constructor(poolSize: number = 4) {
    for (let i = 0; i < poolSize; i++) {
      const worker = new Worker(__filename);
      this.workers.push(worker);
      this.workerLoad.set(worker, 0);
      worker.on('message', (msg) => {
        // Decrease load count when work completes
        const currentLoad = this.workerLoad.get(worker) || 0;
        this.workerLoad.set(worker, Math.max(0, currentLoad - 1));
      });
    }
  }

  private getNextWorker(): Worker {
    // Load balancing: choose worker with lowest load
    let minLoad = Infinity;
    let selectedWorker = this.workers[0];

    for (const worker of this.workers) {
      const load = this.workerLoad.get(worker) || 0;
      if (load < minLoad) {
        minLoad = load;
        selectedWorker = worker;
      }
    }

    // Increase load count
    this.workerLoad.set(selectedWorker, minLoad + 1);

    return selectedWorker;
  }

  async processAudio(audioData: Buffer): Promise<Buffer> {
    const worker = this.getNextWorker();

    return new Promise((resolve, reject) => {
      const requestId = Date.now() + Math.random();
      const timeout = setTimeout(() => {
        reject(new Error('Worker processing timeout'));
      }, 5000);

      const messageHandler = (msg: any) => {
        if (msg.requestId === requestId) {
          clearTimeout(timeout);
          worker.off('message', messageHandler);
          if (msg.error) {
            reject(new Error(msg.error));
          } else {
            resolve(msg.result);
          }
        }
      };

      worker.on('message', messageHandler);
      worker.postMessage({
        requestId,
        method: 'processAudio',
        audioData
      });
    });
  }
}

3. Memory Management and Garbage Collection

Worker threads require careful memory management:

class MemoryManagedAudioProcessor {
  private bufferPool: Buffer[] = [];
  private maxPoolSize = 100;
  private gcInterval: NodeJS.Timeout;

  constructor() {
    // Pre-allocate buffer pool
    for (let i = 0; i < this.maxPoolSize; i++) {
      this.bufferPool.push(Buffer.alloc(1024 * 4)); // 4KB buffers
    }

    // Periodic garbage collection
    this.gcInterval = setInterval(() => {
      if (global.gc) {
        global.gc();
      }
    }, 10000); // Every 10 seconds
  }

  private getBuffer(size: number): Buffer {
    // Reuse existing buffers when possible
    for (let i = 0; i < this.bufferPool.length; i++) {
      const buffer = this.bufferPool[i];
      if (buffer && buffer.length >= size) {
        this.bufferPool[i] = null; // Mark as used
        return buffer.slice(0, size);
      }
    }

    // Allocate new buffer if pool is empty
    return Buffer.alloc(size);
  }

  private returnBuffer(buffer: Buffer): void {
    // Return buffer to pool for reuse
    for (let i = 0; i < this.bufferPool.length; i++) {
      if (!this.bufferPool[i]) {
        this.bufferPool[i] = buffer;
        return;
      }
    }
  }

  async processAudio(audioData: Buffer): Promise<Buffer> {
    const workBuffer = this.getBuffer(audioData.length);
    try {
      // Process audio using pooled buffer
      audioData.copy(workBuffer);

      // Perform processing...
      const result = await this.performProcessing(workBuffer);

      return result;
    } finally {
      // Always return buffer to pool
      this.returnBuffer(workBuffer);
    }
  }

  destroy(): void {
    clearInterval(this.gcInterval);
    this.bufferPool = [];
  }
}

Integration with Realtime Voice AI Pipeline

Our worker thread architecture integrates seamlessly with the broader voice AI pipeline, leveraging the event loop for I/O and threads for CPU-intensive work:

// Main thread handles I/O and coordination

  private audioProcessor: AudioProcessorProxy;
  private websocket: WebSocket;
  private llmClient: LLMClient;
  private ttsClient: TTSClient;

  constructor() {
    this.audioProcessor = new AudioProcessorProxy();
    this.setupWebSocketHandling();
  }

  private setupWebSocketHandling(): void {
    // Event loop handles WebSocket I/O efficiently
    this.websocket.on('message', async (audioData: Buffer) => {
      // This is non-blocking - audio processing happens in worker thread
      const processedAudio = await this.audioProcessor.processAudioChunk(audioData, {
        enableNoiseReduction: true,
        enableVolumeNormalization: true
      });

      // Parallel processing while audio is being processed
      const [transcription, vadResult] = await Promise.all([
        this.transcribeAudio(processedAudio),
        this.analyzeVoiceActivity(processedAudio)
      ]);

      if (vadResult.isSpeaking) {
        // Generate AI response
        const aiResponse = await this.llmClient.generateResponse(transcription);

        // Convert to audio (also uses worker thread)
        const aiAudio = await this.ttsClient.generateAudio(aiResponse);

        // Mix with background audio
        const mixedAudio = await this.audioProcessor.mixAudioTracks([
          { audioData: aiAudio, volume: 0.8 },
          { audioData: this.backgroundAudio, volume: 0.2 }
        ]);

        // Send back via WebSocket (event loop I/O)
        this.websocket.send(mixedAudio);
      }
    });
  }

  private async transcribeAudio(audioData: Buffer): Promise<string> {
    // I/O operation - handled by event loop
    return await this.makeHTTPRequest('/transcribe', audioData);
  }

  private async analyzeVoiceActivity(audioData: Buffer): Promise<VADResult> {
    // CPU-intensive operation - delegated to worker thread
    return await this.audioProcessor.analyzeVoiceActivity(audioData);
  }
}

Performance Monitoring and Observability

We instrument our worker threads with comprehensive metrics to understand performance characteristics using Datadog. Below is a provider-agnostic implementation:

class InstrumentedAudioProcessor {
  private metrics = {
    processedChunks: 0,
    averageProcessingTime: 0,
    memoryUsage: 0,
    errorRate: 0
  };

  constructor() {
    // Monitor worker thread performance
    setInterval(() => {
      this.reportMetrics();
    }, 1000);
  }

  async processAudio(audioData: Buffer): Promise<Buffer> {
    const startTime = performance.now();
    const initialMemory = process.memoryUsage().heapUsed;
    try {
      const result = await this.performProcessing(audioData);

      // Track success metrics
      const processingTime = performance.now() - startTime;
      this.updateMetrics(processingTime, true);
      return result;
    } catch (error) {
      // Track error metrics
      this.updateMetrics(0, false);
      throw error;
    } finally {
      // Monitor memory usage
      const finalMemory = process.memoryUsage().heapUsed;
      this.metrics.memoryUsage = finalMemory - initialMemory;
    }
  }

  private updateMetrics(processingTime: number, success: boolean): void {
    this.metrics.processedChunks++;

    if (success) {
      // Moving average for processing time
      this.metrics.averageProcessingTime =
        (this.metrics.averageProcessingTime * 0.9) + (processingTime * 0.1);
    } else {
      this.metrics.errorRate =
        (this.metrics.errorRate * 0.9) + (0.1);
    }
  }

  private reportMetrics(): void {
    console.log('Worker Thread Metrics:', {
      processedChunks: this.metrics.processedChunks,
      averageProcessingTime: `${this.metrics.averageProcessingTime.toFixed(2)}ms`,
      memoryUsage: `${(this.metrics.memoryUsage / 1024 / 1024).toFixed(2)}MB`,
      errorRate: `${(this.metrics.errorRate * 100).toFixed(2)}%`
    });
  }
}

Production Results and Lessons Learned

Performance Achievements

Our multi-threaded audio processing system now handles:

Key Lessons Learned

1. Event Loop + Worker Threads = Optimal Architecture

The combination of JavaScript's event loop for I/O operations and worker threads for CPU-intensive processing provides the best of both worlds:

2. Bun's Performance Gains Are Real and Significant

Our switch to Bun provided measurable improvements:

3. SharedArrayBuffer Is a Game-Changer

Using SharedArrayBuffer for audio data eliminates serialization overhead:

4. Thread Pools Require Careful Management

Worker thread pools need sophisticated load balancing:

5. Memory Management Is Critical

Pre-allocating buffers and using object pools prevents:

Debugging and Troubleshooting

Common issues we encountered and solutions:

// Issue: Worker thread deadlocks
class DeadlockSafeProcessor {
  private requestTimeout = 5000; // 5 second timeout

  async processWithTimeout(audioData: Buffer): Promise<Buffer> {
    return new Promise((resolve, reject) => {
      const timer = setTimeout(() => {
        reject(new Error('Worker thread timeout - possible deadlock'));
      }, this.requestTimeout);
      this.processAudio(audioData)
        .then(resolve)
        .catch(reject)
        .finally(() => clearTimeout(timer));
    });
  }
}

// Issue: Memory leaks in worker threads
class MemoryLeakSafeProcessor {
  private cleanupInterval: NodeJS.Timeout;

  constructor() {
    // Periodic cleanup
    this.cleanupInterval = setInterval(() => {
      this.cleanup();
    }, 30000); // Every 30 seconds
  }

  private cleanup(): void {
    // Force garbage collection
    if (global.gc) {
      global.gc();
    }

    // Clear any lingering references
    this.clearCaches();
  }

  destroy(): void {
    clearInterval(this.cleanupInterval);
    this.cleanup();
  }
}

Conclusion

Multi-threaded audio processing with worker threads represents a significant architectural evolution from single-threaded Node.js applications. By combining JavaScript's excellent event loop for I/O operations with worker threads for CPU-intensive tasks, and choosing Bun for its superior performance characteristics, we've built a system that scales to handle realtime voice AI at production scale.

The key insights from our implementation:

  1. Understand your workload: CPU-intensive tasks need threads, I/O tasks need the event loop
  2. Choose the right runtime: Bun's performance advantages compound at scale
  3. Optimize data transfer: SharedArrayBuffer eliminates serialization overhead
  4. Monitor aggressively: Worker threads require different monitoring strategies
  5. Plan for failure: Graceful degradation and timeouts are essential

The patterns shown here are broadly applicable to any high-throughput, low-latency system that combines I/O operations with CPU-intensive processing.

Want to work on problems like this? Toma is disrupting the $2T automotive industry with agentic AI. We're looking for senior engineers who love solving complex challenges in realtime systems. Check out our jobs page to learn more.

Disclaimer: This post represents our engineering approach as of 2025. Technologies, benchmarks, and implementation details may evolve as we continue to optimize our systems.