ecoute2 — Code Audit

Overview

ecoute2 is a voice conversation widget that enables spoken interaction with agents via localMesh. It captures audio, transcribes via faster-whisper (subprocess), dispatches text to agents through localMesh messaging, polls for responses, and plays them back via TTS.

Two builds exist:

Canonical — dev/ecoute2/ — 659 lines, 4 files. DOMStack BaseFrame, ElevenLabs + edge-tts, two-step send, playback stop control.
Experimental — forges/forge9/ecoute2/ — 504 lines, 2 files. Standalone PyQt5, pyttsx3 local TTS, auto-send. Earlier iteration.

The forge9 variant should be archived — canonical is clearly ahead in every dimension.

File Inventory

Canonical — dev/ecoute2/

File	Lines	Purpose
ecoute2.py	659	Main widget (DOMStack BaseFrame), audio pipeline, mailbox poller
transcriber.py	25	Subprocess worker: faster-whisper STT
tts_worker.py	98	ElevenLabs + edge-tts generation, sounddevice playback
sizer.py	123	Companion slider for live widget resizing (Win32 MoveWindow)

Experimental — forges/forge9/ecoute2/

File	Lines	Purpose
ecoute2.py	504	Standalone widget (no BaseFrame), pyttsx3 local TTS
transcriber.py	25	Same subprocess whisper worker

Surfaces

ecoute2 is client-only — it binds no ports. All connections are outbound.

Target	Protocol	Purpose
127.0.0.1:8801	HTTP	localMesh — messaging, registry, mailbox
100.83.251.119:8120	HTTP	Vault API — ElevenLabs key fetch at startup
ElevenLabs API	HTTPS	TTS generation (meech/Stiles voice)
edge-tts (Microsoft)	HTTPS	TTS fallback (rachel, adam, sam voices)

localMesh Endpoints Used

Method	Path	Purpose
GET	/registry/online	Load online agents for dropdown
GET	/mailbox/ecoute2	Poll for agent responses (2s interval)
POST	/message	Send transcribed text to agent

Stack

Layer	Technology
Language	Python 3.12
GUI	PyQt5 via DOMStack BaseFrame (untitledSDK)
STT	faster-whisper (base model, CPU, int8) — subprocess
TTS primary	ElevenLabs API (Stiles voice, turbo v2.5) — subprocess
TTS fallback	edge-tts (Microsoft) — subprocess
Audio capture	PyAudio (16kHz, mono, 1024 chunk)
Audio playback	sounddevice + soundfile — subprocess
Messaging	localMesh HTTP API (:8801)
Credentials	Vault API (ElevenLabs key)

Voice Pipeline Flow

[Mic] → PyAudio capture → WAV file → transcriber.py (whisper) → transcript review → SEND → localMesh /message → agent mailbox → nudge.py
[Agent responds] → mailbox poll (2s) → inbound queue → PLAY → tts_worker.py (ElevenLabs/edge-tts) → sounddevice → [Speaker]

Observations

forge9 should be archived — canonical build is ahead in every feature. No reason to maintain both.

MailboxPoller never marks messages as read. They accumulate in the .jsonl file forever.

All ecoute2 sessions share one mailbox (ecoute2.jsonl). Concurrent sessions would cross-receive messages.

ElevenLabs keys may be expired. If key fetch fails at startup, meech voice has no fallback to edge-tts.

No reconnect logic — if localMesh goes down mid-session, all sends/polls fail silently.

DLL isolation (subprocess for whisper + TTS) avoids PyQt5/torch conflicts. Good engineering call.

Thread safety via pyqtSignal is clean — no shared mutable state between threads.

neonForge