dolphin-ui

unified AI interface -- chat / image gen / voice synthesis
deployed ec2:8200 voice loop: planning

overview

dolphin-ui is a three-service web interface that unifies self-hosted AI capabilities into a single dark-themed dashboard. it connects a local LLM (Ollama on GPU), image generation (Stable Diffusion on T4), and text-to-speech (ElevenLabs API) behind a Flask backend on questboard-ec2.

a companion project, voice-dolphin, extends the concept into a real-time voice conversation loop running on Rocky with STT via Whisper.

architecture

                    +---------------------------+
                    |     browser (client)      |
                    |   tabs: chat|image|audio  |
                    +-------------+-------------+
                                  | HTTP
                    +-------------v-------------+
                    |   questboard-ec2:8200     |
                    |   Flask app (app.py)      |
                    |   nginx proxy :8201       |
                    |   100.83.251.119          |
                    +----+--------+--------+----+
                         |        |        |
              +----------v-+ +---v------+ +v----------------+
              |  picass0    | | picass0  | |  ElevenLabs     |
              |  Ollama     | | imggen   | |  TTS API        |
              |  :11434     | | :7860    | |  (external)     |
              |  dolphin-   | | SD v1.5  | |                 |
              |  llama3:8b  | | T4 GPU   | |                 |
              +-------------+ +----------+ +-----------------+
               100.127.18.29  100.127.18.29   api.elevenlabs.io

services

chat -- Ollama / dolphin-llama3:8b

Streaming chat via Ollama REST API. Messages are proxied from the Flask backend to picass0. Supports model switching -- any model pulled to Ollama is available via the /api/models endpoint.

image gen -- Stable Diffusion v1.5 on T4

Image generation via a custom Flask endpoint on picass0. Uses the diffusers library with runwayml/stable-diffusion-v1-5 in float16. Model loads lazily on first request.

audio -- ElevenLabs TTS

Text-to-speech via ElevenLabs API. Voice list is pulled dynamically. Supports monolingual v1, multilingual v2, and turbo v2.5 models. Audio returned as base64 MP3 and played in-browser with a pulsing visualizer.

API endpoints

methodpathdescription
POST/api/chatstreaming chat -- accepts {messages, model}, returns SSE token stream
GET/api/modelslist available Ollama models
POST/api/imageimage generation -- accepts {prompt, negative_prompt, steps, width, height}
POST/api/ttstext-to-speech -- accepts {text, voice_id, model_id}, returns base64 audio
GET/api/voiceslist ElevenLabs voices
GET/api/healthservice health -- ollama, image gen, elevenlabs status + chars remaining

deployment

infrastructure

componenthostportservice
Flask appquestboard-ec28200dolphin-ui.service
nginx proxyquestboard-ec28201nginx sites-enabled/dolphin
image gen APIpicass07860picass0-imggen.service
Ollamapicass011434ollama systemd

deploy script

Run from Rocky: bash ~/dolphin-ui/deploy.sh

  1. fetches ElevenLabs API key from vault
  2. SCPs app.py + static/ to ec2:/home/ubuntu/dolphin-ui/
  3. creates/updates dolphin-ui.service systemd unit
  4. injects EL key as environment variable
  5. SCPs picass0_imggen.py to picass0
  6. creates/updates picass0-imggen.service
  7. configures nginx reverse proxy on :8201

dependencies

# ec2 (questboard)
flask, requests

# picass0 (GPU)
flask, torch, diffusers, transformers, accelerate, safetensors

web UI

single-page app at static/index.html (591 lines, all inline). dark theme with cyan/purple/magenta accents.

design system

tokenvalueusage
--bg#0a0a0fpage background
--surface#12121apanels, header
--accent#00d4ffprimary cyan
--accent2#7b2ff7purple
--accent3#ff2d95magenta/pink
--success#00ff88health dots (online)
--error#ff4444health dots (offline)

tabs

features

voice-dolphin (companion project)

planning / prototype

a standalone voice conversation loop that runs on Rocky. extends dolphin-ui into real-time spoken interaction.

  +----------+    +----------+    +----------+    +----------+
  |   mic    |--->| whisper  |--->|  ollama  |--->|elevenlabs|---> speakers
  |  (Rocky) |    | STT      |    |  (SSH    |    |  TTS     |
  |          |    | (Rocky   |    |  tunnel  |    |  (API)   |
  |          |    |  CPU)    |    |  picass0)|    |          |
  +----------+    +----------+    +----------+    +----------+

the loop

  1. capture mic input via sounddevice
  2. detect silence (1.5s threshold) to end capture
  3. transcribe with faster-whisper (base model, CPU, int8)
  4. send transcript to dolphin-llama3 via Ollama (SSH tunnel to picass0)
  5. send response to ElevenLabs TTS (voice: Stiles)
  6. play MP3 via ffplay or powershell fallback
  7. loop until "quit" / "bye" / ctrl+c

modes

status

open questions

file map

filelocationpurpose
app.py~/dolphin-ui/Flask backend -- routes, API proxying, health checks
index.html~/dolphin-ui/static/web UI -- 591 lines, all inline CSS/JS
deploy.sh~/dolphin-ui/deployment script -- ec2 + picass0 + nginx
picass0_imggen.py~/dolphin-ui/image gen API for picass0 GPU
dolphin.py~/.artificer/.afterHours/voice-dolphin/voice loop script (Rocky-side)
voice-dolphin.md~/.artificer/.afterHours/planning doc -- stack, models, build order
brick_bot.py~/dolphin-ui/Telegram agent (separate, shares deploy dir)
mailbox_daemon.py~/dolphin-ui/email listener for brick.sl (separate)

credentials

credentialvault entryused by
ElevenLabs API keyelevenlabs_miecz_api_keyapp.py (env var), dolphin.py
ElevenLabs fallbackelevenlabs_ecoute_api_keydolphin.py (fallback)
picass0 SSH key~/forge3/questboard-key.pemdeploy.sh, dolphin.py tunnel

known issues

documented by goulard3120 // tunnelTime // 2026-05-07