Real-Time Transcription & Translation
for Live Audio

EchoVCast captures audio from your mic or system audio, transcribes it with local speech recognition, and sends the text to our server for translation. It's still in beta — expect some rough edges, but it works and we're improving it with every update.

Accuracy varies.

Results depend on audio quality, speaking speed, accents, slang, and language complexity. Translations are approximate and may not fully capture nuance, idioms, or context. EchoVCast is under active development, but machine transcription and translation are imperfect by nature. Expect mistakes. Some sentences will be wrong, misheard, or awkwardly translated.

Demo video coming soon

Features

Local Speech Recognition

Runs on your machine using your GPU or CPU. Your audio never leaves your computer. Accuracy depends on audio clarity — it works well with clear speech, but background noise and overlapping speakers can cause errors.

Real-Time Translation

Transcribed text is sent to our server for machine translation. There's typically a 1-3 second delay, and translations aren't always perfect, especially with slang or fragmented speech.

Dual Audio Capture

Capture your microphone and system audio (YouTube, Twitch, Discord) at the same time. Each source runs independently with its own language settings.

Supported Translation Directions

Translate between these language pairs in real time.

English Japanese
English Traditional Chinese
Japanese Traditional Chinese

More languages and directions coming soon.

Ready to get started?

EchoVCast is in beta and actively being improved. Try it out and let us know what you think.

View Pricing