Transcription
Persyk uses speech-to-text AI to transcribe your voice recordings into text. After recording a voice memo, the audio is sent to a transcription service and the resulting text is copied to your clipboard.
Any OpenAI-compatible transcription endpoint can be used, giving you flexibility to choose between cloud services or self-hosted solutions based on your privacy, latency, and cost requirements.
Configuration
Section titled “Configuration”Configure transcription in your settings by specifying a provider and model:
{ "providers": { "speaches": { "type": "openai-compatible", "baseUrl": "http://localhost:8000/v1", "apiKey": "sk-..." } }, "transcription": { "enabled": true, "provider": "Systran/faster-distil-whisper-large-v3", "model": "whisper-1" }}Provider Options
Section titled “Provider Options”Speaches (Recommended)
Section titled “Speaches (Recommended)”Speaches is another project of mine - an OpenAI-compatible inference server with support for transcription, translation, text-to-speech, voice activity detection (VAD), speaker embedding, Realtime API, and more. See the GitHub repository for setup instructions.
Best for users who want full control over their data and are willing to run a local server. With the right hardware, you can achieve a significantly lower latency than cloud alternatives.
Example configuration:
{ "providers": { "speaches": { "type": "openai-compatible", "baseUrl": "http://localhost:8000/v1", "apiKey": null, }, }, "transcription": { "enabled": true, "provider": "speaches", "model": "Systran/faster-whisper-small.en", },}OpenAI
Section titled “OpenAI”Use OpenAI’s API directly. The quickest way to get started - just add your API key and you’re ready to go.
Example configuration:
{ "providers": { "openai": { "type": "openai-compatible", "baseUrl": "https://api.openai.com/v1", "apiKey": "sk-...", }, }, "transcription": { "enabled": true, "provider": "openai", "model": "gpt-4o-mini-transcribe", // or "whisper-1" },}