Transcription

Persyk uses speech-to-text AI to transcribe your voice recordings into text. After recording a voice memo, the audio is sent to a transcription service and the resulting text is copied to your clipboard.

Any OpenAI-compatible transcription endpoint can be used, giving you flexibility to choose between cloud services or self-hosted solutions based on your privacy, latency, and cost requirements.

Configuration

Configure transcription in your settings by specifying a provider and model:

{
  "providers": {
    "speaches": {
      "type": "openai-compatible",
      "baseUrl": "http://localhost:8000/v1",
      "apiKey": "sk-..."
    }
  },
  "transcription": {
    "enabled": true,
    "provider": "Systran/faster-distil-whisper-large-v3",
    "model": "whisper-1"
  }
}

Supplying a language request parameter can significantly improve transcription speed for both Speaches and OpenAI. When the language is specified, the model skips language detection and processes audio faster.

{
  "transcription": {
    "enabled": true,
    "provider": "speaches",
    "model": "Systran/faster-distil-whisper-large-v3",
    "requestOptions": {
      "language": "en"
    }
  }
}

Use ISO 639-1 language codes (e.g., en for English, es for Spanish, de for German).

Provider Options

Speaches (Recommended)

Speaches is another project of mine - an OpenAI-compatible inference server with support for transcription, translation, text-to-speech, voice activity detection (VAD), speaker embedding, Realtime API, and more. See the GitHub repository for setup instructions.

Best for users who want full control over their data and are willing to run a local server. With the right hardware, you can achieve a significantly lower latency than cloud alternatives.

Example configuration:

{
  "providers": {
    "speaches": {
      "type": "openai-compatible",
      "baseUrl": "http://localhost:8000/v1",
      "apiKey": null,
    },
  },
  "transcription": {
    "enabled": true,
    "provider": "speaches",
    "model": "Systran/faster-whisper-small.en",
  },
}

OpenAI

Use OpenAI’s API directly. The quickest way to get started - just add your API key and you’re ready to go.

Example configuration:

{
  "providers": {
    "openai": {
      "type": "openai-compatible",
      "baseUrl": "https://api.openai.com/v1",
      "apiKey": "sk-...",
    },
  },
  "transcription": {
    "enabled": true,
    "provider": "openai",
    "model": "gpt-4o-mini-transcribe", // or "whisper-1"
  },
}