Transcription Formatting

Persyk can automatically format raw transcriptions using a Large Language Model (LLM) to produce cleaner, more readable text. This feature removes filler words, fixes grammar, adds punctuation, and organizes content into logical paragraphs.

Here are some examples showing raw transcriptions and their formatted results:

Example 1: Quick reminder

Raw:

um so I need to remember to uh call the dentist tomorrow morning and also pick up groceries on the way home

Formatted:

I need to remember to call the dentist tomorrow morning and also pick up groceries on the way home.

Example 2: Technical note

Raw:

okay so the bug is in the like authentication module I think it’s because we’re not handling the uh the refresh token expiry correctly so when the token expires the user gets logged out instead of like automatically refreshing

Formatted:

The bug is in the authentication module. I think it’s because we’re not handling the refresh token expiry correctly. When the token expires, the user gets logged out instead of automatically refreshing.

Example 3: Meeting notes

Raw:

so we talked about the um the new feature launch and sarah said she needs like two more weeks for testing and uh john mentioned that the marketing materials aren’t ready yet so we’re pushing the launch to march fifteenth

Formatted:

We talked about the new feature launch. Sarah said she needs two more weeks for testing, and John mentioned that the marketing materials aren’t ready yet. We’re pushing the launch to March 15th.

Requirements

This feature requires an LLM endpoint that supports structured outputs. Structured outputs ensure the LLM returns properly formatted responses that Persyk can reliably process.

Configuration

Enable formatting in your settings:

{
  "transcription": {
    "formatting": {
      "enabled": true,
      "provider": "your-provider-name",
      "model": "your-model-name",
    },
  },
}

System Prompt

A default system prompt is provided that handles common transcription cleanup tasks like removing filler words, fixing grammar, and preserving the original meaning and tone. The raw transcription is automatically wrapped in <transcription> tags and sent as the user message.

You can customize the system prompt in your settings if needed:

{
  "transcription": {
    "formatting": {
      "prompt": "Your custom system prompt here...",
    },
  },
}

Provider Options

Local Inference Server (vLLM / Ollama / Any other OpenAI-compatible server)

Run an LLM locally on your own hardware. Best for users who prioritize privacy and have suitable hardware available.

Example configuration:

{
  "providers": {
    "local": {
      "type": "openai-compatible",
      "baseUrl": "http://localhost:11434/v1",
      "apiKey": null,
    },
  },
  "transcription": {
    "formatting": {
      "enabled": true,
      "provider": "local",
      "model": "llama3.2:3b",
    },
  },
}

OpenRouter

OpenRouter is an LLM gateway that provides access to many different model providers and models through a single API. The large variety of smaller, faster models makes it easy to optimize for low latency without running your own infrastructure.

Add the following to your request options to optimize for the fastest response times and enhanced privacy:

{
  "requestOptions": {
    "provider": {
      "require_parameters": true,
      "sort": "latency",
      "data_collection": "deny",
      "zdr": true,
    },
  },
}

require_parameters: true - Only use providers that support all requested parameters
sort: "latency" - Prioritize faster providers
data_collection: "deny" - Opt out of data collection
zdr: true - Enable Zero Data Retention mode

Example configuration:

{
  "providers": {
    "openrouter": {
      "type": "openai-compatible",
      "baseUrl": "https://openrouter.ai/api/v1",
      "apiKey": "sk-or-...",
    },
  },
  "transcription": {
    "formatting": {
      "enabled": true,
      "provider": "openrouter",
      "model": "meta-llama/llama-3.2-3b-instruct",
      "requestOptions": {
        "provider": {
          "require_parameters": true,
          "sort": "latency",
          "data_collection": "deny",
          "zdr": true,
        },
      },
    },
  },
}

OpenAI

Use OpenAI’s API directly. The quickest way to get started, but the limited selection of smaller models means higher latency compared to other options.

Example configuration:

{
  "providers": {
    "openai": {
      "type": "openai-compatible",
      "baseUrl": "https://api.openai.com/v1",
      "apiKey": "sk-...",
    },
  },
  "transcription": {
    "formatting": {
      "enabled": true,
      "provider": "openai",
      "model": "gpt-4o-mini",
    },
  },
}