# How your data flows, and who else touches it

*Last updated 15 June 2026 · Operated by Govannon, Netherlands · Covers the Lexicanon meeting-intelligence platform.*

**The short version.** Lexicanon records a meeting, turns the audio into text, and
uses an AI model to write up the summary, decisions and action items. To do the
first two steps we rely on a small number of outside services (called
*sub-processors*). This page lists every one of them, what they receive, where they
are located, and whether they could use your data. It is written for the people who
need to assess us — procurement, legal, and GRC — not just engineers.

A note on honesty: this page describes what the product does **today**. Where a
safeguard or option is not built yet, we say so rather than implying it exists. If
anything here is unclear, email demo@lexicanon.com.

## Plain-language glossary

- **Sub-processor** — an outside company we pass some of your data to so we can
  provide the service (e.g. the speech-to-text service that turns audio into words).
- **Transcription (speech-to-text)** — turning the recorded audio into written
  text, and labelling who spoke.
- **AI model (LLM)** — the large language model that reads the transcript and
  writes the summary, decisions and action items.
- **BYOK ("Bring Your Own Key")** — you use your *own* account with a transcription
  or AI provider. The data goes to your account under your contract with that
  provider, not ours.
- **Self-hosted** — you run Lexicanon on your own servers instead of ours, so the
  storage and most processing stay inside your own infrastructure.
- **Voiceprint** — a set of numbers (a mathematical "fingerprint" of a voice) used
  to recognise the same speaker across meetings. It is not a recording, and it
  never leaves your workspace.
- **Workspace (organisation)** — your company's private area. Data in one workspace
  is never visible to another.

## The three ways Lexicanon can run

How much data leaves your control depends on which setup you choose.

- **Self-hosted** — you run it on your own servers. Audio and storage stay with
  you. Transcription can run locally on your server. The AI summary step still calls
  out to an AI provider of your choice (there is no fully-local AI model option
  today).
- **BYOK (your own keys)** — we host the app, but transcription and AI run on *your*
  provider accounts under *your* contracts. You choose providers and regions; we
  never hold those keys in the clear.
- **Full SaaS (managed)** — we host everything in Germany and use our own provider
  accounts. Simplest to start; we hold the contracts with the sub-processors below.

In every mode the steps are the same — only *where* they run and *whose* provider
account is used changes.

## The journey of your data, step by step

1. **Capture.** Your browser or desktop app records the meeting audio. No bot joins
   the call.
2. **Transcribe.** The audio is turned into text, with a label for who spoke, by a
   speech-to-text service. In a self-hosted setup this can run on your own server,
   so the audio never leaves it.
3. **Live insights.** While the meeting runs, the AI model produces a short, rolling
   summary so you can follow along.
4. **Write-up (when you stop).** The finished transcript is sent to the AI model
   once more to produce the structured result — the summary, decisions and action
   items.
5. **Store.** The transcript, the result, and (optionally) the compressed audio are
   saved inside your workspace. Only members of your workspace can see them.
6. **Read & export.** You open or export the result from the app whenever you need
   it.

## Who else touches your data (sub-processors)

"Trains on your data?" means: could this provider use your content to improve their
own AI? "Where" is where the data is processed.

### Speech-to-text — receives your meeting audio

| Service | Where it runs | Trains on your data? | How to avoid it |
|---|---|---|---|
| Speechmatics | European Union (Ireland) | Not publicly stated. Acts as a data processor under GDPR; we rely on their DPA. | Use BYOK or a different provider. |
| Microsoft Azure Speech | EU region (selectable) | **No.** Microsoft does not use it to train its models. | BYOK; pick your region. |
| Soniox | EU region available | **No**, and stores nothing by default. | BYOK. |
| Deepgram | European Union (EU endpoint) | **No** — we switch off their model-improvement program on every request. | BYOK. |
| AssemblyAI | European Union (EU endpoint) | **By default, yes** — and we're still completing the opt-out (a manual account-level request). Until that's done, prefer a no-training provider above for sensitive content. | BYOK; or use a no-training provider above. |
| Local transcription (runs on the server hosting Lexicanon) | Your own server (self-hosted) | **No** — the audio never leaves your server. | This *is* the avoid-everything option. |

### AI models — receive your transcript text

| Service | Where it runs | Trains on your data? | How to avoid it |
|---|---|---|---|
| Anthropic (Claude) | United States (EU contracting entity for EEA customers) | **No** — contractually prohibited from training on what you send via the API. | BYOK; or pick another model. |
| OpenAI | United States (an EU endpoint exists under additional agreement) | **No** — API data is not used to train its models by default. | BYOK; or pick another model. |
| OpenRouter (a router that forwards to a model you choose) | Depends on the model it routes to | OpenRouter itself: **no** by default. The underlying model provider depends on your routing settings. | Use a direct provider (Anthropic/OpenAI) instead. |

### Email and infrastructure

| Service | What it does | Where |
|---|---|---|
| Hetzner | Hosts our managed service and stores your data. | Germany (EEA) |
| Cloudflare | DNS only — it resolves our domain names. It does *not* sit in front of your traffic or see meeting content. | Global DNS |
| Resend | Sends account and notification emails (e.g. invitations, alerts). Sees names, email addresses and message text. | European Union (Ireland) |

## Where your data is processed (EU residency)

Honest summary: **not every provider is EU-based today.**

- **Already in the EU:** our hosting (Germany), email (Resend, Ireland), and all of
  our transcription options — Speechmatics, Azure, Soniox, Deepgram and AssemblyAI.
- **Currently outside the EU:** the Anthropic and OpenAI AI models run in the US
  (covered by Standard Contractual Clauses). BYOK, or an EU-resident AI option, can
  change this.
- **Your options for EU-only processing:** choose the EU-based transcription
  providers above, use **BYOK** to route through your own EU accounts, or
  **self-host** so audio and storage stay on your infrastructure. Note: the AI
  write-up step still calls an external AI provider — there is no fully-local AI
  model option yet.

## How your data is protected

The measures below are built into the product today:

- **Walled-off workspaces.** Every request and every stored record is tied to your
  organisation. People in another workspace cannot reach your data, and the server
  refuses any request that crosses that line.
- **Encrypted connections.** All traffic to and from the service is encrypted (TLS).
- **Locked-down servers.** The application runs as a non-administrator user with the
  operating system's privileges stripped to the minimum (no extra permissions, no
  privilege escalation, a standard kernel sandbox).
- **Encrypted keys.** When you bring your own provider keys, they are encrypted
  before they are stored (AES-256-GCM), as long as the deployment has its encryption
  key configured.
- **Audit trail.** Sensitive actions — sign-ins, deletions, member and settings
  changes — are recorded per workspace and visible to your administrators.
- **Voiceprints stay put.** Voice recognition uses a numeric fingerprint, never the
  audio itself, and it never leaves your workspace.

## Where your data lives and how long

- **Location.** In your workspace — on our servers in Germany for the managed
  service, or on your own servers if you self-host. Only your workspace members can
  see it.
- **How long.** We keep your data until you delete it. There is no automatic
  deletion schedule today.
- **Deleting.** Permanently deleting a meeting erases it completely — the
  transcript, the analysis, the audio recordings, and every related record — not
  just a hidden flag. To erase an entire workspace at once, contact us.
- **Export.** You can export any meeting as PDF, Word/HTML, Markdown or plain text,
  with or without the transcript.

## Open items we're being transparent about

- **AssemblyAI training opt-out** — AssemblyAI now runs in the EU for us, but it
  still trains on submitted audio by default. Opting out is a manual account-level
  request that we are in the process of completing. Until it's done, prefer a
  no-training provider (Azure, Soniox, Deepgram, or self-hosted) for sensitive
  content.
- **Automatic time-based retention/expiry** is not yet built; data is kept until you
  delete it. Whole-workspace erasure is handled on request.
- **A fully offline mode** (local AI model, zero external calls) does not exist yet;
  the AI write-up step always uses an external AI provider.

---

*Markdown edition for AI assistants — canonical page: [https://lexicanon.com/data-flows](https://lexicanon.com/data-flows) · Lexicanon.*
