How Encryption Works

A technical deep-dive for the skeptics, the engineers, and anyone who wants proof.

TL;DR: Your journal entries are encrypted with AES-256-GCM using a key derived from your Master Password. The key never leaves your device. AI runs locally via Llama 3.2. If you opt into the collective, three layers of mathematical anonymization protect your identity before anything is transmitted.

The Data Flow

Here's exactly what happens from the moment you type to the moment data (optionally) reaches our servers:

User types journal entry │ ▼ ┌─────────────────────────────────────────┐ │ YOUR DEVICE (Local) │ │ │ │ 1. Entry encrypted → AES-256-GCM │ │ 2. Stored in IndexedDB (encrypted) │ │ 3. Llama 3.2 reads plaintext in RAM │ │ 4. AI generates insight (local only) │ │ 5. Anonymizer generalizes metadata │ │ 6. Differential Privacy adds noise │ │ 7. K-Anonymity check (Rule of 5) │ │ │ └────────────────┬────────────────────────┘ │ (optional, anonymous payload only) ▼ ┌─────────────────────────────────────────┐ │ CLOUDFLARE EDGE (Our servers) │ │ │ │ • Receives anonymous archetype only │ │ • No raw text, no user ID, no IP │ │ • Stored in D1 (edge SQLite) │ │ • Aggregated for collective insights │ │ │ └─────────────────────────────────────────┘

Layer 1: Local Encryption

Every journal entry is encrypted before it's saved — not after, not during sync, but immediately upon creation.

The Algorithm: AES-256-GCM

AES-256 — Advanced Encryption Standard with a 256-bit key. Used by governments and military worldwide. Brute-forcing a 256-bit key would take longer than the age of the universe.
GCM (Galois/Counter Mode) — provides both encryption and authentication. If even a single bit of ciphertext is tampered with, decryption fails. This prevents data corruption attacks.

Key Derivation

Your encryption key is derived from your Master Password using a Key Derivation Function (KDF) on your device:

The Master Password is never stored in plaintext
The derived key exists only in volatile memory during your session
The key is never transmitted to any server
Even if our servers were fully compromised, your entries remain encrypted gibberish

What this means: If you forget your Master Password, your data is gone forever. We cannot recover it. This is a feature, not a bug — it proves we don't have a backdoor.

Layer 2: Local AI Processing

The Higher Self AI doesn't phone home. Here's how it works:

Engine: Ollama running Llama 3.2 (3B parameters), optimized for edge devices. The model runs entirely in local memory.

Process: Your decrypted entry is passed to the local model in RAM. It never touches disk in plaintext, is never logged, and is never transmitted.

Output: The AI generates insights, identifies patterns, and provides CBT-style reflections. All output stays local.

No data is sent to OpenAI, Google, Anthropic, or any third-party AI provider. The model runs on your CPU/GPU. Full stop.

Layer 3: The Anonymization Pipeline

If you opt in to the collective feature, your data passes through three mathematical privacy guarantees before anything leaves your device:

Step 1: Generalization

Specific identifying data is transformed into broad categories:

Age: 31 → Age Group: 30-35
City: Berlin → Region: Europe
Mood: devastated about breakup → State: Struggling

Raw journal text is never included. Only generalized trait labels are produced.

Step 2: Differential Privacy (The "Coin Flip")

A Randomized Response algorithm introduces mathematical noise:

The algorithm flips a virtual coin
Heads: Your true generalized trait is used
Tails: A random decoy trait is substituted

This means any individual data point has plausible deniability — even we can't know if a specific entry is real or noise. But across thousands of users, the noise cancels out and aggregate statistics remain accurate.

Step 3: K-Anonymity (The Rule of 5)

Before any archetype record is committed to the database, the system checks: are there at least 5 other people who share this exact combination of traits?

If yes → the record is stored
If no → the record is further "blurred" locally until it fits a larger group

This prevents anyone from being identified by a unique combination of attributes (e.g., the only 80-year-old male Scorpio in Iceland).

Infrastructure: Why Cloudflare

Our entire backend runs on Cloudflare's edge network. This matters because:

Workers don't see client IPs — by default, Cloudflare Workers don't expose the connecting client's IP address to worker code. No custom middleware needed.
Edge compute — data is processed at the nearest Cloudflare data center, not routed to a central server
No traditional servers — there is no VM, no EC2 instance, no container to compromise
TLS 1.3 — all data in transit is encrypted with modern TLS

What We Store on Our Servers

The Cloudflare D1 database contains exactly two tables:

archetype_cohorts — generalized, noise-injected archetype records with no user identifiers
cohort_aggregations — aggregated trend data computed from cohorts that pass the Rule of 5

Zero raw text. Zero user IDs. Zero IP addresses. Zero encryption keys.

Verify It Yourself

Every claim on this page can be verified in our source code:

Full source code on GitHub
Encryption implementation: /src/storage/aes-encryption.ts
Anonymization pipeline: /src/anonymizer/
AI system prompt: /src/ai-engine/prompts.ts
Backend Worker: /collective-vault-worker/src/

We believe privacy claims without transparency are worthless. Audit us.