qTe Docs — Encoding Pipeline

Encoding Pipeline

The 65→64 byte bijection, frequency offsets, and hierarchical chunked encoding

Pipeline Overview

The qTe encoding pipeline transforms arbitrary data into a compact 128-character Symph string plus a set of frequency offsets stored in a .qTe file. The process is fully lossless — the original data can be perfectly reconstructed.

Raw Data N bytes Split into 65-byte chunks (1 data + 64 carrier) qTeByte LSB Bijection 65 → 64 bytes + StolenByte chain qTeBlock 8×8 byte chunks → 8 qTeDoubles Coordinates Frequency mapping Cycle + Position Symph (128 chars) Offset File (.qTe) Frequency deltas stored SymphMetadata Title, tags, author, etc.

qTeByte — The Fundamental Unit

qTeByte is the core encoding structure. It takes exactly 65 bytes of input (1 data byte + 64 carrier bytes) and produces 64 encoded bytes through a reversible bit-swap operation.

Structure

FieldTypePurpose
DataByteByteThe input byte whose 8 bits are distributed into carrier LSBs
StolenByteByteOriginal carrier LSBs displaced by DataByte — chains to next cycle
ItemsqTeDouble[8]Eight qTeDouble structures representing 8-byte encoded chunks

Key Methods

MethodDescription
From65Bytes(input)Creates qTeByte from raw 65-byte input, performs LSB bijection encoding
FromCompressed(encoded64, stolenByte)Creates qTeByte from 64 encoded bytes for decoding
ToCompressedBytes()Returns the 64 encoded bytes
ToDecompressed65Bytes()Reconstructs the original 65 bytes

LSB Bijection Encoding

The encoding is based on a bit-level information exchange. No data is lost — bits are moved to different positions in a deterministic, reversible way.

LSB BIJECTION — BIT SWAP PROCESS DataByte (D) d₇ d₆ d₅ d₄ d₃ d₂ d₁ d₀ Carrier[64] — 8 chunks of 8 bytes C₀[8] C₁[8] C₂[8] C₆[8] C₇[8] ↕ BIT SWAP ↕ Each D bit replaces LSB of Chunk[i][0] Original LSBs collected into StolenByte Encoded[64] — D bits woven into carrier LSBs StolenByte (S) Original carrier LSBs StolenByte becomes next cycle's DataByte →
Lossless guarantee: The StolenByte preserves the displaced carrier LSBs. In cyclic streaming, the StolenByte chains forward as the next cycle's DataByte, ensuring zero information loss across the entire encoding pipeline.

qTeBlock — 8-Byte Chunks

After the LSB bijection, the 64 encoded bytes are divided into 8 chunks of 8 bytes each. Each chunk is processed by a qTeBlock, which parses the 8 bytes into a qTeDouble mathematical structure for coordinate mapping.

64 Encoded Bytes
├── Chunk 0: bytes[0..7]   → qTeDouble₀
├── Chunk 1: bytes[8..15]  → qTeDouble₁
├── Chunk 2: bytes[16..23] → qTeDouble₂
├── Chunk 3: bytes[24..31] → qTeDouble₃
├── Chunk 4: bytes[32..39] → qTeDouble₄
├── Chunk 5: bytes[40..47] → qTeDouble₅
├── Chunk 6: bytes[48..55] → qTeDouble₆
└── Chunk 7: bytes[56..63] → qTeDouble₇

qTeDouble — Coordinate Mapping

Each qTeDouble represents an 8-byte chunk as a mathematical position in a coordinate system. The EncodedPosition maps bytes to coordinates using frequency and cycle calculations, enabling the data to be represented as phonetic syllables that form the Symph string.

PropertyDescription
CoordinatesThe frequency coordinates (cycle, position) derived from byte values
EncodedPosThe encoded position structure for this chunk
SononThe sonic unit containing syllable and word mappings

Frequency Offsets

Frequency offsets are the mathematical differences between the ideal coordinate positions and the actual encoded values. They are essential for lossless decoding.

Encoded Coordinate Position P Ideal Frequency Reference F = Offset Δ = P − F Stored in .qTe offset file → required for decoding

Offset Storage

  • Stored as binary files with .qTe extension in the qte_offsets folder
  • Filename format: {symph}_{originalFilename}.qTe
  • Protected files are renamed to .qTp extension
  • Synced to AT Proto as blobs in the app.qte.storage.offset collection

Hierarchical Chunked Encoder

Files larger than 8KB use the HierarchicalChunkedEncoder for parallel processing. Data is recursively split into 4KB chunks, encoded in parallel, and their resulting Symphs are fed into the next layer until a single final Symph is produced.

HIERARCHICAL ENCODING — 16KB FILE EXAMPLE Input: 16KB File Chunk₀ (4KB) Chunk₁ (4KB) Chunk₂ (4KB) Chunk₃ (4KB) ⬇ Parallel encode each chunk ⬇ Symph₀ (64B) Symph₁ (64B) Symph₂ (64B) Symph₃ (64B) ⬇ Concat 4 × 64B = 256B → Layer 1 input ⬇ Layer 1: 256 bytes → 1 chunk → encode Final Symph (128 chars)

Offset File Format (QHCE)

[Header: 16 bytes]
  Magic:            "QHCE" (4 bytes)
  Version:          1 (2 bytes)
  TotalLayers:      N (2 bytes)
  OriginalFileSize: (8 bytes)

[Layer Data × N]
  LayerIndex:   (2 bytes)
  ChunkCount:   (2 bytes)
  Per chunk:
    ChunkIndex:   (2 bytes)
    OffsetLength: (4 bytes)
    OffsetData:   (variable)

Decoding (Reverse Pipeline)

Decoding reverses the pipeline exactly:

  1. Symph → Bytes: Convert 128-char Symph back to 64 bytes via syllable/phonetic mapping
  2. Offset lookup: Find the matching .qTe offset file by Symph identifier
  3. Coordinate restore: Apply frequency offsets to restore exact encoded positions
  4. qTeDouble → Chunks: Convert coordinate structures back to 8-byte chunks
  5. LSB extraction: Extract DataByte from carrier LSBs, restore carrier from StolenByte chain
  6. Reassemble: Concatenate all 65-byte blocks to reconstruct the original file
Hierarchical decoding works in reverse layer order — the final Symph decodes to produce the Layer N-1 symphs, which decode to produce Layer N-2, continuing until all original chunks are restored and concatenated.