Technical Deep Dive
Technical Architecture
A complete technical reference for the Sansqrit quantum simulation engine. This document covers every layer of the system — from the DSL lexer through sparse matrix mathematics to distributed 10-qubit sharding. Designed for researchers extending the engine and AI models learning the codebase.
System Overview
Sansqrit is a quantum-classical computing platform implemented entirely in Rust. The system consists of four major components that work together to execute quantum programs written in the Sansqrit DSL (Domain-Specific Language). The first component is the language frontend — a lexer, parser, and tree-walking interpreter that processes .sq source files into an Abstract Syntax Tree (AST) and executes them. The second component is the quantum engine — a three-tier simulation system that selects the optimal execution strategy based on qubit count. The third component is the standard library — a collection of classical computing utilities including collections, file I/O, statistics, and regular expressions. The fourth component is the domain packages — eight specialized packages for chemistry, biology, medical, physics, genetics, machine learning, mathematics, and QASM export.
The engine supports 46 quantum gates, 19 fully-implemented quantum algorithms, and 17 circuit constructors. It can simulate up to 100+ qubits on a standard laptop by exploiting the sparsity structure of most practical quantum circuits. The key insight behind Sansqrit's performance is that most quantum states encountered in real-world algorithms (GHZ states, VQE ansatz outputs, QAOA intermediate states) have far fewer non-zero amplitudes than the theoretical maximum of 2n. By storing only non-zero entries in a HashMap, Sansqrit achieves memory savings of up to 1028× compared to dense simulation.
Workspace Structure
The Sansqrit project uses a Rust workspace with 11 crates. The workspace root Cargo.toml sets rust-version = "1.80" as the minimum supported compiler version (required by rayon-core 1.13.0 for parallel computation). The crate structure is designed so that adding a new domain package (e.g., climate science) requires only creating a new crates/sansqrit-climate/ directory and adding one line to the workspace members list — zero changes to existing code.
sansqrit/
├── Cargo.toml # Workspace root: rust-version = "1.80"
├── crates/
│ ├── sansqrit-core/src/ # Quantum engine: 8 core modules
│ │ ├── complex.rs # Complex64 arithmetic: c(), c_real(), c_exp_i()
│ │ ├── sparse.rs # SparseStateVec: HashMap<u128, Complex64>
│ │ ├── gates.rs # 46 gates: matrices + sparse application
│ │ ├── lookup.rs # O(1) memory-mapped gate lookup tables
│ │ ├── engine.rs # QuantumEngine: 3-tier auto-selection
│ │ ├── measurement.rs # Shot-based histograms, expectation values
│ │ ├── distributed.rs # Rayon parallel chunks, TCP protocol
│ │ ├── qasm_export.rs # OpenQASM2/3, IBM, IonQ, Cirq, Braket
│ │ ├── algorithms.rs # 19 quantum algorithms (Grover, Shor, VQE...)
│ │ └── circuits.rs # 17 circuit constructors (W, Dicke, QEC...)
│ ├── sansqrit-lang/src/ # DSL compiler: lexer → parser → interpreter
│ │ ├── lexer.rs # Tokenization: keywords, literals, operators
│ │ ├── ast.rs # Abstract Syntax Tree: 15 statement types
│ │ ├── parser.rs # Recursive-descent parser: expressions, control flow
│ │ ├── interpreter.rs # Tree-walking interpreter: classical + quantum dispatch
│ │ └── main.rs # CLI: sansqrit run/qasm/version commands
│ ├── sansqrit-stdlib/src/ # Standard library: 7 modules
│ └── sansqrit-{chemistry,biology,medical,physics,genetics,ml,math,qasm}/
├── tools/precompute/generate_blobs.py # Gate lookup table generator
├── samples/ # Example .sq programs
└── .github/workflows/ci.yml # CI: format → build → test → clippy
Source code: github.com/sansqrit/sansqritPy
Execution Pipeline
When a user runs sansqrit run program.sq, the source code passes through four stages before producing results. Each stage is implemented as a separate Rust module in the sansqrit-lang crate. The pipeline is designed to be single-pass — no intermediate compilation step is needed. The interpreter directly executes the AST nodes, dispatching quantum operations to the engine as they are encountered.
Drag to pan, scroll to zoom, and hover any block to inspect the details without text overflow.
Stage 1: Lexer (lexer.rs)
The lexer (also called the tokenizer or scanner) reads the raw .sq source text character by character and produces a stream of tokens. Each token has a type (keyword, identifier, number, string, operator, etc.) and a span indicating its position in the source file for error reporting. The Sansqrit lexer supports the following token categories:
Keywords: let, const, fn, class, struct, if, else, for, while, loop, match, return, break, continue, import, simulate, circuit, molecule, true, false, None, and, or, not, in, try, catch, finally, raise, extends
Operators: +, -, *, /, //, %, **, ==, !=, <, >, <=, >=, =, +=, -=, *=, /=, |> (pipeline), &, |, ^, <<, >>
Literals: Integers (42, -7), floats (3.14, 1e-6), strings ("hello", 'world'), f-strings (f"Energy: {e:.6f}"), triple-quoted multiline strings
Comments: Single-line (# comment or -- comment), multi-line (/* ... */), documentation (/// docstring)
-- Input source code:
let q = quantum_register(4)
H(q[0])
-- Lexer output (token stream):
-- [KW_LET, IDENT("q"), ASSIGN, IDENT("quantum_register"), LPAREN, INT(4), RPAREN, NEWLINE,
-- IDENT("H"), LPAREN, IDENT("q"), LBRACKET, INT(0), RBRACKET, RPAREN, NEWLINE]
Stage 2: Parser & AST (parser.rs, ast.rs)
The parser consumes the token stream and builds an Abstract Syntax Tree (AST). Sansqrit uses a recursive-descent parser with precedence climbing for expressions. The AST defines 15 statement types (LetDecl, Assign, ExprStmt, FnDef, ClassDef, StructDef, IfChain, ForLoop, WhileLoop, Import, Return, Match, Simulate, TryCatch, Circuit) and expression types including IntLit, FloatLit, StringLit, BoolLit, Ident, BinOp, UnaryOp, FnCall, Index, Member, ListLit, DictLit, FString, Lambda, ListComp, and Pipeline.
The operator precedence (from lowest to highest) is: pipeline (|>), logical or, logical and, comparison, bitwise or, bitwise xor, bitwise and, shift, addition/subtraction, multiplication/division/modulo, power, unary, member access/index/call.
Stage 3: Interpreter (interpreter.rs)
The tree-walking interpreter executes AST nodes directly. It maintains an environment (scope stack) for variable bindings and dispatches quantum operations to the QuantumEngine. When the interpreter encounters a Simulate { engine, body } block, it creates a new QuantumEngine instance with the specified engine tier (or auto-selects based on qubit count), executes the body statements, and captures measurement results. All quantum gate function calls (H, CNOT, Rx, etc.) within a simulate block are dispatched to the engine's convenience methods.
The interpreter supports 80+ built-in functions organized into categories: quantum gates (46), quantum operations (quantum_register, measure, measure_all, probabilities, expectation_z, expectation_zz, engine_nnz, statevector), math functions (sqrt, sin, cos, log, exp, abs, ceil, floor, round, pow), collection operations (len, range, enumerate, zip, map, filter, reduce, sort, sum, mean, min, max), type functions (int, float, str, bool, type), I/O functions (print, read_csv, write_csv, read_json, write_json), and string methods (len, upper, lower, contains, replace, split, join).
Three-Tier Quantum Engine
The QuantumEngine struct is the primary interface for all quantum operations. It auto-selects the optimal simulation tier based on qubit count, but users can force a specific tier using simulate(engine="chunked") { ... }. All three tiers use the same SparseStateVec data structure under the hood — the difference is in how they manage memory and parallelism.
Tier 1: Dense Engine (≤ 20 qubits)
For small circuits (up to 20 qubits), the state vector contains at most 220 = 1,048,576 amplitudes, requiring only 16 MB of RAM. The dense engine uses the SparseStateVec but expects it to be fully populated. Gate application iterates over all non-zero entries, which for dense states means all 2n entries. This is the fastest tier for small circuits because memory access patterns are sequential and cache-friendly. The 20-qubit threshold was chosen because 16 MB fits comfortably in L3 cache on modern processors.
Tier 2: Sparse Engine (21–28 qubits)
For medium circuits, the sparse engine exploits the fact that most quantum states have far fewer non-zero amplitudes than 2n. A 100-qubit GHZ state, for example, has exactly 2 non-zero entries regardless of qubit count. The sparse engine stores only non-zero entries in a HashMap<u128, Complex64>, achieving massive memory savings. Gate application iterates over only the non-zero entries, computing new amplitudes by applying the gate's unitary matrix. After each gate, amplitudes below a pruning tolerance (default 10-15) are removed to prevent numerical noise from accumulating.
Tier 3: Chunked Engine (> 28 qubits)
For large circuits, the chunked engine splits the quantum register into chunks of 10 qubits each. Each chunk maintains its own SparseStateVec with at most 1,024 basis states (210). Chunks execute in parallel using the Rayon library. Gates operating within a single chunk are applied locally. Cross-chunk gates (where control and target qubits are in different chunks) use a coordination protocol that temporarily merges the affected subspaces, applies the gate, and redistributes. The 10-qubit chunk size was chosen as the optimal balance between memory per chunk (16 KB max) and the number of cross-chunk operations needed for typical quantum circuits.
| Engine | Qubits | Memory (30q) | Strategy | Parallelism |
|---|---|---|---|---|
| Dense | ≤ 20 | 16 GB | Full state vector | Single-thread |
| Sparse | 21–28 | ~100 bytes (GHZ) | HashMap of non-zero entries | Single-thread |
| Chunked | > 28 | 10 × 16 KB chunks | 10-qubit shards, parallel | Rayon threads |
Sparse Matrix Mathematics
The core mathematical innovation in Sansqrit is the sparse representation of quantum state vectors. Traditional quantum simulators allocate a dense vector of 2n complex numbers. For 50 qubits, this requires 16 petabytes of RAM — clearly impossible. Sansqrit's sparse engine stores only the non-zero amplitudes, which for most practical quantum circuits is a tiny fraction of the full state space.
SparseStateVec Data Structure
The SparseStateVec is defined in crates/sansqrit-core/src/sparse.rs. It uses a Rust HashMap<u128, Complex64> where the key is the basis state index (supporting up to 128 qubits) and the value is the complex amplitude. Key operations include:
pub struct SparseStateVec {
pub n_qubits: usize, // number of qubits
entries: HashMap<u128, Complex64>, // ONLY non-zero amplitudes
prune_tol: f64, // prune below 1e-15
}
// O(1) operations:
get(index) -> Complex64 // lookup amplitude (0 if absent)
set(index, amp) // insert or remove if near-zero
nnz() -> usize // count of non-zero entries
drain() -> Vec<(u128, Complex64)> // take all entries
total_probability() -> f64 // Σ|aᵢ|² (should always be 1.0)
// Bit manipulation:
bit_of(state, qubit) -> 0|1 // extract bit at position
flip_bit(state, qubit) -> u128 // toggle bit at position
set_bit(state, qubit, val) -> u128
Gate Application Algorithm
Applying a single-qubit gate to a sparse vector works by iterating over all existing non-zero entries. For each entry, the algorithm extracts the target qubit's bit value, computes the partner state (same state but with the target bit flipped), looks up the 2×2 gate matrix elements, and distributes the amplitude between the original and partner states. This is O(nnz) where nnz is the number of non-zero entries — not O(2n).
// For each non-zero entry (state, amplitude):
// bit = extract target qubit's value from state
// partner = state with target bit flipped
// matrix = [[m00, m01], [m10, m11]] for the gate
//
// if bit == 0:
// new[state] += m00 * amplitude
// new[partner] += m10 * amplitude
// else:
// new[partner] += m01 * amplitude
// new[state] += m11 * amplitude
Amplitude Pruning
After each gate application, amplitudes with magnitude below prune_tol (default 10-15) are removed from the HashMap. This prevents numerical noise from accumulating over long circuits and keeps the nnz count as low as possible. The pruning tolerance was chosen to be well below the threshold where it could affect measurement probabilities (which are |amplitude|² ≈ 10-30) while still catching floating-point noise.
O(1) Gate Lookup Tables
Sansqrit's second performance innovation is pre-computed gate lookup tables. Instead of computing gate matrix multiplications at runtime, the system pre-computes every possible gate result for 10-qubit chunks and stores them in memory-mapped binary files. At runtime, applying a gate is a single memory read — O(1) instead of O(nnz).
Table Generation (generate_blobs.py)
The Python script tools/precompute/generate_blobs.py generates the lookup tables. It iterates over all 27 single-qubit gate types, 10 qubit positions within a chunk, and all 1,024 possible chunk states (210). For each combination, it pre-computes the output states and amplitudes, writing them to binary files. Two-qubit gates require iterating over 90 qubit pairs × 1,024 states. Generation takes approximately 30 seconds and produces ~52 MB of binary data.
python3 tools/precompute/generate_blobs.py --verify
# Output:
# single_qubit_all.bin ~20 MB (27 gates × 10 positions × 1024 states)
# two_qubit_all.bin ~31 MB (10 gates × 90 pairs × 1024 states)
# phase_table.bin ~1 MB (65536 pre-computed e^(iθ) values)
# manifest.json <1 KB (gate name → byte offset mapping)
Binary File Layout
Each entry in the binary lookup table consists of 36 bytes: two 16-bit output state indices (out0, out1) and four 64-bit floats (real and imaginary parts of both output amplitudes). The files are memory-mapped using the memmap2 crate, so the OS handles paging — only the needed portions are loaded into physical RAM.
Runtime Lookup
At runtime, applying a gate to a chunk state requires: (1) compute the byte offset: gate_id × 10 × 1024 × 36 + qubit × 1024 × 36 + state × 36, (2) read 36 bytes from the memory-mapped file, (3) deserialize into output states and amplitudes. This is a single memory read — no floating-point arithmetic required.
Measurement Engine
Sansqrit supports both single-qubit measurement (which collapses the state) and shot-based measurement (which samples from the probability distribution without collapsing). The measure(qubit) function computes P(0) = Σ|ai|² over all states where bit qubit is 0, generates a random number, and collapses to 0 or 1 accordingly. The measure_all(shots) function builds a cumulative probability distribution and samples shots times, returning a histogram of bitstring outcomes.
Hardware Export (5 Backends)
Sansqrit can export circuits to five real quantum hardware backends: OpenQASM 2.0/3.0 (standard text format), IBM Quantum JSON (for IBM Cloud), IonQ JSON (for IonQ trapped-ion hardware), Google Cirq Python (for Google Sycamore), and Amazon Braket Python (for AWS quantum services). The qasm_export.rs module records all gate operations in the circuit_log during execution and serializes them to the target format on demand.
Crate Architecture (11 crates)
| Crate | Purpose | Key Dependencies |
|---|---|---|
sansqrit-core | Quantum engine: 10 modules, 3,116 LOC | num-complex, rayon, dashmap, memmap2, bytemuck, rand |
sansqrit-lang | DSL frontend: lexer, parser, interpreter, CLI | sansqrit-core, sansqrit-stdlib, regex, env_logger |
sansqrit-stdlib | Standard library: 7 modules | csv, regex, serde_json, rand |
sansqrit-chemistry | VQE, PES, Trotter, molecular Hamiltonians | sansqrit-core |
sansqrit-biology | DNA/RNA, protein folding, alignment | sansqrit-core |
sansqrit-medical | Drug screening, vaccine design, binding energy | sansqrit-core |
sansqrit-physics | Ising model, Heisenberg chain, time evolution | sansqrit-core |
sansqrit-genetics | CRISPR guide design, GWAS, variant calling | sansqrit-core |
sansqrit-ml | QNN, QSVM, QPCA, variational classifiers | sansqrit-core |
sansqrit-math | Shor factoring, Grover search, HHL solver | sansqrit-core |
sansqrit-qasm | OpenQASM import/export utilities | sansqrit-core |
Full source code: github.com/sansqrit/sansqritPy