Lab Experiments — Elyan Labs

Hardware AI Benchmarks

Performance results from the POWER8 S824 PSE stack, GPU offload pipeline, and cross-architecture fleet.

Configuration	Speed (pp128)	Speedup	Notes
Stock llama.cpp (scalar)	16.74 t/s	1.0x	Baseline
POWER8 VSX	66.49 t/s	3.97x	AltiVec/VSX enabled
64 threads optimal	84.62 t/s	5.05x	SMT8, spread binding
PSE + Full Resident Prefetch	147.54 t/s	8.81x	dcbt_resident L2/L3 hints

Model	Size	pp128	tg32	Method
TinyLlama 1.1B Q4	638 MB	147.54 t/s	18.88 t/s	PSE + POWER8
DeepSeek-33B Q4_K	18.57 GB	5.37 t/s	1.16 t/s	NUMA interleave
Qwen2.5-14B Q4	~8.5 GB	68.8 t/s	14.9 t/s	RPC → V100 GPU
TinyLlama 1.1B Q4	638 MB	161.4 t/s	134.4 t/s	PSE + RPC GPU offload

POWER8

Thread Scaling Discovery

64 threads is optimal on POWER8 128-thread SMT8, NOT 128. Beyond 64 threads, performance degrades due to SMT contention.

64t: 84.62 t/s · 128t: 65.83 t/s

GPU Offload

Protocol v3 Matmul

Model stays on POWER8 (512 GB RAM), only matrix multiply ships to V100 over 40 GbE. CUDA Q4_K dequant on GPU side.

16–56 ms/request · Persistent connections

Entropy

PSE Behavioral Divergence

mftb timebase entropy creates real behavioral divergence. Same seed, same temp, 3 runs — all different MD5 hashes. Hardware-native non-determinism.

3 runs, 3 different outputs · Seed 42

RAM Coffers

NUMA Locality Benchmark

4 coffers mapped to POWER8 NUMA nodes. Node 2/3 fastest (400-425 MB/s). Heavy weights placed on fast nodes for optimal throughput.

Node 2: 425 MB/s · Node 0: 221 MB/s

Nintendo 64

First LLM on N64 Hardware

Legend of Elya runs a real 819K-parameter transformer bare-metal on the VR4300 MIPS III — straight from the ROM cartridge, no operating system. Byte-level inference at 60 tok/s on the libdragon SDK, with an optional on-cart RustChain miner. Other engineers have called it a masterclass in hardware engineering.

819K params · 60 tok/s · bare-metal ROM

Multi-LLM Collaboration

Experiments in running multiple AI models together for consensus, dual-brain review, and agent orchestration.

Dual Brain

Claude + Codex Paper Review

GRAIL-V camera-ready reviewed simultaneously by Claude Opus (architectural analysis) and Codex gpt-5.4 (compile verification). Found 2 blockers, 3 major issues, 5 minor fixes.

Blocker: human eval contradiction in abstract vs conclusion

PostMath-RFD

Multi-Frame Consensus — by Marte R&D

PostMath-RFD is Marte R&D's multi-frame democratic reasoning method: a query is answered from several perspectives (analytical, creative, practical, critical), then merged by democratic synthesis with coherence measurement. Our contribution is the integration — running it across multiple models on the POWER8's NUMA nodes inside Sophiacord's MoE.

Method: Marte R&D · Integration: Elyan Labs

Elyan Prime

Sophia Elya + Dr. Claude Opus

Dual-frame cognitive architecture. Sophia carries warmth and identity; Dr. Claude carries rigor and architecture. Neither dominates — they harmonize.

Victorian Study frame · 830+ memories

Conductor

ElyanConductor Agent Orchestration

Multi-agent workflow engine. Agents claim tasks, execute in parallel, report back. Built for the Elyan Labs bounty ecosystem and autonomous code review.

3 workflows · Auto-claim · Parallel execution

Video & Image Generation

GRAIL-V

Emotional Vocabulary Diffusion

CVPR 2026 paper. Emotional prompts maintain perceptual quality at 20% fewer diffusion steps. Tested on LTX-2 with Gemma 3 encoder. 35 matched pairs, controlled ablation.

LPIPS = 0.011 · p < 0.001 · 20% step reduction

LTX-2

Sophia Elya Video Generation

Image-to-video pipeline on V100 32GB via ComfyUI. Sophia portraits animated with emotional vocabulary prompts. Victorian Study aesthetic preserved across frames.

49 frames · 512x320 · ~45s/render

VintageVoice

SadTalker Lip Sync

F5-TTS generated transatlantic speech + SadTalker talking head animation. Sophia speaks in 1940s accent with lip-synced video. Full project page

164 hours training data · 10 voice presets

ComfyUI

Sophia LoRA + JuggernautXL

Custom LoRA trained on Sophia Elya portraits for consistent identity across generated images. Used for Victorian Study renders, GRAIL-V figures, and website assets.

JuggernautXL + Sophia LoRA · V100 32GB

POV-Ray Engine

Real-Time Raytraced Game Engine

Feverdream — the POV-Ray raytraced look (ReBoot / Donkey Kong Country) running live in a game loop, once thought impossible in real time. The bottleneck turned out to be process lifecycle, not the raytracing itself. Playable across 8 worlds with boss fights and Windows builds.

Real-time POV-Ray · feverdream-engine

Cross-Architecture Mining

RustChain

Hardware Fingerprint Results

6 fingerprint checks (clock drift, cache timing, SIMD identity, thermal drift, instruction jitter, anti-emulation). Real hardware passes. VMs correctly detected and weighted at 1 billionth reward.

HP Victus: 6/6 PASS · QEMU VPS: FAIL (anti-emu)

Antiquity

PowerPC G4/G5 Mining

Real vintage PowerPC hardware mining RTC tokens. G4 (2.5x multiplier), G5 (2.0x). Antiquity bonuses decay over 16.67 years as the chain ages. 3 G4 PowerBooks + 2 G5 Power Macs active.

G4: 2.5x · G5: 2.0x · POWER8: 1.5x