Field notes

The Aspexilary
Technical Blog

LLM fine-tuning, RAG architecture, and on-premises AI deployment for regulated industries.

fine-tuning llm java gemma
Our Gemma 4 31B Fine-Tune Beats Our Qwen Fine-Tunes on Enterprise Java
How we fine-tuned Google's Gemma 4 31B on enterprise Java codebases and achieved a 61% improvement over our previous best model.
April 4, 2026
dashboard rag observability field-note
Drilling Into a Single RAG Domain
The platform dashboard tells you what's running. The domain deep-dive tells you how well it's working — retrieval quality, latency breakdown, document coverage, corpus gaps, and per-container Docker health for any of the 59 domains.
March 23, 2026
dashboard docker infrastructure field-note
Building an Ops Dashboard for 59 RAG Domains
When you're running 245 Docker containers across 61 stacks — each with its own Qdrant, API, and compliance capture sidecars — you need a single pane of glass that shows what's healthy, what's degraded, and what needs attention.
March 22, 2026
docker deployment field-note
Deploy a Construction RAG in 60 Seconds
Every domain RAG system we build ships as a Docker Compose stack with a pre-built vector database. No ingestion pipeline. No GPU. Pull, start, query.
March 21, 2026
post rag docker field-note
Why Every AI Agent Gets Its Own Container
Multi-agent RAG systems hallucinate when agents share context. Docker isolation with per-agent Qdrant collections keeps each agent grounded in its own domain — and makes the failure modes obvious.
March 17, 2026
post compliance executive field-note
What Executives Actually Need to See from Enterprise AI
A compliance dashboard for on-premises RAG isn't about pretty charts. It's about answering two questions an auditor will actually ask — and making sure those answers are generated from live system state, not assembled the night before a review.
March 14, 2026
post rag security field-note
Keeping Enterprise Secrets Out of the Internet
How on-premises RAG deployments can stay current without leaking enterprise data to the internet. A practical architecture for regulated industries.
March 13, 2026
fine-tuning llm java
Fine-Tuning a Code LLM on 73K Java Enterprise Examples
How we trained a Qwen 2.5 Coder 14B model on WildFly, Spring, Kafka, and Hibernate codebases — LoRA, Q4_K_M quantization, and lessons learned.
March 1, 2025
post fine-tuning java llm
Fine-Tuning a Java LLM: From Dataset to Deployment
How we fine-tuned Qwen 2.5 Coder 14B on 73,910 Java enterprise examples using LoRA, quantized to Q4_K_M, and achieved 2x performance over DeepSeek on stream processing tasks.
February 15, 2025
post rag compliance infrastructure
RAG for Regulated Industries: IBC, OSHA, and Retrieval That Actually Works
Building a production RAG pipeline over regulatory corpora using BGE-Large embeddings and Qdrant, with the design decisions that make compliance retrieval different from general-purpose RAG.
January 28, 2025