The fastest way to make a multi-agent system hallucinate is to let the agents share context. It sounds efficient — one big knowledge base, multiple agents querying it — but in practice it produces a specific, dangerous failure mode: an agent answering with confident authority from a domain it was never meant to touch.

We learned this deploying RAG across multiple regulated domains for the same customer. A construction safety agent was pulling HIPAA-adjacent language from a healthcare collection because the embeddings were close enough. The answer sounded authoritative. It was wrong. And in a compliance context, a confident wrong answer is worse than no answer at all.

The fix was not better prompting. It was architecture.

One Agent, One Container, One Collection

Every agent in our platform runs in its own Docker container with its own dedicated Qdrant collection. This is not a soft boundary enforced by system prompts or metadata filters. It is a hard boundary enforced by the fact that the agent literally cannot see data outside its collection.

┌───────────────────────────────────────────────────────────┐
│                    QDRANT VECTOR STORE                    │
│                                                           │
│  ┌───────────────┐  ┌───────────────┐  ┌───────────────┐  │
│  │ construction  │  │ healthcare    │  │ legal         │  │
│  │  _public      │  │  _public      │  │  _public      │  │
│  │  _enterprise  │  │  _enterprise  │  │  _enterprise  │  │
│  └───────┬───────┘  └───────┬───────┘  └───────┬───────┘  │
└──────────┼──────────────────┼──────────────────┼──────────┘
           │                  │                  │
   ┌───────▼───────┐  ┌───────▼───────┐  ┌───────▼───────┐
   │   CONTAINER   │  │   CONTAINER   │  │   CONTAINER   │
   │               │  │               │  │               │
   │ Construction  │  │ Healthcare    │  │ Legal         │
   │ Agent         │  │ Agent         │  │ Agent         │
   │               │  │               │  │               │
   │ collections:  │  │ collections:  │  │ collections:  │
   │ construction  │  │ healthcare    │  │ legal         │
   │  _public,     │  │  _public,     │  │  _public,     │
   │  _enterprise  │  │  _enterprise  │  │  _enterprise  │
   └───────────────┘  └───────────────┘  └───────────────┘

Each container's environment specifies exactly which Qdrant collections it can access. The agent code reads those collection names from environment variables at startup. There is no discovery mechanism, no wildcard access, no "search all collections" capability. If a collection name is not in the container's environment, it does not exist from that agent's perspective.

Why This Prevents Hallucinations

Hallucination in RAG systems is often a retrieval problem, not a generation problem. The LLM is doing exactly what it was asked — synthesizing an answer from the context it was given. The problem is that the context included documents from the wrong domain.

Metadata filtering is the common solution. Tag each document with its domain, filter at query time. It works until it does not. A misconfigured filter, a missing tag, a query that somehow bypasses the filter logic — any of these lets cross-domain documents leak into the retrieval results. And because the documents are semantically similar enough to surface in the first place, the LLM has no reason to question them.

Container-level isolation eliminates the entire category. The agent cannot retrieve what it cannot reach. There is no filter to misconfigure because there is no shared collection to filter against.

The Docker Compose Pattern

In practice, this is straightforward to implement. Each agent is a service in a Docker Compose file with its collections injected via environment variables:

services:
  qdrant:
    image: qdrant/qdrant:latest
    volumes:
      - qdrant_data:/qdrant/storage

  agent-construction:
    image: aspexilary/rag-agent:latest
    environment:
      - QDRANT_HOST=qdrant
      - QDRANT_COLLECTIONS=construction_public,construction_enterprise
      - AGENT_DOMAIN=construction
      - LLM_ENDPOINT=http://llm:8080/v1
    networks:
      - internal

  agent-healthcare:
    image: aspexilary/rag-agent:latest
    environment:
      - QDRANT_HOST=qdrant
      - QDRANT_COLLECTIONS=healthcare_public,healthcare_enterprise
      - AGENT_DOMAIN=healthcare
      - LLM_ENDPOINT=http://llm:8080/v1
    networks:
      - internal

Same image, different configuration. Adding a new domain is adding a new service block and creating the corresponding Qdrant collections. The agent code is identical — it connects to the collections it is told about and ignores everything else because everything else does not exist in its world.

What You Get from Isolation Beyond Hallucination Prevention

Containing each agent has benefits that go beyond retrieval accuracy.

Independent scaling. A healthcare agent handling heavy query volume does not compete for resources with a lightly-used construction agent. You scale containers independently based on actual demand per domain.

Isolated failures. If the legal agent crashes or enters a bad state, the construction and healthcare agents continue serving queries. A shared-context architecture often means a failure in one domain degrades the entire system.

Per-domain audit trails. Each container produces its own logs. When a compliance officer asks "show me every query the healthcare agent answered last month," you point at one container's logs. You do not parse a shared log to filter by domain after the fact.

Independent update cycles. Updating the healthcare knowledge base — adding new clinical guidelines, refreshing drug interaction data — does not require touching the construction agent. The Qdrant collection is updated, and the healthcare container picks up new vectors on the next query. Zero coordination with other domains.

The Collection Naming Convention

We use a strict naming convention for Qdrant collections that encodes the domain and data classification:

{domain}_{classification}

Where classification is one of:

  • public — sourced from approved external sources via the ingestion pipeline
  • enterprise — the customer's proprietary documents, loaded during onboarding

This convention is simple enough that it can be validated at startup. The agent reads its QDRANT_COLLECTIONS environment variable, splits on commas, and verifies each collection exists in Qdrant before accepting queries. If a collection is missing, the agent refuses to start rather than operating with incomplete knowledge.

When Agents Need to Talk to Each Other

Sometimes a query spans domains. A construction project manager asks about OSHA requirements that intersect with environmental regulations. The construction agent does not have environmental data.

The answer is not to give it access. The answer is a routing layer that sits in front of the agents.

The router receives the query, classifies it, and if necessary forwards it to multiple agents. Each agent answers from its own domain. The router merges the responses and presents a combined answer to the user. At no point does any single agent access another agent's collections. The cross-domain synthesis happens at the routing layer, which has no vector store access of its own — it only orchestrates.

This preserves the isolation guarantee while still handling multi-domain queries. And it produces a clear audit trail: the router log shows which agents were consulted, and each agent's log shows what it retrieved and generated.

The Principle

The broader principle is worth stating explicitly: context boundaries should be enforced by infrastructure, not by prompts. A system prompt that says "only answer questions about construction" is a suggestion to the LLM. A container that can only reach construction collections is a fact about the system.

In regulated environments, the distinction between a suggestion and a fact is the distinction between a policy and a control. Auditors want controls.

Each agent runs in its own Docker container with access limited to its designated Qdrant collections via environment configuration. Cross-domain retrieval is architecturally impossible — not filtered, not discouraged, but unreachable. Multi-domain queries are handled by a routing layer that orchestrates responses from isolated agents without granting any agent access beyond its own domain.

This is the same design philosophy as the network isolation we use for keeping enterprise data off the internet — enforce the boundary at the infrastructure layer, and the compliance story tells itself.