Back to Portfolio
RAG SYSTEMS

RAG Platform for Retail Knowledge

Centralised multi-tenant RAG core turning documentation chaos into conversational clarity

Challenge

  • Knowledge scattered across 60+ unstructured sources (wikis, docs, tickets, drives)
  • Employees relying on 'ask a friend' instead of search; low trust in existing tools
  • Each new assistant was a one-off build, tightly coupled to a specific LLM provider
  • No clean model for multi-tenant isolation (per BU / country / role)

Solution

  • Designed an end-to-end RAG platform with multi-tenant vector store (namespaces per org / BU / country)
  • Implemented hybrid dense + lexical retrieval to handle noisy enterprise text
  • Built pluggable LLM abstraction to swap OpenAI, Gemini, or local models without rewrites
  • Created ingestion pipelines for 60+ sources with cleaning, splitting, enrichment and change-capture
  • Exposed a unified retrieval + generation API consumed by multiple assistants (FAQ bots, search copilots, support tools)
  • Applied hexagonal / ports-and-adapters architecture so new data sources or models mean new adapters, not new cores
  • Instrumented everything with observability: retrieval hit-rate, latency, errors, and assistant usage

Impact

  • One shared RAG core powering several assistants, instead of a separate stack per use case
  • Higher trust in answers and fewer 'where is X?' tickets, based on internal feedback
  • Faster onboarding of new teams: plug in data sources, configure tenant, deploy an assistant
  • Reduced lock-in risk by abstracting LLM providers behind a stable interface

Tech Stack

Core Technologies

  • Python
  • Vector DB (pgvector)
  • Hybrid search

AI & ML

  • OpenAI
  • Gemini
  • Ollama
  • Retrieval pipelines
  • Document ETL

Architecture

  • Hexagonal architecture
  • DI
  • Observability (logs/metrics/traces)

Ready for AI that drives revenue?

HEXALON