Back to projects

Quark

High-performance RAG pipeline

Dual-stream memory architecture with persistent context awareness. Built for deep document analysis and retrieval-augmented generation at scale.

Quark Demo

Tech Stack

TypeScriptPythonUnStructred.ioQdrant RedisMem0Docker

Overview

A high-performance RAG (Retrieval-Augmented Generation) system designed for deep document analysis and persistent context awareness. This project implements a sophisticated architecture that synchronizes unstructured text and image data, utilizing a dual-memory layer for a truly personalized chat experience.

Architecture Overview

Quark distinguishes itself through a multi-stage pipeline:

Multimodal Ingestion

  • Partitioning: Leveraging Unstructured.io for semantic text decomposition and layout analysis.
  • Extraction: Utilizing pdfplumber for precise image and table coordinate extraction.
  • Sync Layer: A custom orchestration layer that aligns text and visual modalities for comprehensive multimodal embeddings.

Dual-Stream Memory

  • STM (Short-Term Memory): Powered by Redis. Provides sub-millisecond access to rapid session-based context and transient state.
  • LTM (Long-Term Memory): Powered by Mem0. Acts as a persistent intelligence layer that retains user history, evolving preferences, and long-form knowledge over time.

Core Intelligence & Retrieval

  • Embedding & Reranking: Powered by Voyage AI, utilizing advanced rerankers and metadata filtering to maximize retrieval precision.
  • Vector Infrastructure: Qudrant handles high-dimensional vector storage alongside robust relational metadata.

Technical Stack

  • Web Framework: ElysiaJS — The high-performance, Bun-native framework for the backend.
  • Identity & DB: Supabase — Unified Auth and PostgreSQL backend.
  • Frontend: React — A minimalist, streaming-responsive interface optimized for real-time AI interactions.
  • Worker(BullMQ + Redis): — Persistent workers. Heavy I/O and compute offloaded. Scalability by design. Powered by BullMQ and Redis."