Back to projects
Quark
High-performance RAG pipelineDual-stream memory architecture with persistent context awareness. Built for deep document analysis and retrieval-augmented generation at scale.
Quark Demo
Tech Stack
TypeScriptPythonUnStructred.ioQdrant RedisMem0Docker
Overview
A high-performance RAG (Retrieval-Augmented Generation) system designed for deep document analysis and persistent context awareness. This project implements a sophisticated architecture that synchronizes unstructured text and image data, utilizing a dual-memory layer for a truly personalized chat experience.
Architecture Overview
Quark distinguishes itself through a multi-stage pipeline:
Multimodal Ingestion
- Partitioning: Leveraging
Unstructured.iofor semantic text decomposition and layout analysis. - Extraction: Utilizing
pdfplumberfor precise image and table coordinate extraction. - Sync Layer: A custom orchestration layer that aligns text and visual modalities for comprehensive multimodal embeddings.
Dual-Stream Memory
- STM (Short-Term Memory): Powered by Redis. Provides sub-millisecond access to rapid session-based context and transient state.
- LTM (Long-Term Memory): Powered by Mem0. Acts as a persistent intelligence layer that retains user history, evolving preferences, and long-form knowledge over time.
Core Intelligence & Retrieval
- Embedding & Reranking: Powered by Voyage AI, utilizing advanced rerankers and metadata filtering to maximize retrieval precision.
- Vector Infrastructure: Qudrant handles high-dimensional vector storage alongside robust relational metadata.
Technical Stack
- Web Framework: ElysiaJS — The high-performance, Bun-native framework for the backend.
- Identity & DB: Supabase — Unified Auth and PostgreSQL backend.
- Frontend: React — A minimalist, streaming-responsive interface optimized for real-time AI interactions.
- Worker(BullMQ + Redis): — Persistent workers. Heavy I/O and compute offloaded. Scalability by design. Powered by BullMQ and Redis."