Cog-Bot: Cognitive Memory System
A reactive Spring Boot application that gives AI agents persistent, evolving memory through vector embeddings, semantic search, and LLM-driven knowledge distillation.
Overview
Cog-Bot is a cognitive memory system built on Spring Boot 3.4 WebFlux that enables AI agents to maintain persistent, evolving memory across conversations. It uses pgvector for semantic similarity search, Ollama for local LLM inference, and a multi-tier memory lifecycle that automatically promotes, demotes, and curates knowledge over time.
The system features cognitive spaces — semantic clusters of related memories that develop their own centroid embeddings, health scores, and tier lifecycles. A background heartbeat service manages centroid refresh, tier migration, health checks, and weekly curation sweeps including duplicate detection via cosine similarity self-joins.
Tech Stack
Deep Dive
The architecture supports a two-tier worker dispatch system: external workers register at runtime via REST and receive tasks via SSE, while an in-process worker provides a Spring AI fallback. Memory construction follows a multi-stage pipeline: distill → embed → save → link → evolve — all executed reactively through Mono/Flux chains.
Built as a platform for experimenting with agentic engineering patterns: agent daemon with goal cycles, voice/persona configuration via cognitive spaces, and MCP-compatible tool integration.
Key Outcomes
- Reactive architecture using Spring WebFlux with Mono/Flux throughout the entire stack
- Vector similarity search with pgvector for semantic memory retrieval
- Multi-tier memory lifecycle (HOT → WARM → COLD → DORMANT) with automatic curation
- Cognitive spaces with centroid embeddings, health scoring, and organic growth
- Two-tier worker dispatch: external workers via SSE + in-process Spring AI fallback
- Agent daemon with goal-cycle execution (GOAL → PLAN → EXECUTE → REFLECT → LEARN)
- Duplicate detection and batch merge using cosine similarity self-joins