Brahe: Distributed Search Platform
Enterprise distributed search platform with dynamic resource allocation, MapReduce data pipelines, and infrastructure optimizations that projected $500K+ in savings.
Overview
Brahe is a distributed search platform built on ZooKeeper, HBase, and Apache Solr, designed for dynamic resource allocation across a large-scale healthcare data ecosystem. The platform served as a critical dependency for all other applications in the organization, requiring 99% uptime guarantees.
Built and maintained Java services and MapReduce data processing pipelines that powered all search services for a NoSQL datastore. Processing pipelines generated fresh outputs on 2, 4, and 6-hour refresh cycles to keep search data current across the platform.
Tech Stack
Deep Dive
Designed and implemented major feature enhancements focused on infrastructure efficiency. Optimizations reduced the ongoing operational cost of search infrastructure by more than 30%, with projected savings of $500K+ over 3 years — not accounting for expected data growth.
Managed a team of 10–14 staff engineers with full agile ownership: facilitated bi-weekly planning, sprint execution, task assignment, and retrospectives.
Key Outcomes
- Maintained 99% uptime on search systems serving as a dependency for all other applications
- Built MapReduce pipelines with 2/4/6-hour refresh cycles powering platform-wide search
- Infrastructure optimizations reduced operational costs by 30%+, projecting $500K savings over 3 years
- Led agile team of 10–14 staff engineers through bi-weekly sprint cycles
- Dynamic resource allocation for distributed search across ZooKeeper-coordinated clusters