Back to portfolio

Enterprise GenAI Retrieval Platform

Built to migrate a large multi-tenant search ecosystem from a legacy index to OpenSearch, support vectorized document retrieval at scale, and improve end-user search performance in a high-volume enterprise environment.

What I led

Owned architecture and implementation end-to-end on AWS, including roadmap/timeline definition, indexing strategy, ingest/search pipeline design, performance evaluation, and cross-team migration rollout.

Stack

AWS OpenSearchPer-tenant multi-index strategyAWS API Gateway + Lambda service layerIngestion Lambdas, search Lambdas, and embedding LambdasAWS Bedrock Titan Embeddings V2Amazon S3
+5 more
AWS event-driven ingestion triggers and processing queuesVector embeddings for document retrievalOpenSearch dashboards + Slack alertingHoneycomb + OpenTelemetry traces across Lambda flowsPython

Highlights

  • Designed and implemented a multi-tenant retrieval architecture on AWS for hundreds of client tenants.
  • Led migration planning and execution from a legacy index to OpenSearch.
  • Implemented event-driven ingestion pipelines (including S3-triggered flows) that collect, clean, chunk, embed, and index new content.
  • Designed chunking approaches for mixed document structures (including page/table-aware handling where needed).
  • Implemented API-driven search flow where user queries are embedded and matched against tenant-scoped OpenSearch indexes with filtering, candidate retrieval, and tuned query passes for performance.
  • Evaluated OpenSearch Serverless vs provisioned OpenSearch architecture and selected provisioned deployment for required multi-tenant flexibility and operational control.
  • Built lifecycle operations for update/delete/batch update and index migration/re-index flows for shard and scale management.

Outcomes

  • Scaled multi-tenant retrieval workflows across 100+ client contexts and high document volumes.
  • Indexed roughly 100M documents across tenant environments.
  • Supported thousands of user searches per day.
  • Delivered sub-second search performance in key workflows after migration and query tuning.
  • Search latency improved materially versus the legacy indexing approach.
  • Maintained high availability for production operations with incident visibility through dashboarding and trace-based alerting.