Journey & Growth
Deep technical impact across AI research and engineering.
Professional Experience

Software Engineer
Diversified Automation
Mar 2025 – Present
Santa Clara, CA, USA
- •Multi-tenant search infrastructure: Designed multi-tenant vector search service (Qdrant) with namespace-level isolation across 100+ independent environments; collection filters enforced strict data boundaries, reducing memory overhead 80% vs. per-tenant database approach.
- •Hybrid search service: Built search service combining inverted-index keyword matching with semantic vector retrieval to handle exact alphanumeric codes and natural language in a unified pipeline; improved Recall@5 by up to 20% over vector-only architecture with no latency regression.
- •Low-latency query routing layer: Engineered intent-classification service that routes requests before triggering expensive downstream calls; fast-path circuit breaker for direct-lookup queries reduced p99 retrieval latency from 150ms to <5ms (30x), eliminating vector search overhead for targeted file queries.
- •Output validation service: Built post-generation validation microservice that fuzzy-matches LLM output against an indexed SQL tag database before rendering; achieved >99% redaction accuracy of fabricated hardware identifiers.
- •Deployment infrastructure: Owned end-to-end CI/CD pipeline (GitHub Actions + Docker + AWS) for AI microservices; automated testing, container builds, and blue-green deployments enabling zero-downtime.

Data Scientist
BMR Infotek
Aug 2024 – Mar 2025
Dublin, CA, USA
- •Large-scale text processing pipeline: Built batch embedding pipeline processing 1M+ customer records using sentence transformer models; outputs fed a segmentation service enabling precision behavioral targeting that improved marketing ROI by 20%.
- •Distributed data ingestion service: Designed Apache Airflow DAG-based ETL system on AWS EC2 ingesting and transforming 3.5M+ records/month into Redshift; retry orchestration and task-level fault isolation reduced pipeline processing latency by 35%.
- •Model serving & versioning platform: Built orchestration service connecting forecasting model outputs to downstream consumers; CI/CD pipelines automated model versioning, artifact promotion, and rollback, enabling zero-downtime model updates in production.

Data Scientist · Graduate Research Assistant
Kelley School of Business, Indiana University
Dec 2022 – May 2024
Bloomington, IN, USA
- •Large-scale NLP & classification: Applied fine-tuned BERT models to a 100M+ row dataset for political campaign and sentiment classification, achieving 91% test precision.
- •LLM-based analytics: Implemented LLM pipeline to analyze customer satisfaction and brand equity signals for large US companies; improved binary classification predictive accuracy by 15% and surfaced quantifiable financial market impact.
- •Behavioral modeling: Developed and deployed a Cross-Classified Multilevel Model to predict user performance from behavioral data; uncovered key interaction patterns that improved decision-making efficiency and boosted system performance by 35%.
- •Led research projects on television advertising, social media, and political marketing under Prof. Beth Fossen and Prof. Lopo Rego.

Data Scientist Intern
Twin Cities Innovation Alliance
Sep 2023 – Dec 2023
Minneapolis, MN, USA
- •A/B testing infrastructure: Built traffic-routing framework across application variants; SQL-based conversion analysis drove a 17% lift in user engagement.
- •Recommendation engine: Engineered collaborative + content-based filtering backend, optimizing preference retrieval queries to achieve 20% engagement increase.
- •API performance: Developed Node.js RESTful APIs connecting frontends to ML microservices, cutting API response latency by 40% and driving 15% growth in daily active usage.

Data Engineer
Moonplexus Pvt. Ltd.
May 2021 – Aug 2022
Pune, MH, India
- •Healthcare platform: Designed microservices web platform (Java, Django, AngularJS, SQL) automating health information management; 40% reduction in query execution time through relational DB optimization.
- •Cloud infrastructure: Migrated and maintained AWS EC2/RDS infrastructure, building fault-tolerant ETL pipelines with high availability for large-scale medical data storage.
- •Deep learning deployment: Deployed TensorFlow/CNN diagnostic models via Docker with FastAPI endpoints, reducing inference latency by 30% and improving system uptime.

Machine Learning Intern
Indian Institute of Technology (IIT)
Oct 2020 – Apr 2021
Guwahati, AS, India
- •Algorithm optimization: Researched and developed incremental/decremental versions of DBSCAN and MBSCAN algorithms in Python and C++ for distance-based outlier mining.
- •Clustering performance: Improved computational speed by 30% and precision by 20% in clustering algorithms; tested optimizations on a fintech application with 1M+ users for customer segmentation and portfolio optimization.
Education

Master of Science, Data Science
Indiana University Bloomington
Aug 2022 – May 2024
Key Coursework: Applied Machine Learning, Elements of AI, Probability & Statistics, Big Data, Data Warehousing & Mining, Business Intelligence.

Bachelor of Technology, Computer Science & Engineering
MIT World Peace University
Jul 2018 – Jun 2022
Key Coursework: Data Warehousing & Mining, Business Intelligence, Big Data Analysis, Financial Econometrics, Design & Analysis of Algorithms, Database Management.