Experience | Aditya Mhaske

Diversified Automation

Mar 2025 – Present 📍 Santa Clara, CA, USA

Multi-tenant search infrastructure: Designed multi-tenant vector search service (Qdrant) with namespace-level isolation across 100+ independent environments; collection filters enforced strict data boundaries, reducing memory overhead 80% vs. per-tenant database approach.
Hybrid search service: Built search service combining inverted-index keyword matching with semantic vector retrieval to handle exact alphanumeric codes and natural language in a unified pipeline; improved Recall@5 by up to 20% over vector-only architecture with no latency regression.
Low-latency query routing layer: Engineered intent-classification service that routes requests before triggering expensive downstream calls; fast-path circuit breaker for direct-lookup queries reduced p99 retrieval latency from 150ms to <5ms (30x), eliminating vector search overhead for targeted file queries.
Output validation service: Built post-generation validation microservice that fuzzy-matches LLM output against an indexed SQL tag database before rendering; achieved >99% redaction accuracy of fabricated hardware identifiers.
Deployment infrastructure: Owned end-to-end CI/CD pipeline (GitHub Actions + Docker + AWS) for AI microservices; automated testing, container builds, and blue-green deployments enabling zero-downtime.

Python Java PLC Programming Pandas NumPy

BMR Infotek

Aug 2024 – Mar 2025 📍 Dublin, CA, USA

Large-scale text processing pipeline: Built batch embedding pipeline processing 1M+ customer records using sentence transformer models; outputs fed a segmentation service enabling precision behavioral targeting that improved marketing ROI by 20%.
Distributed data ingestion service: Designed Apache Airflow DAG-based ETL system on AWS EC2 ingesting and transforming 3.5M+ records/month into Redshift; retry orchestration and task-level fault isolation reduced pipeline processing latency by 35%.
Model serving & versioning platform: Built orchestration service connecting forecasting model outputs to downstream consumers; CI/CD pipelines automated model versioning, artifact promotion, and rollback, enabling zero-downtime model updates in production.

Machine Learning PowerBI SQL Deep Learning RFM Modeling

Kelley School of Business, Indiana University

Dec 2022 – May 2024 📍 Bloomington, IN, USA

Conducted data analysis using NLP techniques to classify and predict outcomes, extracting topics, sentiment, and named entities from large text corpora.
Led research projects on television advertising, social media, and political marketing under Prof. Beth Fossen and Prof. Lopo Rego.

NLP Seaborn Feature Engineering Machine Learning Excel

Twin Cities Innovation Alliance

Sep 2023 – Dec 2023 📍 Minneapolis, MN, USA

A/B testing infrastructure: Built traffic-routing framework across application variants; SQL-based conversion analysis drove a 17% lift in user engagement.
Recommendation engine: Engineered collaborative + content-based filtering backend, optimizing preference retrieval queries to achieve 20% engagement increase.
API performance: Developed Node.js RESTful APIs connecting frontends to ML microservices, cutting API response latency by 40% and driving 15% growth in daily active usage.

Statistical Modeling A/B Testing SQL Recommendation Systems

Moonplexus Pvt. Ltd.

May 2021 – Aug 2022 📍 Pune, MH, India

Healthcare platform: Designed microservices web platform (Java, Django, AngularJS, SQL) automating health information management; 40% reduction in query execution time through relational DB optimization.
Cloud infrastructure: Migrated and maintained AWS EC2/RDS infrastructure, building fault-tolerant ETL pipelines with high availability for large-scale medical data storage.
Deep learning deployment: Deployed TensorFlow/CNN diagnostic models via Docker with FastAPI endpoints, reducing inference latency by 30% and improving system uptime.

TensorFlow AWS EC2 / RDS CNN Node.js ETL

Indian Institute of Technology (IIT)

Oct 2020 – Apr 2021 📍 Guwahati, AS, India

Researched incremental/decremental versions of DBSCAN and MBSCAN algorithms for distance-based outlier mining.
Used Python and C++ to improve computational speed by 30% and precision by 20% in clustering algorithms, tested on a fintech application with over 1 million users for customer segmentation and portfolio optimization.

Python C++ Clustering Data Analysis Machine Learning

Experience & Education