Career Journey

Experience & Education

2+ years spanning industrial automation, academic research, data engineering, and ML deployment.

Professional Experience

Software Engineer

Diversified Automation
Mar 2025 โ€“ Present ๐Ÿ“ Santa Clara, CA, USA
  • Multi-tenant search infrastructure: Designed multi-tenant vector search service (Qdrant) with namespace-level isolation across 100+ independent environments; collection filters enforced strict data boundaries, reducing memory overhead 80% vs. per-tenant database approach.
  • Hybrid search service: Built search service combining inverted-index keyword matching with semantic vector retrieval to handle exact alphanumeric codes and natural language in a unified pipeline; improved Recall@5 by up to 20% over vector-only architecture with no latency regression.
  • Low-latency query routing layer: Engineered intent-classification service that routes requests before triggering expensive downstream calls; fast-path circuit breaker for direct-lookup queries reduced p99 retrieval latency from 150ms to <5ms (30x), eliminating vector search overhead for targeted file queries.
  • Output validation service: Built post-generation validation microservice that fuzzy-matches LLM output against an indexed SQL tag database before rendering; achieved >99% redaction accuracy of fabricated hardware identifiers.
  • Deployment infrastructure: Owned end-to-end CI/CD pipeline (GitHub Actions + Docker + AWS) for AI microservices; automated testing, container builds, and blue-green deployments enabling zero-downtime.
Python Java PLC Programming Pandas NumPy

Data Scientist

BMR Infotek
Aug 2024 โ€“ Mar 2025 ๐Ÿ“ Dublin, CA, USA
  • Large-scale text processing pipeline: Built batch embedding pipeline processing 1M+ customer records using sentence transformer models; outputs fed a segmentation service enabling precision behavioral targeting that improved marketing ROI by 20%.
  • Distributed data ingestion service: Designed Apache Airflow DAG-based ETL system on AWS EC2 ingesting and transforming 3.5M+ records/month into Redshift; retry orchestration and task-level fault isolation reduced pipeline processing latency by 35%.
  • Model serving & versioning platform: Built orchestration service connecting forecasting model outputs to downstream consumers; CI/CD pipelines automated model versioning, artifact promotion, and rollback, enabling zero-downtime model updates in production.
Machine Learning PowerBI SQL Deep Learning RFM Modeling

Data Scientist ยท Graduate Research Assistant

Kelley School of Business, Indiana University
Dec 2022 โ€“ May 2024 ๐Ÿ“ Bloomington, IN, USA
  • Conducted data analysis using NLP techniques to classify and predict outcomes, extracting topics, sentiment, and named entities from large text corpora.
  • Led research projects on television advertising, social media, and political marketing under Prof. Beth Fossen and Prof. Lopo Rego.
NLP Seaborn Feature Engineering Machine Learning Excel

Data Scientist Intern

Twin Cities Innovation Alliance
Sep 2023 โ€“ Dec 2023 ๐Ÿ“ Minneapolis, MN, USA
  • A/B testing infrastructure: Built traffic-routing framework across application variants; SQL-based conversion analysis drove a 17% lift in user engagement.
  • Recommendation engine: Engineered collaborative + content-based filtering backend, optimizing preference retrieval queries to achieve 20% engagement increase.
  • API performance: Developed Node.js RESTful APIs connecting frontends to ML microservices, cutting API response latency by 40% and driving 15% growth in daily active usage.
Statistical Modeling A/B Testing SQL Recommendation Systems

Data Engineer

Moonplexus Pvt. Ltd.
May 2021 โ€“ Aug 2022 ๐Ÿ“ Pune, MH, India
  • Healthcare platform: Designed microservices web platform (Java, Django, AngularJS, SQL) automating health information management; 40% reduction in query execution time through relational DB optimization.
  • Cloud infrastructure: Migrated and maintained AWS EC2/RDS infrastructure, building fault-tolerant ETL pipelines with high availability for large-scale medical data storage.
  • Deep learning deployment: Deployed TensorFlow/CNN diagnostic models via Docker with FastAPI endpoints, reducing inference latency by 30% and improving system uptime.
TensorFlow AWS EC2 / RDS CNN Node.js ETL

Machine Learning Intern

Indian Institute of Technology (IIT)
Oct 2020 โ€“ Apr 2021 ๐Ÿ“ Guwahati, AS, India
  • Researched incremental/decremental versions of DBSCAN and MBSCAN algorithms for distance-based outlier mining.
  • Used Python and C++ to improve computational speed by 30% and precision by 20% in clustering algorithms, tested on a fintech application with over 1 million users for customer segmentation and portfolio optimization.
Python C++ Clustering Data Analysis Machine Learning

Education

Master of Science, Data Science

Indiana University Bloomington
Aug 2022 โ€“ May 2024

Key Courses

Applied Machine Learning Elements of AI Probability & Statistics Big Data Data Warehousing & Mining Business Intelligence

Bachelor of Technology, Computer Science & Engineering

MIT World Peace University
Jul 2018 โ€“ Jun 2022

Key Courses

Data Warehousing & Mining Business Intelligence Big Data Analysis Financial Econometrics Design & Analysis of Algorithms Database Management