Resume – Nikhil Kasukurthi | bluenotebook.io

Staff Machine Learning Engineer

I have 8 years owning AI systems end-to-end across healthcare, search, and LLM training. Designed retrieval pipelines from the ground up involving data collection, embedding generation, and hybrid search via Vespa. Built serving infrastructure for multimodal LLMs via vLLM on Kubernetes and torch-compiled models on RayServe. Deep expertise in GPU profiling and distributed training. Led engineering and data science teams.

Work Experience

Independent ML Research

Göttingen, Germany Nov 2025 – Present
  • Implemented entire LLM training stack from scratch: BPE tokenizer, pre-training, instruction tuning, RLHF, and evaluation (Stanford CS336)
  • Published technical deep-dive on distributed training optimization. Built DeepSpeed ZeRO-2 distributed training setup; benchmarked H100 SXM vs PCIe, identifying 2x cost-efficiency difference. Profiled GPU utilization and memory bottlenecks using Nsight Systems and PyTorch Memory Profiler
  • Designed and launched clarifyit.ai, an interactive prompt refinement tool using iterative questioning to improve LLM output quality
  • Built Voice Activity Detection (VAD) at the RTP packet level for a telephony product. Exported ASR models via ONNX for optimized edge inference, reducing costs over third-party ASR providers

Lead Data Scientist | Eka.care

Bengaluru, India Jan 2022 – Oct 2025
Healthcare company building AI-powered tools for 100K+ doctors across India

Search & Retrieval

  • For grounding LLMs in Indian medical context, built visual document retrieval pipeline for Medical Protocols with ColQwen-2.5 and Vespa. Through hybrid search (BM25 + vision embeddings), improved top-3 retrieval accuracy by 24% over text embeddings. Chose this over chunking methods for better scalability
  • Improved medication autosuggest nDCG by 55% via query decomposition and dynamic query construction on ElasticSearch, handling shorthand and misspelling issues. Built Go interface for calling ES
  • Transformed unstructured clinical into structured notes using contrastive learning semantic retrieval models, improved diagnosis coding by 30% and medication coding by 80%

Products & Infrastructure

  • Created a unified MedAssist AI platform, an LLM client with remote MCP server support. Apollo Hospitals (largest hospital chain in India) uses it for appointment booking
  • Early adopter of MCP, open-sourced MCP server to provide LLMs with Indian medical context
  • Built PySpark pipelines for feature creation. Used Apache Beam to unify scattered records across DBs for cohesive patient profiles
  • Architected the inference serving infrastructure for multimodal LLMs via vLLM on Kubernetes and torch-compiled models on RayServe. Cut inference cost by 50% compared to AWS Sagemaker
  • Deployed a custom Speech LLM (Whisper + Gemma 2) for medical transcription built on vLLM plugins. Reduced STT/ASR inference costs by 60% versus third-party API providers
  • Built medical data collection platform (Django). Gathered 100+ hours of medical speech data and 1000+ medical protocol documents. Directed annotation and protocols curation with medical professionals

LLM Evaluations

  • Open-sourced KARMA-OpenMedEvalKit, an evaluation library for LLMs in healthcare. Released Indian healthcare datasets and tasks
  • Gates Foundation is adopting KARMA for healthcare bot evaluation in collaboration with Eka.care
  • Evaluated custom-built agent harnesses on Tau-Bench2 with LLM-generated clinical scenarios, and overall benchmarks on HELM and MedHELM
  • Defined rubric-driven LLM evaluation standards adopted across all AI products at Eka.care, replacing ad-hoc assessments

Leadership

  • AWS featured Eka.care as a reference customer for healthcare AI use cases for being the earliest production adopters of AWS Bedrock
  • Managed a team of 3 (1 engineer, 2 data scientists), owning the AI roadmap

Data Scientist | Udaan

Bengaluru, India May 2021 – Dec 2021
India's largest B2B e-commerce marketplace
  • Learning-to-rank models (gradient-boosted trees) lifted search-to-cart conversion by 10%. A/B tested across business verticals
  • Built 3D point cloud processing pipeline (DGCNN) for LiDAR-based volume estimation of warehouse shipments. Built the data science stack from scratch: collection, annotation, training, and deployment. Achieved 40% cost savings and 50% latency reduction

Visiting Researcher | National Centre for Biological Sciences (NCBS) – TIFR

Bengaluru, India May 2020 – Mar 2021
Concurrent with role at SigTuple
  • Developed PrISM (Precision for Integrative Structural Models) using Variational Autoencoders, a novel unsupervised technique to score integrative models
  • Won Best Poster Award at NCBS Annual Talks 2021
  • Published in Bioinformatics (Vol. 38, Issue 15, August 2022)

Data Scientist III | SigTuple

Bengaluru, India May 2018 – May 2022
Healthcare AI startup building diagnostic products
  • Built retinal disease detection products end-to-end from annotation strategies, model development, clinical studies, and CE certification
  • Published 2 papers at IEEE ISBI 2019
  • Technical lead for two diagnostic products (Fundus analysis, Urine analysis). Directed research and engineering roadmap
  • Led ML platform team, architected multi-model inference DAG using TF Serving, Kubernetes, and Cloud Functions. Improved turnaround time by 60% and reduced costs by 40%

Publications

PrISM: precision for integrative structural models

Bioinformatics, Volume 38, Issue 15 August 2022
Varun Ullanat, Nikhil Kasukurthi, Shruthi Viswanath

Deep learning for weak supervision of diabetic retinopathy abnormalities

IEEE International Symposium on Biomedical Imaging (ISBI) July 2019
Maroof Ahmad, Nikhil Kasukurthi, Harshit Pande

Dynamic region proposal networks for semantic segmentation in automated glaucoma screening

IEEE International Symposium on Biomedical Imaging (ISBI) July 2019
Shivam Shah, Nikhil Kasukurthi, Harshit Pande

Technical Skills

ML Systems

PyTorch DeepSpeed vLLM TensorRT ONNX RayServe TorchServe TensorFlow

LLM / NLP

Whisper Gemma BERT NER/NEL RLHF MCP

Search & Retrieval

ElasticSearch Vespa FAISS ColQwen RAG

Data & Infrastructure

Kubernetes Docker SLURM AWS GCP Apache Beam PySpark BigQuery DynamoDB

Profiling & Optimization

Nsight Systems PyTorch Memory Profiler GPU Benchmarking

Languages

Python Go

Education

B. Tech - Computer Science and Engineering

VIT University, Vellore, India 2014 – 2018
CGPA: 8.39/10

Awards

Hackathon Winner — AWS GenAI Hackathon

August 2024
Built appointment booking agent through tool use

Impact Award — Eka.care

2023
For major organizational impact

Best Poster Award — NCBS Annual Talks

2021
For PrISM research presentation

Interests

Scuba Diving (Open Water Certified) Rock Climbing Trekking Formula 1