About
I am a Machine Learning Engineer based in Mountain View, CA, building data driven products that make an imapct!
Experience
-
Kognitic AI — Machine Learning Engineer (Jun 2024 – Present)
Owned end-to-end development and productionization of a $1M-revenue AI SaaS platform for clinical intelligence; deployed natural language analytics with MCP-based trust layers and multimodal LLM agents; built product analytics dashboards; developed Transformer models for biomarker and drug-target extraction (87% accuracy; EMNLP 2025/ACL); fine-tuned LLMs with LoRA + RLHF for clinical eligibility prediction. -
Kognitic AI — Machine Learning Intern (June 2023 – Sept 2023)
Built scalable ETL pipelines with AWS Glue (Spark), S3, Athena; implemented time-series forecasting and automated CI/CD pipelines; developed time-series forecasting models (XGBoost ensembles); automated reporting on AWS; engineered ETL pipelines at 10M+ record scale.
Education
- MS in Data Science, UC San Diego (Aug 2022 – Jun 2024)
Courses: NLP, Data Engineering, Statistics, Deep Learning, HPC.
Award: UCSD Excellence in Academic Achievement.
Publications
- BIOPSY — Biomarkers In Oncology: Pipeline for Structured Yielding (ACL / EMNLP 2025)
- Closed Domain Multiple Choice Question Answering (Springer Journal 2023)
- Pneumonia Detection using an Ensemble of Modified ViT-YOLO Models (IEEE Conference 2022)
Skills
Python, SQL, C++, R; PyTorch, TensorFlow, JAX; LangGraph, LangChain; MLflow, Grafana; PySpark, Hadoop; MongoDB, PostgreSQL, Neo4j, Pinecone, Snowflake, Redshift, dbt; Airflow; AWS (Lambda, S3, CloudWatch, Athena, EC2); GCP; GitHub Actions, Jenkins.
Contact
- Email: schetwani1@gmail.com
- LinkedIn: https://www.linkedin.com/in/sanyachetwani
- GitHub: https://github.com/sanyacodes