Hey there, I'm Spandan Das. Currently, I'm studying computer science at Carnegie Mellon University with a minor in Machine Learning. My primary interest is in artificial intelligence and using it to have a positive impact on the world.
Outside of school, I enjoy playing tennis and basketball, working out, and listening to and playing music.
Relevant Coursework: (PhD) Intro to Deep Learning [Python], Deep Reinforcement Learning [Python], (PhD) Advanced NLP [Python], Algorithm Design and Analysis, Machine Learning with Large Datasets [Python], (PhD) Convex Optimization, Intro to ML [Python], Intro to Computer Systems [C], Probability and Computing, Statistics and Computing
Relevant Coursework: Artificial Intelligence [Python], Computer Vision [C++], Machine Learning [Python], Parallel Computing [C], Probability Theory, Concrete Math, Multivariable Calculus, Linear Algebra
Clubs: Senior Computer Team (Captain), Intermediate Computer Team (Captain), Varsity Math Team
Places that I've worked
Built anomaly detection system for NVIDIA TEGRA chip production environment
Developed an active learning based approach for data-efficient pretraining for LLMs by utilizing data impact models
[Paper] [Poster]
Created an LLM integration library to automatically filter and annotate semantically similar Siri queries
Developed an online camera calibration algorithm for a multi-view stereo setup on drones used to determine real-time depth maps
Predicted precipitation through an ice microphysics-based machine learning approach using remote sensing data from NASA's Global Precipitation Measurement Mission [Paper] [GitHub]
Used web-scraping and machine learning to predict likelihood of premium subscription purchase for TV streaming platform
Some of my personal projects
Yu, Z.; Das, S.; Xiong, C. MATES: Model-Aware Data Selection for Efficient Pretraining with Data Influence Models. 2024. https://arxiv.org/abs/2406.06046
Das, S.; Wang, Y.; Gong, J.; Ding, L.; Munchak, S.J.; Wang, C.; Wu, D.L.; Liao, L.; Olson, W.S.; Barahona, D.O. A Comprehensive Machine Learning Study to Classify Precipitation Type over Land from Global Precipitation Measurement Microwave Imager (GPM-GMI) Measurements. Remote Sens. 2022, 14, 3631. https://doi.org/10.3390/rs14153631
Pandey, R.; Das, S.; Thrush, T.; Liang, P.P.; Salakhutdinov, R.; Morency, L.-P. Winoground{VQA}: Zero-shot Reasoning with Large Language Models for Compositional Visual Question Answering. 2023. [Link to Paper]
Das, S.; Samuel, V.; Noroozizadeh, S. TLDR at SemEval-2024 Task 2: T5-generated Clinical-Language Summaries for DeBERTa Report Analysis. NAACL SemEval Conference 2024. https://arxiv.org/abs/2404.09136
Things I've achieved...