Education
- B.A. in Mathematics, University of Pennsylvania, 2019–2023
- M.S.E. in Data Science, University of Pennsylvania, 2021–2023
- GPA: 3.9/4.0
- Notable courses: Machine Learning, Computer Vision, Theory of Machine Learning, Game Theory, Probability Theory
Work experience
- Spring 2022: Applied Scientist Intern at Amazon
- Worked on Amazon Help Search Q&A System to expand English IR model to multiligual IR model
- Built data pipeline including scraping, annotation collection, and data augmentation on millions of datapoints
- Improved MRR from 22% to 45% with the same runtime by performing knowledge distillation on multilingual-BERT model
- Mentor: Olive Qin
- Manager: Rajesh Kamma
- Summer 2022: Machine Learning Engineer Intern at Roblox
- Worked on Account Security team to use username feature to improve bot detection
- Designed a scalable and distributed text mining algorithm that achieved 90%+ F1 score
- Implemented the algorithm and its data processing, hyperparameter tuning using Spark, SQL, Hive, Hadoop, Airflow, Kubeflow
- Implemented Markov chain-based gibberish detection
- Mentor: Derek Farren, Younes Abouelnagah
- Manager: Luke Fu
Skills
- Python: Pytorch, HuggingFace, CUDA, OpenCV, PySpark, Selenium, Multiprocessing, Regex
- Big Data: Spark, SQL, Hive, Hadoop, Airflow, HPC, Slurm
- Others: Linux, Github (version control), AWS, GCP