👋 Hello there!
I am an aspiring data scientist and Master’s student in the Department of Statistical Science at Duke University, where I also obtained my Bachelor’s in Statistical Science (Data Science Concentration) and a Minor in Computer Science.
- My academic journey has been marked by a deep research commitment to statistical analysis, machine learning, and data science, with special focuses on natural language processing, Bayesian statistics, and creative data visualizations. Take a look of my previous research and projects in R and Python, and let me know if you are interested!
- Beyond academia, I actively contribute as a Project Manager & Data Analyst @ Duke Impact Investing Group and as the Chief Technology Officer @ Duke Statistical Science Majors Union. Additionally, I have been a teaching assistant with 3+ years of experience. Feel free to reach out for project advice and business case studies.
- In my free time, I do 🥊 / 🚴♀️ / 🎹 / 🧁
👩💻 Highlights & Updates
Invitee | R Dev Day @ Hutch @ (Aug 2024)
Opportunity Scholar | posit::conf(2024) @ (Aug 2024)
Masters Statistician Intern @
(May 2024 – Aug 2024)
Diabetes Common Safety Tables, Figures, Lists (TFLs) Automation
- Developed and launched a Shiny app to automate the creation, execution, and review of common safety TFLs, integrating R and SAS code with output formatting, progress tracking, and error reporting through front-end UI design and back-end cloud system engineering; consolidated 30+ common safety TFLs from 300+ listings across 5+ Diabetes study by building a flexible internal TAFFY template project; reimagined the clinical reporting pipeline with enhanced efficiency and consistency
- Orchestrated regular meetings with senior leadership; pitched the app to 600+ global employees; achieved successful implementations in Diabetes, with ongoing rollouts to Neuroscience and other therapeutic areas
Student Research Affiliate @
(May 2022 – Dec 2022)
Lab Test Harmonization: Bio-BERT Based Deduplication of Test Labels
- Optimized lab test deduplication of grouper labels by fine-tuning Bio-BERT, an NLP model pre-trained on biomedical corpora; established a new method of cross-comparison similarity evaluation based on ground-truth text embeddings; uncovered a 95% performance boost in the application to Duke Hospital’s lab database
- Demonstrated academic distinction by contributing to the Duke AI Health 2022 cohort as the sole undergraduate participant; effectively communicated research outcomes through a well-received presentation at the Duke AI Health Poster Showcase 2022
Data Science Intern @
(Jun 2022 – Aug 2022)
Hiya Shield Project: Robocall Identification & Screening
- Spearheaded an NLP-based robocall detection system based on internal audio databases, leveraging SBERT, unsupervised learning, statistical analysis, and AWS Cloud on text- and audio-space manipulation
- Enhanced classification efficiency by discovering optimal audio truncation length and similarity thresholds, driving a 67% faster user experience with a customizable accuracy screening feature for Hiya mobile app
Lead Author & Research Assistant @
(Jun 2020 – Mar 2021)
Cross-Media Retrieval Based on Big Data Technology
- Refined traditional permutation invariant training with mean squared error loss through BLSTM/LSTM and CNN in a key media separation technique; innovated two new separation methods – the FIX strategy and the masking-based data augmentation strategy, demonstrating notable performance gains
- Publication: Audio-Visual Single-Channel Signal Separation based on Big Data Augmentation in IEEE (IICSPI 2020)
🏫 Education
Institution | Degree | Field of Study | Dates |
---|---|---|---|
Duke University | M.S. Student | Statistics | May 2025 |
Duke University | B.S. | Statistical Science (Data Science Concentration) Minor in Computer Science | May 2023 |
University of California, Santa Barbara | (Transfer Out) | Statistics and Data Science | June 2021 |
⚙️ Skillset
© Visualization is created by scraping through my resume using R package wordcloud2.
🎨 Creative Outlet
Lines of code, strokes of art — both speak a language beyond words. Generative art becomes my passion towards digital creativity. See my inspirations for the art pieces below: 1, 2, 3, 4, 5, 6.