π Hello there!
I am a statistician by training, an R developer by habit, and an automation enthusiast by curiosity.
Currently working as a Sr. Computational Statistician at Eli Lilly, I spend most of my time building tools where statistics meets innovation β from clinical trial analytics to R-based automation and emerging AI productivity projects. I received my M.S. + B.S. in Statistics from Duke University, and I enjoy turning stats ideas into practical solutions, including open-source tools like my R package lme4u and other technical projects you can explore in my projects and research sections.
When Iβm (unfortunately) away from my laptop, Iβm usually at the gym π₯, cycling π΄ββ, or busy becoming a DJ βοΈγ‘
π©βπ» Highlights & Updates
Sr. Computational Statistician @
(Jul 2025 β Now)
Open-Source Developer | lme4u R Package
(Apr 2025)
Invitee | R Dev Day @ Hutch @ (Aug 2024)
Opportunity Scholar | posit::conf(2024) @ (Aug 2024)
Masters Statistician Intern @
(May 2024 β Aug 2024)
Diabetes Common Safety Tables, Figures, Lists (TFLs) Automation
- Developed and launched a Shiny app to automate the creation, execution, and review of common safety TFLs, integrating R and SAS code with output formatting, progress tracking, and error reporting through front-end UI design and back-end cloud system engineering; consolidated 30+ common safety TFLs from 300+ listings across 5+ Diabetes study by building a flexible internal TAFFY template project; reimagined the clinical reporting pipeline with enhanced efficiency and consistency
- Orchestrated regular meetings with senior leadership; pitched the app to 600+ global employees; achieved successful implementations in Diabetes, with ongoing rollouts to Neuroscience and other therapeutic areas
Student Research Affiliate @
(May 2022 β Dec 2022)
Lab Test Harmonization: Bio-BERT Based Deduplication of Test Labels
- Optimized lab test deduplication of grouper labels by fine-tuning Bio-BERT, an NLP model pre-trained on biomedical corpora; established a new method of cross-comparison similarity evaluation based on ground-truth text embeddings; uncovered a 95% performance boost in the application to Duke Hospitalβs lab database
- Demonstrated academic distinction by contributing to the Duke AI Health 2022 cohort as the sole undergraduate participant; effectively communicated research outcomes through a well-received presentation at the Duke AI Health Poster Showcase 2022
Data Science Intern @
(Jun 2022 β Aug 2022)
Hiya Shield Project: Robocall Identification & Screening
- Spearheaded an NLP-based robocall detection system based on internal audio databases, leveraging SBERT, unsupervised learning, statistical analysis, and AWS Cloud on text- and audio-space manipulation
- Enhanced classification efficiency by discovering optimal audio truncation length and similarity thresholds, driving a 67% faster user experience with a customizable accuracy screening feature for Hiya mobile app
Lead Author & Research Assistant @
(Jun 2020 β Mar 2021)
Cross-Media Retrieval Based on Big Data Technology
- Refined traditional permutation invariant training with mean squared error loss through BLSTM/LSTM and CNN in a key media separation technique; innovated two new separation methods β the FIX strategy and the masking-based data augmentation strategy, demonstrating notable performance gains
- Publication: Audio-Visual Single-Channel Signal Separation based on Big Data Augmentation in IEEE (IICSPI 2020)
π« Education
| Institution | Degree | Field of Study | Dates |
|---|---|---|---|
| Duke University | M.S. Student | Statistics | May 2025 |
| Duke University | B.S. | Statistical Science (Data Science Concentration) Minor in Computer Science | May 2023 |
| University of California, Santa Barbara | (Transfer Out) | Statistics and Data Science | June 2021 |
βοΈ Skillset

π¨ Creative Outlet
Lines of code, strokes of art β both speak a language beyond words. Generative art becomes my passion towards digital creativity. See my inspirations for the art pieces below: 1, 2, 3, 4, 5, 6.
