👋 Hello there!

I am a statistician by training, an R developer by habit, and an automation enthusiast by curiosity.

Currently working as a Sr. Computational Statistician at Eli Lilly, I spend most of my time building tools where statistics meets innovation – from clinical trial analytics to R-based automation and emerging AI productivity projects. I received my M.S. + B.S. in Statistics from Duke University, and I enjoy turning stats ideas into practical solutions, including open-source tools like my R package lme4u and other technical projects you can explore in my projects and research sections.

When I’m (unfortunately) away from my laptop, I’m usually at the gym 🥊, cycling 🚴‍♀, or busy becoming a DJ ✌︎㋡

👉View My Resume

👩‍💻 Highlights & Updates

Sr. Computational Statistician @ (Jul 2025 – Now)

Open-Source Developer | lme4u R Package (Apr 2025)

Invitee | R Dev Day @ Hutch @ (Aug 2024)

Opportunity Scholar | posit::conf(2024) @ (Aug 2024)

Masters Statistician Intern @

(May 2024 – Aug 2024)

Diabetes Common Safety Tables, Figures, Lists (TFLs) Automation

Developed and launched a Shiny app to automate the creation, execution, and review of common safety TFLs, integrating R and SAS code with output formatting, progress tracking, and error reporting through front-end UI design and back-end cloud system engineering; consolidated 30+ common safety TFLs from 300+ listings across 5+ Diabetes study by building a flexible internal TAFFY template project; reimagined the clinical reporting pipeline with enhanced efficiency and consistency
Orchestrated regular meetings with senior leadership; pitched the app to 600+ global employees; achieved successful implementations in Diabetes, with ongoing rollouts to Neuroscience and other therapeutic areas

Student Research Affiliate @

(May 2022 – Dec 2022)

Lab Test Harmonization: Bio-BERT Based Deduplication of Test Labels

Optimized lab test deduplication of grouper labels by fine-tuning Bio-BERT, an NLP model pre-trained on biomedical corpora; established a new method of cross-comparison similarity evaluation based on ground-truth text embeddings; uncovered a 95% performance boost in the application to Duke Hospital’s lab database
Demonstrated academic distinction by contributing to the Duke AI Health 2022 cohort as the sole undergraduate participant; effectively communicated research outcomes through a well-received presentation at the Duke AI Health Poster Showcase 2022

Data Science Intern @

(Jun 2022 – Aug 2022)

Hiya Shield Project: Robocall Identification & Screening

Spearheaded an NLP-based robocall detection system based on internal audio databases, leveraging SBERT, unsupervised learning, statistical analysis, and AWS Cloud on text- and audio-space manipulation
Enhanced classification efficiency by discovering optimal audio truncation length and similarity thresholds, driving a 67% faster user experience with a customizable accuracy screening feature for Hiya mobile app

Lead Author & Research Assistant @

(Jun 2020 – Mar 2021)

Cross-Media Retrieval Based on Big Data Technology

Refined traditional permutation invariant training with mean squared error loss through BLSTM/LSTM and CNN in a key media separation technique; innovated two new separation methods – the FIX strategy and the masking-based data augmentation strategy, demonstrating notable performance gains
Publication: Audio-Visual Single-Channel Signal Separation based on Big Data Augmentation in IEEE (IICSPI 2020)

🏫 Education

Institution	Degree	Field of Study	Dates
Duke University	M.S. Student	Statistics	May 2025
Duke University	B.S.	Statistical Science (Data Science Concentration) Minor in Computer Science	May 2023
University of California, Santa Barbara	(Transfer Out)	Statistics and Data Science	June 2021

⚙️ Skillset

Skillset

🎨 Creative Outlet

Lines of code, strokes of art — both speak a language beyond words. Generative art becomes my passion towards digital creativity. See my inspirations for the art pieces below: 1, 2, 3, 4, 5, 6.

Art