Hi, I'm Jen (Ha).
pronounced hah (/hɑː/)
I turn data into
impact.
Data Scientist & ML Engineer with dual Master's degrees (GPA 4.0/4.0) and hands-on experience in machine learning, NLP, medical imaging AI, and business intelligence across healthcare, finance, and research.
Jen (Ha) Nguyen
Data Scientist & ML Engineer · USA
About Me
Passionate about data science at the intersection of health, AI & business
I build end-to-end ML pipelines, publish research in top-tier journals, and translate data into meaningful decisions — from ETL and BI dashboards to deep learning medical imaging models.
End-to-end ML pipelines: tree-based models, deep learning (ResNet, BERT, LSTM), and radiomics for real-world impact.
Power BI dashboards, DAX modeling, PySpark ETL pipelines, and Microsoft Fabric — turning raw data into executive insights.
Co-authored papers targeting Q1 journals and top conferences — RSNA, Springer Nature, Diagnostics, ICME.
Experience & Skills
My professional journey
Work Experience
Led end-to-end tree-based regression pipeline in Python to identify how material characteristics affect battery performance, increasing energy capacity from 100–120 to 135 mAh/g and saving over $1M in manufacturing costs.
Designed deep learning pipelines (ResNet34) for mediastinal abnormality detection on chest X-rays. Led feature selection and benchmarking of 16+ ML/DL models — achieved AUC = 0.903, reducing diagnostic decision time by 50%.
Led economic research with logistic regression & R, developed Power BI dashboards for 2000+ students across 5 years, deployed AWS web app (EC2, S3, RDS), and mentored 30+ students in ML competitions.
Processed 11M+ physician text messages with BERT & PySpark, built Power BI dashboards unlocking $1M+ in revenue, automated 300+ ServiceNow reports, and developed a 10TB+ ETL pipeline on Microsoft Fabric reducing cloud costs by 40%.
Automated 100+ HR reports, built gradient-boosting models improving headcount accuracy by 80%, led A/B tests increasing applicant volume by 20%, and received Global DNA STARs Award & "Best DEI Projects" award.
Technical Skills
Education
Selected Work
Featured Projects
MediRad-MRI: AI Tumor Classification
Radiomics-based ML/DL pipeline for benign–malignant anterior mediastinal tumor classification on MRI. Evaluated 16+ models; AUC = 0.903, reducing radiologist decision time by 50%.
CoxHealth Radiology Analytics Platform
Processed 11M+ physician text messages with BERT & PySpark. Modality demand–capacity forecasting via Time Series Analysis. ETL pipeline on Microsoft Fabric handling 10TB+ of distributed HDFS data.
Recognition
Awards & Honors
Data Hackathon competition.
2026Database Design & Machine Learning Contest.
2025Recognized for visionary leadership and technical excellence.
2025Research
Publications
Let's collaborate on something impactful
Whether it's ML research, data analytics, or AI engineering — I'd love to connect and hear about your challenge.