Chao Péter Yang

Chao Péter Yang

Data Science Analyst II

Curinos, Inc.

Biography

Having received highest honors in Data Science from the University of Michigan, Chao Peter Yang is current working as a Data Science Analyst at Curinos, Inc.. He has an interest in all things Data Science and Machine Learning related, with an especially keen interest in neural networks and modeling in general.

Interests
  • Artificial Intelligence
  • Deep Neural Networks
  • Data Science
Education
  • Bachelors in Honors Data Science and Mathematical Sciences, 2018 - 2021

    University of Michigan - Ann Arbor

  • International Baccalaureate, 2018

    American International School of Budapest

Experience

 
 
 
 
 
Curinos
Modeling Analyst II
Curinos
September 2023 – Present Chicago
  • Researched and developed industry-level nonlinear Asset-Liability Management (ALM) models to predict acquisition and other portfolio balances for smaller banks and credit unions, resulting in improved acquisition prediction vs. legacy models in terms of out-of-sample validation.
  • Created automated ad-hoc regression notebooks with PySpark for creating, testing, and validating models with different configurations, reducing the time to build proof-of-concept models by half.
 
 
 
 
 
Curinos
Data Science Analyst II
Curinos
April 2022 – September 2023 Chicago
  • Led ML engineering team to migrate legacy modeling pipeline from using Cloudera to Databricks, coordinating with DevSecOps and Application teams to schedule testing, promotion, and release plans, leading to more than $100k in annual savings for data platform expenses and a 30% decrease in pipeline processing time on average. (Publicly acknowledged in company-wide town hall meeting)
  • Tuned nonlinear hierarchical price elasticity models en masse for multiple major US banks, each with 10,000+ model segments, resulting in improved fit in terms of both AIC and R2 with a significantly higher rate of convergence.
  • Installed and managed more than 10,000 price elasticity models per client bank to predict and optimize their deposit portfolio across a wide range of interest rates, with precise Model Risk Management documentation.
 
 
 
 
 
Curinos
Data Science Analyst
Curinos
August 2021 – April 2022 Chicago
  • Converted local, single-threaded, legacy modeling pipeline to use SparkR and Cloudera, reducing run time for model fitting by up to 30 times.
  • Performed Exploratory Data Analysis (EDA) for client banks to tune and reconfigure their models and data segments, leading to better-performing price elasticity models in terms of MAPE, R2, and rate of convergence.
  • Set up and automated custom SQL procedures to clean, wrangle, map, and transform client’s data feed to be used in the modeling pipeline, partially eliminating the need for manual model data refreshes.
 
 
 
 
 
University of Michigan - Ann Arbor
Honors Student Researcher
May 2020 – April 2021 Ann Arbor
  • Researched Content Based Music Classification System with Neural Networks. Advised by Prof.Edward Ionides and Prof.Daniel Forger
  • Developed new music classification methods using Musical Instrument Digital Interface (MIDI) and LSTM neural networks resulting in 82% accuracy in music classification, more than 10% improvement over conventional ML methods.
  • Improved models using supervised machine learning methods like Support Vector Machines, Decision Trees, Ensemble Methods, K-nearest neighbors etc.
  • Recieved ”Highest Honor” distinction in Data Science from UMich, one of only 2 awarded in 2021.

Certificates

Gain foundational knowledge, practical skills, and a functional understanding of how generative AI works
See certificate
DataCamp
Introduction to Scala
See certificate
Coursera
Deep Learning Spcialization
See certificate
Coursera
Share Data Through the Art of Visualization
See certificate

Projects

*
Muscribe: Transcribing Music to Scores
A research project into developing a model that can create scores from pieces of music.
Californian House Price Prediction with Kaggle Data
Performed EDA and a simple XGBoost to predict house prices in California in a single Jupiter notebook. This is simple data project to showcase how I’d approach a relatively straight forward modeling task.
Squirrels API - Use Case Development and Documentation
Developing use cases and documentation for the Squirrels API

Contact

Feel free to leave me a message and I’ll get back to you as soon as possible!