Jae-Won Chung

Ph.D. Candidate @ UMich CSE

Summary

I'm a fourth year PhD candidate in CSE at the University of Michigan. I build efficient software systems for deep learning, with a recent focus on the efficient management of not only time, but also energy.

I view energy as a new fundamental software systems resource that is worth carefully optimizing and allocating. Doing so has downstream benefits, such as reducing operational expenses and power delivery pressure for datacenters.

I lead the ML.ENERGY initiative. I am fortunate to be advised by Professor Mosharaf Chowdhury and be part of SymbioticLab.

Publications

Perseus: Reducing Energy Bloat in Large Model Training

SOSP, 2024 (Acceptance rate = 17.34%)

Jae-Won Chung, Yile Gu, Insu Jang, Luoxi Meng, Nikhil Bansal, Mosharaf Chowdhury

Toward Cross-Layer Energy Optimizations in AI Systems

DOE ASCR Energy-Efficient Computing for Science Workshop, 2024

Jae-Won Chung, Nishil Talati, and Mosharaf Chowdhury

Andes: Defining and Enhancing Quality-of-Experience in LLM-Based Text Streaming Services

Preprint, 2024

Jiachen Liu, Jae-Won Chung, Zhiyu Wu, Fan Lai, Myungjin Lee, Mosharaf Chowdhury

Chasing Low‑Carbon Electricity for Practical and Sustainable DNN Training

ICLR Workshop (Tackling Climate Change with Machine Learning), 2023

Zhenning Yang, Luoxi Meng, Jae-Won Chung, Mosharaf Chowdhury

Zeus: Understanding and Optimizing GPU Energy Consumption of DNN Training

USENIX NSDI, 2023 (Acceptance rate = 18.38%)

Jie You*, Jae-Won Chung*, Mosharaf Chowdhury (* Equal Contribution)

ShadowTutor: Distributed Partial Distillation for Mobile Video DNN Inference

ACM ICPP, 2020 (Acceptance rate = 28.99%)

Jae-Won Chung, Jae-Yun Kim, Soo-Mook Moon

Experience

Graduate Student Research Assistant

Sep 2021 - Present

Advisor: Prof. Mosharaf Chowdhury

Building energy-efficient software systems for machine learning. I created Zeus, the first energy optimization system for DNN training on GPUs. Zeus is a PyTorch ecosystem project and serves as the bedrock for Chase, a carbon-efficient DNN training solution, the ML.ENERGY Leaderboard, the first energy benchmark for LLM inference, the ML.ENERGY Colosseum, an interactive service that lets users compare LLM responses in terms of both quality and energy consumption, and Perseus, a large model training energy optimizer that reduces per-iteration energy consumption by up to 30% without training slowdown.

Keywords:

  • MLSys
  • Energy
  • LLM
  • Training
  • Inference
  • Open-Source

Research Intern

Mar 2020 - May 2022

Advisor: Prof. Byung-Gon Chun

Developed Crane, a GPU cluster manager for elastic AutoML jobs. Wrote components for automatic cluster bootstrapping on Docker Swarm and enabled full operation on top of Kubernetes. Worked on efficient AutoML scheduling policies on GPU clusters.

Keywords:

  • MLSys
  • AutoML
  • Training
  • Cluster Management
  • Scheduling
  • Open-Source
Dec 2019 - Jun 2020

Advisor: Prof. Soo-Mook Moon

Created ShadowTutor, a server-client collaborative DNN inference system that distills knowledge from a server-side large DNN to a small DNN on the client in an online fashion.

Keywords:

  • MLSys
  • Inference
  • Knowledge Distillation

Research Intern

Jun 2019 - Dec 2019

Advisor: Prof. Kyoung Mu Lee

Worked on finding better meta-initialization points for Model-Agnostic Meta-Learning (MAML) using LSTM-based neural memory modules. Also worked on embedding images of the same class into a single class embedding vector and augmenting MAML with self-attention scores derived from class embeddings.

Keywords:

  • ML
  • Computer Vision
  • Meta-Learning
  • Few-Shot Classification
  • Optimization
Jun 2019 - Aug 2019

Advisor: Prof. Jongho Lee

Designed and implemented CAD-QSMNet, a full deep learning pipeline for Quantitative Susceptibility Mapping (QSM) for brain MRI images, including a new U-Net variant model.

Keywords:

  • ML
  • Computer Vision
  • Medical Imaging
  • Data Engineering

Open-Source Projects

Number of stars and forks are as of November 8th, 2024.
BERT4Rec-VAE-Pytorch (355 84)

Implementation of BERT4Rec and Netflix VAE recommendation models.

  • Python
  • PyTorch
  • RecSys
  • |
  • GitHub
Reason (190 5)

A shell for research papers. Supports UNIX-like commands that instead work on a set of research papers.

Pegasus (31 3)

An SSH command runner with a focus on simplicity. Useful when you have a bunch of commands to run and a bunch of SSH nodes available.

Talks

How to Create Mediocre Open-Source Repositories

University of Michigan (SymbioticLab lunch seminar) | August 2024

Power and Energy Considerations in Machine Learning Systems

University of Michigan (EECS 598: Systems for Generative AI) | April 2024

  • Energy
  • MLSys
Energy-Efficient Software Systems for Machine Learning

Seoul National University | October 2023

  • Energy
  • MLSys
Energy-Efficient Deep Learning with PyTorch and Zeus

PyTorch Conference | October 2023

Energy-Efficient Deep Learning with Zeus

Massachusetts Institute of Technology | September 2023

  • Energy
  • MLSys
Memory Plus Meta-Learning

Deepest | August 2019

Education

  • PhD, Computer Science and Engineering
    (In progress)
    University of Michigan
    Sep 2021 - Present
  • MS, Computer Science and Engineering
    University of Michigan
    Sep 2021 - Apr 2023
  • BS, Electrical and Computer Engineering
    Summa cum laude
    Seoul National University, South Korea
    Mar 2015 - Aug 2021

Proficiency

Languages

  • Python
  • Rust
  • Go, C++, CUDA, Verilog
  • Zig, JavaScript

Tools and Frameworks

  • FastAPI, Mkdocs, Pandas, NumPy
  • PyTorch, Kubernetes, LaTeX

Others

  • Commandline
  • Neovim
  • GitHub
  • Open-Source
  • Documentation

Honors & Awards

  • Second Best Solution in Carbon Hack '22
    $25,000 prize with Chase.
  • Kwanjeong Overseas Scholarship
    $25,000 awarded.
  • Best Tutor Award
    SNU computer architecture, Fall 2020.
  • Kwanjeong Undergraduate Scholarship
    $20,000 awarded over two years.

Teaching

  • Undergrad Operating Systems
    As lead TA, provided Linux kernel lectures, four Linux-based term projects, and team design reviews.
    Spring 2021
  • Undergrad Computer Architecture
    Gave 30 hours of online lecture as peer tutor. Best tutor award!
    Fall 2020

Community Service

English Proficiency

Interests

  • Software Systems
  • Deep Learning
  • Fingerstyle Guitar