Hyungyo Kim

Hyungyo Kim | 김현교

I am a Ph.D. candidate in Electrical and Computer Engineering at the University of Illinois Urbana-Champaign, advised by Prof. Nam Sung Kim. I was a research intern at Samsung, Intel, and IBM Reserach. I received my B.S. in Electrical and Computer Engineering from Seoul National University. I am a recipient of the Korean Government Scholarship Program for Study Overseas and the Samsung Ph.D. fellowship.

Email / CV / Scholar / Linkedin

Research

My research focuses on AI systems, specifically on systems for large language model (LLM) inference and training. I work on addressesing system-level bottlenecks with new device/software technologies and algorithms to improve deployment efficiency and scalability.

Selected Publications

2025

LPC: Efficient Lossless Parameter Compression for Deploying LLM Inference on Edge Systems
Nachuan Wang, Hyungyo Kim, Nam Sung Kim
IEEE Embedded Systems Letters (ESL)

The New LLM Bottleneck: A Systems Perspective on Latent Attention and Mixture-of-Experts
Sungmin Yun, Seonyong Park, Hwayong Nam, Younjoo Lee, Gunjun Lee, Kwanhee Kyung, Sangpyo Kim, Nam Sung Kim, Jongmin Kim, Hyungyo Kim, Juhwan Cho, Seungmin Baek, Jung Ho Ahn
arXiv preprint

NetZIP: Algorithm/Hardware Co-design of In-network Lossless Compression for Distributed Large Model Training
Jinghan Huang, Hyungyo Kim, Nachuan Wang, Jaeyoung Kang, Hrishi Shah, Eun Kyung Lee, Minjia Zhang, Fan Lai, Nam Sung Kim
IEEE/ACM International Symposium on Microarchitecture (MICRO)

Stratum: System-Hardware Co-design with Tiered Monolithic 3D-DRAM for Efficient MoE Serving
Yue Pan, Zihan Xia, Po-Kai Hsu, Lanxiang Hu, Hyungyo Kim, Janak Sharda, Minxuan Zhou, Nam Sung Kim, Shimeng Yu, Tajana Rosing, Mingu Kang
IEEE/ACM International Symposium on Microarchitecture (MICRO)

LIA: A Single-GPU LLM Inference Acceleration with Cooperative AMX-Enabled CPU-GPU Computation and CXL Offloading
Hyungyo Kim, Nachuan Wang, Qirong Xia, Jinghan Huang, Amir Yazdanbaksh, Nam Sung Kim
IEEE International Symposium on Computer Architecture (ISCA)

2024

Exploiting Intel Advanced Matrix Extensions (AMX) for Large Language Model Inference
Hyungyo Kim, Nachuan Wang, Qirong Xia, Jinghan Huang, Amir Yazdanbaksh, Nam Sung Kim
IEEE Computer Architecture Letters (CAL)

An LPDDR-based CXL-PNM Platform for TCO-efficient Inference of Transformer-based Large Language Models
Sang-Soo Park, KyungSoo Kim, Jinin So, Jin Jung, Jonggeon Lee, Kyoungwan Woo, Nayeon Kim, Younghyun Lee, Hyungyo Kim, Yongsuk Kwon, Jinhyun Kim, Jieun Lee, YeonGon Cho, Yongmin Tai, Jeonghyeon Cho, Hoyoung Song, Jung Ho Ahn, Nam Sung Kim
IEEE International Symposium on High-Performance Computer Architecture (HPCA)