Reversifying Neural Networks: Efficient Memory Optimization Strategies for Finetuning Large Models

Graduate Seminar

Event Start

2024-10-07 - 12:00

Event End

2024-10-07 - 13:00

Location

Building 9, Level 2, Room 2325

Chen Zhao

Research Scientist, Visual Computing Center

Abstract

"In recent years, the rapid expansion of model and data scales has notably enhanced the performance of AI systems. However, this growth has significantly increased GPU memory demands, limiting further scaling due to current constraints. Despite adopting techniques like mixed precision, checkpointing, and parameter-efficient fine-tuning, challenges persist, especially with high-resolution data, leading researchers to compromise on end-to-end fine-tuning.
Amidst these challenges, reversible networks have emerged as a transformative solution. These networks can convert outputs back to inputs, eliminating the need to store intermediate features and drastically reducing memory requirements. This attribute means that, unlike traditional networks, the memory needs of reversible networks are not proportionate to their depth.
This talk introduces the concept of ""neural network reversification,"" a method of transforming non-reversible networks into reversible ones during fine-tuning, which substantially lowers memory usage while maintaining competitive performance. We will explore the memory issues in neural network training, describe various reversible network architectures, and introduce our reversification strategies with their the practical applications and benefits.

Brief Biography

Chen Zhao is a Research Scientist at KAUST, and Lead of the Video Understanding Theme in the Image and Video Understanding Lab (IVUL). She received her Ph.D. degree from Peking University (PKU), China in 2016. She studied in University of Washington (UW), US from 2012 to 2013, and at the National Institute of Informatics (NII), Japan in 2016. Her research interests include computer vision and deep learning, with a focus on video understanding and efficient models. She has published 40+ papers on representative journals and conferences such as T-PAMI, CVPR, ICCV, ECCV, and has received over 3000 citations according to Google Scholar. She has been serving as a reviewer in T-PAMI, T IP, T CSVT, CVPR, ICCV, ECCV, NeurIPS, ICLR, etc. and was recognized as Outstanding Reviewer in CVPR 2021. She received the Best Paper Award in CVPR workshop 2023, the Best Paper Nomination in CVPR 2022 (top 0.4%), and the Best Paper Award in NCMT 2015. She was also awarded the scholarship of Outstanding Talent in Peking University, First Prize of the Qualcomm Innovation Fellowship Contest (QInF) (only 2 in China), the Goldman Sachs Global Leaders Award (only 150 worldwide), and National Scholarship, etc. Her website is www.zhao-chen.com.

Contact Person

Di Wang

Event Start

Event End

Location

Abstract

Brief Biography

Contact Person

Events

Efficient Computation Through Tuned Approximation

Mean Value Expansions for Solutions to Elliptic and Parabolic Equations

PG: Byzantine Fault-Tolerant and Privacy-Preserving Sensor Fusion With Guaranteed Output Delivery

CEMSE - Computer, Electrical and Mathematical Sciences and Engineering Division

Biological and Environmental Sciences Engineering Division

Physical Science and Engineering Division

Study

Expanding Knowledge

Student Affairs

Living in KAUST

About KAUST

Latest from KAUST

CEMSE Division

Follow us