Reversifying Neural Networks: Efficient Memory Optimization Strategies for Finetuning Large Models

Event Start
Event End
Location
Building 9, Level 2, Room 2325

Abstract

"In recent years, the rapid expansion of model and data scales has notably enhanced the performance of AI systems. However, this growth has significantly increased GPU memory demands, limiting further scaling due to current constraints. Despite adopting techniques like mixed precision, checkpointing, and parameter-efficient fine-tuning, challenges persist, especially with high-resolution data, leading researchers to compromise on end-to-end fine-tuning.
Amidst these challenges, reversible networks have emerged as a transformative solution. These networks can convert outputs back to inputs, eliminating the need to store intermediate features and drastically reducing memory requirements. This attribute means that, unlike traditional networks, the memory needs of reversible networks are not proportionate to their depth.
This talk introduces the concept of ""neural network reversification,"" a method of transforming non-reversible networks into reversible ones during fine-tuning, which substantially lowers memory usage while maintaining competitive performance. We will explore the memory issues in neural network training, describe various reversible network architectures, and introduce our reversification strategies with their the practical applications and benefits.

Brief Biography

Chen Zhao is a Research Scientist at KAUST, and Lead of the Video Understanding Theme in the Image and Video Understanding Lab (IVUL). She received her Ph.D. degree from Peking University (PKU), China in 2016. She studied in University of Washington (UW), US from 2012 to 2013, and at the National Institute of Informatics (NII), Japan in 2016. Her research interests include computer vision and deep learning, with a focus on video understanding and efficient models. She has published 40+ papers on representative journals and conferences such as T-PAMI, CVPR, ICCV, ECCV, and has received over 3000 citations according to Google Scholar. She has been serving as a reviewer in T-PAMI, T IP, T CSVT, CVPR, ICCV, ECCV, NeurIPS, ICLR, etc. and was recognized as Outstanding Reviewer in CVPR 2021. She received the Best Paper Award in CVPR workshop 2023, the Best Paper Nomination in CVPR 2022 (top 0.4%), and the Best Paper Award in NCMT 2015. She was also awarded the scholarship of Outstanding Talent in Peking University, First Prize of the Qualcomm Innovation Fellowship Contest (QInF) (only 2 in China), the Goldman Sachs Global Leaders Award (only 150 worldwide), and National Scholarship, etc. Her website is www.zhao-chen.com. 

Contact Person