Group Normalization (Paper Explained)

The dirty little secret of Batch Normalization is its intrinsic dependence on the training batch size. Group Normalization attempts to achieve the benefits of normalization without batch statistics and, most importantly, without sacrificing performance compared to Batch Normalization.

Abstract:
Batch Normalization (BN) is a milestone technique in the development of deep learning, enabling various networks to train. However, normalizing along the batch dimension introduces problems — BN’s error increases rapidly when the batch size becomes smaller, caused by inaccurate batch statistics estimation. This limits BN’s usage for training larger models and transferring features to computer vision tasks including detection, segmentation, and video, which require small batches constrained by memory consumption. In this paper, we present Group Normalization (GN) as a simple alternative to BN. GN divides the channels into groups and computes within each group the mean and variance for normalization. GN’s computation is independent of batch sizes, and its accuracy is stable in a wide range of batch sizes. On ResNet-50 trained in ImageNet, GN has 10.6% lower error than its BN counterpart when using a batch size of 2; when using typical batch sizes, GN is comparably good with BN and outperforms other normalization variants. Moreover, GN can be naturally transferred from pre-training to fine-tuning. GN can outperform its BN-based counterparts for object detection and segmentation in COCO, and for video classification in Kinetics, showing that GN can effectively replace the powerful BN in a variety of tasks. GN can be easily implemented by a few lines of code in modern libraries.

Authors: Yuxin Wu, Kaiming He

Links:
YouTube:
Twitter:
BitChute:
Minds:

source

What's Hot

OM1’s PhenOM® Foundation AI Surpasses One Billion Years of Health History in Model Training

The Zacks Analyst Blog Highlights C3.ai, UiPath, Microsoft, Alphabet and Amazon

Nvidia to Launch Downgraded H20 AI Chip in China after US Export Curbs – Space/Science news

Group Normalization (Paper Explained)

[News] Google’s medical AI was super accurate in a lab. Real life was a different story.

Concept Learning with Energy-Based Models (Paper Explained)

Faster Neural Network Training with Data Echoing (Paper Explained)

The Visionary Design Behind The Broadway Musical ‘Maybe Happy Ending’

Inside UNTITLED, An Art-Filled Hotel Tucked Down A Graffitied Alley

Celebrating A Decade With Icons, Rebels And Urgent New Voices

Monumental Relief of Last Assyrian Ruler Unearthed in Ancient Nineveh

OM1’s PhenOM® Foundation AI Surpasses One Billion Years of Health History in Model Training

The Zacks Analyst Blog Highlights C3.ai, UiPath, Microsoft, Alphabet and Amazon

Nvidia to Launch Downgraded H20 AI Chip in China after US Export Curbs – Space/Science news

What's Hot

Group Normalization (Paper Explained)

Related Posts

Subscribe to Updates