Your Website Title
Why Warmup the Learning Rate?

Why Warmup the Learning Rate?

Why Do We Use Learning Rate Warm-Up in Deep Learning? Training deep neural networks is notoriously sensitive to hyperparameters, especially the learning rate. One widely adopted technique to improve stability and performance is learning rate warm-up. But why does warm-up help, what exactly does it do and the effect of Warm-up duration and how does it behave with different optimizers like SGD and Adam? What Is Learning Rate Warm-Up? Learning rate warm-up is a simple technique where the learning rate starts small and gradually increases to a target value over a few iterations or epochs. ...

June 30, 2025 路 6 min 路 1115 words 路 Sherif Ahmed
Detection Transformer

DETR End-to-End Object Detection with Transformers

DETR Architecture Components Backbone for Feature Extraction The initial input image is processed by a backbone, typically a pre-trained Convolutional Neural Network (CNN), such as ResNet 50 or ResNet 101, trained on the ImageNet classification task. The last pooling and classification layers are discarded to produce a feature map that captures semantic information for different regions of the image. The network stride is typically 32. The feature map output dimensions are $C \times \text{feature map height} \times \text{feature map width}$, where $C$ is the number of output channels of the last convolution layer. ...

September 26, 2025 路 6 min 路 1149 words 路 Sherif Ahmed