Your Website Title
Why Warmup the Learning Rate?

Why Warmup the Learning Rate?

In this blog, I will try break down the findings from the paper “Why Warmup the Learning Rate? Underlying Mechanisms and Improvements” and explain how warm-up helps stabilize training by reducing gradient sharpness and enabling the use of higher learning rates.

Detection Transformer

DETR End-to-End Object Detection with Transformers

The Detection Transformer (DETR) represents a novel, end-to-end approach to object detection, reframing the task as a set prediction problem. This architecture relies on a transformer encoder-decoder structure and a unique assignment strategy involving the Hungarian matching algorithm, enabling the model to bypass post-processing steps like Non-Maximal Suppression (NMS) and reliance on anchor design.