Learning Rate decay
Learning Rate Decay Learning rate decay is a technique used to adjust the learning rate during training in order to improve the convergence of the optimization algorithm. The idea behind learning rate decay is to gradually decrease the learning rate over time as the training progresses, so that the optimization process can focus on the finer details of the loss function. Learning Rate Decay methods: Exponential decay: lr = lr_0 * e^(- d * t) The learning rate is decreased exponentially over time, where the decay rate d and initial learning rate lr_0 are hyperparameters. The learning rate decreases quickly at the beginning and then slows down as time goes by. Time-based decay: lr = lr_0 / (1 + d * t) The learning rate is decreased linearly over time, where the decay rate d and initial learning rate lr_0 are hyperparameters. The learning rate decreases in a steady and gradual manner. Step decay: lr = lr_0 * d^(epoch // epoch_drops) The learning rate is decreased by a facto