Exploding Gradients Detection


Difficulty: 2 | Problem written by zeyad_omar
Problem reported in interviews at


Gradient descent is an optimization algorithm that aims to find the best weights that give the least error by subtracting a small value from the previously calculated gradient (a step towards the bottom of the search space). This small constant that the gradient is multiplied by is known as learning rate (alpha)

How small should the learning rate be?

If alpha is too high, the model error diverges and optimal weights will never obtained. On the other hand, if alpha is too small, then the learning process becomes much too slow.

In this problem, you are given a 1D vector of gradients of the model over time as well as a threshold. You should write a function that returns the index at which the gradients are going to explode (reach too high value). 


When the difference between the previous gradient and current gradient exceeds the given threshold.

Return -1 if you did not face the exploding gradients problem.

Sample Input:
<class 'list'>
grads: [1.01, 0.9, 1.56, 10.2, 30.25]
<class 'int'>
threshold: 15

Expected Output:
<class 'int'>

This is a premium problem, to view more details of this problem please sign up for MLPro Premium. MLPro premium offers access to actual machine learning and data science interview questions and coding challenges commonly asked at tech companies all over the world

MLPro Premium also allows you to access all our high quality MCQs which are not available on the free tier.

Not able to solve a problem? MLPro premium brings you access to solutions for all problems available on MLPro

Get access to Premium only exclusive educational content available to only Premium users.

Have an issue, the MLPro support team is available 24X7 to Premium users.

This is a premium feature.
To access this and other such features, click on upgrade below.

Log in to post a comment

Jump to comment-101
soumyajit • 6 months, 1 week ago


Absolute Difference?

Jump to comment-159
uozcan12 • 2 months, 3 weeks ago


Yes, I think absolute difference should be used.


Input Test Case

Please enter only one test case at a time
numpy has been already imported as np (import numpy as np)