Explanation for Coordinate Descent and Subgradient
How to get an easy explanation of Coordinate descent and subgradient solution in context of lasso.
An intuitive explanation followed by proof will be helpful.
Suppose you have a multivariate function F(W)
with K
number of variables/parameters w
( w_1, w_2, w_3, ..., w_k
). The parameters are the knobs and the goal is to change these knobs in a way that F is minimized the function F
. Coordinate descent is a greedy method the sense that on each iteration you change the values of parameters w_i
to minimize F
. It is very easy to implement and like gradient descent it is guaranteed to minimize F
on each iteration and reach a local minima.
Picture borrowed from the Internet through a Bing image search
As shown in the picture above, the function F
has two parameters x
and y
. On each iteration either both of the parameters are changed by a fixed value c
and the value of the function is evaluated at the new point. If the value is higher and the goal is to minimize the function, the change is reversed for the selected parameter. Then the same procedure is done for the second parameter. This is one iteration of the algorithm.
An advantage of using coordinate descent is in the problems where computing the gradient of the function is a expensive.
Sources
Coordinate Descent
Gradient Descent
上一篇: 超过1个不同的网址,只有一个列表视图
下一篇: 坐标下降和次梯度的解释