This script shows how to use the LBFGS minimizer with curvatures.
Authors: Pavel V. Afonine, Dorothee Liebschner
See the LBFGS script without curvature for some information about the LBFGS algorightm.
To illustrate the use of the minimizer, we will use the Rosenbrock function (see also LBFGS script without curvature).
It is defined by:
We will use the coefficients a=1 and b=100:
\begin{equation} f(x,y) = (1-x)^2 +100(y-x^2)^2 \end{equation}The partial first derivatives are:
\begin{equation} \frac{\partial f(x,y)}{\partial x} = -2 (1-x) + 400 (-xy + x^3) \end{equation} \begin{equation} \frac{\partial f(x,y)}{\partial y} = 200(y - x^2) \end{equation}The partial second derivatives are (the diagonal elements of the Hessian matrix):
\begin{equation} \frac{\partial^2 f(x,y)}{\partial x ^2} = 2 + 400(-y+3x^2) \end{equation} \begin{equation} \frac{\partial^2 f(x,y)}{\partial y ^2} = 200 \end{equation}The global minimum is where the partial derivatives are equal to zero. For the Rosenbrock function with coefficients a=1 and b=100, this is at the coordinates: \( x = 1 \), \( y = 1 \). So the LBFGS minimizer should find coordinates equal (or very close) to this point. The function value of the Rosenbrock function at (1,1) is equal to 0.
from __future__ import division, print_function
from scitbx import lbfgs as scitbx_lbfgs
from scitbx.array_family import flex
from libtbx import adopt_init_args
# Rosenbrock's function, gradients and curvatures
def target(x,y):
return (1-x)**2+100*(y-x**2)**2
def grad_x(x,y):
return -2*(1-x) + 400*(-x*y+x**3)
def grad_y(x,y):
return 2*100*(y-x**2)
def curv_xx(x,y):
return 2 + 400*(-y+3*x**2)
def curv_yy(x,y):
return 200
def lbfgs_run(target_evaluator,
min_iterations=0,
max_iterations=None,
traditional_convergence_test=True,
use_curvatures=False):
ext = scitbx_lbfgs.ext
minimizer = ext.minimizer(target_evaluator.n)
minimizer.error = None
is_converged = ext.traditional_convergence_test(target_evaluator.n)
try:
icall = 0
requests_f_and_g = True
requests_diag = use_curvatures
while 1:
if (requests_f_and_g):
icall += 1
x, f, g, d = target_evaluator(
requests_f_and_g=requests_f_and_g,
requests_diag=requests_diag)
if (use_curvatures):
if (d is None): d = flex.double(x.size())
have_request = minimizer.run(x, f, g, d)
else:
have_request = minimizer.run(x, f, g)
if (have_request):
requests_f_and_g = minimizer.requests_f_and_g()
requests_diag = minimizer.requests_diag()
continue
assert not minimizer.requests_f_and_g()
assert not minimizer.requests_diag()
if (minimizer.iter() >= min_iterations and is_converged(x, g)): break
if (max_iterations is not None and minimizer.iter() >= max_iterations):
break
if (use_curvatures):
have_request = minimizer.run(x, f, g, d)
else:
have_request = minimizer.run(x, f, g)
if (not have_request): break
requests_f_and_g = minimizer.requests_f_and_g()
requests_diag = minimizer.requests_diag()
except RuntimeError as e:
minimizer.error = str(e)
minimizer.n_calls = icall
return minimizer
class minimizer:
def __init__(self, xx=-3, yy=-4, min_iterations=0, max_iterations=10000):
adopt_init_args(self, locals())
self.x = flex.double([xx, yy])
self.n = self.x.size()
def run(self, use_curvatures=False):
self.minimizer = lbfgs_run(
target_evaluator=self,
min_iterations=self.min_iterations,
max_iterations=self.max_iterations,
use_curvatures=use_curvatures)
self(requests_f_and_g=True, requests_diag=False)
return self
def __call__(self, requests_f_and_g, requests_diag):
self.xx, self.yy = self.x
if (not requests_f_and_g and not requests_diag):
requests_f_and_g = True
requests_diag = True
if (requests_f_and_g):
self.f = target(self.xx,self.yy)
self.g = flex.double(
(grad_x(self.xx, self.yy),
grad_y(self.xx, self.yy)))
self.d = None
if (requests_diag):
self.d = flex.double(
(curv_xx(self.xx, self.yy),
curv_yy(self.xx, self.yy)))
assert self.d.all_ne(0)
self.d = 1 / self.d
return self.x, self.f, self.g, self.d
def run():
for use_curvatures in (False, True):
print("use_curvatures:", use_curvatures)
m = minimizer().run(use_curvatures=use_curvatures)
print(tuple(m.x), "final")
if (abs(m.x[0]-1) > 1.e-4 or abs(m.x[1]-1) > 1.e-4):
print(tuple(m.x), "failure, use_curvatures="+str(use_curvatures))
print("iter,exception:", m.minimizer.iter(), m.minimizer.error)
print("n_calls:", m.minimizer.n_calls)
print()
assert m.minimizer.n_calls == m.minimizer.nfun()
if (__name__ == "__main__"):
run()
The script consists of several parts:
Imports: At the beginning of the script are import statements. They are used to import the modules needed for our task.
Rosenbrock function: target
returns the value of the Rosenbrock function, grad_x
and grad_y
return the partial derivatives, curve_xx
and curve_yy
return the diagonal elements of the Hessian matrix (matrix of partial second derivatives).
Main function: The run()
function is used to call the minimizer and to print the result to standard output.
Class minimizer: Class to perform minimization of the Rosenbrock function.
Function lbfgs_run: Helper function to run the scitbx minimizer.
The run()
function exectues the run()
method of the minimizer()
class, once with curvatures and once without curvatures:
m = minimizer().run(use_curvatures=use_curvatures)
This will launch the minimization. For both computations, we print some results to standard output.
The minimizer
class has three methods: __init__()
, run()
and __call__()
.
The __init__()
method initializes our class with the starting point (-3, -4) (xx
and yy
), and sets the number of minimum and maximum iterations to 0 and 10,000, respectively.
x
is a two-dimensional array that represents the x and y coordinates, so x[0] = -3 and x[1] = -4 are the (arbitrarily chosen) starting points of the minimization task.
self.n
represents the length of the array x
.
The run()
method starts the L-BFGS minimizer:
self.minimizer = lbfgs_run(
target_evaluator=self,
min_iterations=self.min_iterations,
max_iterations=self.max_iterations,
use_curvatures=use_curvatures)
Then it executes the __call__()
method:
self(requests_f_and_g=True, requests_diag=False)
The __call__()
method allows an instance of the minimzer class to behave like a function, i.e. it can be called like a function; minimizer(requests_f_and_g=True, requests_diag=False)
is then a shorthand for minimizer.__call__(requests_f_and_g=True, requests_diag=False)
.
Depending on the values of the boolean parameters requests_f_and_g
and requests_diag
, the __call__()
method will update the target and the gradients or the diagonal elements of the Hessian matrix (second derivatives). Note that it will actually calculate the inverse of the diagonal elements (the Newton method uses the inverse of the Hessian matrix and the LBFGS algorithm approximates the Hessian matrix):
assert self.d.all_ne(0)
self.d = 1 / self.d
(As we are calculating the inverse, we first assert that the diagonal elements are different from zero.)
Here, we import the lbfgs module from scitbx:
ext = scitbx_lbfgs.ext
Then we instantiate the minimizer from lbfgs with the dimension of our function (in our case: n=2).
minimizer = ext.minimizer(target_evaluator.n)
Then we get a function to check if the minimization converged. Its arguments are the function value and the gradients.
is_converged = ext.traditional_convergence_test(target_evaluator.n)
Now we can start our minimization loop. We get the values of the booleans requests_f_and_g
and requests_diag
from the scitbx minimizer, but the starting values are True and the value of use_curvatures.
We get the coordinates, the function value, the gradients and the curvatures from our minimizer class.
x, f, g, d = target_evaluator(
requests_f_and_g=requests_f_and_g,
requests_diag=requests_diag)
Then we run the scitbx minimizer with the coordinates, the function value and the gradients as input parameters. If we are using curvatures, we also supply the second derivatives. The minimizer will return requests for function values and the gradients and/or for the curvatures:
requests_f_and_g = minimizer.requests_f_and_g()
requests_diag = minimizer.requests_diag()
If the minimizer requests values, we continue in the loop. Otherwise, we check if we meet the convergence criteria. Finally, the function returns the scitbx minimizer object.
The run()
function has some output. It will print the final coordinates, the number of iterations in the scitbx minimizer and an exception message, if present.
use_curvatures: False
(0.9999998308201578, 0.9999996829964546) final
iter,exception: 27 None
n_calls: 34
use_curvatures: True
(1.0000002135019253, 1.000000406037093) final
iter,exception: 43 None
n_calls: 60
Both methods converged to the minimum at (1,1).