multimin: an interface for GSL optimization routines
Table of Contents
Introduction
multimin
is an interface to the GNU Scientific Library minimization
routines. It is a simple C function. When invoked, all the information
necessary to perform the minimization, that is the function, possibly
its derivatives, the initial conditions and the minimization options,
are passed as formal parameters.
The main advantage is that multimin
automatically handles different
kinds of boundary conditions (open, closed, semi-closed). The way in
which this is performed is by transforming the problem using a change
of coordinates. For instance, assume one is interested in the minimum
\(x_0\) of the function \(f(x)\) on the closed interval
\([a,b]\). Considering the change of variable
\[ x=h(y)=\frac{a+b}{2} + \frac{a-b}{2} \, \sin(y) \]
one can restate the problem above as finding the minimum \(y_0\) of the function \(f(h(y))\) on the open real line. Once the minimum (or better, one minimum) has been found, the solution of the original problem is obtained as \(x_0=h(y_0)\). The list of the implemented transformations can be found below.
Using multimin
one can safely forget about the details of the
interior functioning of the GSL minimization routines. The pointer to
the objective function to be minimized and, when required, its
derivatives, are passed as parameters. So the only programming effort
for the user is to properly code these functions. The C function
returning the value of the objective function at the point \(x\) should
be coded as
void (*f) (const size_t n, const double *x,void *fparams,double *fval)
where n
is the dimension of the problem (number of variables),
fval
is the return value, x
is an array of type double of
dimension n
containing the coordinates where the objective function
has to be computed and fparams
is a pointer to a list of parameters,
different from the value of the variables, on which the function
depends. The function returning derivatives of the objective is coded
similarly (see below).
Download and install
Download the file multimin.c and multimin.h and copy it in the
directory of your choice. In order to work, the GNU scientific library
must be properly installed on your system. To compile a program using
multimin, remember to include the appropriate header file
multimin.h
. To test the function, download this file and compile it
using
gcc -Wall multimin_test.c multimin.c -lgsl -lgslcblas -o multimin_test
The example is discussed in the section below. For a list of recent
additions to multimin
you can check the ChangeLog
file distributed
with the software.
Calling convention
Let's now analyze the calling convention in details
multimin(size_t n,double *x,double *fmin, unsigned *type,double *xmin,double *xmax, void (*f) (const size_t,const double *,void *,double *), void (* df) (const size_t,const double *, void *,double *), void (* fdf) (const size_t,const double *, void *,double *,double *), void *fparams, const struct multimin_params oparams)
the list of the parameters is as follows (where INPUT stands for the input values of the parameter, while OUTPUT indicates their values when the function return):
n
- INPUT: dimension of the problem, number of independent variables of the function.
x
- INPUT: pointer to an array of
n
valuesx[0],...x[n-1]
containing the initial estimate of the minimum point. OUTPUT: contains the final estimation of the minimum position fmin
- OUPUT: return the value of the function at the minimum
type
INPUT a pointer to an array of
n
integerstype[1],...,type[n-1]
describing the boundary conditions for the different variables. The problem is solved as an unconstrained one on a suitably transformed variable y. Possible values are:Table 1: Available transformations value interval function 0 unconstrained \(x=y\) 1 \([x_m,+\infty )\) \(x=x_m+y^2\) 2 \(( -\infty,x_M ]\) \(x=x_M-y^2\) 3 \([ x_m,x_M ]\) \(x=SS+SD \sin(y)\) 4 \(( x_m,+\infty )\) \(x=x_m+\exp(y)\) 5 \(( -\infty,x_M )\) \(x=x_M-\exp(y)\) 6 \(( x_m,x_M )\) \(x=SS+SD \tanh(y)\) 7* \(( x_m,x_M )\) \(x=SS+SD (1+y/\sqrt{1+y^2})\) 8* \(( x_m,+\infty )\) \(x=x_m+.5 (y+\sqrt{1+y^2})\) 9* \(( -\infty,x_M )\) \(x=x_M+.5 (y-\sqrt{1+y^2})\) where \(SS=(x_m+x_M)/2\) and \(SD=(x_M-x_m)/2\).
(*) These are UNSUPPORTED transformations: they have been used for testing purposes but are not recommended.
xmin xmax
- INPUT: pointers to arrays of double containing
respectively the lower and upper boundaries of the
different variables. For a given variable, only the
values that are implied by the type of constraints,
defined as in
*type
, are actually inspected.
f
function that calculates the objective function at a specified point x. Its specification is
void (*f) (const size_t n, const double *x,void *fparams,double *fval)
where
n
- INPUT: the number of variables
x
- INPUT:the point at which the function is required
fparams
- INPUT: pointer to a structure containing parameters required by the function. If no external parameter are required it can be set to NULL.
fval
- OUTPUT: the value of the objective function at the current point x.
df
function that calculates the gradient of the objective function at a specified point x. Its specification is
void (*df) (const size_t n, const double *x,void *fparams,double *grad)
where
n
- INPUT: the number of variables
x
- INPUT:the point at which the function is required
fparams
- INPUT: pointer to a structure containing parameters required by the function. If no external parameter are required it can be set to NULL.
grad
- OUTPUT: the values of the gradient of the
objective function at the current point x are
stored in
grad[0],...,grad[n-1]
.
fdf
fdf calculates the value and the gradient of the objective function at a specified point x. Its specification is
void (*fdf) (const size_t n, const double *x,void *fparams,double *fval,double *grad)
where
n
- INPUT: the number of variables
x
- INPUT:the point at which the function is required
fparams
- INPUT: pointer to a structure containing parameters required by the function. If no external parameter are required it can be set to NULL.
fval
- OUTPUT: the value of the objective function at the current point x.
grad
- OUTPUT: the values of the gradient of the
objective function at the current point x are
stored in
grad[0],...,grad[n-1]
.
fparams
- pointer to a structure containing parameters required by the function. If no external parameter are required it can be set to NULL.
oparams
- structure of the type "multiminparams" containing the
optimization parameters. The members are
double step_size
- size of the first trial step
double tol
- accuracy of the line minimization
unsigned maxiter
- maximum number of iterations
double epsabs
- accuracy of the minimization
double maxsize
- final size of the simplex
unsigned method
- method to use. Possible values are:
- Fletcher-Reeves conjugate gradient
- Polak-Ribiere conjugate gradient
- Vector Broyden-Fletcher-Goldfarb-Shanno method
- Steepest descent algorithm
- Nelder-Mead simplex
- Vector Broyden-Fletcher-Goldfarb-Shanno ver. 2
- Simplex algorithm of Nelder and Mead ver. 2
- Simplex algorithm of Nelder and Mead: random initialization
unsigned verbosity
- if greater then 0 print info on intermediate steps
Example
Consider the same example described in the GSL maual: a two-dimensional paraboloid function centerd in \((a,b)\) with scale factor \((A,B)\) and minimum \(C\):
\[ f(x,y) = A (x-a)^2 + B (y-b)^2 + C \;. \]
Assume the parameters \((a,b,A,B,C)\) are stored in a five dimensional
array named p
. The function returning the value of the objective can
be coded as follows
void f(const size_t n, const double *x,void *fparams,double *fval){ double *p = (double *)params; *fval=p[2]*(x[0]-p[0])*(x[0]-p[0]) + p[3]*(x[1]-p[1])*(x[1]-p[1]) + p[4]; }
while the function returning the derivatives reads
void df(const size_t n, const double *x,void *fparams,double *grad) { double *p = (double *)params; grad[0]=2*p[2]*(x[0]-p[0]); grad[1]=2*p[3]*(x[1]-p[1]); }
and, in addition, one needs a function returning both the value of the objective and of its derivative at the point
void fdf(const size_t n, const double *x,void *fparams,double *fval,double *grad) { double *p = (double *)params; *fval=p[2]*(x[0]-p[0])*(x[0]-p[0]) + p[3]*(x[1]-p[1])*(x[1]-p[1]) + p[4]; grad[0]=2*p[2]*(x[0]-p[0]); grad[1]=2*p[3]*(x[1]-p[1]); }
Once these functions are defined, the program to minimize the objective can simply be
int main(){ double par[5] = { 1.0, 2.0, 10.0, 20.0, 30.0 }; double x[2]={0,0}; double minimum; double xmin[2], xmax[2]; unsigned type[2]; struct multimin_params optim_par = {.1,1e-2,100,1e-3,1e-5,2,0}; /* unconstrained minimization */ multimin(2,x,&minimum,NULL,NULL,NULL,&f,&df,&fdf,(void *) par,optim_par); printf("unconstrained minimum %e at x=%e y=%e\n",minimum,x[0],x[1]); /* minimum constrained in [-1,2]x(-5,5] */ type[0]=3; xmin[0]=-1; xmax[0]=2; x[0]=0; type[1]=2; xmin[1]=-5; xmax[1]=5; x[1]=0; /* increase verbosity */ optim_par.verbosity=1; multimin(2,x,&minimum,type,xmin,xmax,&f,&df,&fdf,par,optim_par); printf("constrained minimum %e at x=%e y=%e\n",minimum,x[0],x[1]); return 0; }
Notice that in the unconstrained minimization NULL
arrays have been
passed to the function. To search for a constrained minimum, the
arrays type
, xmin
and xmax
have been set accordingly. Of course,
in this case the two minima coincide.