(

Simultaneous solution of multiple equations for multiple unknowns is a dauting task, in general. Many of the solution strategies from the solution of single equations in single unknowns (Newton's method, for example) can be applied but the possibilty for no solution or multiple solutions is very much greater.

[Back to main page] [Back to numerics page]

The subject breaks naturally into two divisions:

No attempt is made here to provide a tutorial on solution of equations. Rather, some typical examples are presented and some conclusions are indicated. If you find yourself much in need of a tool in this area, consider the routines in the Optimization Toolbox (provided by the MathWorks).

Finally, before going too far with the material in this section you should

- know how to write functions in MATLAB.
- be familiar with the solution of a single equation in a single unknown.
- have a good familiarity with arrays and linear algebra.

If the equations you wish to solve are linear, then the methods of linear algebra is the way to go. There are two cases that are usually of interest:

*N*equations in*N*unknowns**Example**: Interpolating polynomial of degree*N*-1 passing through*N*distinct points.Given:

*N*distinct*x*-*y*pairs (*x*_{j},*y*_{j}), (*j*= 1:*N*)Find:

*P*_{n}(*x*) such that*P*_{n}(*x*_{j}) =*y*_{j}and*n*=*N*- 1.For each

*x*-*y*pair (*j*= 1:*N*), the equation*c*_{N}*x*_{j}^{N-1}+*c*_{N-1}*x*_{j}^{N-2}+ ... +*c*_{2}*x*+*c*_{1}=*y*_{j}holds since the interpolating polynomial must pass

*exactly*through all the points in the given set. The goal is to use this set of*N*equations to determine the*N*unknown coefficients,*c*_{k}(*k*= 1:*N*) of the polynomial.If we create a set of coefficients

*a*_{jk}=*x*_{j}^{N-k}(i.e., the*j*-th*x*-value raised to the*N*-*k*power) then the set of equations may be written as*A***c**=**y**or

**f**(**c**) =*A***c**-**y**= 0where

*A*is the matrix of coefficients just defined,**c**an*N*-by-1 vector of the unknown coefficients and**y**is a vector that contains the*y*-values from the given data.For example, find the coeffients of the parabola (

*N*= 2) that passes through the points (-1,3), (1,2) and (2,5).**x = [-1 1 2]' % x-values y = [3 2 5]' % y-values A = [x.^2 x ones(size(x))] % coefficient matrix c = A\y % the coefficients check = polyfit(x,y,2) % apply polyfit c - check' % are they the same?**Except for the shape of the coefficient vector, the solution produced by

is the same (within roundoff) as that using matrix inversion (matrix division).`polyfit`

*M*equations in*N*unknowns (with*M*>*N*)**Example**: Fitting functions other than polynomials (generalization)`polyfit`When you use

to find the unknown coefficients in the fit of a polynomial of specified order to a set of data, you are carrying out a particular case of linear regression. More generally, given the functions`polyfit`*f*_{1}(*x*),*f*_{2}(*x*), ... ,*f*_{N}(*x*), the linear model*c*_{1}*f*_{1}(*x*) +*c*_{2}*f*_{2}(*x*) + ... +*c*_{N}*f*_{N}(*x*) =*y*can be fit to a set of

*x*-*y*data ((*x*_{j},*y*_{j}),*j*= 1:*M*) by determining the unknown coefficients*c*_{k}(*k*= 1:*N*). The number of data pairs (*M*) is generally much greater than the number of fitting functions (*N*).Inserting each of the

*M**x*-*y*pairs into the expression for the linear model given above produces the following set of*M*equations*c*_{1}*f*_{1}(*x*_{j}) +*c*_{2}*f*_{2}(*x*_{j}) + ... +*c*_{N}*f*_{N}(*x*_{j}) =*y*_{j}(

*j*= 1:*M*) which constitutes a set of equations that are*linear in the unknown coefficients*,*c*_{k}. Note that there is no restriction on the linearity on the*functions*chosen for the model. What is important here is that the functions are*combined*linearly.Defining the elements of an

*M*-by-*N*matrix*A*as*a*_{jk}=*f*_{k}(*x*_{j}), the equations above may be written in compact matrix form as*A***c**=**y**which is the same

**f**(**c**) that was found for the case of polynomial interpolation in the*M*=*N*case. The only difference here is that*M*>*N*and so there will not be a unique solution.To generate a unique solution, the hope that

**f**(**x**) be zero exactly is given up and some additional criterion for assesing the quality of the solution is imposed. A common criterion is that the coefficients*c*_{k}are chosen such that the residual**f**(**c**) =*A***c**-**y**is minimized in some sense. If the square of the residual (**f**^{T}**f**) is used to guide the minimization then it may be shown that the coefficients satisfy the*N*-by-*N*set of equations*B***c**=**d**where

*B*=*A*^{T}*A*and**d**=*A*^{T}**y**. These are the so-called "normal" equations for the given system. The solution in this case (using MATLAB) is simply**c = A\y**Note that the matrix

*B*and the vector**d**need not (and should not) be formed.As a specific example, consider fitting the function

*y*=*c*_{1}+*c*_{2}*x*+*c*_{3}sin(*x*)to a set of synthetic data (a model with error added):

**x = linspace(0,5,50)'; % x-values for the data set y = 1 + 3*x - 2*sin(x); % the answer noise = 0.5*randn(size(x)); % some noise to throw in y = y + noise; % the disguised answer A = [ones(size(x)) x sin(x)]; % coefficient matrix c = A\y % the solution: is it [1 3 -2]'?**(you should be able to paste the above lines right into the MATLAB command window.) A graphical look at the quality of the solution can also be generated:

**xi = linspace(0,5,200); % plotting coordinates yi = c(1) + c(2)*xi + c(3)*sin(xi); % evaluate the function plot(x,y,'o',xi,yi) % display a comparison**Strictly speaking, this particular type of problem is not root-finding since the function

**f**(**x**) is not 0 at the solution. Rather, a feature of the solution (the minimum of its residual) is found. In a sense, however, a root-find did occur. At the minimum of a function, the derivative is zero. Hence, for this problem a root of the derivative of the residual was found.

Solution of sets of non-linear equations is a difficult task and MATLAB
does not have a built-in solver for more than one variable (though the
Optimization Toolbox does have one). Presently, you can retrieve
`fsolve35.m` from the Mathworks FTP site
(`ftp://ftp.mathworks.com/contrib`) that will give you some
functionality in this area. The focus here will again be on examples that
illustrate the way to set up and deal with these sorts of problems.

As with the linear case (above) two major cases can be distinguished:

**Example**: Newton's method for *N* equations in *N* unknowns

One way of deriving the up-dating procedure for the 1-dimensional
version of Newton's method is to write the Taylor-series representation of
the function and then truncate the expression. For the
*N*-dimensional case, the same approach can be used but keeping track
of the differentiation is most effectively done with matrix notation.

At the point **x**_{o}, the function **f**(**x**) can
be expanded in a Taylor series as

**f**(**x**) =
**f**(**x**_{o}) +
*J* (**x** - **x**_{o}) +
higher-order terms

where *J* is a matrix of first derivatives of the function, called the
*Jacobian matrix*. The elements of this matrix are

*J*_{ij} =
partial derivative of *f*_{i} with respect to *x*_{j}

Truncating this expansion at the first derivative and setting this
expression to zero (which is what we would like **f**(**x**) to be
when the root is found) gives

**0** = **f**(**x**_{o}) +
*J* (**x** - **x**_{o})

which, on re-arrangement, leads to the following estimate of where the root can be found.

**x** = **x**_{o} -
*J*^{-1} **f**(**x**_{o})

If the function at this new estimate of the root is not close enough to
zero, the estimate can replace **x**_{o} and thus generate a new estimate.

A simple implementation of this version of Newton's method follows:

- Write a function which, when provided with a vector of
*x*-values, computes the value of the*N*elements of the function as well as the*N*-by-*N*Jacobian matrix.A typical definition line might look like

**function [f,J] = myfun(x) %MYFUN Returns the function (f) and Jacobian (J) given x** - Use that function in a version of the following script
**tolerance = 1e-6; xo =**a column vector of values**[f,J] = myfun(xo); x = xo - J\f; while norm(x-xo) > tolerance xo = x; [f,J] = myfun(xo); x = xo - J\f; end**

While this is not a "production" version of a robust root-finder, it is simple enough to incorporate in other functions or improve upon to create a robust root-finder. Aside from its simplistic nature, the method suffers from the need for you to provide a matrix of first-derivatives, a time-consuming if not impossible task. For "small" problems, however, the approach may be feasible.

**Example**: Fitting a multi-exponential decay

A fit of the expression

*y* = *a*_{1} exp(*b*_{1} *x*) +
*a*_{2} exp(*b*_{2} *x*)

to a set of data points ((*x*_{j},*y*_{j}),
*j* = 1:*M*) occurs in many guises in engineering and physics.
If the number of points involved is greater than 4 (and it generally should
be) an exact determination of the parameters of the model
(*a*_{k}, *b*_{k}, *k* = 1:2) is not
possible. In such an instance, determination of the parameters is usually
found by minimizing the difference between the prediction (generated by
evaluating the function using an assumed set of parameters) and the actual
*y*-values from the data.

The degree of mismatch between the model and the data is often characterized by the sum of the squares of the individual difference for each data point. Specifically, the "objective function" to be minimized is

*f* = *d*_{1}^{2} +
*d*_{2}^{2} + ... +
*d*_{M}^{2}

where *d*_{j}^{2} = {*y*_{j} -
(*a*_{1} exp(*b*_{1}
*x*_{j}) +
*a*_{2} exp(*b*_{2}
*x*_{j}) }^{2}. While not strictly a
root-finding problem, this data-fitting/optimization problem can be
approached as a root-finding problem from the point of view of its first
derivative (which is zero at the minimum of the function defined above.

The built-in function ` fmins` (renamed

- Write a function Mfile that evaluates the objective function defined
above:
**function sse = expmodel(p,x,y) %EXPMODEL Fit of a 2-term exponential model to a set of data % The parameters are in p (4-by-1 vector) and assumed to % be in the following order: a1 = p(1); b1 = p(2); a2 = p(3); b2 = p(4); ypred = a1*exp(b1*x) - a2*exp(b2*x); % predictions err = y - ypred; % mismatches sse = sum(err.^2); % residual measure**

- Load in a set of
*x*-*y*data and useto find the values of the parameters that minimize the residual measure given above. The following script gives an idea of the direction to head in:`fmins`**x =**a N-by-1 vector of x-values**y =**a N-by-1 vector of y-values**po =**a 4-by-1 vector of inital estimates of the parameters**opt = foptions; p = fmins('expmodel',po,opt,[],x,y);**The variable

is used by`opt`to customize performance and accuracy of the computation. See the on-line help information on`fmins`for your options here.`foptions`

You should be able to see the obvious generalization of this approach
to fitting functions other than the one focussed on here. When
constructing Mfiles for use with ** fmins**, you can either
include the data set as a parameter (as was done above) or you can
eliminate the two extra input arguments and put the data in the Mfile
directly. See the discussion of the syntax
options for fzero for additional information.

[Back to main page] [Back to numerics page]

Comments? Contact Jim Maneval at