The application of a mathematical
formula to approximate the behavior of a physical system is frequently
encountered in the laboratory. The most common such approximation is the
fitting of a straight line to a collection of data. This is usually done using
a method called ``least squares" which will be described in the following
section.

Consider the data shown in Figure 1 and in Table
1. This data appears to have a relative linear relation
between the abscissa (*x*) and ordinate
(*y*) values. If we pick some arbitrary
input value, *X*, and continuously
repeat this input and record the output, we will get a variability of the output
about some mean ordinate, *Y*. This is shown pictorially in a more abstract
form by the normal distribution at *X*.

Figure 1. A normal distribution of observations for a fixed input value.

There
are a total of *N* observations of a *y*-value for the input *x*. If we assume the data may be
represented mathematically by an equation for a straight line, we write

By carefully selecting the two parameters a
and b, we may find an equation which closely imitates the relationship
between *y* and *x* portrayed in* *Figure 1.

Table 1. A small linear dataset used to construct the
previous figure.

Unfortunately,
for any given input* x _{i}*,
the observation

_{} Equation
2

In
an effort to find the **best** values of
*a,b** *which minimize the errors, we might take a derivative of the total error (summation
of all the e_{i})with respect to *a* and *b*, set them equal
to zero, and solve for the roots of these simultaneous equations. Unfortunately, a simple summation of the
errors is not an adequate measure since opposing positive and negative errors
can cancel one another. In fact, if we define the median values *X* and *Y* as

_{} Equation
3

and

_{} Equation
4

then any line going through the point (*X,Y*) has a zero total error, irrespective
of the values of *a* and *b*. The measure of the error that we normally
use to avoid this problem is to use the sum of the squares of the errors as our
measure of goodness of fit. This is
written as

_{} Equation
5

To
minimize this value with respect to the choice of *a* and *b*, we equate the
derivatives of Equation 5 (with respect to *a,b*) to
zero and solve them simultaneously:

_{} Equation
6

It
can be shown that solving them simultaneously yields:

_{} Equation
7

and

_{} Equation
8

The
above formulas represent the best values of *a* and *b* to minimize the sum of the square errors for the set of data
chosen. This approach is called the *Method
of Least Squares*.

In
cases where the observations are perfectly random, the variability in *y _{i}*

where

*f(**y)* = relative frequency of
observation *y* at *X*.

σ =
standard deviation for y (≈62.5% of observations).

σ^{ 2} = variance

In
Equation 9, we are assuming that the variance is independent of and distributed
uniformly along *x*. If the errors are a result of the
observations in y, then we write our expression as in Equation 8.

Conversely,
if the errors are in *x*, then we write
our model equation as

_{} Equation
10

A
measure of the closeness of the data to our assumed linear expression is given
by Equation 11.

_{} Equation
11

The
value *r ^{2}* is a statistical
measure of the linearity of the curve fit and is called the correlation
coefficient. When the fit is good, the
value of r

In
Figure
2, we have shown two curve fits, one assuming the errors
are in *x*, the other in *y*. Note that **any** line through the median point, (*X,Y*) for this data is as good a fit as any
other. In such a case, there is no
relationship between *x* and *y* and the resulting correlation
coefficient is *r = 0*. If, however, every point lies on a straight line,
then the resulting correlation coefficient would then be *r = *1. Any data having a
nonlinear shape would have a correlation coefficient that would then be greater
or less than unity by an amount related to the nonlinearity involved. Data which is essentially linearly related
but having a wide variability would result in a value of *r≈1*.

Figure 2. A set of x,y
coordinates which represent a statistically indeterminate relationship.

By
finding a similar set of values for *y = a
+ bx* and , we can solve Equation 11 to obtain:

_{} Equation
12

We
next define the standard error as follows

_{} Equation
13

If
we expand Equation 13, we obtain a “short form” solution for the standard
variation of a predicted *y*-value for
a given *x*-value as shown in Equation 14.

_{} Equation
14

In
a similar manner, the application of Equation 9 to the equations for the slope
and intercept yield the variability of the estimate of the zero intercept to be

_{} Equation
15

The
error of the estimate of the slope is found to be

_{} Equation
16

**Linear Example**

Consider
the data shown in the left three columns of Table
2 below.

Table
2. Data and
analysis for a linear curve fit.

Here,
the standard deviations of the slope and intercept are high because of the
spread of the errors. The correlation
coefficient r^{2}=0.925, however, tells us that a linear approximation
is a good fit to this data. The large
variability in the data and the small number of data points have resulted in a
large standard deviation of a y-estimate for a given value of x. The result of this analysis is depicted
graphically in Figure
3.

**Figure ****3****.**** Linear curve fit showing lines at ±1 standard
deviation.**

One
characteristic inferred in Figure
3 is that the errors are not only “normal” at any given
x, but the standard deviation is considered constant over all *x*. Thus, at any given *x*, we now know that the equation *y=mx+b* predicts a value of *y = mx+b ± S _{yx}*.
In other words, if one took large amounts of data and plotted them on
Figure 3, then 62.5% of all the data points would fall within the dotted lines
at

It
is important to note that the foregoing development is specific to a linear model
for the data. The method shown, however, is general. Assume that we wish to
represent the data with another (nonlinear) function, y = f(x). The total
square error is, therefore,

_{} Equation
17

If
there are m parameters in the function f(x), then each derivative of the total squared
error with respect to each parameter must be equated to zero and the system of
equations solved for the parameters, λ_{i}.

_{} *i**=1..m* Equation 18

Note that we continue to use the
Method of Least Squares, even though the function is not a linear one.

**Exponential Curve Fit**

Suppose we have data that, when plotted, appear to have an exponential character. If we choose an exponential function to represent the data, we write

_{} Equation
19

In this formulation, a and b are
the λ_{i} parameters we need to find
that would best fit the function to the data.
Thus, we write for the total squared error:

_{} Equation
20

The two parameters are found by

** _{} **and

_{} Equation
21

_{} Equation
22

It can be shown that this yields a Coefficient of Determination of

_{}

As in the linear case, a value of r2=1 infers a “good fit”
of the model to the data.

**Exponential Example: **Given the data in Table 3, find the appropriate exponential curve fit.

Table 3. Data for an exponential curve fit.

The results of this analysis are shown in the figure below.

Figure
4.
Results of an exponential curve fit.

Using the appropriate formulae for an exponential curve fit,
we obtain a = 3.45, b
= -0.58, and an correlation coefficient of
r^{2} = 0.98. As can be
seen in the above graph, the function fits well as validated by the closeness of r^{2} to
1.

**Power Curve Fit**

Suppose we have data that, when plotted, appear to have a power-law character. If we choose a power function to represent the data, we write

_{} Equation
23

As in the previous section, *a* and *b* are the λ_{m} parameters we
seek that would best fit the function to the data. Thus, we write for the total squared error:

_{} Equation
24

The two parameters are found by

** _{} **and

_{} Equation
25

_{} Equation
26

It can be shown that this yields a Coefficient of Determination of

_{} Equation
27

**Power Law Example: **Given the following data table (see the
first three columns on the left), find the appropriate curve fit assuming a
power law relationship between x and y.
The spreadsheet shown as Table
4 also shows the added spreadsheet columns that help us
compute a, b, and r^{2} based on the formulae of Equations 25-27. The computed values of *a, b, r ^{2}* for a power curve fit are shown at the bottom
of the spreadsheet.

Table 4. Data and analysis spreadsheet for a power law curve fit.

The graph below (Figure 5) shows the best fit curve to this data using the power law function. You will notice that the curve is low in the middle and high on each end.

Figure 5. Results from a power law curve fit.

The disparity infers that a higher order power law fit may
be more appropriate. The first step was
to create a log-log plot of the data. A
column was then created for (*y-y _{0}*)
(where

_{}

I was able to fit this modified data much more accurately as depicted in Figure 6.

Figure 6. Linear log-log curve fit.

When portraying results from curve fitting or just presenting data, it is important to follow your organizations’ standards. In this course, here are the few standards we ask you to adhere to.

- When presenting graphics, make certain that all curves are legible and labeled.
- Photographs and drawings are also to be carefully formatted to assure they are understandable.
- Graphic information (curves, drawings, photographs, charts) are to be labeled as figures. The caption is always to be at the bottom of the figure as it is viewed, even when the figure has to be turned 90 degrees in landscape mode. Legends are to be composed so that the reader clearly can identify what each of the elements in a single figure are. See the figures in this document as examples.
- All tables are to be labeled at the top, as viewed. Headings are to be clearly labeled so that the reader knows what is being presented. See the tables in this document as examples.

You can apply the method presented here to any curve form you wish. There are many additional forms of nonlinear fitting methods which may become useful to you as you seek to characterize your results. Be aware of the fact that taking just a few data points can dramatically affect your calculations. Go back over this document and look at the influence of N on the equations (Equations 7,13,14,15, etc) to understand its importance. The standard deviation, Syx (Eqn.13), is very sensitive to small values of N, so keep your sample size as large as is practical.