Curve Fitting


Method of Least Squares


            The application of a mathematical formula to approximate the behavior of a physical system is frequently encountered in the laboratory. The most common such approximation is the fitting of a straight line to a collection of data. This is usually done using a method called ``least squares" which will be described in the following section.


            Consider the data shown in Figure 1 and in Table 1. This data appears to have a relative linear relation between the abscissa (x) and ordinate (y) values. If we pick some arbitrary input value, X, and continuously repeat this input and record the output, we will get a variability of the output about some mean ordinate, Y.  This is shown pictorially in a more abstract form by the normal distribution at X.

Figure 1. A normal distribution of observations for a fixed input value.


There are a total of N observations of a y-value for the input x. If we assume the data may be represented mathematically by an equation for a straight line, we write

                                                                                                                                                         Equation 1

By carefully selecting the two parameters a and b, we may find an equation which closely imitates the relationship between y and x portrayed in Figure 1.

Table 1.  A small linear dataset used to construct the previous figure.



Unfortunately, for any given input xi, the observation yi is not necessarily exactly the value predicted by Equation 1.  The error, εi, is therefore given as

                                                                                                        Equation 2

In an effort to find the best values of a,b which minimize the errors,  we might take a derivative of the total error (summation of all the ei)with respect to a and b, set them equal to zero, and solve for the roots of these simultaneous equations.  Unfortunately, a simple summation of the errors is not an adequate measure since opposing positive and negative errors can cancel one another. In fact, if we define the median values X and Y as

                                                                                                            Equation 3


                                                                                                             Equation 4

then any line going through the point (X,Y) has a zero total error, irrespective of the values of a and b. The measure of the error that we normally use to avoid this problem is to use the sum of the squares of the errors as our measure of goodness of fit.  This is written as

                                                                       Equation 5

To minimize this value with respect to the choice of a and b, we equate the derivatives of Equation 5 (with respect to a,b) to zero and solve them simultaneously:

                                                                                                                 Equation 6


It can be shown that solving them simultaneously yields:

                                                                                         Equation 7


                                                                                                                 Equation 8

The above formulas represent the best values of a and b to minimize the sum of the square errors for the set of data chosen. This approach is called the Method of Least Squares.


Statistical Treatment of the Curve Fit

In cases where the observations are perfectly random, the variability in yi for a specific xi has a bell shape.  This type of variation is termed ``Normal." For a normal distribution, we can approximate the relative frequency of the variation using a bell-shaped mathematical function for this distribution, f(y).

                                                                                Equation 9


f(y)       =  relative frequency of observation y at X.

σ          = standard deviation for y (≈62.5% of observations).

σ 2        = variance

In Equation 9, we are assuming that the variance is independent of and distributed uniformly along x.  If the errors are a result of the observations in y, then we write our expression as in Equation 8.

Conversely, if the errors are in x, then we write our model equation as

                                                                                                                  Equation 10

A measure of the closeness of the data to our assumed linear expression is given by Equation 11.

                                                                                              Equation 11

The value r2 is a statistical measure of the linearity of the curve fit and is called the correlation coefficient.  When the fit is good, the value of r2 is very close to one.  If it deviates from 1 the linear assumption falters.


In Figure 2, we have shown two curve fits, one assuming the errors are in x, the other in y. Note that any line through the median point, (X,Y) for this data is as good a fit as any other.  In such a case, there is no relationship between x and y and the resulting correlation coefficient is r = 0.  If, however, every point lies on a straight line, then the resulting correlation coefficient would then be r = 1.  Any data having a nonlinear shape would have a correlation coefficient that would then be greater or less than unity by an amount related to the nonlinearity involved.  Data which is essentially linearly related but having a wide variability would result in a value of r≈1.

Figure 2. A set of x,y coordinates which represent a statistically indeterminate relationship.


By finding a similar set of values for y = a + bx and  , we can solve Equation 11 to obtain:

                                                                Equation 12

We next define the standard error as follows

                                                                       Equation 13

If we expand Equation 13, we obtain a “short form” solution for the standard variation of a predicted y-value for a given x-value as shown in Equation 14.

                                                                               Equation 14

In a similar manner, the application of Equation 9 to the equations for the slope and intercept yield the variability of the estimate of the zero intercept to be

                                                                                   Equation 15

The error of the estimate of the slope is found to be

                                                                                             Equation 16


Linear Example


Consider the data shown in the left three columns of Table 2 below.

Table 2. Data and analysis for a linear curve fit.


Here, the standard deviations of the slope and intercept are high because of the spread of the errors.  The correlation coefficient r2=0.925, however, tells us that a linear approximation is a good fit to this data.  The large variability in the data and the small number of data points have resulted in a large standard deviation of a y-estimate for a given value of x.  The result of this analysis is depicted graphically in Figure 3.

Figure 3.  Linear curve fit showing lines at ±1 standard deviation.


One characteristic inferred in Figure 3 is that the errors are not only “normal” at any given x, but the standard deviation is considered constant over all x.  Thus, at any given x, we now know that the equation y=mx+b predicts a value of y = mx+b ± Syx.  In other words, if one took large amounts of data and plotted them on Figure 3, then 62.5% of all the data points would fall within the dotted lines at  ±Syx (above and below the line).



Alternate Curve Fits

It is important to note that the foregoing development is specific to a linear model for the data. The method shown, however, is general. Assume that we wish to represent the data with another (nonlinear) function, y = f(x). The total square error is, therefore,

                                                                              Equation 17

If there are m parameters in the function f(x), then each derivative of the total squared error with respect to each parameter must be equated to zero and the system of equations solved for the parameters, λi.

          i=1..m                                                                                                                                       Equation 18

Note that we continue to use the Method of Least Squares, even though the function is not a linear one.


Exponential Curve Fit


Suppose we have data that, when plotted, appear to have an exponential character.  If we choose an exponential function to represent the data, we write

                                                                                                                                                             Equation 19

In this formulation, a and b are the λi parameters we need to find that would best fit the function to the data.  Thus, we write for the total squared error:

                                                               Equation 20

The two parameters are found by

 and  which are solved simultaneously to obtain

                                                                                                 Equation 21

                                                                                                                Equation 22

It can be shown that this yields a Coefficient of Determination of

As in the linear case, a value of r2=1 infers a “good fit” of the model to the data.



Exponential Example:  Given the data in Table 3, find the appropriate exponential curve fit.

Table 3.  Data for an exponential curve fit.


The results of this analysis are shown in the figure below.

Figure 4. Results of an exponential curve fit.

Using the appropriate formulae for an exponential curve fit, we obtain a = 3.45,  b = -0.58, and an correlation coefficient of  r2 = 0.98.  As can be seen in the above graph, the function fits well as validated by the closeness of  r2 to 1.



Power Curve Fit

Suppose we have data that, when plotted, appear to have a power-law character.  If we choose a power function to represent the data, we write

                                                                                                                                                              Equation 23

As in the previous section, a and b are the λm parameters we seek that would best fit the function to the data.  Thus, we write for the total squared error:

                                                                 Equation 24

The two parameters are found by

 and  which are solved simultaneously to obtain

                                                                                     Equation 25

                                                                                                            Equation 26

It can be shown that this yields a Coefficient of Determination of

                                                        Equation 27


Power Law Example:  Given the following data table (see the first three columns on the left), find the appropriate curve fit assuming a power law relationship between x and y.  The spreadsheet shown as Table 4 also shows the added spreadsheet columns that help us compute a, b, and r2 based on the formulae of Equations 25-27.  The computed values of a, b, r2 for a power curve fit are shown at the bottom of the spreadsheet.

Table 4. Data and analysis spreadsheet for a power law curve fit.

The graph below (Figure 5) shows the best fit curve to this data using the power law function.  You will notice that the curve is low in the middle and high on each end.



Figure 5. Results from a power law curve fit.

The disparity infers that a higher order power law fit may be more appropriate.  The first step was to create a log-log plot of the data.  A column was then created for (y-y0) (where y0=constant) and that data was included in the plot.  The result was a linearized function was created in the log-log space by manually adjusting the value of y0.  A model for a straight line with a slope of m on a log-log plot was used, or

I was able to fit this modified data much more accurately as depicted in Figure 6.


Figure 6.  Linear log-log curve fit.




Presenting Results


When portraying results from curve fitting or just presenting data, it is important to follow your organizations’ standards.  In this course, here are the few standards we ask you to adhere to.

  • When presenting graphics, make certain that all curves are legible and labeled.
  • Photographs and drawings are also to be carefully formatted to assure they are understandable.
  • Graphic information (curves, drawings, photographs, charts) are to be labeled as figures.  The caption is always to be at the bottom of the figure as it is viewed, even when the figure has to be turned 90 degrees in landscape mode. Legends are to be composed so that the reader clearly can identify what each of the elements in a single figure are.  See the figures in this document as examples.
  • All tables are to be labeled at the top, as viewed.  Headings are to be clearly labeled so that the reader knows what is being presented. See the tables in this document as examples.


General Comments


You can apply the method presented here to any curve form you wish.  There are many additional forms of nonlinear fitting methods which may become useful to you as you seek to characterize your results.  Be aware of the fact that taking just a few data points can dramatically affect your calculations.  Go back over this document and look at the influence of N on the equations (Equations 7,13,14,15, etc) to understand its importance.  The standard deviation, Syx (Eqn.13), is very sensitive to small values of N, so keep your sample size as large as is practical.