Chemistry
2031: Research Methods in Chemistry
Good news and bad news. The good news is, this is our last assignment involving Excel. It's about time to expand our computer experience to other programs and uses. The bad news is, this is our last assignment involving Excel. At about the moment you really begin to feel comfortable working with a program, it's time to move on to something else.
Our last lesson on graphing and data analysis in Excel involves finding linear relationships out of equations that, at first glance, don't look linear. In the last tutorial, on Beer's Law, the raw data as given (absorbance vs concentration) was linear. Often, however, data measured in the lab will not give a linear relationship when plotted. However, with some mathematical manipulation the data can be made linear. Vital information can then be extracted from manipulation and analysis of the linear relationship.
This is another tutorial that has two sets of data, one for the example tutorial itself, and one for the actual assignment. I can think of no better way to get confident and efficient in determining linear relationships than to actually do several of them. The example we will use comes from the marvelous world of chemical kinetics, which all of you may remember fondly from general chemistry, if not from later courses as well.
Tutorial Example: Imagine you are studying a chemical reaction:
A -----> B + C
where one molecule of A falls apart into two pieces over time. You've carefully measured the concentration of A over time, and obtained the following results:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Since this is an equation describing something changing over
time, it's written as a differential equation.
Integrating this equation over time
where t is time, [A]t is the concentration at that
time,
and k and [A]0 are constants (the rate constant
and the "initial" concentartion at time=0)
Look carefully at this one a moment, and don't be put off by the
logarithm terms. We have a linear equation
here! The y-values are the logarithm of the
concentrations, the x-values are time.
So IF we plot the data in This format (natural log of the concentrations vs. time) we would expect a straight line: the slope would be -k, and the y-intercept will be ln[A]o. Thus, after doing the linear regression we will get the value of the rate constant, k (- slope) for this reaction. Then, applying a little of our recently learned statistics, we can determine the error range in our measured rate constant.
In
order to properly analyze this in Excel, we will need to plot our data
in
the right format, which first means getting our data in the
right
format: Natural Logs (of concentrations) vs Time
Convert your data to the linear format.
The new equation still has time in seconds, so that does not need to be
changed from the raw format. However, the new equation requires
the
natural log of the concentrations. In the column in your
spreadsheet
next to your concentration data, use the =LN equation to make a new
column
of data containing the natural logs of the concentrations.
Make a graph of the linear format data.
On your spreadsheet, make a second graph (not on
the same
axes as the first, but a completely separate graph) that plots ln[A]
versus
time. Does it look linear?
Do the linear regression using the data from the
second graph.
Carry out a statistical analysis on the slope and y-intercept. Be sure to list all the pertinent values and report the rate constant for this reaction properly.
I
obtained a value of 7.00 x 10-4 +/- 8.1 x 10-6 s-1
for the rate constant here.
See
if you can figure out how to treat the y-intercept data. From it you
should be able to extract [A]0, an estimate of the original reactant concentration, and
its error
range. There is a complication in getting the error range in [A]0 out of the y-intercept. If
you can
spot what the complication is, and how to solve the problem, you'll be
way
ahead of the game. Major hints that point to the answer are in the
spreadsheet I used. Download it by clicking here.
A Final Word About Graphing and Linear Regression:
Whenever you make a plot of data and fit it to a linear or some other
more
complicated function...
1. Graph the raw data (concentration vs. time in the example above) and see if it looks as expected. Look for any obvious mistyped data points.
2. Look at the regression line (line only) and compare it to your data points. By eye, how good does the data look? You can often pick up systematic curvature in the data at this point, despite the fact the R2 value will come out 0.9 or higher. It is sometimes better to have random scatter and a lower R2 value, rather than points with a slight curvature to them but a higher R2 value.
3. Look at the R2 value. Generally, this must be at least 0.9 to consider the data linear. There is a formal statistical way to judge this that we are not going to get into too deeply, but for your information, if your set of data only has three points, your R2 value has to be greater than 0.994 in order for you to believe to 95% confidence that those three points fit a straight line. For five plotted data points your R2 value needs to be at least 0.77 in order for you to have significant (95%) confidence that your data is linear. (See Statistical Treatment of Experimental Data, by Hugh D. Young, p. 164) Remember also that this is only one of three criteria you must use in judging the linearity of your data.
4. Finally, the residuals must be random. If one of the residual points were removed, you should have NO IDEA where it was sitting. If in your mind you can "fill in" where the removed residual point sat, based on the residuals around it, it's an indication of non-random error in your data. Also, you should not be able to predict where the next data point at either end of your data will be. "Curvature" as indicated by NON-RANDOM residuals can indicate, if you are studying a reaction, that other side reactions are playing a part. It may mean the temperature of the reaction mixture was not held steady. Or, it can mean the model you are using that predicts a linear relationship may be incorrect.
5. Numbers derived from a linear regression, like all results that come from measurements, must be reported properly: the value from the regression routine +/- the confidence limits, typically at 95%, rounded to the proper number of significant figures and with appropriate units attached.
Whew!
That's a lot of stuff. But it's by adhering to the details here
that chemists (and scientists in general) can generate numbers that
they can
believe in, and make critical decisions about. From dosages of
medication
to levels of pollutants that are harmful, it all boils down to the
careful
and repeated measurements of numbers, analyzed thoroughly, with error
limits
well defined. While we have only touched the surface of statistics and
error
analysis, it all follows the principles outlined here.
Consider the following:
When any liquid evaporates it generates a pressure called vapor pressure. Written as a standard chemical equation, this process can be written like this for a molecule like benzene, C6 H6:
C 6H6 (l) <===> C6 H6 (g)
As you might guess, the hotter the liquid becomes, the greater vapor pressure it generates. This is an important point we'll come back to in a moment.
Recall that the absorption or liberation of heat is an important aspect of chemical reactions. If the reaction absorbs heat it's said to be endothermic, and if heat is liberated, the reaction is exothermic. Recall further that the amount of heat generated or absorbed by a chemical reaction is referred to as the reaction's enthalpy, symbolized DH. The determination of a reaction's enthalpy is often a primary goal of chemists who study chemical reactions. Knowing the DH is important in order to get a complete understanding of the reaction.
Returning to the discussion of vapor pressure, recall that when you heat a liquid to the point where the vapor pressure equals the prevailing atmospheric pressure, the liquid will boil. For the process of boiling, the heat absorbed as the process occurs is called the enthalpy of vaporization, DHvap. This is a physical property like density or index of refraction that one can use to identify a substance uniquely. The entalpy of vaporization is always a positive number because a liquid ALWAYS requires energy to make it boil (i.e. boiling is always endothermic).
There is an equation that brings together all of the above discussion about boiling and vapor pressure. This equation relates theoretically the temperature and the vapor pressure generated by an evaporating liquid and is called the Clausius-Clapeyron equation. See below:

Believe it or not, this equation can be used to take data that is distinctly NON-Linear and plot it to get something that looks linear (just like the first order kinetics in the tutorial). Done properly, a linear regression on temperature-vapor pressure data will yield a value for the enthalpy of vaporization. In addition, one can also determine a value for the constant P inf.
Here is some data from a series of measurements for the vapor pressure of benzene at various temperatures:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Remember, it's always a good idea to graph the raw data first. The reason for doing a quick graph of it is to make sure you type it in properly. One data point not in place with the others will usually stick out very distinctly.
1. Graph the vapor pressure vs temperature data. Notice that when you plot the data this way it is definitely NOT linear. You need to find a way to graph this data so that it IS linear, which is the next step of this assignment. DO NOT try to do a linear regression on this data directly. It will be imporatant for you first to perform the manipulations described below. Read on.
2.
In order to find a way to use this data to get a linear-looking plot
you will need to pay close attention to the Clausius-Clapeyron
equation. Use the Clausius-Clapeyron
equation to help you decide which values to graph. This will
require you to manipulate the vapor pressure data and the temperature,
converted, of
course,
to degrees Kelvin, before doing the regression. Check out THIS EXPLANATION. It is
important that you distinguich the gas constant R from the symbol used
to represent the correlation coefficient. You must also
use the correct value for the gas constant. What units do you
think the Clausius-Clapeyron R should have?
Once you have made the correct plot, you should
get
a good "R-squared" value (the correlation coefficient R this time) and
the data should LOOK linear if you did the
exercise
correctly. Generate a graph that is based on this regression
(this
will be a similar exercise to the one's you have done in the previous
two
assignments).
3. Calculate the value of DH and of Pinf from your slope and intercept, respectively. In order to get a value for Pinf you will need to use the exponentional function again: =exp(). You will also need to use as a value of R, the Gas Constant, 8.3145 J/mol-K. Next, calculate the 95% confidence limit uncertainty in m and with this quantity calculate the 95% confidence limit in DH. Caution: you will have to do some simple algebra to get the enthalpy from the slope. In order to get the error in DH, you will have to treat the ERROR IN THE SLOPE in the same way you treated the slope to get DH. Do not worry about calculating the error in the y-intercept or the error in Pinf . As it turns out, the calculation of the error in DH is a very simple special case of a type of problem called "Propagation of Error." Depending on the situation, the determination of the uncertainty will, in general, be more complicated. If you want to know more, take physical chemistry!
4. In a shape or text box on the spreadsheet:
Properly report your values of DH and along with its uncertainty, and give your Pinf value no more than 3 significant digits (use scientific notation).Justify your linear equation. What did you have to graph to get these calculated values? How did you come up with this?
How good of a model is the Clausius-Clapeyron equation for probing the relationship between vapor pressure and temperature (ie. is the data truly linear)?
How would you expect your plot to change if you did a study on another organic liquid? What physical property or properties are responsible for that change?
Please
make sure you follow the directions at the end of the Assign4 tutotial.
Return to Chemistry 2031 Homepage.