Useful Excel Statistics Formulas

Basic Statistics

Average (Mean), Median, Mode:

The mean, median, and mode are very useful functions for analyzing a set of data. For example, let's say that we are timing a certain experiment. Let's say that we have taken 10 trials. In a large set of data like this, instead of analyzing each time recorded, it is easier to analyze the entire set of data. There are three very useful functions for taking the average, or what value is expected for the entire value.

Note: All of our examples assume you have 10 data points stored from A1 to A10.

Average:

The average of the data is also called the arithmetic mean. The average is the value that we expect to get when performing a specific trial of an experiment. It is calculated by adding all of the numbers in the data set, then dividing the sum by the number of trials that we took. In this example, we could take the sum of the cells A1 through A10 and divide by 10 (the number of trials taken) and this would give us the average, or the arithmetic mean. The average is useful because without taking another trial, we can have a guess as to what the outcome should be (or at least pretty close).

To find the average of our data set in Microsoft Excel, we must enter the formula =AVERAGE(A1:A10) as in the image below:

Average Formula

If you have a few data points and would like to find the average by hand, you can do this simply by adding the values and dividing by the number of data points.

Median:

The median is useful for data sets that are very lopsided. For example, if there are 9 trials that are measured around 4.5 seconds and there is one trial that is measured around 10 seconds, then the average or arithmetic mean will be higher than the value we actually might expect to get in a further trial. Therefore, in this case, the median is more useful.

To find the median of our data set in Microsoft Excel, we must enter the formula =MEDIAN(A1:A10) as below:

Median Formula

Mode:

The mode of the data is the value which appears most frequently. For example, in our set {4.3, 4.4, 4.4, 4.4, 4.5, 4.5, 4.6, 4.6, 4.8, 4.9} the data point 4.4 appears three times (while 4.5 and 4.6 appear only twice), so the mode is 4.4. The mode is useful because it can be considered in what value we can expect in another trail. Sometimes the mode can be useful to seeing which data points are being taken the most accurately.

To find the mode of our data set in Microsoft Excel, we must enter the formula =MODE(A1:A10) - check it out

Mode Formula

Intermediate Statistics

Variance:

Okay so we've seen some very basic statistical analysis functions in Excel. Now, given a new set of data (in our example, just 1, 2, 3, and 4), how else can we analyze the data? One way is to find the variance of the data set, which is basically how the possible values spread around the expected value. The expected value of any set is basically the mean of the set, the value that you could "expect" to see. Variance is simply a measure of how much the set actually varies from that expected value. In our set, the expected value is 2.5. Here is an example of how to quickly find the variance in Microsoft Excel.

Variance

The sample variance is a bit more complicated to compute. To do this first compute the average of the data. Then use the following formula.

Taking the data we have been given above in the average example, we step through a hand computation of the variance. First, we are given 10 data points so n = 10. Then we use the average, 4.54, computed above. We can then put this into the variance equation to get

which comes out to 0.0360. Verify this with Microsoft Excel for additional practice.

Standard Deviation:

The standard deviation is another way of measuring the spread of your data values. The standard deviation examines how far data values typically vary from the mean or the expected value. This deviation from the "standard" or the average is the value that we get. This is a valuable tool when looking at a set of data points and realizing the average "range" of values that we can observe. It is a good measure of how widely spread the values are in the data set. Below is an example of how to easily find the Standard Deviation in Microsoft Excel.

Deviation

To compute this by hand, just simply compute the variance above and take the square root of the result.

See also