Updated March 29, 2023

What is the Empirical Rule?

Robinhood Learn

Democratize Finance For All. Our writers’ work has appeared in The Wall Street Journal, Forbes, the Chicago Tribune, Quartz, the San Francisco Chronicle, and more.

Definition:

The empirical rule estimates that data in a normal distribution will cluster around the mean (average) and thin out to the sides within three standard deviations — creating a bell-shaped curve.

🤔 Understanding the empirical rule

The empirical rule (Three Sigma Rule or the 68-95-99.7 Rule) says that almost all data in a normal distribution will land within a specific distance from the average of the data set (mean). The value that measures how close the data falls to the average is the standard deviation. The rule tells us that 68% of the data will fall within the first standard deviation from the mean, 95% will fall within two standard deviations, and 99.7% will fall within three standard deviations. The empirical rule only applies to normal distribution curves that are symmetrical and bell-shaped. Traders are interested in the distance between data points because it helps them analyze risk and return. Traders can compare the performance of assets' past data patterns, which offers insight into future possibilities.

Example

Normal distribution curves are bell-shaped, meaning that data points tend to cluster more around the average or mean. Consider three people playing basketball. One is a good player, one is an OK player, and the third has never played basketball. In all likelihood, the good player will sink more shots into the net, and those they miss will be close to the target. The OK player may make a few baskets, with missed shots landing farther away from the basket. And the newbie player’s shots will probably be all over the place. Data tends to cluster around the mean, or average, like the good player’s shots cluster around the basket, reflecting a normal distribution and creating a bell-shaped curve. Comparatively, the newbie’s shots that land everywhere are an example of a random distribution that won't create a curve at all. The empirical rule only deals with data that is normally distributed, like the good player’s shots.

Takeaway

The empirical rule is like a social experiment to study herd mentality…

People tend to conform to the conventional way, but there will always be eccentrics and rebels. Data in a normal distribution tend to cluster toward the average, but there can be outliers. In the stock market, outliers show us deviations away from the typical amount of stock movement in a day. We can often predict what people might do by looking at their past behavior. Similarly, past behavior of data dispersion can help us recognize possible future trends.

Ready to start investing?

Certain limitations apply

New customers need to sign up, get approved, and link their bank account. The cash value of the stock rewards may not be withdrawn for 30 days after the reward is claimed. Stock rewards not claimed within 60 days may expire. See full terms and conditions at rbnhd.co/freestock. Securities trading is offered through Robinhood Financial LLC.

Tell me more…

What is the Empirical Rule?

The empirical rule is an equation that tries to estimate where data falls if there is a mean (average) and a standard deviation (distance from the average) in a normal distribution.

Normal distribution curves (also called Gaussian curves) frequently appear in business, medicine, nature, education, and stock analysis. They are bell-shaped and symmetrical (right and left sides are the same). The data is distributed more heavily around the mean in the center.

The standard deviation is the average distance between any data point and the mean. The smaller the standard deviation, the closer the data will be to the mean. The larger the standard deviation, the farther the data will be from the mean.

The empirical rule states that:

68% of the data in a data set will fall within one standard deviation of the mean (between -1sd and 1sd)
95% of the data in a data set will fall within two standard deviations of the mean (between -2sd and 2sd)
99.7% of the data in a data set will fall within three standard deviations of the mean (between -3sd and 3sd)

By using the empirical rule, we may be able to determine the likelihood of data falling within a specific range. The empirical rule estimates that:

68% of the data points will lie between the mean and first standard deviation from the mean.
27% of the data points will lie between 1-2 standard deviations from the mean.
4.7% of the data points will lie between 2-3 standard deviations from the mean.
0.3% of the data points will lie outside of 3 standard deviations from the mean.

How is the empirical rule useful?

In trading, both fundamental analysis (researching a company, industry, competitors, products, news, politics, etc.) and technical analysis (analyzing the movements of assets to try and predict future movements) are ways that traders try to determine whether the price of a stock or security is going to go up or down.

The empirical rule is a technical analysis tool to analyze risk and return and estimate possible future events by considering possible alternative outcomes. For example, the empirical rule formula can show historical volatility. Historical volatility is the standard deviation of Periodic Daily Returns (PDR) — The rate of change that an asset has increased and decreased in value each day.

How is standard deviation useful?

Standard deviation tells us the distance between data points. But it may be hard to see if you're just looking at the closing prices each day. Standard deviation can give us some historical context to recognize whether a given stock price is outside of the ordinary (an outlier), such as a stock that has a three standard deviation move.

When the data is more spread out, the distance between the mean and the standard deviation will be larger for more volatile assets (the price has more gains and losses each day). If the data is closer to the mean, the distance between the mean and the standard deviation will be smaller, and the assets less volatile (the price has fewer gains and losses each day).

What is the empirical rule formula?

To use the empirical rule, you need three things:

1 - A set of data. 2 - The mean (average) of the data. 3 - The standard deviation (distance between the mean and each data point.)

Let's take a random set of numbers:

14, 8, 2, 7, 3,1

First, you need to find the mean or average.

Add the numbers: 14+8+2+7+3+1=35
Divide by how many numbers there are: 35/6=5.8

The mean is 5.8.

Now find the variance (the standard deviation squared). The formula for the variance is:

σ: standard deviation
σ2: variance
N: population size or the total number of data points that we are using for the calculation.
X: a variable
μ: mean
Σ: summation — this asks for the sum

Here is what this equation is saying in English:

The variance equals the sum of (each number minus the mean )2 — then divided by the total number of data values.

σ2: variance = 19.8

Now that you know the variance is (σ2 = 19.8), you can take the square root to get the standard deviation.

Standard deviation σ = 4.4.

With a mean μ of 5.8 and a standard deviation σ of 4.4, you can use the empirical rule to place the data in the bell curve.

How do you calculate the empirical rule in a spreadsheet?

In both Excel and Google Sheets, we can import live stock data and find standard deviations to visualize the volatility of a stock. This example uses Google Sheets.

Create a Sheet.

In cell A1, start with the following formula to pull one year’s worth of closing data of the stock of your choice by inserting any stock ticker in this formula.

=GOOGLEFINANCE(“TCEHY”,”close”,today()-365,today())

Your data will load with the date in column A and the closing data in column B for 252 closing days, which are the number of trading days in a year.

Create headings for column C, D, and E for the distribution, mean, and standard deviation as below:

Let’s find the mean. In cell D2, enter the formula below to calculate the average:

=AVERAGE(B2: B252)

Now we need to find the standard deviation. In cell E2, enter the formula below to calculate the standard deviation.

=STDEV(B2:B252)

We need to find the distribution of our data. Enter the formula below in cell C2.

=NORMDIST(B2,$D$2,$E$2,false)

Then click and drag the lower right-hand corner of cell B2 to B252 to populate the cells.

To make the bell curve chart, select data in columns B and C from row 2 to row 252.

Click ‘insert chart.’

Choose Scatter Chart.

You can also calculate Periodic Daily Returns from the closing data.

Go to column F and create the heading PDR for Periodic Daily Returns.

Then drop down one cell to F3. Use the formula

=LN(B3/B2)

Drag on the lower right-hand corner of cell F3 down the column to populate all cells in column F with the PDRs.

You can find the percentage values for 1, 2, and 3 standard deviation moves with the following formulas.

To find the Lower (-1) standard deviation move use this formula, =AVERAGE (F:F)-1*STDEV(F:F) Note: This formula is in cell H11, but you can put it in any cell you want.

To find the upper 1 standard deviation move, change the minus to a plus: =AVERAGE (F:F)+1*STDEV(F:F)

Repeat for 2 and 3 by replacing 2 and 3 in the formula. For the lower 2 SV move =AVERAGE (F:F)-2STDEV(F:F) For the upper 2 SV move =AVERAGE (F:F)+2STDEV(F:F) For the lower 3 SV move =AVERAGE (F:F)-3STDEV(F:F) For the upper 3 SV move =AVERAGE (F:F)+3STDEV(F:F)

You can show all deviation moves as a percentage by highlighting each cell and clicking the % in the toolbar.

And now you have some statistical data that you can easily compare to other assets for historical volatility.

This is just a basic introduction to how you can use standard deviation to look at stock movement and historical volatility. There are many other things you can do. You can set your sheet up however you want. For example, you can show deviation values in dollars, and you can find deviation fractions such as 0.5. All stock data is live, which means that it will change daily and update. You can also use the same sheet to pull in data from another stock. Just change the stock ticker in the first formula. For example, plug in the ticker for Amazon in the same sheet, the sheet will populate with real-time data for Amazon. =GOOGLEFINANCE(“AMZN”,”close”,today()-365,today())

Ready to start investing?

Certain limitations apply

2791266

What is a Dividend?

Updated February 02, 2024