# What is the Normal Distribution?

A normal distribution is a bell-shaped visualization of a dataset, with observations arranged symmetrically around a mean and appearing less frequently as you move farther away from the center.

## 🤔 Understanding normal distribution

The normal distribution describes the way that many datasets are naturally organized, with the majority of the observations appearing close to the center. Data points appear less and less often as you move farther away from the middle. Normal distributions are symmetrical (meaning one side is the mirror image of the other) and slope away from the center. When drawn on a graph, a normal distribution curve looks like a bell, which is why it’s also called a bell curve. Statisticians assume that datasets have normal distributions in several analytical methods, including hypothesis testing.

Imagine measuring the size of every pumpkin that has ever grown. For each observation, you could write down how much a pumpkin weighs. Some pumpkins might be small and only weigh a pound or two. Some might be big enough to win a blue ribbon at the state fair. But both of those observations turn out to be unusual. Most of the pumpkins that you weigh are probably somewhere between seven and 11 pounds. If you plot the frequency that each weight comes up, you’ll notice that pumpkins weighing close to the middle of the range appear more often than those tiny gourds or massive prize-winning ones. So your graph would probably take the shape of a normal distribution.

## Takeaway

A normal distribution is like throwing darts…

If you stand in front of a dartboard and aim at the bullseye, you aren’t going to hit it every time. But you’re more likely to miss it by an inch than a foot. Looking at all of the darts you throw in an evening, most probably landed in the general vicinity of where you were aiming. Hopefully only a few were way off the mark. Similarly, in a normal distribution, most data points appear near the middle.

The free stock offer is available to new users only, subject to the terms and conditions at rbnhd.co/freestock. Free stock chosen randomly from the program’s inventory.

## What is the normal distribution?

A normal distribution is one of many ways that data can be distributed. With a normal distribution, the data is concentrated in the center and tapers off to the sides. There are other types of distributions as well. A dataset might have a lot of observations around one number, with more larger ones than smaller ones (right-skewed distribution). Another distribution can be evenly spread out within a range (uniform distribution). Or data can be randomly distributed, with no cohesive shape at all.

Statisticians assume that data is normally distributed for many statistical methods. For example, when testing a null hypothesis, you use the assumption that the data comes from a normally distributed population. When you conduct a t-test to see if something is statistically significant, the underlying assumption is that the data comes from a normal distribution. Normality is the required result when applying the central limit theorem — which states that the average of sampled data always forms a normal distribution, regardless of the shape of the distribution it comes from. In general, if it’s a normal distribution, values farther from the mean are less likely to show up by random chance.

## What are the properties of a normal distribution?

There are a few properties of a normal distribution. First, the data is symmetrically distributed around the center. If you drew a line down the middle, both sides of the line would be mirror images of each other. There is no skewness (asymmetry) to the data, which implies that the center point is the mean (average value), the median (middle value), as well as the mode (most common value).

But symmetry alone doesn’t make a distribution normal. You can imagine many statistical distributions with odd, but symmetrical, shapes. They could be flat, feature a hill on each side, or even be V-shaped and still look the same on each side. Therefore, the second property is that the data is concentrated in the center. As you move away from the middle, the number of observations gets smaller and smaller.

However, a triangular distribution (among others) would meet this definition too. Normal curves don’t have straight lines. Instead, they have long tails (which give them their bell shape). The implication is that the slope of the curve is decreasing as it moves away from the center. One way that this property gets expressed is by something called kurtosis, which describes how sharp the peak of the curve is. A normal distribution always has a kurtosis of 3.

## How do you find the normal distribution?

Two parameters completely define every normal distribution. First, the average value of the dataset is the mean. The mean serves as the center point around which all other data points are distributed. Second, the standard deviation describes how far the data is spread out. The larger the standard deviation, the more often data appears far from the mean. If you calculated the mean and the standard deviation of a dataset, you would have all you need to find a normal distribution that approximates that data.

## How do you find the probability of a normal distribution?

In probability theory, the values that a random variable can take are defined by a probability density function. The distribution of a probability function can take many shapes. For example, imagine a variable that could be any number between one and 100. If a given value is chosen randomly, with equal likelihood of any of the values in between, it’d be a uniform distribution. There is a 1% chance of each number being selected. Because the probability density function represents the entire universe of possible values, the sum of the probabilities of all the possible choices must equal 100%. Therefore, the total area under a probability density function is always equal to one.

Likewise, a normal density function (aka a Gaussian distribution) has a mass of one and the shape of a normal distribution. The majority of the body is in the center, with lower and lower probabilities assigned to values farther from the middle. When a random variable is assigned a value from a normal probability distribution, it’s more likely to have a value closer to the middle than farther away. These probabilities are often expressed in terms of their Z-score, which is the number of standard deviations from the mean.

There is an empirical rule with the normal distribution. It says that:

- 68% of the time, a selected value will be within one standard deviation of the center
- 95% of the time, the value will be within two standard deviations
- 99.7% of the time, it will be within three standard deviations

The famous quality assurance method called Six Sigma refers to the probability that an observation is more than six standard deviations from the mean of a normal probability distribution. Under that method, the probability of error is just 0.00034%.

A few other Z-scores from the normal distribution are important. That's because statisticians like to use a few specific probabilities to test hypotheses. In particular, 95% of a normal density function falls to the left or right of 1.65 standard deviations from the center. And 99% of the probability mass is to the left or right of a Z-value of 2.33.

## How is the normal distribution used in finance?

Analysts might use a normal distribution when they want to assess risk. Traders who use technical analysis implicitly use the normal distribution whenever they apply standard deviations to their trades. And many investors use some assumptions rooted in the normal distribution when they consider whether an asset’s price is a fair representation of its value.

### Risk Assessment

There is at least a little bit of randomness in everything. But some things are more susceptible to random outcomes than others. The more randomness there is, the less confident you can be about the future. In finance, that lack of confidence represents a risk.

Let’s assume a business wants to borrow $1M for some upgrades to its plant. A lender might look at the business earnings to determine how likely the loan will be repaid. A financial analyst might look at the earnings history of the company, as well as other companies in the same industry, to figure out how much its earnings could go up and down each year. There’s a good chance that the analyst will assume the random fluctuations in earnings follow a normal distribution.

If the company needs to earn at least $5M in gross income to afford the loan, the analyst might look at where $5M falls on the normal distribution. If that number is to the right of the average earnings, the default risk would be higher. To compensate for that risk, the lender could require a higher interest rate. Or they might just deny the loan.

### Projecting Returns

If you look at the rate of return that investors get from owning stocks, each investment generates a unique value. But if you plotted all of the different returns on a graph, you would notice something familiar. While some companies excel and others lag, most of the companies in an industry will have performed somewhere in the middle of the pack. In other words, it will look a lot like a normal distribution.

Although every investor hopes to own only the best-performing stocks on the market, there’s a good chance that most investors won’t. That’s why projecting returns is more of a shot in the dark. The most likely outcome will be somewhere around the industry average, and a good representation of the uncertainty is the standard deviation.

### Diversification

Even if the returns on a specific stock don’t follow a normal distribution, the average return of a suite of stocks does. That’s a law of statistics provided by the central limit theorem. If you pick any two stocks, it’s unlikely that you would get the best two. But even if you did, the average return of those two stocks must be lower than the return of the best stock by itself. This is always true.

If you increase the number of stocks you hold, your annual return will get closer and closer to the average of the market. You can prove this to yourself by imagining spreading your investments across every single stock in the market. In that case, your return would be the market return. But there’s something else going on behind the scenes. Your standard deviation is getting smaller as the sample size increases by adding more stocks. That’s the law of large numbers in action.

Once you have more than a handful of stocks in a portfolio, your uncertainty begins to shrink, and your return gets closer to the market return. Your possible outcomes get closer and closer to a normal distribution, and the standard deviation of the mean gets smaller.

### Hypothesis Testing

Consider an oil futures contract for delivery next month. Every day, the price of that contract changes. Buyers and sellers look at the data, the news, and the weather forecast to try to predict what’s going to happen next. Then, they decide to buy or sell these contracts based on what they speculate will happen.

Consequently, the price of oil moves around each day. If you plotted those daily price movements, it might look like a normal distribution. Then you could get a good sense of whether a price movement is just a random outcome in the market, or if it signals a significant change in the fundamentals.

## What are some examples of real-life normal distributions?

Distributions that have many properties of normality happen everywhere you look. Imagine flipping 100 coins. You should expect 50 of them to come up heads. You probably wouldn’t be surprised if only 49 landed face up, but you would (and should) be surprised if none of them did.

The same idea is valid with rolling dice, picking stocks, or looking at daily price charts. You would expect what you observe to be close to the actual average and are probably skeptical when something appears far from the middle of the pack. That intuition stems from a lifetime of observing the real world, in which many things tend to follow a normal distribution.

In general, if something has natural variability in its measurement, its distribution tends to look a lot like a normal distribution. People’s height is a classic example. Some people are tall, while others are short. But there are far more people that are six feet tall than seven feet tall. When values become less common as they get farther from the average, that’s usually a characteristic of a normal distribution. The same is true with the weight of fruit, the circumference of trees, the blood pressure of patients, and the number of times a hummingbird flaps its wings in a minute. Almost anything you can measure in nature that has some proportional random fluctuation involved has observations that look somewhat like a normal distribution.

The free stock offer is available to new users only, subject to the terms and conditions at rbnhd.co/freestock. Free stock chosen randomly from the program’s inventory.