What is Descriptive Statistics?
Descriptive statistics are numbers that describe a larger set of numbers, which quickly allow people to understand some important features about the data.
🤔 Understanding descriptive statistics
Descriptive statistics are numbers that convey information about a larger or more complex set of numbers (also called a dataset). In business, descriptive statistics are the outputs from descriptive analytics. They allow someone to gain insight about the data without diving into the details and unique circumstances that each data point represents. In statistics, descriptive statistics are the standard summary statistics of a dataset when conducting a research project. It includes some information about how dispersed (spread out) the data is, what the average values look like, and how unbalanced the observations are. Descriptive statistics are bite-sized pieces of information that provide general insight about the larger dataset.
On April 13, 2020, Delta Airlines stock closed at $23.25 while Southwest Airlines stock closed at $34.25. But comparing stock prices doesn’t provide enough information.
An investor might want to compare the stock price to its earnings as a measure of value. The descriptive statistic that does this is called the price-to-earnings ratio (P/E ratio). It translates complex information about the company’s operations, and its stock price, into a quick piece of information for investors.
For reference, the P/E ratios on that day were $3.33 for Delta and $8.65 for Southwest (implying Delta was selling for cheaper than Southwest once adjusted for the size of their earnings).
Takeaway
Descriptive statistics are like the local news…
When you flip on the local 6 o’clock news, you get about 30 minutes of information about what is going on in your community. But there is much more to the stories than what shows up on the TV. Plus, there are a lot of things that didn’t make the broadcast. The producer and the rest of the crew filter out the things they think aren’t as important. Then they distill that information down into a summary that describes the gist of what happened. Descriptive statistics do something similar with data.
New customers need to sign up, get approved, and link their bank account. The cash value of the stock rewards may not be withdrawn for 30 days after the reward is claimed. Stock rewards not claimed within 60 days may expire. See full terms and conditions at rbnhd.co/freestock. Securities trading is offered through Robinhood Financial LLC.
What are descriptive statistics?
Descriptive statistics are simple ways to describe a more complicated set of information. It is a term often reserved for statistics, but it can also apply to business.
For example, say your regional manager wants a briefing on your sales from last quarter. You could send a spreadsheet full of numbers that detail every transaction your store made. That would certainly give your manager everything they want to know. But that’s probably not what they wanted.
More likely, they just want to know a few key performance indicators (KPIs) that they can compare to a benchmark. Those KPIs are descriptive statistics, which might include the average customer order, the amount of labor costs per dollar of revenue, and the gross profit margin (the percentage of total income that doesn’t go toward direct costs).
Converting information into insight, including descriptive statistics, is called descriptive analytics. Evaluating the details and nuance of the data, especially the outliers, is called diagnostic analytics.
What is the main purpose of descriptive statistics?
The primary purpose of descriptive statistics is to convey information quickly. In business, the person receiving the information may not have the time or skills required to analyze data. That is one reason why data analysts take complex information and reduce it into something more digestible.
A management team can take the descriptive statistics into account as they consider changes in a company’s strategy. These descriptive statistics allow managers to understand if the current plan is working and if course corrections are required.
Investors often use descriptive statistics to get a feel for a company’s finances, performance, value, and growth potential. Researchers may use descriptive statistics to understand and communicate the details of the set of data they are using.
What are the types of descriptive statistics?
A descriptive statistic is anything that describes a broader set of information. A great example is how people reduce the game of baseball down to a few numbers. The batting average immediately tells viewers how often the batter effectively hits the ball. The runs batted in (RBI) shows viewers how good the hitter is at moving runners. Stolen bases say something about the player’s speed. The home run count informs you about their power. These are all descriptive statistics that allow us to get a good feeling for a player’s ability without watching every at-bat.
In general, there are several types of descriptive statistics. But, in statistics, descriptive statistics typically fall into three categories.
Location Statistics
Probably the most commonly used descriptive statistics are measures of central tendency. Most people are familiar with taking the average of a set of numbers. It is the total number when the numbers are added together and then divided by the number of values that are in the set. That is one measure of central tendency, which statisticians call the mean value. The average, or mean, helps tell you what to expect. It sets your baseline assumption about something.
However, some types of data have extreme values (called outliers), which distort the average. Imagine 10 people that have various income levels. Nine of them have incomes around $10,000 per year. The other is a CEO who earns $10,000,000 a year. The average income in that group is about $1,009,000 apiece. But that average isn’t all that descriptive of the group.
They certainly are not all millionaires. A different measure of central tendency, called the median, is a better descriptive statistic here. The median is the middle value. It is the value at the center of the data. In this case, the middle value of these incomes is $10,000.
There is one other common descriptive statistic that measures central tendency. It is called the mode. The mode is simply the most common value in the data. It can be helpful if you want to know what happens most often.
Consider the following small dataset: [1, 1, 3, 3, 3, 4, 4, 5, 6, 7, 117]
The three measures of central tendency work out to the following values. Each conveys a slightly different piece of information about the data.
Mean (the average) = 14
Median (the number in the middle) = 4
Mode (the number that occurs most often) = 3
Dispersion Statistics
The central tendency of a dataset conveys a lot of information. But two datasets can be very, very different from each other, and still have the same mean, median, or mode. Consider these two small datasets:
Dataset A | Dataset B | |
Observation 1 | 45 | 5 |
Observation 2 | 50 | 50 |
Observation 3 | 55 | 95 |
Both of these datasets have the same mean, median, and mode (50). But they are not anything close to the same thing. Measures of spread help tell the rest of the story.
One such descriptive statistic is the variance. The variance is a single number that describes the average distance a data point is from the mean of the dataset.
Technically, it’s a little more complicated than that. But you can intuitively see that Dataset A has values that are only spread out by five from the middle. The spread of the numbers in Dataset B is further — all the way to 45 numbers from the center.
The variance would describe that fact. The square root of the variance is called the standard deviation — It helps show the spread of data from the average.
Another descriptive statistic that informs you about how much spread there is in the data is the frequency distribution. It explains how often the data appears in various “bins” of data. Sometimes, the data is broken into quarters, and descriptive statistics apply to those quartiles.
For example, it might describe the interquartile range (the distance between the smallest and largest value in the quartile), mean, or variance.
Shape Statistics
Other descriptive statistics describe the shape of the data when it is plotted on a graph. One measure is called skewness, which describes how lopsided the observations are from the middle. Another is kurtosis, which describes how quickly the data reaches a peak.
What is the difference between descriptive and inferential statistics?
Descriptive statistics describe a dataset; inferential statistics infer what the data is saying.
Imagine seeing a Ford Mustang sitting at the starting line of a drag race. You can describe the car. It’s red; it has four tires, and it has a 480 horsepower V-8 engine.
You can also make inferences about the sports car that you can’t directly observe. For instance, you can assume the car is fast. After all, would it be on the racetrack if it weren’t?
The difference between descriptive and inferential statistics is similar. Except, there is a distinction that must be understood when talking about statistics. To do so, we need to define a few terms.
First, a population is the complete set of something. Every person that lives in the United States is part of the population of the United States. If you measured the height of every American, you would know the average height of all Americans. In that case, you would have the height parameter of Americans.
But it is often impossible to measure every single person in a population. That is why you usually take samples from the population to make generalizations about everyone else. The average height of 1,000 randomly chosen Americans can provide you with a height statistic.
Descriptive statistics, like the average height of that 1,000 person sample, only tell you information about the sample. That sample might not do a good job of describing the rest of the people. Perhaps it is accidentally made up of only tall people, which results in a significant sampling error.
Inferential statistics help provide information about how likely a descriptive statistic is to represent the population being estimated. Inferential statistics are important in research studies when scientists conduct hypothesis testing.
How do you write a descriptive statistical analysis?
A descriptive data analysis is usually a table of summary statistics. To be understandable to its intended audience, it may be accompanied by a narrative description of key information about the data. In most cases, the information will best be communicated through the use of accompanying visualizations.
The stocks mentioned are for illustrative purposes only and are not a recommendation of a security or investment strategy.
New customers need to sign up, get approved, and link their bank account. The cash value of the stock rewards may not be withdrawn for 30 days after the reward is claimed. Stock rewards not claimed within 60 days may expire. See full terms and conditions at rbnhd.co/freestock. Securities trading is offered through Robinhood Financial LLC.