# What is a Population?

A population is the complete set of something of interest — perhaps all of the people in the state of California, or every acre of farmland in the state of Nebraska.

## 🤔 Understanding population

A population is the entire collection of something of interest. In common usage, the word population refers to a count of people or animals. In statistics, however, a population is not limited to living things. For instance, all of the rocks on the moon make up the population of moon rocks. Every case of a genetic disease would be the population of cases of that disease. And, every car that has ever rolled off the Ford assembly line is the population of Ford cars. It’s very difficult to measure an entire population of anything. That’s why scientists use sampling techniques to make inferences about the rest of the members of a population.

Every decade, the US Census Bureau embarks on a mission to count every person in the United States. That effort results in the best possible understanding of the US population. From that information, congressional district boundaries are redrawn to improve how the population is represented around the country. In this case, the population of interest is every human being living inside the borders of the United States. A count of everyone in the population is called a census.

## Takeaway

A population is like a big pot of soup…

Imagine a large pot on the stove in a restaurant. The chef fills it with water and beef stock. He throws in some meat, potatoes, and vegetables. After letting it sit over low heat for a few hours, he adds some herbs and spices. Before serving that soup to customers, the chef wants to know how it tastes. But, tasting every bite wouldn’t leave anything for the diners. Instead, he takes a spoonful (sample) and assumes that the rest of the pot (population) tastes about the same (has the same characteristics).

New customers need to sign up, get approved, and link their bank account. The cash value of the stock rewards may not be withdrawn for 30 days after the reward is claimed. Stock rewards not claimed within 60 days may expire. See full terms and conditions at rbnhd.co/freestock. Securities trading is offered through Robinhood Financial LLC.

## What is a Population?

Most people encounter the word population in everyday life. It’s usually in the context of how many people live in a specific area. For example, the population of California is about 40 million people. Or the population of Turkey is estimated to be 84 million people. In essence, it means that there is an invisible line on the ground. The population is the count of everyone that lives inside that boundary.

In statistics, the same idea applies. You start by defining that boundary, but it doesn’t need to be on the ground. Then, everything inside that boundary makes up the population of interest. For instance, say you wanted to understand the size of redwood trees in the Redwood Forest National Park. The borders of the park act to define the physical boundary. The fact that you’re interested in redwood trees further defines the population of interest. Therefore, you wouldn’t count the Douglas-fir, western hemlocks, tanoaks, or madrones. The population of interest is only the redwood trees that exist inside the park boundary.

Sometimes, the population of interest isn’t so neat and tidy. Say a researcher wanted to know the effectiveness of a new treatment on COVID-19. The population of interest is everyone that has that virus. But, that number is constantly changing as some people contract the disease, and others recover or die. So, the population of the study is difficult to measure.

In some instances, the population might not even be finite; for example, the average of all rolls of a particular set of dice. Because you could theoretically keep throwing those dice forever, the number of rolls in the population is infinite.

## What is the difference between a sample and a population?

A population consists of every member within a defined group. A sample is a subset of a population. For example, if the population of interest is all of the books in a specific library, randomly taking a dozen books off the shelves is a sample of what the library has in stock. Or, if you’re planning a large event and need a caterer, all of the food that caterer makes is the population of interest. Tasting one bite of each dish is a sample.

Samples allow you to get a feel for the rest of the population. Those few bites of food give you a decent idea of how the rest of the banquet will taste. The characteristic of a sample is a good representation of the larger population of food. Therefore, it’s safe to make generalizations about the things you don’t eat based on the things you do.

Likewise, that sample size of 12 books from the library should give you an idea about the rest of the books on the shelves — that is, unless you only picked books from one section of the library. If you did, you can’t make reliable assumptions about the rest of the bookshelves. For example, if you only grabbed children’s books, the average length of your sample might not provide you with an unbiased estimate about the length of all the books in the library.

## What is the difference between population samples and population parameters?

In statistics, a population parameter describes something about the entire population. For instance, say you wanted to know how much money people carry around. You could try to count the amount of money that every person in your city has in their pocket. The average cash held by all of those people would be a parameter. It defines the whole population with no ambiguity.

However, you don’t need to interview every single person to get a good idea of that parameter. You could just interview 50 random people. That would provide you with a sample of the population. The average amount of cash held by those 50 people is a descriptive statistic (an estimation of a parameter), which gives you an idea about everyone else. The sample mean (the average of the values in a sample) is a reasonable estimate of the population mean, within some probability distribution.

Parameter has another definition, so that can be confusing. The other meaning of parameter is the value that defines the conditions of a distribution or model. For example, a normal distribution (aka bell curve) is a function to describe the variance within a set of values. It’s symmetrical, with most of the occurrences located close to the center. There are some observations far from the middle, but they occur less often. The normal distribution has two parameters — the mean (average of all values) and the standard deviation (a measure of dispersion, describing how far observations stray from the middle). Those parameters define the shape of the distribution.

## How useful is a population in statistics?

The primary purpose of statistics is to make generalizations about the broader population. In fact, the word statistic implies that it is an estimate about a population parameter (the real value that describes an entire population).

Descriptive statistics describe the outcomes of a sample, usually including information about its central tendency (the average, middle, and most common values) and its dispersion (how far the values are spread out). Inferential statistics provide information about how reliable the descriptive statistics are as estimates of the broader population. Ultimately, the goal of statistics is to understand a population.

## Why are samples used more often?

Using a sample is much more manageable than trying to count the target population. Without sampling, it would be extremely challenging to understand anything around us. Imagine trying to understand what’s going on in the labor market. The Bureau of Labor Statistics (BLS) would need to track every working-age adult in the country and know whether or not they have a job at any given moment. If a person doesn’t have a job, the BLS would need to determine if that was by choice or not.

Keeping tabs on millions of people would be an impossibly difficult task. But the BLS doesn’t need to do that to get a decent estimate of what’s going on. By randomly selecting a few thousand households, then interviewing them about their employment situation, the BLS can approximate the status of the rest of the workforce.

Samples provide imprecise but reasonable approximations of the bigger picture. Samples are quicker, cheaper, and easier to collect than taking a census. And unless you have a small sample, the statistics provided by the random sampling distribution are reasonable estimates of what is really going on.

## When is a population parameter necessary for research?

A population parameter is a characteristic that defines a population — like the average height of a specific variety of corn stalk. The population mean parameter would be the actual average of all of the corn stalks. But measuring each plant in the entire population is tedious and unnecessary.

Say your research project was to determine the effectiveness of some new fertilizer. You might plant the corn in a few sections of land without any nourishment, plant other plots with the current fertilizer, and others with the new treatment. Then, statisticians would compare the results of each subpopulation using confidence intervals (a range of values that are feasible with a specified level of confidence) and hypothesis tests.

You wouldn’t need the population parameter to determine whether or not the new treatment is effective. Instead, you could take a sufficient simple random sample to test the null hypothesis (default assumption) that the fertilizer doesn't work, then make a statistical inference about its effectiveness. In reality, the entire experiment is a sample of all future plants that are yet to be grown. It might be impossible to determine the precise parameter of the overall population. But you can approximate it with a high degree of confidence. It’s almost always the case that research uses statistics to approximate a parameter. Measuring the population parameter directly is rarely necessary.

One instance that a population parameter is required is in determining the amount of representation each state gets in the United States House of Representatives. Once every decade, the US takes a census of the population.

New customers need to sign up, get approved, and link their bank account. The cash value of the stock rewards may not be withdrawn for 30 days after the reward is claimed. Stock rewards not claimed within 60 days may expire. See full terms and conditions at rbnhd.co/freestock. Securities trading is offered through Robinhood Financial LLC.