What is the Sum of Squares?

Robinhood Learn
Democratize Finance For All. Our writers’ work has appeared in The Wall Street Journal, Forbes, the Chicago Tribune, Quartz, the San Francisco Chronicle, and more.
Definition:

The sum of squares is a statistical method used to describe how far apart data points are from one another.

🤔 Understanding sum of squares

Sum of squares is a way to capture how dispersed (spread out) the numbers in a dataset are. The sum of squares gets its name from the way you calculate it — summing up the squared difference between an observation and the target value. The total sum of squares is a measure of deviation from a mean data point. Those deviations are a combination of numbers above and below the target value. Therefore, some numbers are positive, and some are negative. The simple sum of those deviations is always zero (by definition). To describe how spread out those numbers are from each other, they are first squared (which makes them all positive values) and then added together. The result is the sum of squares. In regression analysis (estimating one variable using another), the sum of squares is a measure of how well an estimate fits the data.

Example

Assume you wanted to compare the stock prices of two companies — Chieps and Dabble. They trade at nearly equal values, but they are not the same. Consider the trading week of May 4th through 8th.

ChiepsDabble
Day 1$312.50$310.13
Day 2$304.87$302.92
Day 3$297.79$299.82
Day 4$293.74$296.76
Day 5$291.29$292.37
Average$300.04$300.40

Although the average closing price for the week is nearly identical, Chieps moved around more than Dabble did. That’s information that doesn’t show up when only looking at average values. The sum of squares gives you some information about how much movement there is in the data. In this example, Chieps has a sum of squares value of 300, while Dabble’s is 179. The lower number indicates less dispersion.

Takeaway

Sum of squares is like an appetizer…

When you sit down for a meal, you might not want to wait for the main course. So, perhaps you order a salad or appetizer to tide you over while you wait for the entre. That appetizer (sum of squares) is just the first course (step) in completing the dining experience. Like an appetizer, the sum of squares might be all you want. But, more often, you’ll use it to get to a different, more meaningful statistic.

Ready to start investing?
Sign up for Robinhood and get stock on us.Certain limitations apply

New customers need to sign up, get approved, and link their bank account. The cash value of the stock rewards may not be withdrawn for 30 days after the reward is claimed. Stock rewards not claimed within 60 days may expire. See full terms and conditions at rbnhd.co/freestock. Securities trading is offered through Robinhood Financial LLC.

Tell me more…

What is a sum of squares?

The sum of squares is a statistical measure of dispersion. It quantifies the distance numbers are from the average of a dataset, or a regression model (estimate of one variable using another). Those differences are squared then summed to determine the sum of squares value. The greater a sum of squares becomes, the more spread out a dataset is, and the worse a regression is at predicting an outcome. Because deviations are squared, more considerable differences cause the sum of squares to increase by much more than smaller variances do.

As a measure of spread, the total sum of squares is an indicator of volatility, and therefore, signals that there’s more risk. However, analysts usually use the sum of squares to calculate other measures of volatility rather than using it directly. In linear regression models, the total sum of squares is divided into the explained sum of squares (the variation explained by the regression model) and the residual (unexplained) sum of squares. A statistical model with a high residual sum of squares has less explanatory power than one with a lower value.

How do you calculate the sum of squares?

The sum of squares formula might look a little intimidating at first, but it’s actually quite simple. It just says to take the difference for each data point, square it, then add all the values together. Here is what it looks like:

Total Sum of Squares = ∑ (Xi – Xavg) 2

Xi = data point i

Xavg = average of all data points in the set

∑ = instruction to sum the values together

In words, the formula says to do this:

Step 1: Calculate the average value of the dataset. Do this by adding all of the numbers together and dividing that sum by the number of data points.

Step 2: Determine the difference between each data point and the average value.

Step 3: Square (multiply by itself) each of the differences you developed in Step 2. This action turns all of the numbers into positive values.

Step 4: Add up all of the squared deviations from Step 3.

Calculating the total sum of squares in Excel

Say you wanted to understand the variability of the an Airline common stock price in April 2022 at closing. You would put those data into an Excel spreadsheet. Here is the data, along with the other three columns calculated for you:

DateClosing PriceAverageDifferenceSquared Difference
1-Apr-22$23.87$23.54$0.33$0.11
2-Apr-22$22.68$23.54-$0.86$0.73
3-Apr-22$22.48$23.54-$1.06$1.11
6-Apr-22$22.32$23.54-$1.22$1.48
7-Apr-22$22.25$23.54-$1.29$1.65
8-Apr-22$23.23$23.54-$0.31$0.09
9-Apr-22$24.39$23.54$0.85$0.73
13-Apr-22$23.25$23.54-$0.29$0.08
14-Apr-22$24.54$23.54$1.00$1.01
15-Apr-22$24.35$23.54$0.81$0.66
16-Apr-22$22.78$23.54-$0.76$0.57
17-Apr-22$24.27$23.54$0.73$0.54
20-Apr-22$23.64$23.54$0.10$0.01
21-Apr-22$23.10$23.54-$0.44$0.19
22-Apr-22$22.47$23.54-$1.07$1.13
23-Apr-22$22.48$23.54-$1.06$1.11
24-Apr-22$22.41$23.54-$1.13$1.27
27-Apr-22$22.16$23.54-$1.38$1.89
28-Apr-22$24.34$23.54$0.80$0.65
29-Apr-22$27.32$23.54$3.78$14.32
30-Apr-22$25.91$23.54$2.37$5.64
Sum of Squares$3.78$34.99

The third column is the average value for the entire dataset. In the fourth column, you’ll find the difference between the data point and the average. The last column takes the square of the deviations from the average. At the bottom, the squared differences are added together to determine the sum of squares.

A word of caution: The function in Excel =SUMSQ() squares then sums the values you give it. So, you can’t apply it to the data in the first column and be done. You’ll need to determine the differences first, then apply the function to the third column.

Alternative (shortcut) formula

There is a mathematically equivalent way to get the same answer, which might be a little easier in some cases. It involves doing the steps in a different order. The alternative formula looks like this:

Total Sum of Squares = ∑ (Xi2) – 1 / n ∑ (Xi) 2 Xi = data point i n = number of data points ∑ = instruction to sum the values together

The formula doesn’t look like much of a shortcut, but the steps might illustrate the advantage.

Step 1: Multiply each of the data points by itself and add up all of the resulting values.

Step 2: Add up all of the numbers in the dataset, and square the result.

Step 3: Divide the result from Step 2 by the number of data points.

Step 4: Subtract the result in Step 3 from the answer in Step 1.

What does the sum of squares tell you?

The total sum of squares (TSS or SST) tells you how far the data points in a dataset are from the center. It’s a descriptive statistic called a measure of spread or dispersion. Dividing the TSS by the number of observations in the dataset gives you the average variability within the data, which is called the variance. Taking the square root of the variance generates the standard deviation (another measure of dispersion).

In a linear regression analysis, the total sum of squares is partitioned into two pieces — The explained sum of squares (ESS or SSE) and the residual sum of squares (RSS or SSR). These values are related, in that:

TSS = ESS + RSS

That formula just says that the total variability equals what is explained by a regression model plus what is left unexplained (the residual). The best fit for a regression line is the one that minimizes the RSS, which is called the ordinary least square (OLS).

Analysts look at the percentage of the variability that is explained by the best fit line to determine how valuable a regression model is at predicting one variable based on observations of others. That value is called the R-squared. It’s calculated using the sum of squares as follows:

R2 = ESS / TSS

R-squared values closer to one explain almost all of the variability in the data. A value closer to zero has little explanatory power.

What are the practical applications and limitations of the sum of squares?

In practice, an analyst could use the total sum of squares (TSS) as a direct measure of variability. However, the TSS is unscaled. In other words, comparing the TSS of one data set to another wouldn’t tell you anything. The variability of the size of pumpkin seeds will almost certainly be a smaller number than the variability of pumpkins. But that doesn’t mean the seeds are more like each other than pumpkins are. Maybe it’s more likely for a pumpkin seed to be twice the size of the average seed than it is likely for a full-grown pumpkin to be twice the size of the average full-grown pumpkin. Comparing TSS won’t tell you that.

That is why it’s important to scale the measure of variability to the data in question. For that reason, analysts don’t use TSS very often. Instead, they use TSS to determine variance, standard deviation, and R-square values. Each of those statistics is scaled, which allows analysts to compare them to other data. A stock with a larger standard deviation than another stock is more volatile. A regression equation with a bigger R-squared has more explanatory power than another model with a smaller R-squared.

The practical application of the sum of squares is to develop more meaningful measures of volatility. Those measures help analysts determine the associated level of risk in a security, and the required potential reward necessary to compensate for that risk.

Ready to start investing?
Sign up for Robinhood and get stock on us.Certain limitations apply

New customers need to sign up, get approved, and link their bank account. The cash value of the stock rewards may not be withdrawn for 30 days after the reward is claimed. Stock rewards not claimed within 60 days may expire. See full terms and conditions at rbnhd.co/freestock. Securities trading is offered through Robinhood Financial LLC.

2256752

Related Articles

You May Also Like

PARTICIPATION IS POWER™

This information is educational, and is not an offer to sell or a solicitation of an offer to buy any security. This information is not a recommendation to buy, hold, or sell an investment or financial product, or take any action. This information is neither individualized nor a research report, and must not serve as the basis for any investment decision. All investments involve risk, including the possible loss of capital. Past performance does not guarantee future results or returns. Before making decisions with legal, tax, or accounting effects, you should consult appropriate professionals. Information is from sources deemed reliable on the date of publication, but Robinhood does not guarantee its accuracy.

Options trading entails significant risk and is not appropriate for all customers. Customers must read and understand the Characteristics and Risks of Standardized Options before engaging in any options trading strategies. Options transactions are often complex and may involve the potential of losing the entire investment in a relatively short period of time. Certain complex options strategies carry additional risk, including the potential for losses that may exceed the original investment amount.

Commission-free trading of stocks, ETFs and options refers to $0 commissions for Robinhood Financial self-directed individual cash or margin brokerage accounts that trade U.S. listed securities and certain OTC securities electronically. Keep in mind, other fees such as trading (non-commission) fees, Gold subscription fees, wire transfer fees, and paper statement fees may apply to your brokerage account. Check out Robinhood Financial’s Fee Schedule for details.

Brokerage services are offered through Robinhood Financial LLC, (RHF) a registered broker dealer (member SIPC) and clearing services through Robinhood Securities, LLC, (RHS) a registered broker dealer (member SIPC). Cryptocurrency services are offered through Robinhood Crypto, LLC (RHC) (NMLS ID: 1702840). Robinhood Crypto is licensed to engage in virtual currency business activity by the New York State Department of Financial Services. The Robinhood spending account is offered through Robinhood Money, LLC (RHY) (NMLS ID: 1990968), a licensed money transmitter. A list of our licenses has more information. The Robinhood Cash Card is a prepaid card issued by Sutton Bank, Member FDIC, pursuant to a license from Mastercard®. Mastercard and the circles design are registered trademarks of Mastercard International Incorporated. RHF, RHY, RHC and RHS are affiliated entities and wholly owned subsidiaries of Robinhood Markets, Inc. RHF, RHY, RHC and RHS are not banks. Products offered by RHF are not FDIC insured and involve risk, including possible loss of principal. RHC is not a member of FINRA and accounts are not FDIC insured or protected by SIPC. RHY is not a member of FINRA, and products are not subject to SIPC protection, but funds held in the Robinhood spending account and Robinhood Cash Card account may be eligible for FDIC pass-through insurance (review the Robinhood Cash Card Agreement and the Robinhood Spending Account Agreement).

2784249

Robinhood, 85 Willow Road, Menlo Park, CA 94025.© 2024 Robinhood. All rights reserved.
Follow us on

This information is educational, and is not an offer to sell or a solicitation of an offer to buy any security. This information is not a recommendation to buy, hold, or sell an investment or financial product, or take any action. This information is neither individualized nor a research report, and must not serve as the basis for any investment decision. All investments involve risk, including the possible loss of capital. Past performance does not guarantee future results or returns. Before making decisions with legal, tax, or accounting effects, you should consult appropriate professionals. Information is from sources deemed reliable on the date of publication, but Robinhood does not guarantee its accuracy.

Options trading entails significant risk and is not appropriate for all customers. Customers must read and understand the Characteristics and Risks of Standardized Options before engaging in any options trading strategies. Options transactions are often complex and may involve the potential of losing the entire investment in a relatively short period of time. Certain complex options strategies carry additional risk, including the potential for losses that may exceed the original investment amount.

Commission-free trading of stocks, ETFs and options refers to $0 commissions for Robinhood Financial self-directed individual cash or margin brokerage accounts that trade U.S. listed securities and certain OTC securities electronically. Keep in mind, other fees such as trading (non-commission) fees, Gold subscription fees, wire transfer fees, and paper statement fees may apply to your brokerage account. Check out Robinhood Financial’s Fee Schedule for details.

Brokerage services are offered through Robinhood Financial LLC, (RHF) a registered broker dealer (member SIPC) and clearing services through Robinhood Securities, LLC, (RHS) a registered broker dealer (member SIPC). Cryptocurrency services are offered through Robinhood Crypto, LLC (RHC) (NMLS ID: 1702840). Robinhood Crypto is licensed to engage in virtual currency business activity by the New York State Department of Financial Services. The Robinhood spending account is offered through Robinhood Money, LLC (RHY) (NMLS ID: 1990968), a licensed money transmitter. A list of our licenses has more information. The Robinhood Cash Card is a prepaid card issued by Sutton Bank, Member FDIC, pursuant to a license from Mastercard®. Mastercard and the circles design are registered trademarks of Mastercard International Incorporated. RHF, RHY, RHC and RHS are affiliated entities and wholly owned subsidiaries of Robinhood Markets, Inc. RHF, RHY, RHC and RHS are not banks. Products offered by RHF are not FDIC insured and involve risk, including possible loss of principal. RHC is not a member of FINRA and accounts are not FDIC insured or protected by SIPC. RHY is not a member of FINRA, and products are not subject to SIPC protection, but funds held in the Robinhood spending account and Robinhood Cash Card account may be eligible for FDIC pass-through insurance (review the Robinhood Cash Card Agreement and the Robinhood Spending Account Agreement).

2784249

Robinhood, 85 Willow Road, Menlo Park, CA 94025.© 2024 Robinhood. All rights reserved.