Department of Labor Logo United States Department of Labor
Dot gov

The .gov means it's official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you're on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Consumer Price Index

A Review of Hedonic Price Adjustment Techniques for Products Experiencing Rapid and Complex Quality Change

Introduction

The Bureau of Labor Statistics (BLS) has a long history of using hedonic models to adjust for quality change in the Consumer Price Index (CPI) and Producer Price Index (PPI).[1] BLS built hedonic models for goods and services in several areas including apparel, electronics, and housing for CPI and for computers and, more recently, broadband services for PPI.

BLS began expanding the use of hedonic quality adjustment to other sectors of the digital economy in 2018. The CPI implemented a hedonic model for smartphones which indirectly estimates quality adjusted price change when a substitution is made to the newest model phone as determined through a directed substitution process. [2] Directed substitutions occur twice a year, each spring and fall, to coincide with new hardware releases from manufacturers. Shortly after, the CPI introduced hedonic quality adjustments for residential telephone services, internet services, and cable and satellite television service.[3]

Also in January 2018, BLS began using hedonic models to directly estimate quality adjusted price change in the PPI for microprocessors on a quarterly basis.[4] This work broke new ground by utilizing a time dummy approach as well as statistical learning techniques to determine the most appropriate model. BLS continues to build on these methods in the development of a model for PPI cloud computing services.

This paper examines the BLS directed substitution approach for smartphones in the CPI, as well as quality adjustment approaches for telecommunications services, and identifies which variables (product characteristics) are significant to these hedonic models. In addition, it explains the characteristics of goods and services and of price changes that guide BLS in deciding which hedonic technique to use for the PPI. To do so, we explore the different approaches to estimating the hedonic models for wired broadband internet and cloud computing services, including how BLS is building upon the techniques used for microprocessors. To conclude, a review of the current work on developing time dummy models for select telecommunications services in both the CPI and the PPI is provided.

Recent developments: CPI Experience 

Smartphones

In 2018, BLS began a process of directing substitutions of smartphones in an effort to improve the pace at which price changes of new technology make their way into the CPI. At the same time, BLS began applying hedonic quality adjustments to smartphone prices to control for rapid and complex quality change. Smartphones are included in the price index for telephone hardware, calculators, and other consumer information items (SEEE04) and account for about 47 percent of the sample of price observations used to calculate the index.[5] Chart 1[6] plots the CPI for telephone hardware, calculators, and other consumer information items during the twenty years prior to 2018.

The CPI for telephone hardware, calculators, and other consumer information items fell nearly 80 percent during the 1998 – 2017 period, or roughly 4 percent per year. The composition of items representing the category changed dramatically during this time. In 1998, land-line telephones dominated the index, gradually ceding weight to basic, analog cellular telephones. Through the early 2000s, cellular telephones continued to gain in popularity, soon overtaking land-line telephones. In the 2010s, smartphones replaced basic feature phones as wireless technology continued to advance.

The shift from an index dominated by prices for land-line telephones, to basic cellular phones, and finally to smartphones was gradual. The introduction of these new technologies into the CPI was possibly delayed, in part, due to BLS procedures intended to produce price indexes where quality is held constant from period to period. Typically an item in the CPI sample is priced until it is no longer available for purchase, but smartphones have now adopted a process called directed substitution in an effort to improve the pace at which price changes of new technology make their way into the CPI.[7] In this process, if a particular smartphone in the sample is two or more generations (models) old, the analyst will determine via random sampling whether the current phone model should continue to be priced, or if a substitution to a newer model is required. These directed substitutions take place approximately twice a year to coincide with new hardware releases from manufacturers. Since adopting this process in 2018, directed substitution has typically occurred in April and November of each subsequent year. 

During directed substitution months, BLS sends its data collectors instructions specifying which smartphone models meet the criteria identifying them as older models which must be replaced by newer models. The directed substitutions from older to newer model smartphones, along with any other smartphone substitutions, are quality adjusted using estimates from the hedonic regression model. Typically, there are around fifteen quality adjusted substitutions in directed substitution months, or approximately 10 percent of the sample, and about half that amount in non-directed substitution months.[8] 

The hedonic regression models for smartphones are constructed using data purchased from a market-research firm that specializes in capturing smartphone prices from a variety of retailers. In addition to providing the detailed characteristics information needed to create a hedonic model, the data also includes the full, non-contract price of each smartphone. This is the same type of price collected by and used in the calculation of the CPI for these items. BLS began using a hedonic model to quality adjust CPI smartphone prices in January 2018. Since then the model has been re-estimated twice a year – coinciding with the directed substitution months of April and November – to capture new technologies as they enter the market. BLS intends to continue to re-estimate the model twice a year using data from the month prior to the directed substitution months.

The data used for modeling includes the non-contract price of the phone (accounting for promotions and sales), brand, model, and retailer. The quality characteristics available in the data include the number of front and rear camera megapixels, screen size, vertical and horizontal screen resolutions, processor speed, number of processor cores, the amount of internal memory, the amount of RAM, measures of the phone’s thickness, height, and width, security software, the release date/age of the phone, and battery specifications.

Table 1 provides a summary of the six hedonic models estimated and used since January 2018 to quality adjust smartphone prices. The six models are specified with a dependent variable equal to the log of price and include independent variables for processor speed (measured in gigahertz/GHz), type and amount of memory (in megabytes/MBs), number and location of cameras, number of camera megapixels, total screen resolution (the product of horizontal and vertical screen resolutions), discount outlets, physical dimensions (phone thickness, width, and height, measured in millimeters/mms), and brand effects. 

Table 1. Consumer Price Index Smartphone regression models, 2018 - 2020
Variable 18-Jan 18-Apr 18-Nov 19-Apr 19-Nov 20-Apr

Processor Speed (GHz)

0.19884*** 0.38547*** 0.60059*** 0.70933*** 0.83884*** 0.55231***
(-0.03188) (-0.06494) (-0.05881) (-0.05442) (-0.05292) (-0.0391)

RAM (GB)

0.27320*** 1.02500*** 0.08290*** 0.03220** 0.03370*** 0.09110***
(-0.0285) (-0.0331) (-0.0199) (-0.0141) (-0.0104) (-0.00636)

Internal Memory (GB)

0.00156*** 0.00119*** 0.00086*** 0.00073*** 0.00080*** 0.00069***
(-0.00028) (-0.00028) (-0.00013) (-0.00011) (-0.0001) (-0.00008)

Rear Camera (MP)

0.03118*** 0.02000*** 0.03019*** 0.03630*** . .
(-0.0081) (-0.00768) (-0.0064) (-0.0064) . .

Dual Rear Camera

. . . . 0.08237*** 0.07366***
. . . . (-0.02608) (-0.0221)

Front Camera (MP)

0.05367*** 0.05010*** 0.04675*** 0.01773** 0.01702*** 0.01285***
(-0.00994) (-0.0501) (-0.00653) (-0.00571) (-0.00415) (-0.00245)

Dual Front Camera

. . . . . 0.18075***
. . . . . (-0.0316)

Total Screen Resolution (MP)

0.14600*** 0.25400*** 0.17800*** 0.19500*** 0.16100*** 0.09680***
(-0.146) (-0.023) (-0.0212) (-0.0183) (-0.019) (-0.0139)

Foldable

. . . . . 0.65381***
. . . . . -0.09346

Discount Suppliers

-0.19451*** -0.14961*** -0.10468*** -0.55631*** -0.06668** .
(-0.03346) (-0.03421) (-0.02563) (-0.02767) (-0.02807) .

Thickness (mm)

0.03900* . 0.07701*** 0.09150*** . .
(-0.02074) . (-0.01427) (-0.01481) . .

Width (mm)

. 0.00265 . . . .
. (-0.00368) . . . .

Height (mm)

. . . . 0.00681*** .
. . . . (-0.00194) .

Brand A

1.65603*** 1.01989*** 0.76024*** 0.79637*** 0.65099*** 0.43018***
(-0.06268) (-0.05256) (-0.04473) (-0.04429) (-0.0359) (-0.0338)

Brand B

0.34213*** . . . . -0.15765***
(-0.05053) . . . . (-0.02455)

Observations

600 539 641 992 1034 1001

Adjusted R2

0.8229 0.8278 0.8942 0.8678 0.8493 0.8752

Footnotes:

*Significant at the 10% level

**Significant at the 5% level

***Significant at the 1% level

Standard errors in parentheses

A few variables, like security technologies that unlock phones based on fingerprints or facial recognition, wireless charging, and phone age, were expected to be significant, but were ultimately found to have little explanatory power. We expect to see new technologies, like phones that function on 5G networks, enter the market and become important features in future regression models.

Since the BLS began quality adjusting smartphone prices in the CPI, none of the regression models used for adjusting had an adjusted R2 less than 0.82 and none included fewer than nine regressors, almost all of which were significant at the 5-percent confidence level.  Parameter estimates for certain features, like those for processor speed and the “Brand A” alias, changed value significantly from January 2018 to April 2020 highlighting the need for frequent refreshes of the model. The variables in Table 1 displayed in italics are not used in making direct quality adjustments.

Removing the effects of directly adjusting prices to account for quality change produces a counterfactual index that decreases more slowly than the official CPI for telephone hardware, calculators, and other consumer information items when prices are decreasing. Chart 2 [9] compares this index, which includes the application of direct quality adjustments, with a counterfactual index where the quality adjustments have been removed and replaced by price relatives calculated through imputation based on the price changes of similar price observations.

Over the first 2 years since adopting hedonic price adjustments, the official CPI for telephone hardware, calculators, and other consumer information items fell 24.13 percent, whereas, at the same time, the counterfactual index fell 14.85 percent. The counterfactual index begins to diverge from the official CPI in April 2018, coinciding with the first directed substitution month, and continues this trend, diverging substantially during directed substitution months, throughout the 2-year period. This result supports the hypothesis that quality adjusted indexes decrease faster (or increase slower) than non-quality adjusted indexes when quality is improving and most of the divergence between the indexes occurs in months where the frequency of quality adjusted substitutions is highest in the official CPI. Indeed, during the 1998 – 2017 period when this index was transitioning from land-line telephones to smartphones, the index averaged only a 4 percent decrease per year. This compares to the roughly 12 percent per year decrease observed during the 2018 – 2019 period when hedonic quality adjustments were applied.

Telecommunications services

In January 2019, BLS began applying direct hedonic quality adjustments to the prices of residential telecommunications service plans to control for quality change within these services which were experiencing rapid quality improvements. Residential telecommunications services refer collectively to residential telephone services, residential broadband internet access, cable and satellite television services, and bundled packages of those services. Residential telecommunications services are accounted for in the CPI by the component indexes residential telephone services (SEED04), internet services and electronic information providers (SEEE03), and cable and satellite television services (SERA02). Chart 3 [10] plots the CPIs for the three residential telecommunications services from 2010 through 2018. 

During the 2010 – 2018 period, the three residential telecommunications services indexes increased, although at different rates. Television services rose an average of 3.1 percent per year, followed by residential telephone services at 1.9 percent per year, and trailed by internet services at only 0.2 percent per year.

There were two significant consumer trends during this period, driven both in part by significant and rapid technological changes in the telecommunications sector. The first trend was the increasing number of households who dropped residential telephone services in favor of the convenience of cellular telephone service.[11] The second trend, cord-cutting, refers to the increasing pattern of consumers canceling their subscriptions to multichannel subscription television services packages provided over cable or satellite in response to competition from new streaming media services provided over the internet such as Netflix, Hulu, Prime Video, Sling TV, YouTube TV, and others.[12] Households who have “cut the cord” no longer pay for subscription television service and instead rely on broadband internet service to stream video content into their homes. This trend was made possible by increased availability of high-speed internet bandwidth needed to adequately stream video content.

The trends seen in the residential telecommunications services market may help to explain some of the variation in the CPIs over this period. First, the shift away from residential telephone services and, to a degree, the canceling of subscription television services in favor of streaming video content over the internet may have led providers of residential telephone and subscription television services to increase the rates charged for those services to customers who continued to subscribe despite the increasing availability of comparable substitute services. And second, the improvement to the quality of internet bandwidth, both in availability and in increased download speeds, was the most important characteristic to consumers of these services. As providers of these services began offering internet service plans promising faster download speeds, the CPI did not reflect that benefit to consumers by way of lower quality-adjusted prices, and as a result, the index for broadband internet saw almost no change from 2010 through 2018. These market conditions and the need to account for rapid changes in quality led directly to the initiative to develop hedonic price adjustment techniques for these services. 

The hedonic regression models for residential telecommunications services developed by the CPI are constructed using data obtained from a secondary source that specializes in monitoring and collecting data on the prices and characteristics of service plans offered to consumers around the country by most telecommunications providers. Along with the detailed characteristics information necessary to create hedonic models, the secondary source data also includes the price, promotion, and contract terms necessary to calculate an average monthly rate over the contract for each service plan. This price is comparable to the prices collected by and used in the calculation of the CPIs for these services. CPI began using hedonic regression models to quality adjust residential telecommunications services plan prices in January 2019. Since then, the models have been re-estimated every year in January using data from the prior quarter.

The data used for modeling include the average monthly contract price of the service plan (accounting for promotional months), the type of service provided, the service provider, the city in which the service is provided, the service plan name, and information about contract lengths and the types of customers who are eligible to purchase the service plan. The quality characteristics available in the data include the type of long-distance calling included in the plan, the number of calling features included in the plan, the type of technology used for internet transmission, upstream and downstream internet speeds, the number of television channels included, and the type of services included in bundled packages, just to name a few.

Tables 2 through 5 provide summaries of the hedonic models estimated and used since January 2019 to quality adjust telecommunications service plan prices. The models are specified with a dependent variable equal to the average monthly price of the service plan adjusted through a Box Cox transformation (denoted as lambda) to improve the normality of the data.[13] The dependent variable of each model may have a different Box Cox transformation. Independent variables included in the models include those listed above as well as several variables used as controls for provider and city effects, which are not used to make price adjustments. The provider and city variables are not displayed in the model summaries.

Table 2. Consumer Price Index residential telephone services hedonic regression models, 2019 and 2020
Variable 2019 2020

Unlimited Long Distance

1.02974*** 0.19681***
(-0.07781) (-0.02361)

Unlimited Long Distance - Domestic

-0.52777*** 1.33176***
(-0.12344) (-0.02632)

Unlimited Long Distance - Regional

0.86265*** 4.43443***
(-0.24483) (-0.21392)

Unlimited Local Calling

2.66212*** .
(-0.11434) .

Number of Calling Features

0.09908*** 0.14514***
(-0.00619) (0.00199)

VOIP - Regular Line

-1.15207*** 0.13682***
(-0.10991) (-0.04823)

VOIP - Additional Line

-2.62892*** .
(-0.10425) .

24-Month Contract

. 0.87527***
. (-0.06572)

Existing Customer

. -0.58007***
. (-0.05112)

Promotional Price

-1.04383*** -0.66510***
(-0.00619) (-0.03193)

Observations

562 8,123

Adjusted R2

0.8334 0.7389

Lambda

0.5 0.5

Footnotes:

*Significant at the 10% level

**Significant at the 5% level

***Significant at the 1% level

Standard errors in parentheses

Table 3. Consumer Price Index Internet services hedonic regression models, 2019 and 2020
Variable 2019 2020

DSL Transmission

0.31227*** .
(-0.03031) .

Cable Transmission

. -0.26633***
. (-0.00799)

Download Speed (DSL trans; MBPS)

0.00634*** .
(-0.00065) .

LOG(Download Speed) (Cable trans; MBPS)

0.20816*** .
(-0.00601) .

LOG(Download Speed) (Fiber Optic trans; MBPS)

0.23799*** .
(-0.00516) .

LOG(Download Speed) (All trans; MBPS)

. 0.11005***
. (-0.00147)

Upload Speed (All trans; MBPS)

. 0.00027***
. (-0.00002)

Unlimited Data

-0.11972*** .
(-0.02096) .

Data Cap Amount (GB)

. 0.00003***
. (0.00000)

Router Included

0.05974** .
(-0.02904) .

Modem Included

-0.06543*** -0.35169***
(-0.02077) (-0.00746)

Month-to-Month Contract

-0.10143*** 0.03153***
(-0.02699) (-0.00598)

12-Month Contract

-0.37000*** .
(-0.03434) .

Prepaid Contract

-0.31185*** .
(-0.05022) .

Promotional Price

-0.38117*** .
(-0.01628) .

Observations

1,633 7,218

Adjusted R2

0.8071 0.632

Lambda

0 0

Footnotes:

*Significant at the 10% level

**Significant at the 5% level

***Significant at the 1% level

Standard errors in parentheses

Table 4. Consumer Price Index Televisions services hedonic regression models, 2019 and 2020
Variable 2019 2020

HD Receiver Included

-0.05118*** 0.39893***
(-0.01237) (-0.00509)

SD Receiver Included

-0.23554*** -0.00268
(-0.01812) (-0.01014)

Total Number of Channels Included

0.00267*** 0.00292***
(-0.00004) (-0.00002)

English Language Channel Package

0.15002*** 0.12830***
(-0.01488) (-0.00254)

DVR Fees Included

0.19696*** .
(-0.00751) .

HD Channel Fees Included

0.25149*** .
(-0.01669) .

Plan Available to New and Upgrading Customers

-0.26961*** .
(-0.03649) .

Plan Available to New, Existing, and Upgrading Customers

0.15906*** .
(-0.01488) .

12-Month Contract

-0.60049*** -0.63904***
(-0.01193) (-0.00653)

Promotional Price

-0.03346** -0.27788***
(-0.01454) (-0.00362)

Observations

2,580 34,068

Adjusted R2

0.7955 0.8377

Lambda

0.25 0.25

Footnotes:

*Significant at the 10% level

**Significant at the 5% level

***Significant at the 1% level

Standard errors in parentheses

Table 5. Consumer Price Index Bundled telecommunications hedonic services regression models, 2019 and 2020
Variable 2019 2020

Telephone & Internet Bundle

0.02073*** .
(-0.0078) .

Telephone & Television Bundle

-0.04774*** .
(-0.00969) .

Unlimited Long Distance

0.11397*** -0.30670***
(-0.00301) (-0.07492)

Standard Telephone System

0.07494*** 3.65423***
(-0.00974) (-0.09863)

Number of Calling Features

. 0.14591***
. (-0.00706)

LOG(Download Speed) (MBPS)

0.06858*** 1.36731***
(-0.00137) (-0.02481)

Unlimited Data

-0.13130*** .
(-0.00633) .

Data Cap Amount (GB)

. -0.00451***
. (-0.00018)

Upload Speed (MBPS)

. 0.00094***
. (-0.00018)

Router Included

0.08780*** -8.13262***
(-0.01155) (-0.22466)

Modem Included

-0.05371*** -0.88626***
(0.00684) (0.08891)

ONT Included

-0.07871*** .
(-0.00963) .

Total Number of Channels Included

0.00147*** 0.03286***
(-0.00002) (-0.00035)

Spanish Language Channel Package

-0.04994*** .
(-0.00354) .

English Language Channel Package

. 2.17317***
. (-0.09763)

SD Receiver Included

-0.08197*** 2.40070***
(-0.00748) (-0.09372)

HD Receiver Included

. 1.14417***
. (-0.08557)

DVR Fees Included

0.15895*** -0.91185***
(-0.00732) (-0.06221)

Satellite Television Provider A

0.27613*** .
(-0.00824) .

Satellite Television Provider B

0.22221*** .
(-0.05007) .

Includes HBO

-0.11502*** .
(-0.03804) .

Includes Showtime

-0.13596*** .
(-0.03804) .

Includes TMC

0.19691*** .
(-0.04844) .

Plan Available to Existing and Upgrading Customers

0.21916*** .
(-0.053) .

Plan Available to New Customers

0.03058*** .
(-0.00791) .

Plan Available to Existing Customers

. -0.96448***
. (-0.12433)

Month-to-Month Contract

0.05890*** 1.01899***
(-0.00617) (-0.06174)

12-Month Contract

-0.13445*** -0.82210***
(-0.00786) (-0.07064)

Prepaid Contract

0.17743*** .
(-0.02079) .

Promotional Price

-0.14721*** -0.10577*
(-0.00739) (-0.05742)

Observations

6,877 33,663

Adjusted R2

0.8398 0.8908

Lambda

0.25 0.65

Footnotes:

*Significant at the 10% level

**Significant at the 5% level

***Significant at the 1% level

Standard errors in parentheses

With few exceptions, the quality characteristics of the variables that were expected to have strong explanatory powers were selected for the models. Many variables – those displayed in italics in Tables 2 through 5 – are not used to make price adjustments because we found that their estimates did not perform well when applied to our price data.

Over the nearly 2 years in which the BLS has been quality adjusting prices of residential telecommunications service plans, no regression model had an adjusted R2 less than 0.63 and each model included an average of 32 regressors, almost all of which were significant at the 5-percent confidence level.[14] The most important variables and those used most in making price adjustments are those related to internet speed (from the models for internet services and bundled services) and those related to the number of included television channels (from the models for television services and bundled services). Few of the other model variables are used with any frequency.

Removing the effects of directly adjusting prices to account for quality change produces counterfactual indexes that increase more rapidly than the official CPI when prices are increasing. Chart 4[15], Chart 5[16], and Chart 6[17] compare the official CPIs for residential telephone services, internet services, and television services, which include the application of direct quality adjustments, with counterfactual indexes where the quality adjustments have been removed and replaced by price relatives calculated through class-mean imputation.

In 2019, the first year since adopting hedonic price adjustments, the official CPIs for the three residential telecommunications services have all increased. The CPI for residential telephone services increased 6.9 percent, compared to its counterfactual index which increased 5.8 percent, running counter to our expectations that the counterfactual indexes for these services would increase faster over the period. The CPI for internet services increased 1.8 percent, compared to its counterfactual index which increased 2.8 percent. The CPI for television services increased 3.4 percent, while its counterfactual index increased 4.8 percent. These limited results, for both internet services and television services, support the hypothesis that quality adjusted indexes increase slower (or decrease faster) than non-quality adjusted indexes when quality is improving. The telecommunications services indexes increased more rapidly in 2019 than in previous years and prior to the adoption of hedonic price adjustments, and they would have increased at even faster rates were it not for the application of hedonic price adjustments.

Recent Developments: PPI Experience

In 2016, BLS expanded its use of hedonic models in its services sector PPIs with the introduction of hedonic quality adjustment for broadband items in the Internet access services index. This work has been followed up with the development of a model for estimating quality adjusted prices for cloud computing services. This model has not been employed in the PPI yet.

Broadband Internet Access Services

Broadband Internet access services include digital subscriber lines (DSL), cable, and fiber optic services. These services are subject to rapid technological change because download and upload speeds typically increase over time. This means specific broadband items within the Internet access services index of the PPI are periodically replaced with items with faster download and upload speeds.

To account for these changes, BLS must estimate the value of quality adjustment—the value of the increased broadband download or upload speed. Ideally, PPI survey participants would provide this information, but often this does not occur. Consequently, an hedonic quality adjustment model is needed to estimate the value of the increased speed.  This model is estimated using data reported by companies in the PPI sample. The model is re-estimated annually to ensure the coefficients reflect technological advancement over time.

In addition to upload and download speed, there are a number of other possible explanatory variables, specifically residential and various companies, all of which are categorical. The only aspects of the service that change and that are quality adjusted are upload and download speeds because these two variables are the primary price determining characteristics. The other variables are control variables whose purpose is to remove extraneous factors that influence upload and download speed.

It is important to note that the PPI broadband model uses different variables than the CPI internet services model because the datasets are different. The PPI dataset contains services sold to businesses and consumers, while the CPI dataset only contains services sold to consumers. In addition to business services being in scope for the PPI, it is important to include them as they can have additional features not seen in residential services. The CPI dataset is also larger, and a larger dataset is typically able to estimate a model with more variables than a smaller dataset.

Upload and download speed are so closely correlated that a hedonic regression cannot be estimated with both of them because of collinearity. Consequently, the hedonic regressions only use download speed, which represents the quality change associated with both upload and download speed.

Residential is an important control variable because the download speed/price relationship between residential and business customers is known, from industry knowledge, to be different. This difference means that the coefficient on the download variable should be different for residential and business customers. To allow for a different download coefficient, not only is a dummy variable for residential needed, but also an interaction term between download and residential.

Table 6. Producer Price Index Broadband Hedonic Model 2019[18]
Estimate Std. Error t value P(>|t|)

Intercept

4.08025 0.11548 35.332 <2e-16

Log(Downstream speed)

0.21824 0.04206 5.189 1.37E-05

Residential

-0.52704 0.12573 -4.192 0.000225

Log(Downstream speed):Residential

-0.05593 0.04284 -1.306 0.201637

Company A

0.28859 0.13895 2.077 0.046471

Company B

-0.58119 0.15886 -3.659 0.000967

Company C

-0.29369 0.17486 -1.68 0.10343

Company D

-0.35328 0.15853 -2.228 0.033498

Company E

-0.43315 0.0916 -4.729 5.01E-05

Footnotes:

a. Adjusted R-Squared = 0.8267; F = 23.66 ; RSE = 0.2067

b. Base Configuration: Several Companies

c. Dependent variable: Log(Price)

Table 7. Producer Price Index Broadband Hedonic Model 2020
Estimate Std. Error t value P(>|t|)

Intercept

4.11582 0.11356 36.245 < 2e-16

Log(Downstream speed)

0.23684 0.03577 6.621 2.12E-07

Residential

-0.57417 0.12295 -4.67 5.52E-05

Log(Downstream speed):Residential

-0.077 0.03702 -2.08 0.04588

Company A

0.30627 0.11122 2.754 0.00977

Company B

-0.61737 0.15525 -3.977 0.00039

Company C

-0.34663 0.16724 -2.073 0.04661

Company D

-0.3682 0.15554 -2.367 0.02434

Company E

-0.49015 0.09168 -5.346 7.95E-06

Footnotes:

a. Adjusted R-Squared = 0.8582; F = 30.51; RSE = 0.2014

b. Base Configuration: Several Companies

c. Dependent variable: Log(Price)

The 2019 and 2020 PPI Broadband Hedonic Models are similar. Residential is significant and has a negative value, as expected from prior assumptions about the broadband industry. The interaction of download and residential is negative in both models, but is not significant in the 2019 model. This could be the result of a sample that is unable to measure with enough precision for the variables to be significant or the difference between residential and business download speeds may be too small for the model to precisely estimate. Therefore, the Log(Download speed) coefficient is used to adjust both residential and business broadband items.

Table 8. Log(Downstream Speed) coefficients of the Producer Price Index Broadband model for 2016-2020
2016 2017 2018 2019 2020

Log(Downstream Speed)

0.3075 0.28416 0.24208 0.21824 0.23684

As Table 8 shows, the coefficient on Log(Downstream Speed) has generally been declining since the broadband model was put into production in 2016. This decline reflects the falling price per megabit of downstream speed over time. The 2019 coefficient on Log(Downstream Speed) is also similar in magnitude to the coefficients on Log(Download Speed) for both cable and fiber optics in the 2019 CPI Internet services model. The 2020 CPI internet services model Log(Downstream Speed) coefficient dropped considerably, but this decrease is because the 2020 model combines all of the different types of broadband services in the Log(Downstream Speed) variable, while the 2019 model separated them. The coefficient for Log(Downstream Speed) in the 2019 model was much lower than for cable and fiber optics, and including DSL in the 2020 Log(Downstream Speed) variable reduced the size of the coefficient. 

Chart 7 shows the PPI Internet access services index between 2010 and 2020. The index has shown a downward trend, but it is not possible to determine what the movement of the index would have been had the hedonic model not been introduced in 2016. Since the introduction of the model in December 2016, the index has declined 5.6 percent. Between 2010 and 2016, prior to the introduction of the hedonic model, the index had decreased only 2.7 percent.

Cloud Computing

Cloud computing is a key information technology service with large and growing revenue. Worldwide public cloud services are projected to grow from $182.4 billion in 2018 to $331.2 billion in 2022, a compound annual growth rate of 12.6 percent.[19] Like many high-tech goods and services, cloud computing undergoes rapid improvement.

The PPI for data processing, hosting, and related services encompasses establishments that provide infrastructure or support for hosting or data processing purposes. The establishments are often third-party service providers for other businesses and governments who outsource their business processes and/or data and computing services. These outsourced services are provided by equipment owned, operated, and held by the establishments within the data processing industry. 

The data processing, hosting, and related services index is an aggregate index that contains lower-level indexes. These lower-level indexes are based on the types of services provided: Business process management services; Data management, information transformation, and related services; Hosting, application service provision (ASP); and other IT infrastructure provisioning services. Cloud computing is the provisioning of virtual computer infrastructure which makes it classified in the Hosting, ASP, and other IT infrastructure provisioning services index.

Cloud computing can be classified into three areas: software as a service, platform as a service, and infrastructure as a service. Software as a service (SaaS) is the most fully featured and easily accessible cloud computing package. SaaS is access to software online hosted by the service provider. With SaaS, there is no need for a customer to install, manage, or purchase hardware. The customer simply connects to the cloud provider and uses the software. Platform as a service (PaaS) has a much more broad structure compared to the polished SaaS packages. As its title suggests, PaaS provides a platform on which developers can build and attach their own applications. These platforms are typically made up of an operating system (OS), a programming language and an environment for it, a database, and a web server. Finally, the service which is the focus of this section, infrastructure as a service (IaaS), is the most basic type of cloud computing. Each package is essentially a virtual version of a blank computer: microprocessor, memory, storage, etc. Furthermore, these packages often include access to a basic OS, such as a limited version of Linux, or the option of purchasing access to a preferred OS, such as Windows.

IaaS packages are chosen as the focus for our model because, as the broadest service offering, IaaS is typically used as a base that the other services are built on. For instance, SaaS and PaaS both use the computer resources that are offered through IaaS. Developing a model for IaaS, allows the PPI program to describe many of the factors that affect the price for SaaS and PaaS.

The current pricing method for IaaS in PPI is determined by both the service characteristics and transaction terms for each item. In IaaS, the main types of prices are fee-based transaction prices (average rates, standard rates, or prepaid rates) or estimated flat fees. Flat fees are more commonly seen in contracts with large firms. The contracts are negotiated based on an average or expected sum of cloud usage per month. However, in actuality, cloud usage is so variable that the real value of these contracts is almost never the same month-to-month. Fee-based transaction prices have become much more common in the cloud computing industry because the industry has shifted towards on-demand services in order to cater to small businesses, individual consumers, and large companies simultaneously.

The service characteristics are the variable features, and the combinations of these features determine the price, either as a contract or a sum-of-its parts fee-based transaction. For IaaS packages, these service characteristics include, but are not limited to:

  • Application support/customer support dedicated to the specific package

  • Shared vs. dedicated/managed environment

  • Microprocessor

  • Operating system

  • Memory

  • Data storage

  • Number of users

  • IP address type (dynamic vs. static, number of addresses)

  • Computer time used in a given period (rented, leased, shared)

  • Training

  • Management

The combination of the above characteristics that a customer may choose to purchase is highly customizable. The cloud computing industry is geared towards meeting the needs of its customers. Although access to cloud services has become more convenient over time, this flexibility has made pricing more complicated. This complexity is why it was necessary to develop this model. Having a hedonic model will allow changes in the quality of the cloud service to be separated from changes in the price.

In 2018, Amazon Web Services (AWS) led the cloud computing service industry with ownership over nearly half of the IaaS market (47.8 percent). The following four largest companies in the industry were:  Microsoft (15.5 percent), Alibaba (7.7 percent), Google (4 percent), and IBM (1.8 percent)[20] [21]. As leaders in the industry, these companies pricing structures are similar. AWS, Microsoft Azure, and Google all have customizable on demand packaging as well as pre-structured packages that are classified or broken down by the same characteristics listed above.

PPI has estimated models for IaaS using time-dummy hedonic models. The methodology used is drawn from “The Rise of Cloud Computing: Minding Your P’s, Q’s and K’s” by David Byrne, Carol Corrado, and Daniel Sichel.[22]  One reason time-dummy hedonic models are used is because they allow for the calculation of price changes that occur from changes to several interrelated characteristics without having to be concerned about the magnitude of the coefficients for the characteristics.[23]  With traditional hedonic quality adjustment, a cross section hedonic model would be used to calculate the adjustments for a relatively small sample of products.  Because of this small sample size, only some of the coefficients would be used to calculate them, which means the values of individual coefficients are critical for accurate adjustment values.  With a time-dummy hedonic model, much larger samples can be used, which means that the values of the individual coefficients do not matter as much as long as the overall model is able to reasonably show the relationship between the price and explanatory variables.  Also, the time-dummy hedonic uses all of the variables to control for changes in characteristics between time periods, which also means the values of the individual coefficients are not critical as long as the model as a whole performs well.

The dataset for a time-dummy hedonic model consists of two or more time periods, and PPI has used overlapping two-quarter datasets. For example, the dataset for the first model consists of the second quarter and third quarter of 2017 and the dataset for the second model consists of the third quarter and fourth quarter of 2017. In addition to variables representing characteristics of cloud computing services, the models also have a time dummy variable. The time dummy variable represents whether an observation is from the first quarter in the dataset or the second quarter. With differences in characteristics being accounted for by the other independent variables, the time dummy variable gives the quality adjusted price change between the two quarters.

The dataset is assembled to back-test quality adjustment models. It comprises quarterly observations from the second quarter of 2017 to the second quarter of 2019 for the three largest service providers in the industry: Amazon Web Services (AWS), Google Cloud, and Microsoft Azure.

The selection of characteristics to include in the model is important because it helps determine the magnitude of the time dummy variable, and thus the estimated price change. One of the key drivers of quality change with cloud services is the microprocessor used in the servers that provide the service. Microprocessors undergo continual quality change, and this change is responsible for improvements in a range of high tech goods and services from smart phones to artificial intelligence. Each cloud provider uses a limited number of microprocessor models in their servers. This limited number of models makes it difficult, though not impossible, to have enough variation in microprocessor characteristics for the model to estimate significant coefficients.

AWS has a measure of the performance of the microprocessor used in their Elastic Compute Cloud (EC2) service called EC2 Compute Unit (ECU).[24] Because ECU is available for all AWS cloud services and it is calculated by AWS itself, ECU is a credible gauge of microprocessor performance. Neither Microsoft Azure nor Google Cloud have a microprocessor measure like the AWS ECU. There are third party microprocessor benchmarks, such as SPEC CPU and PassMark CPU benchmark, but both of these only have results for a limited number of microprocessors used in cloud services. This limited number of results is too small to support a cloud hedonic model.

Fortunately, characteristics information is available for all microprocessors. However, microprocessors are complicated devices, and selecting the characteristics to include in the model is challenging for two reasons.[25]. First, only a few different microprocessors are used by any one cloud service provider which limits the number of microprocessor characteristics the model can support. The main characteristics of microprocessors are as follows.[26]

  • Cores – a hardware term that describes the number of independent central processing units (CPUs) on a single computing component (die or chip)

  • Threads – a software term for the basic ordered sequence of instructions that can be passed through or processed by a single CPU core

  • Thermal design power (TDP) – the average power, in watts, that the microprocessor dissipates when operating at base frequency with all cores active under an Intel-defined high-complexity workload

  • Base frequency – the rate at which the microprocessor’s transistors open and close (The microprocessor base frequency is the operating point at which TDP is defined. Frequency is measured in gigahertz, or billions of cycles per second.)

  • Turbo frequency – the maximum single-core frequency at which the microprocessor is capable of operating using Intel Turbo Boost Technology

  • Cache – an area of fast memory located on the microprocessor (Intel’s Smart Cache refers to the architecture that allows all cores to dynamically share access to the last level cache)

Second, cloud services are priced by virtual CPU (vCPU). Each vCPU corresponds to a microprocessor thread, which means that each vCPU is only using a part of the microprocessor. Consequently, each vCPU only uses part of the microprocessor cache and accounts for part of the TDP. The cache and TDP variables have to be multiplied by the proportion of threads (vCPUs) used by each individual cloud computing service to the total threads in the microprocessor. For example, if a cloud computing service has two vCPUs and the microprocessor used to provide the service has 10 threads, 8 MB of cache, and a TDP of 100 watts, then the cloud service uses 1.6 MB of cache and accounts for 20 watts of TDP. Base and turbo frequency are the same throughout the microprocessor so they do not need to be adjusted.

In “A New Approach for Quality Adjusting PPI Microprocessors”, statistical learning techniques were used to select a specification for the hedonic models. All of the models in this paper have log price as the dependent variable and use two adjacent quarters of data. A repeated k-fold cross validation with pre-screening from that paper was used to select the models. Below are the selected models with ECU being used to represent microprocessor performance. The cloud services that continued from one quarter to the next never had any price changes. Only models that contained quarters with exiting or entering services had price change. We ran the statistical learning specification selection technique on the quarters with no exit or entry to show the stability of the service characteristic coefficients.

Table 9. AWS Cloud Services Time Dummy Hedonic Model using ECU (2017 Q2-2018 Q2)
2017Q2-2017Q3 2017Q3-2017Q4 2017Q4-2018Q1 2018Q1-2018Q2

Quarter dummy

0 0 -0.0296 0
(0.0455) (0.0455) (0.0435) (0.0372)

ln vCPU

-1.6094* -1.6094*
(0.4586) (0.4586)

ln Memory

0.7051* 0.7051* 0.5379* 0.5376*
(0.0792) (0.0792) (0.0446) (0.0423)

ln Storage

0.1401* 0.1401* 0.1608* 0.1588*
(0.0224) (0.0224) (0.0229) (0.0232)

SSD

-1.1939* -1.1939* -1.2873* -1.2506*
(0.183) (0.183) (0.1911) (0.1931)

ln ECU

1.9327* 1.9327* 0.4407* 0.4454*
(0.4124) (0.4124) (0.0511) (0.0468)

Windows

0.4676* 0.4676* 0.5047* 0.5318*
(0.0455) (0.0455) (0.0415) (0.0372)

Observations

128 128 152 176

Adjusted R2

0.965 0.965 0.9659 0.9683

Footnotes:

*Significant at the 5-percent level

Standard errors in parentheses

Table 10. AWS Cloud Services Time Dummy Hedonic Model using ECU (2018 Q2-2019 Q2)
2018Q2-2018Q3 2018Q3-2018Q4 2018Q4-2019Q1 2019Q1-2019Q2

Quarter dummy

0 0.0671 0 0
(0.0372) (0.0478) (0.0479) (0.0479)

ln vCPU

1.0276* 1.3102* 1.3102*
(0.2099) (0.1659) (0.1659)

ln Memory

0.5376* 0.4427* 0.452* 0.452*
(0.0423) (0.0436) (0.036) (0.036)

ln Storage

0.1588* 0.1086* 0.0634* 0.0634*
(0.0232) (0.0238) (0.0244) (0.0244)

SSD

-1.2506* -0.8662* -0.5449* -0.5449*
(0.1931) (0.1801) (0.1687) (0.1687)

ln ECU

0.4454* -0.5007* -0.7908* -0.7908*
(0.0468) (0.2005) (0.1709) (0.1709)

Windows

0.5318* 0.5183* 0.5083* 0.5083*
(0.0372) (0.0457) (0.0479) (0.0479)

Observations

176 206 236 236

Adjusted R2

0.9683 0.9466 0.9349 0.9349

Footnotes:

*Significant at the 5-percent level

Standard errors in parentheses

Only the models for 17Q4-18Q1 and 18Q3-18Q4 had price changes. Prices are stable, which means price changes are caused by the entry and exit of cloud services. Some of the variables have counterintuitive signs on their coefficients in some of the models, such as Log (ECU) in the last three models. This phenomenon can arise when variables are correlated with each other. With time dummy models, we are mainly interested in the time dummy coefficient.

Even though most variables are being selected, there is still value in using the statistical learning specification algorithm. For instance, Log (vCPU) is not selected for three of the models, and this variable is one of the main price determining characteristics of cloud services. Without the statistical learning specification algorithm, we would have not known that omitting Log (vCPU) would have produced a model with better performance in those three models.

We also estimate models for AWS using microprocessor characteristics instead of ECU. The ECU models serve as a benchmark we can use to measure the performance of the characteristics models. This measure of performance will be useful for gauging the appropriateness of using microprocessor characteristics in the Microsoft Azure and Google Cloud models where an ECU-like variable is not available.

For the characteristics models, the vCPU variable is omitted because it is strongly correlated with cache and TDP. The amount of cache or TDP used by a cloud service is proportional to the number of VCPUs, which was explained previously. Because the base and turbo frequency variables are closely correlated and there are so few microprocessors in the data set, model selection using the methodology used by Sawyer and So was done twice, once using base frequency and omitting turbo frequency and once using turbo frequency and omitting base frequency.

Table 11. AWS Cloud Services Time Dummy Hedonic Model using CPU Characteristics with Base Frequency (2017 Q2-2018 Q2)
2017Q2-2017Q3 2017Q3-2017Q4 2017Q4-2018Q1 2018Q1-2018Q2

Quarter dummy

0 0 -0.0358 0
(0.0338) (0.0338) (0.0368) (0.0305)

ln Memory

0.999* 0.999* 0.8248* 0.8302*
(0.069) (0.069) (0.0597) (0.0648)

ln Storage

0.0521 0.0521 0.1595* 0.1696*
(0.0285) (0.0285) (0.0237) (0.025)

SSD

-0.5403* -0.5403* -1.3028* -1.3855*
(0.2419) (0.2419) (0.1969) (0.2135)

ln Base frequency

1.5521* 1.5521* 2.7659* 3.1469*
(0.3745) (0.3745) (0.2894) (0.4642)

ln Cache

-2.503* -2.503* 0.1327* 0.4323*
(0.3498) (0.3498) (0.0598) (0.1015)

ln TDP

2.4708* 2.4708* -0.3015*
(0.3265) (0.3265) (0.1429)

Windows

0.4676* 0.4676* 0.5047* 0.5318*
(0.0338) (0.0338) (0.0345) (0.0305)

Observations

128 128 152 176

Adjusted R2

0.9807 0.9807 0.9764 0.9786

Footnotes:

*Significant at the 5-percent level

Standard errors in parentheses

Table 12. AWS Cloud Services Time Dummy Hedonic Model using CPU Characteristics with Base Frequency (2018 Q2-2019 Q2)
2018Q2-2018Q3 2018Q3-2018Q4 2018Q4-2019Q1 2019Q1-2019Q2

Quarter dummy

0 0.0708 0 0
(0.0305) (0.0431) (0.0456) (0.0456)

ln Memory

0.8302* 0.7399* 0.7052* 0.7052*
(0.0648) (0.057) (0.0542) (0.0542)

ln Storage

0.1696* 0.1066* 0.0598* 0.0598*
(0.025) (0.0229) (0.0231) (0.0231)

SSD

-1.3855* -0.9008* -0.5604* -0.5604*
(0.2135) (0.1753) (0.1591) (0.1591)

ln Base frequency

3.1469* 2.5898* 2.1997* 2.1997*
(0.4642) (0.4648) (0.4955) (0.4955)

ln Cache

0.4323* 0.6801* 0.7981* 0.7981*
(0.1015) (0.1401) (0.1412) (0.1412)

ln TDP

-0.3015* -0.4262* -0.4857* -0.4857*
(0.1429) (0.1717) (0.1777) (0.1777)

Windows

0.5318* 0.5183* 0.5083* 0.5083*
(0.0305) (0.0423) (0.0456) (0.0456)

Observations

176 206 236 236

Adjusted R2

0.9786 0.9544 0.9411 0.9411

Footnotes:

*Significant at the 5-percent level

Standard errors in parentheses

The statistical learning algorithm is selecting most of the variables for the models. These models are also showing price changes for 17Q4-18Q1 and 18Q3-18Q4, just as the models using ECU did. The price changes are somewhat larger in the models using characteristics, but they are not drastically different.

Table 13. AWS Cloud Services Time Dummy Hedonic Model using CPU Characteristics with Turbo Frequency (2017 Q2-2018 Q2)
2017Q2-2017Q3 2017Q3-2017Q4 2017Q4-2018Q1 2018Q1-2018Q2

Quarter dummy

0 0 -0.0422 0
(0.0349) (0.0349) (0.043) (0.034)

ln Memory

0.949* 0.949* 0.596* 0.5655*
(0.0577) (0.0577) (0.0509) (0.0438)

ln Storage

0.1341* 0.1391*
(0.0222) (0.0227)

SSD

-0.1351* -0.1351* -1.0837* -1.1101*
(0.0499) (0.0499) (0.1859) (0.1877)

ln Turbo frequency

ln Cache

-3.4095* -3.4095* -0.3434* -0.1319
(0.2746) (0.2746) (0.1356) (0.0925)

ln TDP

3.4404* 3.4404* 0.7166* 0.5407*
(0.2275) (0.2275) (0.1059) (0.0701)

Windows

0.4676* 0.4676* 0.5047* 0.5318*
(0.0349) (0.0349) (0.0383) (0.034)

Observations

128 128 152 176

Adjusted R2

0.9794 0.9794 0.9708 0.9735

Footnotes:

*Significant at the 5-percent level

Standard errors in parentheses

Table 14. AWS Cloud Services Time Dummy Hedonic Model using CPU Characteristics with Turbo Frequency (2018 Q2-2019 Q2)
2018Q2-2018Q3 2018Q3-2018Q4 2018Q4-2019Q1 2019Q1-2019Q2

Quarter dummy

0 0.083 0 0
(0.0341) (0.0453) (0.046) (0.046)

ln Memory

0.5373* 0.5516* 0.5676* 0.5676*
(0.0345) (0.0317) (0.0276) (0.0276)

ln Storage

0.1423* 0.0516*
(0.0229) (0.022)

SSD

-1.1375* -0.4544* -0.0936 -0.0936
(0.1888) (0.1714) (0.0514) (0.0514)

ln Turbo frequency

-1.8409* -1.9731* -1.9731*
(0.6074) (0.5748) (0.5748)

ln Cache

-0.4847 -0.3937 -0.3937
(0.2634) (0.2882) (0.2882)

ln TDP

0.4381* 0.944* 0.8612* 0.8612*
(0.0349) (0.2557) (0.2772 (0.2772)

Windows

0.5318* 0.5183* 0.5083* 0.5083*
(0.0341) (0.0432) (0.046) (0.046)

Observations

176 206 236 236

Adjusted R2

0.9733 0.9524 0.9401 0.9401

Footnotes:

*Significant at the 5-percent level

Standard errors in parentheses

Again, as with the characteristics models using base frequency, the characteristics models using turbo frequency show price changes for 17Q4-18Q1 and 18Q3-18Q4. But the characteristics models using turbo frequency show a larger deviance from the ECU models than the characteristics models using base frequency. Unlike base frequency, which was selected for all models, turbo frequency was only selected in three of the eight models. Overall, the characteristics models using base frequency show better performance than characteristics models using turbo frequency.

The statistical learning algorithm used by Sawyer and So was used to estimate models for Microsoft Azure separately for base frequency and turbo frequency.

Table 15. Microsoft Azure Cloud Services Time Dummy Hedonic Model using CPU Characteristics with Base Frequency (2017 Q2-2018 Q2)
2017Q2-2017Q3 2017Q3-2017Q4 2017Q4-2018Q1 2018Q1-2018Q2

Quarter dummy

0 -0.0079 0.0095 0
(0.0284) (0.0279) (0.0208) (0.0197)

ln Memory

0.4646* 0.4885* 0.4791* 0.4315*
(0.0244) (0.0245) (0.0311) (0.0155)

ln Storage

0.1273* 0.1162* 0.1231* 0.1508*
(0.0165) (0.0165) (0.0193) (0.0115)

SSD

0.4903* 0.5135* 0.5155* 0.4918*
(0.0252) (0.0243) (0.0259) (0.0251)

ln Base frequency

-0.6297 -0.9335* -0.7991*
(0.3723) (0.1297) (0.1036)

ln Cache

0.8013* 0.5283* 0.4003* 0.4039*
(0.061) (0.1492) (0.0262) (0.0243)

ln TDP

-0.3721* -0.1176
(0.0664) (0.154)

Windows

0.4055* 0.422* 0.4436* 0.4503*
(0.0293) (0.028) (0.024) (0.021)

Observations

136 148 150 140

Adjusted R2

0.9866 0.9872 0.9893 0.9919

Footnotes:

*Significant at the 5-percent level

Standard errors in parentheses

Table 16. Microsoft Azure Cloud Services Time Dummy Hedonic Model using CPU Characteristics with Base Frequency (2018 Q2-2019 Q2)
2018Q2-2018Q3 2018Q3-2018Q4 2018Q4-2019Q1 2019Q1-2019Q2

Quarter dummy

0.0295 0.0049 -0.0018 -0.0036
(0.0238) (0.0168) (0.0155) (0.0157)

ln Memory

0.3166* 0.2625* 0.2631* 0.2584*
(0.0225) (0.0098) (0.0095) (0.0099)

ln Storage

0.2569* 0.3493* 0.3487* 0.3544*
(0.0213) (0.0162) (0.017) (0.0171)

SSD

0.2625* 0.0785* 0.0774* 0.0674*
(0.0316) (0.0138) (0.0137) (0.0144)

ln Base frequency

-2.9143* -4.5942* -4.7023* -4.7626*
(0.3972) (0.3076) (0.3054) (0.3081)

ln Cache

-0.6378* -1.3068* -1.3475* -1.379*
(0.1618) (0.0958) (0.0933) (0.0943)

ln TDP

1.0623* 1.7011* 1.7406* 1.7691*
(0.1607) (0.0834) (0.0779) (0.0788)

Windows

0.4415* 0.4508* 0.4683* 0.4701*
(0.0277) (0.017) (0.0161) (0.0164)

Observations

132 138 152 152

Adjusted R2

0.9873 0.9951 0.9951 0.9949

Footnotes:

*Significant at the 5-percent level

Standard errors in parentheses

Table 17. Microsoft Azure Cloud Services Time Dummy Hedonic Model using CPU Characteristics with Turbo Frequency (2017 Q2-2018 Q2)
2017Q2-2017Q3 2017Q3-2017Q4 2017Q4-2018Q1 2018Q1-2018Q2

Quarter dummy

0 -0.0024 0.005 0
(0.0284) (0.0283) (0.0204) (0.0197)

ln Memory

0.4646* 0.4814* 0.4659* 0.4308*
(0.0244) (0.0257) (0.0274) (0.0157)

ln Storage

0.1273* 0.1199* 0.1307* 0.1509*
(0.0165) (0.0171) (0.0171) (0.0115)

SSD

0.4903* 0.5057* 0.5059* 0.4894*
(0.0252) (0.0242) (0.0263) (0.0261)

ln Turbo frequency

-1.276* -1.1989* -1.0693*
(0.4215) (0.3074) (0.2869)

ln Cache

0.8013* 0.6062* 0.6136* 0.5848*
(0.061) (0.0721) (0.0556) (0.043)

ln TDP

-0.3721* -0.1892* -0.2075* -0.1787*
(0.0664) (0.0801) (0.067) (0.0517)

Windows

0.4055* 0.4231* 0.4437* 0.4507*
(0.0293) (0.0275) (0.0237) (0.0209)

Observations

136 148 150 140

Adjusted R2

0.9866 0.9873 0.9895 0.9919

Footnotes:

*Significant at the 5-percent level

Standard errors in parentheses

Table 18. Microsoft Azure Cloud Services Time Dummy Hedonic Model using CPU Characteristics with Turbo Frequency (2018 Q2-2019 Q2)
2018Q2-2018Q3 2018Q3-2018Q4 2018Q4-2019Q1 2019Q1-2019Q2

Quarter dummy

0.0383 0.0441 -0.0018 -0.0036
(0.024) (0.0273) (0.0269) (0.0273)

ln Memory

0.3266* 0.2633* 0.2633* 0.2587*
(0.0223) (0.0189) (0.0203) (0.0206)

ln Storage

0.253* 0.2806* 0.2681* 0.2726*
(0.0214) (0.0239) (0.0275) (0.0279)

SSD

0.2515* 0.1465* 0.1569* 0.1481*
(0.0295) (0.0242) (0.0265) (0.027)

ln Turbo frequency

-3.5819* -1.515* -0.9662* -0.9857*
(0.3235) (0.3951) (0.2928) (0.2934)

ln Cache

-0.2078* -0.2266* -0.2445*
(0.0742) (0.0815) (0.0826)

ln TDP

0.4197* 0.662* 0.6913* 0.7073*
(0.0273) (0.0639) (0.0653) (0.0659)

Windows

0.4379* 0.4052* 0.4198* 0.421*
(0.0269) (0.0247) (0.0247) (0.025)

Observations

132 138 152 152

Adjusted R2

0.9871 0.9859 0.985 0.9846

Footnotes:

*Significant at the 5-percent level

Standard errors in parentheses

Unlike AWS, there is no major difference between the models using base frequency and the models using turbo frequency. Both sets of models have similar time dummy variables, and the magnitude and sign of base frequency and turbo frequency are similar in the respective models.

For Google Cloud, all of the cloud services in a given region use the same microprocessors. We constructed our dataset from two different regions to provide a mix of microprocessors. Over the time period of 17Q2 to 19Q1, there were no changes in products or prices. With this lack of change, there would of course be no price change for a hedonic model to capture. In the second quarter of 2019, there was a change in the frequency of the microprocessors in one of the regions. We used the statistical learning algorithm with both frequency variables, but in both cases the frequency variables were not selected. Because we know exactly what changed with the cloud services, we estimated models with both frequency variables to see if they yielded any appreciable quality adjusted price change. There was a strong correlation between the region variables and the frequency variables, so the region variables were omitted. Likewise, there was a strong correlation between cache and TDP, so TDP was omitted. Tables 19 and 20 show the results for the model using base frequency and turbo frequency.

Table 19. Google Cloud Services Time Dummy Hedonic Model using CPU Characteristics with Base Frequency (2019 Q1-2019 Q2)
2019Q1-2019Q2

Quarter dummy

-0.0011
(0.0174)

ln Memory

0.1850*
(0.0087)

ln Base frequency

-0.0413
(0.379)

ln Cache

0.8172*
(0.0102)

Windows

0.5034*
(0.0142)

Observations

152

Adjusted R2

0.9956

Footnotes:

*Significant at the 5-percent level

Standard errors in parentheses

Table 20. Google Cloud Services Time Dummy Hedonic Model using CPU Characteristics with Turbo Frequency (2019 Q1-2019 Q2)
2019Q1-2019Q2

Quarter dummy

0.0003
(0.0143)

ln Memory

0.1850*
(0.0087)

ln Turbo frequency

-0.0133
(0.0873)

ln Cache

0.8172*
(0.0102)

Windows

0.5034*
(0.0142)

Observations

152

Adjusted R2

0.9956

Footnotes:

*Significant at the 5-percent level

Standard errors in parentheses

The models are remarkably similar except for the quarter dummy and frequency coefficients. This similarity suggests that the change in microprocessor frequency in the second quarter of 2019 caused negligible quality adjusted price change and it helps illustrate why the statistical learning algorithm did not select either frequency variable.

The results show that a time dummy hedonic model is able to estimate quality adjusted price change for cloud computing services. This is important because cloud computing is an area that sees rapid technological change and it is an industry that has become crucial for the information technology sector.

Time Dummy Model vs. Hedonic Quality Adjustment Model

The time dummy approach is useful for goods or services for which it is not clear how to indirectly adjust quality with coefficients from the model. In these cases, the model needs to estimate the entire product inclusive of all of its characteristics well. With a time dummy model, we can estimate quality adjusted price change for complicated goods or services directly.

A hedonic quality adjustment method is better suited for situations where there is cost data available for some product changes, but not all of them. A hedonic quality adjustment model can be used for those changes where cost data is not available.

Ongoing research 

BLS has contracted with a data provider for pricing information about the telecommunications industry, to purchase monthly downloads from their database of service plans offered by telecommunications providers across the country. This data provider collects data through various modes and channels, though primarily by web-scraping providers’ websites. The data consists of information pertaining to residential telephone, internet access (both residential and wireless), cable and satellite television, bundled service packages (combinations of two or more of residential telephone, internet, and television), streaming video, and wireless telephone service plans. The files include list prices and characteristics of service plans available to consumers in the cities included in the CPI’s geographic sample.

Using this dataset, BLS is evaluating the feasibility of estimating time dummy hedonic models for telecommunications services in both the CPI and the PPI. This evaluation began by analyzing the data and comparing it to comparable CPI and PPI data.

An analysis of the scope of the secondary source data revealed that approximately 95 percent of the combinations of area, provider, and type of service in the CPI are accounted for in the data. Further analysis of the frequency of service plan turnover and service plan price change showed both rates to be very low on a month-to-month basis, mirroring the rates seen in CPI collected data. Overall, the content of the data files, as assessed in the scope analysis, and the behavior of the data, as observed in the frequency of plan turnover and price change, compare quite well to CPI data.

The universe of establishments in the dataset closely matches that from which PPI currently draws samples for the wired telecommunications industry. In addition to data for residential broadband services required by both CPI and PPI, the dataset includes business broadband that is necessary for the PPI calculation. Looking at item data detail, different levels of discounted prices are available and this would allow BLS to capture a representative mix of net transaction prices, the preferred prices for PPI. The most important price determining characteristics for developing a reliable model are included as well.

Conclusion

With the continued success in implementing hedonic regression models, BLS has a wide range of quality adjustment techniques to consider. This is especially important for industries that experience rapid technological change where it is not possible to obtain the data necessary to perform quality adjustments.

There are several factors that inform BLS’ decision to develop a model and to select the appropriate quality adjustment technique. First, a hedonic model is developed when the concept of a matched model breaks down for a particular industry, consumer good, or service. The matched model will fail when newly introduced goods or services are vastly different from those currently priced and price change occurs at the time they are introduced to the market.

Second, the selection of quality adjustment technique is based on the type of quality change observed in the product or service. As previously noted, the indirect hedonic method is the best option when cost data is not available for all products changes. A time dummy model, or the direct hedonic approach, is preferred when changes are complicated and it would be difficult to determine how to apply the coefficients from a hedonic quality adjustment model.

As the use of hedonics continues to expand, BLS will consider calculating indexes comprised completely of model estimated prices. This approach will be more feasible with increased access to alternative data and the tools for transforming and analyzing that data. In addition, BLS could explore model development for other areas of the digital economy as resources allow.

Last Modified Date: September 15, 2022


[1] For further reading, see Brent R. Moulton, 2001. "The Expanding Role of Hedonic Methods in the Official Statistics of the United States," BEA Papers 0018, Bureau of Economic Analysis.

[2] For more information, see “Measuring Price Change in the CPI: Telephone hardware, calculators, and other consumer information items” (U.S. Bureau of Labor Statistics),https://www.bls.gov/cpi/factsheets/telephone-hardware.htm.

[3] For more information, see “Measuring Price Change in the CPI: Telecommunications Services” (U.S. Bureau of Labor Statistics), https://www.bls.gov/cpi/factsheets/telecommunications.htm.

[4] Steven D. Sawyer and Alvin So, "A new approach for quality-adjusting PPI microprocessors," Monthly Labor Review, U.S. Bureau of Labor Statistics, December 2018, https://doi.org/10.21916/mlr.2018.29.

[5] This price index category also includes home phones (4 percent of the sample), accessories (24 percent), smartwatches (11 percent), and calculators (14 percent).

[6] Non-seasonally adjusted CPI-U for item code SEEE04, U.S. city average.

[7] Direct substitution is also used in the CPI for personal computers.

[8] Natural/unforced substitutions occur when the item the data collector was attempting to re-price is not available for purchase and the respondent indicates that the item is not going to return. At that point, the data collector is instructed to substitute to an available item of similar quality. 

[9] Official CPI is the non-seasonally adjusted CPI-U for item code SEEE04, U.S. city average. Counterfactual is the same index series as the Official CPI, but where the quality adjustments have been removed and replaced by price relatives calculated through imputation based on the price changes of similar price observations.

[10] Official CPIs are the non-seasonally adjusted CPI-U for item codes SEED04 (residential), SEEE03 (Internet), and SERA02 (Television), U.S. city averages.

[11] Brett Creech, “Are most Americans cutting the cord on landlines?” Beyond the Numbers: Prices & Spending, vol. 8, no. 7 (U.S. Bureau of Labor Statistics, May 2019), https://www.bls.gov/opub/btn/volume-8/are-most-americans-cutting-the-cord-on-landlines.htm.

[12] Edward Carlson, "Cutting the Cord: NTIA Data Show Shift to Streaming Video as Consumers Drop Pay-Tv" (National Telecommunications and Information Administration, May 2019), https://www.ntia.doc.gov/blog/2019/cutting-cord-ntia-data-show-shift-streaming-video-consumers-drop-pay-tv

[13] All transformations made to the dependent variables were based on lambda values equal to 0 (log(Y)), 0.25 (Y0.25), 0.5 (Y0.5), or 0.65 (Y0.65).

[14] The average number of regressors per model includes the control variables for providers and cities that are not used to make price adjustments.

[15] Official CPI is the non-seasonally adjusted CPI-U for item code SEED04, U.S. city average. Counterfactual is the same index series as the Official CPI, but where the quality adjustments have been removed and replaced by price relatives calculated through imputation based on the price changes of similar price observations.

[16] Official CPI is the non-seasonally adjusted CPI-U for item code SEEE03, U.S. city average. Counterfactual is the same index series as the Official CPI, but where the quality adjustments have been removed and replaced by price relatives calculated through imputation based on the price changes of similar price observations.

[17] Official CPI is the non-seasonally adjusted CPI-U for item code SERA02, U.S. city average. Counterfactual is the same index series as the Official CPI, but where the quality adjustments have been removed and replaced by price relatives calculated through imputation based on the price changes of similar price observations.

[18] The number of observations is not disclosed to maintain the confidentiality of PPI respondents

[19] Louis Columbus, “Public Cloud Soaring to $331B By 2022 According To Gartner” (Forbes, April 2019), https://www.forbes.com/sites/louiscolumbus/2019/04/07/public-cloud-soaring-to-331b-by-2022-according-to-gartner/#3278e08d5739.

[20] Industry leaders Amazon Web Services, Microsoft Azure, and Google have publicly available price data for IaaS packages. PPI used this public information to build the data sample for the hedonic model and surrounding research.

[21] Jeb Su, “Amazon Owns Nearly Half Of The Public-Cloud Infrastructure Market Worth Over $32 Billion: Report” (Forbes, August 2019), https://www.forbes.com/sites/jeanbaptiste/2019/08/02/amazon-owns-nearly-half-of-the-public-cloud-infrastructure-market-worth-over-32-billion-report/#7f7c713d29e0.

[22] David Byrne, Carol Corrado, and Daniel Sichel, “The Rise of Cloud Computing: Minding Your P’s, Q’s and K’s,” Measuring and Accounting for Innovation in the 21st Century.

[23] This fact is also true of cross section models used for hedonic imputation

[24] AWS FAQs, https://aws.amazon.com/ec2/faqs/.

[25] David M. Byrne, Stephen D. Oliner, and Daniel E. Sichel, “How fast are semiconductor prices falling?” Review of Income and Wealth, vol. 64, no. 3, April 2017,  pp. 679–702.

[26] Steven D. Sawyer and Alvin So, "A new approach for quality-adjusting PPI microprocessors," Monthly Labor Review, U.S. Bureau of Labor Statistics, December 2018, https://doi.org/10.21916/mlr.2018.29.