
An official website of the United States government
Wage gaps between male and female workers remain a persistent challenge in labor markets worldwide, often stemming from structural inequalities and unmeasured factors beyond productivity differences. This article introduces a novel decomposition methodology to analyze wage differentials between men and women in Peru, offering an alternative to traditional approaches such as the Kitagawa-Oaxaca-Blinder model. By simulating a counterfactual scenario in which male and female workers possess identical productivity characteristics, this method isolates the unexplained component of the wage gap while accounting for regional labor dynamics. The analysis reveals that substantial wage disparities persist even when observable factors are controlled.
Wage gaps between male and female workers affect the global labor market.1 These gaps are not always related to productivity characteristics. The study of wage gaps is essential to determine the main causes of the phenomenon. This article presents a new decomposition methodology for measuring and assessing wage gaps between men and women. This methodology is particularly suited for undergraduate students interested in labor economics. Peru serves as the case study for the methodology; it is a country known for its biodiversity and rich culture. In the last 10 years, Peru’s steady economic growth has improved living conditions and opportunities for the country’s citizens.2
In this article, the decomposition method explores the hypothetical situation in which a male and female worker possess equal characteristics and have a similar economic status. This article matches the characteristics of male workers with the characteristics of female workers through simulation and analyzes the residual value from subtracting the basic wage regression results. These are then compared with the “raw” residual, which is simply the observed wage gap between male and female workers. The method is flexible enough to explore numerous determinant variables and different combinations.
According to the International Monetary Fund World Economic Outlook Database from April 2022, Peru ranks sixth in the list of Latin American countries by gross domestic product (GDP) in terms of purchasing power parity (PPP).3 Thereby, the economic size of Peru is between that of Mexico and Brazil.4 A high GDP PPP per capita shows a country’s ability to purchase goods and services internationally. However, persistent social and economic inequalities among Peru’s population—especially inequality between men and women—constitutes a challenge for sustainable development. Since Peru is important to the regional economy of the Americas, further analysis of its economy and labor market is merited.
Peru’s Ministry of Women and Vulnerable Populations emphasizes that women experience limited exercise of their fundamental rights and development opportunities.5 But women have improved their human capital. The percentage of the population that has completed at least 16 years of education can be used as a measure of human capital. As chart 1 shows, from 2004 to 2019, women have achieved higher completed levels of education when compared with men. Nevertheless, women still face several barriers in the labor market.6
Year | Women | Men |
---|---|---|
2004 | 10.3 | 12.2 |
2005 | 10.8 | 11.8 |
2006 | 11.4 | 12.7 |
2007 | 13.0 | 14.6 |
2008 | 13.3 | 14.4 |
2009 | 14.2 | 14.9 |
2010 | 13.4 | 14.7 |
2011 | 14.1 | 15.1 |
2012 | 16.3 | 16.6 |
2013 | 16.4 | 16.0 |
2014 | 17.2 | 15.7 |
2015 | 16.5 | 15.6 |
2016 | 16.7 | 16.4 |
2017 | 16.8 | 16.7 |
2018 | 17.4 | 17.2 |
2019 | 18.0 | 17.4 |
Source: Instituto Nacional de Estadística e Informática. |
This section begins with a wage differential decomposition model using the Kitagawa-Oaxaca-Blinder wage decomposition method. The base model is further advanced by including techniques from research on wage differentials between male and female workers in Peru. In 2017, the Defensoría del Pueblo produced one of the most relevant research projects.7 The researchers used a quantile decomposition study of the wage gap between male and female workers in the labor market and found that direct discrimination drove 60 percent of the wage gap. Their analysis of this wage gap is based on the quantile regression method developed by Roger Koenker and Gilbert Bassett Jr.8
Quantile decomposition makes exploring differences in outcomes across the income distribution possible. In addition, the Defensoría del Pueblo study finds its roots in the work of Hugo Ñopo, who developed a methodology using matching comparisons to explain wage differences between male and female workers.9 His methodology emphasizes differences in the distribution of observable characteristics and provides insights into the unmeasured variables related to income differences. Ñopo presents this methodology as an alternative to the Kitagawa-Oaxaca-Blinder decomposition. In his study, Ñopo uses individual data from Peru for the years 1986 to 1999. The matching method that Ñopo uses yields an accurate decomposition result when the dataset is used to represent individual data instead of aggregates. Ñopo found that the wage gap between male and female workers was driven by pay differences mainly at higher income levels. According to Ñopo’s results, wage differentials in the top fifth of the wage distribution explain more than half the wage gap in Peru. In addition, Ñopo found that the wage gap does not substantially affect the average wage gap in the lowest percentiles. Interestingly, his results also show that the unexplained pay differences are more dispersed among married people; and these unexplained differences are substantially higher among the highly educated.10
In another article with similar methods, Giannina Vaccaro, Maria Pia Basurto, Arlette Beltrán, and Mariano Montoya outlined the Peruvian wage gap evolution from 2007 to 2018 and identified key variables that explain its pattern.11 The study by Vaccaro et al. found that the raw wage gap trended upward between 2007 and 2011, ranging from 6 to 12 percent and remaining at that level since then. The authors used the Kitagawa-Oaxaca-Blinder decomposition model to find that the unexplained wage gap remained around 17 percent during the period of study. They observed that there was a reduction in the differences in endowments between men and women. This reduction occurred alongside a persistent, unexplained gap that originated from somewhat broader raw wage gaps over time.12 Although women have made gains in education and work experience, the unexplained portion of the wage gap has held steady at 17 percent. Since this gap cannot be attributed to differences in productivity or changes in the market, it may point to deeper structural factors—like social norms or discrimination—that help account for its persistence.
Contrary to the results obtained by Ñopo, Vaccaro et al. find that wage gaps between male and female workers are larger among the bottom percentiles, with the wage gap shrinking as income increased. Vaccaro et al. revealed that a smaller wage gap is associated with regions that have a higher GDP per capita, a lower instance of domestic violence against women, and a reduced percentage of households led by women.13
Essentially, the findings of Vaccaro et al. suggest that economic, social, and demographic factors are interconnected with the regional differences in the wage gap between male and female workers. These social and demographic variables represent a set of features that are related to economic factors that may influence the residual wage differentials. Thus, these social and demographic variables may explain a certain section of the unexplained residual. Also, these social and demographic variables show that the wage gap cannot simply be explained by discrimination against women in the labor market. The presence of a statistically unexplained residual means that wage differences cannot be accounted for by observable factors such as education, experience, or occupation, suggesting the influence of unmeasured variables.
The approaches of Vaccaro et al. and Ñopo become impractical when estimating wage differences between male and female workers because of certain complexities. While the current research accounts for factors like education, hours worked, and occupational differences, there are still unmeasured variables that influence the wage gap between male and female workers. For example, cultural expectations that discourage women from pursuing higher education or policies that do not support work-life balance can shape long-term earning potential. These factors are not always accounted for in wage gap analyses because they are difficult to measure directly. As a result, while statistical controls help isolate the impact of a worker’s sex on wages, they may also obscure broader structural influences that contribute to disparities between male and female workers.
Next, this article presents the methodological framework used to analyze the wage gap between men and women in Peru. First, there is an outline of the regional division of the country, as regional differences in culture and labor markets are critical to understanding wage disparities. Then, there is a description of the traditional Kitagawa-Oaxaca-Blinder decomposition method, followed by a novel regression-based approach that expands on the traditional model to include a broader set of explanatory variables.
Peru is a fundamentally diverse country in culture and social norms. Therefore, capturing the effects of the different demographic characteristics in this model is essential. This article focuses on the three different natural regions in Peru: coast, highlands, and jungle. Dividing the data into regions helps reveal the labor dynamics that are place specific. The distribution of departments (or states) according to each natural region is shown in table 1 below. The distribution is based on a study by Peru’s Instituto Nacional de Estadística e Informática.14
Region | Departments |
---|---|
Coast | Ica, La Libertad, Lambayeque, Lima, Moquegua, Piura, Tacna, and Tumbes |
Highlands | Áncash, Apurímac, Arequipa, Ayacucho, Cajamarca, Cusco, Huancavelica, Huánuco, Junín, Pasco, and Puno |
Jungle | Amazonas, Loreto, San Martín, and Ucayali |
Source: Instituto Nacional de Estadística e Informática. |
The Kitagawa-Oaxaca-Blinder decomposition model divides the wage differential between two groups into two sections: a section that is explained by differences in productivity features, such as skills, and another section that is unexplained by those features.15 The unexplained part of the model is often used to account for several factors. These unobserved predictors may include factors such as social norms, workplace culture, negotiation differences, informal hiring networks, discrimination, and unconscious biases, which all influence wage disparities but are difficult to measure directly. Nevertheless, the Kitagawa-Oaxaca-Blinder decomposition model is useful for defining what comprises the wage differential between male and female workers.
The Kitagawa-Oaxaca-Blinder decomposition model requires separate calculations of earnings equations for the two groups subject to analysis. However, once the separate earnings equations are calculated, the model allows the matching of the characteristics of either female or male workers by using a single earnings equation, which enables the creation of counterfactual scenarios. For example, what if the income for male workers was determined by the same parameters obtained in the female workers equation?
The standard procedure for wage differential analysis is to create equations that relate earnings and observed characteristics, expressed in semilogarithmic form. The basic decomposition equation has several parts. The first part is the basic wage differential, and it is constructed as
where WM is wages earned by male workers and WF is wages earned by female workers. The basic wage differential reflects the observable difference between wages earned by male workers and female workers. According to the Kitagawa-Oaxaca-Blinder model, these two variables are constituted by the male earnings function (2) and the female earnings function (3), which are defined as
and
In these functions, aM and aF are the sex-based intercepts, and βM and βF are coefficients determining the effect of S, which refers to the average schooling years for each group. This variable represents a proxy for the general level of education and skill for each worker. Given these functions, the raw wage differential can be further decomposed into a regression equation for wage differential as
Finally, and after algebraic manipulation, the decomposition model becomes
The first section of the model, presented as
describes the effect on the wage differential by unobserved variables. The remaining section shows the effect of group productivity differences driven by schooling years and the men’s coefficient of increased trend earnings.
In this section, two regression models are constructed for both male and female workers’ average monthly real income that accounts for more than just average years of schooling. The data used for these models are taken from the Encuesta Nacional de Hogares indices provided by the Instituto Nacional de Estadística e Informática.16 The dataset includes aggregate data from 2005 to 2019, displayed in a panel data and time series format. The analysis is done on 45 observations of data across the three natural regions described in the regions section.
The variables used in the regression model are the following. The average years of schooling is x1. The unemployment rate is x2. Change in the minimum wage policy is x3, in which the indicator takes the value of 1 if a change occurs in minimum wage policy in relation to the previous year and 0 if otherwise. The natural region is x4, in which the indicator takes the value of 0 if the natural region is the coast, 1 if it is the highlands, and 2 if it is the jungle.
Therefore, the specific earnings functions for male and female workers for this analysis are written as follows, with a superscript that designates whether the variable is for female or male workers:
and
Since the model proposed in this article accounts for more than just average years of schooling, the traditional Kitagawa-Oaxaca-Blinder decomposition model requires some adjustment. This analysis of the wage gap will rely on the counterfactual condition that male and female workers possess the same average characteristics. In other words, after the assessment of the earnings functions, the data for male workers used for estimating their equation will be plugged into the earnings function for female workers. Thus, this article considers the hypothesis in which Peruvian male and female workers share the same average characteristics depicted in the variables used for this model.
By making this assumption, this method accounts for the portion of the residual that is not explained by the average characteristics of male and female workers. In an optimal scenario, female and male workers would earn the same income if they had equal characteristics. Therefore, a residual or wage gap would not exist. The data used for these models are aggregate data. In other words, the dataset is composed of individual data presented in monthly averages.
In this section, the results from the regression analysis are discussed. The earnings function for male workers is
The earnings function for male workers has an adjusted R-squared value of 0.803, which suggests that the model explains 80.3 percent of the pattern of real earnings for male workers. Note that the adjusted R-squared is high.17
The earnings function for female workers is
This version has an adjusted R-squared value of 0.885, which indicates that the model explains 88.5 percent of average monthly real income earned by female workers. This model presents a high adjusted R-squared, reinforcing the need for more exploration.
With the use of both earnings equations, the wage differential for male and female workers is calculated without accounting for productivity characteristics. In other words, the “raw” residual is calculated for each regression model. The results from the raw wage differential are shown in the residual 1 column in table 1. The results show the coast region as the highest average raw wage gap, followed by the highlands and the jungle regions. However, once productivity characteristics are accounted for by changing the male wage determinant equation to the female wage determinant, the results switch. If a male worker has the same productivity characteristics as a female worker, then, according to this simulation, the highlands region shows the highest gap, followed by the jungle and the coast. (See table 2.)
Region | Residual 1 | Residual 2 |
---|---|---|
Coast | 1,925 | 1,072 |
Highlands | 1,347 | 1,375 |
Jungle | 1,134 | 1,235 |
Note: The dollar amounts are 2021 conversions from the Peruvian sol to the U.S. dollar. Source: Author's calculations. |
The residual 1 column shows the average wage gap without considering productivity characteristics, while the residual 2 column shows the adjusted results for these differences, showing the wage gap with productivity accounted for. In residual 1, equations for both men and women determined the wage gap. For residual 2, only the female equation was applied to the data for men and women to compute the residual. The fact that the ordering of the regions switches when productivity is accounted for warrants further investigation. Most likely, the regression model is not capturing the effects of factors outside productivity, such as occupation, industry, company, culture, informal norms, and stereotypes of men and women.
This article introduces a new methodology for analyzing wage gaps between male and female workers. Using labor market data from Peru, the methodology hypothesizes the case in which a male worker possesses the same productivity characteristics and is influenced by the same market effects as those of a female worker. The decomposition model developed in this article is simple and straightforward because it is a basic substitution and comparison technique, similar to a difference-in-difference analysis. In addition, the model does not require as much data manipulation as Ñopo’s matching technique, and it is versatile for other demographic characteristics, such as race.18
This method is an alternative to the Kitagawa-Oaxaca-Blinder decomposition model because it does not require the manipulation of the regression equations to find the unexplained part of the gap and provides room for using more than one independent variable. Thus, the residuals that result from the homogenization of the earnings determinant functions (running the male workers’ data through the female workers’ equation) reflect the actual wage gap without the need of further equation manipulation. While this decomposition methodology offers a novel approach to analyzing wage gaps, it's important to acknowledge its limitations. One key consideration is the reliance on aggregate data and the simulation that Peruvian male and female workers have identical productivity characteristics. Future research should focus on accessing regional microdata per socioeconomic sector in order to have a more accurate analysis of the structure of the wage gap between male and female workers in Peru. This way, we will be able to match these characteristics by individual worker rather than in aggregates.
José F. Chaman Alvarez, "Analyzing wage gaps between male and female workers in Peru: a novel decomposition methodology and case study," Monthly Labor Review, U.S. Bureau of Labor Statistics, August 2025, https://doi.org/10.21916/mlr.2025.14
1 José Francisco Chaman Alvarez graduated in May 2023 from St. Mary's University of San Antonio and recently completed the Summer Program in Advanced Economics at the Central Bank of Peru, the country's most prestigious and selective program for economists. Admission is granted only to top candidates after a series of rigorous exams and interviews.
2 Summary information about Peru’s economic health can be found at “The World Bank in Peru” (The World Bank, last updated September 2022), https://www.worldbank.org/en/country/peru/overview.
3 Purchasing power parity is defined by Tim Callen as “the rate at which the currency of one country would have to be converted into that of another country to buy the same amount of goods and services in each country.” Tim Callen, “Purchasing power parity: weights matter,” Finance & Development (International Monetary Fund, February 24, 2020), pp. 44–45, https://www.imf.org/en/Publications/fandd/issues/Series/Back-to-Basics/Purchasing-Power-Parity-PPP.
4 World economic outlook database (International Monetary Fund, April 2022 edition), https://www.imf.org/en/Publications/WEO/weo-database/2022/April.
5 The Política Nacional de Igualdad de Género (National Gender Equality Policy) was approved through Decreto Supremo N° 008-2019-MIMP, issued by Peru’s Ministry of Women and Vulnerable Populations. This policy acknowledges the persistence of inequalities, particularly in employment, income, time use, and political participation. It explicitly states that women, despite comprising 50.8 percent of the population, experience limited exercise of their fundamental rights and development opportunities, requiring state intervention to prevent the systematic reproduction and intergenerational transmission of these disparities. For more information on this policy, see “Decreto Supremo N° 008-2019-MIMP,” Política Nacional de Igualdad de Género, Diario Oficial El Peruano (Ministerio de la Mujer y Poblaciones Vulnerables, 2019), p. 6, https://www.gob.pe/institucion/mimp/normas-legales/271118-008-2019-mimp.
This decree aligns with Peru’s constitutional principles and international commitments, including the Convention on the Elimination of All Forms of Discrimination Against Women (CEDAW) and the Inter-American Convention on the Prevention, Punishment, and Eradication of Violence Against Women (Belém do Pará Convention). The policy framework aims to address structural discrimination by implementing affirmative policies, institutional reforms, and public programs that promote equality between men and women.
6 According to a report by Peru's Defensoría del Pueblo, Peruvian women face the following barriers: discouragement of women, limited career choices, teen pregnancy, informality and job insecurity, wage gap, employer bias against mothers, unpaid domestic work, occupational segregation, sexual harassment, violence against women, and weak labor protections and lack of childcare support. Barreras a la igualdad en la economía formal e informal desde la perspectiva de las mujeres, Serie Igualdad y No Violencia, No. 002 (Defensoría del Pueblo, 2019, Lima, Perú), https://www.defensoria.gob.pe/deunavezportodas/wp-content/uploads/2019/11/Barreras-a-la-Igualdad-en-la-Economia-Formal-e-Informal.pdf.
7 See Patricia Fuertes Medina and Jackeline Velazco Portocarrero, El Impacto Económico de la Brecha Salarial por Razones de Género, Documento de trabajo no. 00502019_DP/ADM (Defensoría del Pueblo: Lima, Peru 2019), https://www.defensoria.gob.pe/deunavezportodas/wp-content/uploads/2019/11/Brecha-salarial-por-razones-de-genero-2019-DP.pdf. The English title of Fuertes and Velazco is The Economic Impact of the Gender Pay Gap.
8 For the original, see Roger Koenker and Gilbert Bassett Jr., “Regression quantiles,” Econometrica 46, no. 1, January 1978, pp. 33–50. https://doi.org/10.2307/1913643.
9 Hugo Ñopo, “Matching as a tool to decompose wage gaps,” The Review of Economics and Statistics 90, no. 2, May 2008, pp. 290–299, https://doi.org/10.1162/rest.90.2.290.
10 The variables considered by Ñopo in his study were the following: age, education, migratory status, marital status, job formality, union membership, tenure, and hours worked per week. These variables are presented as average observable characteristics by sex.
11 Giannina Vaccaro, Maria Pia Basurto, Arlette Beltrán, and Mariano Montoya, “The gender wage gap in Peru: drivers, evolution, and heterogeneities," Social Inclusion 10, no 1, January 2022, pp 19–34, https://doi.org/10.17645/si.v10i1.4757.
12 An endowment difference is the difference in return on some productivity characteristic, in terms of wages. In other words, it is the difference in pay a man and woman would receive given an extra year of education or an extra year of experience, or the like.
13 The study uses the following data variables: dependent variable: log of hourly wages; demographic characteristics: sex, age, tenure, indigenous mother tongue, head of household; education and employment: years of schooling, public sector worker, informal worker; and firm and sector characteristics: firm size, region, industry, occupation. Additional regional-level variables include the following: gross domeestic product (GDP) per capita, population size, Gini index, urbanity rate, poverty rate, informality rate, physical violence rate against women, women as household heads.
14 See "Mapa provincial de pobreza provincial y distrital, 2007" (Instituto Nacional de Estadística e Informática, 2007), https://www.inei.gob.pe/media/MenuRecursivo/publicaciones_digitales/Est/Lib0911/index.htm. The title in English is “Logistical regression model by department” by the National Institute of Statistics and Information.
15 Ben Jann, “The Blinder–Oaxaca decomposition for linear regression models,” The Stata Journal 8, no. 4, December 2008, pp. 453–479, https://doi.org/10.1177/1536867X0800800401.
16 The name of the survey in English is “National Household Survey.” The Instituto Nacional de Estadística e Informática’s (INEI) thematic index on economic statistics provides macroeconomic data, including GDP growth, inflation, labor force participation rates, and unemployment rates of men and women. These indicators are crucial for understanding the broader economic conditions that influence wage structures. In this research, such data help contextualize wage disparities between male and female workers by examining whether economic growth has led to reductions in wage gaps or if inequalities persist despite overall improvements in economic performance. The inclusion of unemployment rate data for female and male workers further refines the analysis by highlighting differences in job access and stability. If female unemployment remains consistently higher than male unemployment, it may indicate structural barriers in the labor market, such as occupational segregation, hiring biases, or limited access to formal employment. Additionally, analyzing wage gaps alongside unemployment trends helps assess whether economic expansion benefits male and female workers equally or exacerbates existing disparities. For more information economic statistics, see “Economía” (Instituto Nacional de Estadística e Informática), https://www.inei.gob.pe/estadisticas/indice-tematico/economia/. The INEI’s statistics include data on average real earnings by sex and differences in years of schooling between men and women, both of which are crucial for measuring wage disparities over time. By using these datasets, this research establishes a baseline for the wage gap between male and female workers, allowing for comparisons across regions, economic sectors, and demographic groups. Since real income adjusts for inflation, it provides a more accurate picture of purchasing power differences between male and female workers. Additionally, the education gap data helps assess whether differences in human capital accumulation explain wage disparities or if an unexplained residual remains, suggesting structural barriers or discrimination. For more information on statistics for men and women, see “Indicadores de género” (Instituto Nacional de Estadística e Informática), https://www.inei.gob.pe/estadisticas/indice-tematico/brechas-de-genero-7913/. The Central Bank's monthly wage series provides historical and current data on remunerations, including minimum wage levels and average earnings in the formal sector. This research uses changes in the minimum wage to track how real incomes evolve over time and whether differences in earnings between male and female workers persist despite adjustments for inflation. For more information on wage statistics, see “Remuneración, series mensuales” (Banco Central de Reserva del Perú), https://estadisticas.bcrp.gob.pe/estadisticas/series/mensuales/remuneraciones.
17 A high adjusted R-squared value may indicate potential overfitting, omitted variable bias, or a lack of variation in the data. The exploration should involve diagnostic tests such as variance inflation factor (VIF) analysis for multicollinearity, residual analysis to check for heteroskedasticity, and robustness checks with alternative model specifications.
18 See Ñopo, “Matching as a tool to decompose wage gaps.”