Simplified Asset Indices to Measure Wealth and Equity in Health Programs: A Reliability and Validity Analysis Using Survey Data From 16 Countries

Many program implementers have difficulty collecting and analyzing data on program beneficiaries’ wealth because a large number of survey questions are required to construct the standard wealth index. We created country-specific measures of household wealth with as few as 6 questions that are highly reliable and valid in both urban and rural contexts.


INTRODUCTION
T he 2012 unanimous adoption of a United Nations resolution to promote universal health coverage has prioritized a global movement to ensure that all people obtain health services they need without suffering financial hardship. 1 Despite the emphasis on government responsibility to provide primary health care, the private sector is still extensively used for health services in low-and middleincome countries (LMICs). 2,3 Health expenditure in the private sector has been shown to account for 61% of the total health expenditure in lowincome countries, and the majority of these costs are out of pocket, which can prove especially difficult for the poor. [4][5][6] In spite of the financial hardship associated with accessing private-sector health services, particularly high-quality health services, clients often indicate a preference for the private sector because of perceived availability and customer service orientation. 2,[6][7][8] Interventions that harness the power of the private sector to increase the poor's access to necessary, highquality services without causing undue hardship have the potential to move countries closer to universal health coverage.
While working with the private sector offers great opportunity, it also comes with challenges. In most LMICs, there is no unified oversight of the private sector. Quality standards for privatesector service delivery are often lacking and when they do exist, there is little to no enforcement. 3,9 In the mid-1990s, concerns over the quality of private-sector care led to the creation of social franchising-the application of commercial franchising concepts to deliver socially beneficial products and services in underserved communities worldwide. 10 When applied to clinical care, social franchising connects a network of health care providers through formal agreements to deliver health services under a common franchise brand and to improve overall quality. 11 The social franchising industry has grown from just a few clinical franchises in the mid-1990s to more than 90 franchises in 40 countries around the globe. 12 Costs associated with starting and maintaining social franchises have historically been covered through large donor grants, and while social franchise programs can differ in their scale and scope of services offered, most have the common goal of serving the poor. 10,13 To implement strategies that best reach the poor, social franchisors must first accurately capture the socioeconomic profile of the people they serve. This information allows them to understand if the right clients are benefiting from subsidized services and to subsequently make decisions about where to scale-up or modify programs to reach those most in need. This paper first describes the different approaches to measuring wealth and the way many social franchisors have tried to understand the wealth profile of their clients, and then proposes a simplified but robust methodology to improve programmatic understanding and use of wealth measurement.

APPROACHES TO MEASURING WEALTH
Although income may appear to be the most obvious indicator of wealth and is commonly used as a measure of economic status, it has been found to be extremely difficult to capture accurately. Economists have realized that income tends to fluctuate a great deal according to factors such as seasonality and migration, and it does not account for informal earnings, such as payments made in-kind. Furthermore, individuals are often reluctant to share information about their income openly, which makes it difficult to measure during household surveys. 14,15 Thus, it is not the best practical measure of wealth to support programmatic decisions.
One alternative has been to measure consumption instead of income. Economists believe that consumption data, representing the total value of household monetary expenditure and items received as gifts or produced by the household, can be both representative of longerterm wealth and less sensitive to fluctuations in income. This method is used extensively in the World Bank's Living Standards Measurement Study surveys and national Household Income and Expenditure Surveys. However, the surveys are extremely lengthy and are impractical when the primary objective for health program implementers is to collect other information besides consumption data. 14 In the late 1990s, Filmer and Pritchett discovered that household characteristics and material assets were much easier to capture and could be used as a proxy for consumption and, consequently, for economic status. 16 This led to the creation of the wealth index. Data for the wealth index are usually collected through Demographic and Health Surveys (DHS) or other national surveys and cover household ownership of selected assets and quality of living standards, such as housing structure and access to utilities. The raw data are converted into a weighted index using principal components analysis, and populations are divided into quintiles of wealth, each representing 20% of the population. 15 Quintile 1 represents the poorest segment of the population and quintile 5, the wealthiest. Other populationlevel indicators are then stratified by wealth, As a measure of wealth, income is extremely difficult to capture accurately.
To best reach the poor, social franchisors must first accurately capture the socioeconomic profile of the people they serve. allowing for an understanding of equity. Equity refers to an absence of differences (in health indicators) that are avoidable, unfair, and unjust; in this paper, we focus on differences specifically related to socioeconomic status. [17][18][19] The inclusion of these questions in all DHS and similar surveys has made the wealth index one of the most common measures of equity in health. 20,21 All references to DHS in this article constitute a reference to any party engaged in the collection, analysis, and reporting of the publicly available DHS data and reports.

CONVENTIONAL WAYS OF MEASURING EQUITY IN SOCIAL FRANCHISING
As a community of practice, social franchisors have identified several goals for social franchising and are working together to identify uniform metrics for each goal. 22 One goal, equity, requires an understanding of the socioeconomic status of franchise clients. To identify a practical measure of equity that could be used to inform scale-up of social franchising or modifications to existing strategies, the Social Franchising Metrics Working Group-comprised of franchisors and their donors-has worked together to pilot and choose an appropriate measure. The working group started with a rigorous testing process in which both absolute and relative measures of wealth were piloted. Results from the pilot revealed that the wealth index most closely aligned with the needs of franchisors by providing results that were easier to interpret than other measures. To facilitate a common application of this procedure, the working group created data collection and analysis resources.
Data for the wealth index among social franchising clients are typically gathered through client exit surveys and then compared with the national wealth index generated from DHS data. The analytic methods, described elsewhere, have also been automated in a toolkit that franchisors can use. 21,23 Despite the availability of data collection and analysis resources, large social franchising organizations such as Population Services International (PSI) have found it difficult to systematically and accurately collect wealth index data across its franchises, which in the case of PSI spans 27 countries. Reasons for difficulty are related both to survey implementation as well as replicability of analytic methods.
Gathering wealth index data is simpler than implementing traditional consumption surveys. However, the number of questions needed to capture the variables required for the wealth index, using DHS country-specific questionnaires, range from 25-50. This adds to survey length, particularly in an exit interview context, and can make data collection time consuming. Additionally, several required questions are difficult for data collectors to ask and for clients to answer, especially since exit interviews take place away from the household. Specifically, the following challenges have been identified as being too complicated for client surveys: Respondents have difficulty estimating with confidence the number of hectares of agricultural land their household owns while away from the household. Respondents living in peri-urban or partially built-up rural areas are unable to confidently say whether their household is in an urban or a rural area. This is also difficult for data analysts to determine given that definitions of urban and rural residence vary by country. Questions on household characteristics are intended to be completed by trained interviewers observing the household. However, in a clinic setting, clients often find it difficult to correctly answer questions with long and detailed response-option lists. For example, the standard DHS question on the type of toilet in a household has 13 response options, some of which may seem similar to the respondent (e.g., ventilated improved pit latrines and pit latrines with slabs).
To improve transparency for data analysis and make the method accessible to all types of programs, a toolkit with standard syntax was created. The syntax mimicked the process used by the DHS Program in creating its country indices. Given variability in the DHS procedure, the toolkit's syntax varies from country to country, including in response options, country-specific assets, and differences in treatment of livestock variables.
The challenges discovered in trying to apply a complex analytic method to data intended for program monitoring and improvement raised the question of whether a simpler index could be created. Simplifications, however, may result in less accurate wealth quintile assignment. In this article, we consider the practical advantages of various alternatives to the standard wealth index and assess the extent to which each alternative's wealth quintile assignment agrees with that of the The wealth index, constructed by collecting data on asset ownership, is one of the most common measures of equity in health.
To capture the variables required for the wealth index, DHS surveys need to ask 25-50 questions.
standard wealth index. As program implementers, our primary concern is that our proposed methods pass muster within a larger community engaged in the measurement and use of equity data.

Preliminary Analyses
To arrive at an alternative, simplified measurement approach, we adapted the Delphi method. 24,25 We prepared preliminary analyses, described below, and presented them to an invited group of experts, who assembled in Washington, DC, in February 2015 for a panel meeting. The panel was comprised of 15 collaborators (6 men, 9 women), representing a variety of stakeholders including donors, franchise program implementers, developers of the original wealth index methodology, and others working in public health programs actively engaged in the measurement of equity. Members of the panel were not considered human subjects but collaborators in the analysis. Data used in the analyses, described below, are publicly available and de-identified.
We used the most recent DHS, MIS (Malaria Indicator Survey), or AIS (AIDS Indicator Survey) data from 16 countries to assess the validity of each alternative wealth measure. MIS and AIS surveys are nationally representative, as are the DHS, and related resources are publicly available. 26 The 16 countries were selected based on 2 main criteria: Implementation of a DHS VI survey, in which the original wealth index factor weights had been published on the DHS website 27 by July 2015 The presence of a known social franchise in operation.
For each country, we compared 4 alternative wealth indices (described as A-D below) against the original DHS-calculated wealth index. The alternative indices had fewer variables than the original wealth index for each country. In these comparisons, the DHS wealth index was conceptualized as the ''gold standard,'' and we aimed to determine how reliable each alternative was against this standard.
Two quantitative measures were used: Percent agreement, to determine what percent of individuals were assigned to the same quintile in the alternative measure as they would have been assigned to in the original Cohen's kappa statistic (k), to take into account agreement that could have happened by chance alone.
Percent agreement can range from 0 to 100, while kappa ranges from -1 to 1, where 0 indicates that all agreement is due to chance alone. Researchers have proposed 2 alternative interpretations of kappa as follows: ko0 = no agreement; 0-0.20 = poor; 0.21-0.40 = fair; 0.41-0.60 = moderate; 0.61-0.80 = substantial; 0.81-1.0 = almost perfect. 28 The second alternative interpretation is that k40.75 is considered excellent, as per Fleiss. 29 The original DHS wealth index for each country includes a set of common variables found in the DHS VI questionnaire, as well as countryspecific variables. 30 The original index routinely includes a measure of land size (number of hectares of land owned), whether the household is urban or rural, and number of animals owned, by animal type (cow, goat, chicken, etc.). Wealth indices are created for urban and rural respondents separately and then combined. 31 To create each alternative index, we used a standardized process, beginning with the common variables in the DHS VI questionnaire. 23 First, we recoded all categorical variables to binary variables. For questions with multiple response options (such as type of floor), we recoded each response option as a binary variable (none were merged together). Animal ownership was not recoded to a binary variable if it was entered as a continuous variable for each type of animal owned. We manually removed response options with zero cases, as well as those common variables that were not included in the country-specific questionnaire. We manually included country-specific variables. We then conducted a principal component analysis on all variables, with responses weighted at the individual level, and created a score from the factor weights of the first principal component. Scores were ordered and respondents were divided into 5 equal quintiles. Analyses were conducted using SPSS version 23. Figure 1 indicates which variables are present in each alternative asset index: Alternative A included all common variables that should be present in all DHS VI datasets, including land area and animal ownership. Wealth indices were created for urban and rural respondents separately and then combined. Country-specific assets were excluded.
We used national survey data from 16 countries to assess the validity of simplified asset indices against the standard DHS wealth index.
Alternative B included all common variables in the DHS VI questionnaire, except land area and animal ownership. It excluded countryspecific assets, and a separate urban/rural analysis was not conducted. Alternative C excluded country-specific assets, land area, and the urban/rural analysis, but included animal ownership. Alternative D included country-specific assets. It excluded land area, animal ownership, and the urban/rural analysis.

Consultation With Expert Panel
We presented the 4 alternatives to our panel. Panel members agreed that a shortened index is needed and that achieving the simplest and most practical questionnaires possible in each country was more important than standardizing questions across countries. All panel participants felt that a simplified approach would allow more programs to measure equity, resulting in better decision making and more equitable service delivery. It would also reduce the burden on clients being interviewed.
The panel made several recommendations, which necessitated the creation of another alternative (alternative E). First, the panel advised us to group the respondents into 3 groups. These groups, in order to be relevant to program decision making, were not terciles but rather the lowest 2 quintiles, the middle quintile, and the highest 2 quintiles, representing the relatively poor, middle, and rich. The panel felt that these 3 groupings would have greater face validity than the distinction between clients in the highest and second highest quintile, or between the lowest and second lowest quintiles. The panel also felt that presenting national quintiles alone would provide insufficient information for franchisors located primarily or solely in urban areas. They advised that the simplified set of questions should allow for sub-analyses on residence to determine the distribution of clients across urban wealth quintiles, with sufficient reliability. Further details about the variables included in alternative E are presented in the next section.
The best index of the 5 alternative options would be based upon 2 measures of agreement (percent agreement and Cohen's kappa statistic

Revised Analyses for the Panel-Recommended Approach
To create the simplified wealth index for the new alternative E, we used an iterative process. We began by removing the variables related to land size (hectares) and animal ownership. Then, for each remaining variable, we created a measure of its importance to the overall wealth index by multiplying the absolute value of the factor weight of the first principal component (drawn from DHS documentation) by the standard deviation of that variable. Variables that have larger absolute factor weights explain a greater proportion of the variation in the construct. Multiplying the factor weight by the standard deviation captures variation in ownership within the population, and the overall procedure values variables with high variation. All included variables were binary, so the standard deviations were comparable in their units. To further simplify the index, we looked at each response option within a categorical variable independently. The goal was to create one list of variables that was sufficiently reliable in both the overall population and the urban population.
Specifically, we followed these steps to create the asset index for alternative E: Binary variables, including those constructed from categorical variables, from the original DHS wealth index were listed. These variables were ranked in order of their importance to national wealth index scores and separately ranked in order of importance to the urban wealth index. New wealth index scores were calculated for respondents in the DHS using the 5 most important variables in the overall and urban listings. Thus, up to 10 variables were included in the new wealth index calculation. Respondents were separated into wealth quintiles using the new scores. Respondents in urban areas were also assigned to urbanspecific wealth quintiles.
A cross-tabulation of the bottom 2 quintiles, middle quintile, and top 2 quintiles according to the original wealth index and according to the simplified wealth index was conducted. This allowed the calculation of the percentage of clients assigned to the same quintiles by the original DHS wealth index and the reduced set of variables, along with calculation of the kappa statistic. This was done for both the national wealth quintiles and for the urban wealth quintiles.
Steps 2-4 were repeated until the smallest number of variables were found that met the reliability criteria of kappa Z 0.75 for both the national and the urban samples.
J In cases where the kappa statistic for either the urban or national index or both were below 0.75, the next variable in the list from the distribution with the lower agreement was added, and steps 2-4 were repeated. This process was repeated until both the urban and national indices had kappa statistics of 0.75 or greater.
J In cases where both urban and national indices had a kappa statistic above 0.75, the smallest list of variables was generated. Variables were removed in ascending order of importance, from the distribution with the lower kappa statistic. Steps 2-4 were repeated, until removing further variables resulted in a kappa statistic below 0.75.  In Table 2, we present 2 measures of reliability, the percent agreement and kappa statistic, between the original wealth index and each alternative index A-D. The reliability calculated here compares respondent movement between each of the 5 quintiles. The percent agreement and kappa statistic were highest overall for alternative D (median agreement, 83.26%; median kappa, 0.79). Although alternative B had fewer question types than alternative C, it produced a higher median agreement and kappa statistic (median agreement, 77.90% vs. 76.10%, respectively; median kappa, 0.72 vs. 0.70, respectively).

RESULTS
The subsequent analyses conducted as per panel group recommendations are presented in Table 3 (national) and Table 4 (urban only). The panel wished to compare alternatives B and D to the newly created alternative E, having decided that alternatives A and C were overly prone to respondent error due to the inclusion of questions on the number of animals owned. The comparisons are presented after combining the 5 wealth quintiles into 3 groups, which may be more programmatically meaningful. Consequently, the agreement and kappa statistics for alternatives B and D are greater in Table 3 than in Table 2, where respondent movement between quintile 1 and 2 would indicate error.
The 6 questions chosen for Benin in alternative E, for example, produced a wealth distribution that agrees with the original wealth index 85% of the time among the national population, when the population was grouped into 3 meaningful divisions (Table 3). Despite having fewer questions, alternative E produced a higher kappa statistic in the national distribution than alternative B (including only the DHS core questions) for Bangladesh, Cambodia, Ethiopia, Malawi, Pakistan, and Uganda. Alternative E also produced a higher kappa statistic in the national distribution than alternative D (also including country-specific assets and animal ownership) for Ethiopia and Malawi. Similarly, when looking at the urban-specific distributions (Table 4), alternative E fared better than B in Bangladesh, Cameroon, Malawi, Pakistan, the Philippines, Senegal, Uganda, and Zimbabwe, and better than D in Malawi and Zimbabwe. In Zimbabwe, the effect of choosing variables that are strong predictors of wealth for the urban population was very evidentalternative E was the only one that produced a highly reliable result (k = 0.75 for alternative E; ko0.75 for alternatives B and D). Figure 2 shows the shortened alternative E questionnaires for Bangladesh and Benin (in English). In Benin, it is obvious, even without seeing the factor scores, that some questions are geared toward assessing wealth (e.g., having a DVD player) while others would be strong indicators of poverty (e.g., not having any toilet facility).

DISCUSSION
This paper describes a methodological innovation that simplifies the collection of data to create relative measures of wealth within program populations. The premise behind the simplification process is that the DHS wealth index (both the construction and the resulting distribution) represents the ''gold standard'' for program implementers who need to understand the socioeconomic profile of their clients and beneficiaries. Thus, each alternative was judged against the gold standard, to determine if a sufficiently reliable alternative was possible.
Alternative E, presented here as the chosen, simplified approach because it required the fewest number of survey questions while maintaining a high enough reliability score, is a promising start for program implementers. Rather than advocating for a reduction in the DHS surveys, we acknowledge that quintiles from the standard DHS wealth index are a popular way to stratify populations, and program beneficiaries should be similarly categorized. The shorter questionnaires, however, are faster to use and therefore may improve the use of equity as a metric for internal decision making, as well as further its use as one for external accountability-a vision of the Social Franchising Metrics Working Group.
Other concise approaches to assessing poverty status exist. The Grameen Foundation's Progress out of Poverty Index (PPI) is limited to 10 questions for all available countries and is derived from a household income and expenditure survey. 32 Thus, it is using the 10 easily answerable questions as a proxy of expenditure. It offers the user a probability that the respondent is above or below various poverty lines, thus measuring absolute poverty. A primary selling point of the PPI is the ability to compute the outcome by hand, as all of the scores are whole numbers. As with the approach we present, the questions for each country differ. The absolute measure from the PPI was previously piloted by franchise organizations, but it did not Index E was the simplified index of choice because it required the fewest number of survey questions while maintaining reliability in national and urban contexts. meet their diversity of needs and it was found to be more difficult for program decision makers to interpret than the wealth index.
Rutstein and Staveteig created one unique comparative wealth index, in which all countries with DHS data are benchmarked against Vietnam in 2002. 33 In this measure, the items used to calculate the index are identical across countries, and while the measure is not relative within the country, it is still a measure of wealth relative to Vietnam. The comparative wealth index, however, was not considered to be a viable alternative by members of the expert group, including by the comparative index creators themselves.
We find it an advantage that our method is reliable for both national and urban populations. Previous pilot testing of wealth, benchmarked to the national population, as a metric of equity has indicated that franchised programs are not always able to act upon the results. Many franchise programs are primarily urban and peri-urban. When results have indicated that their clients are from the wealthiest 40% of the population, they have questioned the specificity of the measure and indicated they would like to see how their clients compared with others in the immediate area covered by the franchise network. 34,35 Using the same short list of questions to compute equity in an urban sub-population will allow them this increased contextual information.
It is possible to include further sub-groups beyond the urban population in one of two ways. First, our iterative approach could be replicated to explore whether a shortened list of questions could be found that accurately divides members of the sub-group. A different set of questions than those described here may result. Second, the same short list of questions could be used, but the reference population changed when producing results. In this case, the results may not be as valid (kappa may not be greater than 0.75), but the results would be tailored to the sub-group of interest. One can imagine a large number of different short questionnaires or sub-group analyses that are possible. However, limiting the reference populations to the national and urban populations ensures practicality and comparability. With different evaluations using the same reference population, the level of poverty indicated by wealth quintiles is kept standard. These results are also comparable with data presented in DHS reports using national and urban quintiles. Interpretation of results should keep in mind that national or urban quintiles may not accurately represent quintiles specific to the subgroup eligible for the intervention in question.

Limitations
We identify 3 primary limitations of our approach. First, the DHS surveys, a publicly available data source, are not available in all countries, nor do the surveys occur very frequently. The effect of the age of the source data on the inference to a current population remains to be assessed. Ownership of some assets, such as mobile phones, has rapidly increased in lowincome countries. This can bias the results from a current survey, if, in an older reference population such as Cambodia in 2010, mobile phones were still indicative of being relatively wealthy but are now more pervasive. Second, following the DHS methodology, analysis is weighted to be generalizable to the whole population. However, in practice, exit surveys would be applied to a more narrow target group, such as women of reproductive age. If the target population is not evenly distributed across the 5 quintiles, this may introduce error into the results. Lastly, in shortening the questionnaires, we may have reduced our ability to distinguish between 2 adjacent quintiles. With fewer questions, it may not be possible to easily distinguish between quintiles 1 and 2, as the distribution is ''lumpier.'' For this reason, we grouped the respondents into 3 groups, which seemed more programmatically relevant, when assessing the reliability of the reduced survey. Piloting reduced questionnaires in varied settings may provide insight into whether less variability affects the utility of the findings.

CONCLUSION
It is possible to use a shorter questionnaire to assess relative wealth within a sample that is benchmarked to the national population, and the resultant measure remains highly correlated to the original DHS wealth index. Through the engagement of an expert panel, this research has galvanized a great deal of interest among a variety of franchising programs, many of which are asking for shorter questionnaires to include within other surveys they conduct, such as for client satisfaction, as well as interest from the International Finance Corporation to assess the wealth of their project beneficiaries. The simplified asset questionnaires will also be embedded into a mobile application to make the collection and analysis of these data easier. (The shortened form of all questionnaires can be found online at www.equitytool.org.) The agreement of our expert panel-a seasoned group of methodologists, program implementers and donors-adds  validity to the proposed methodology. Their conclusion that a simplified approach to assessing wealth is acceptable for programmatic decision making will benefit the use of this measure. As current and former researchers within organizations implementing social franchising, the authors are keenly aware that the measurement of equity, in whatever form, is both desired by, and loathed by, their colleagues. International development organizations exist to serve the underserved, and this measure of socioeconomic status is only one way to define and measure underserved individuals. In many of the countries in which we work, most people are poor in absolute terms. A wealth index, as we have proposed, is relative, and compares those in the same country or subpopulation with each other. The interpretation of results from this simplified method is contextspecific and dependent upon program goals and needs of the eligible population. Future refinements should concentrate on providing both absolute and relative poverty information, in order to improve the understanding of the measure (for example, someone in the wealthiest quintile in Madagascar may still live on less than US$1.25/day) and the justification for providing subsidized services to individuals who appear wealthy on a relative scale.