How Poor Is Your Sample? A Simple Approach for Estimating the Relative Economic Status of Small and Nonrepresentative Samples

The authors demonstrate the simplicity and utility of a method for estimating the relative economic status of small and nonrepresentative samples relative to existing representative reference populations.


Supplement 1. Formal description of the estimation and prediction approach
Let  ̂ describe the estimated wealth index score of household , defined as a linear combination (or some transformation of a linear combination) of a vector of household characteristics   and their average contributions (or weights)  to household wealth.
The vector  is commonly derived from some reference population either by regressing household characteristics on economic indicators such as expenditures, or using principal components analysis, though other approaches to weighting exist.Under the assumption that the associations between components of the index and the construct it intends to measure (e.g., household wealth) are the same in both populations, weights derived from one population represent unbiased estimates of  ̂ for the other, and thus can be applied to calculate a wealth index score  ̂ for each household  in the target population.

𝑊 ̂𝑗 = 𝒘 ̂𝒙𝒋
If the weights of all components of the wealth index in the reference data were known, and if all components were equally observed in the reference and target populations, the wealth index could simply be calculated for each household  in the target population using the vector .
However, if the weights are not known, or not all variables comprising the index are observed in the target data, the weights need to be estimated.The next step therefore involves the development of a model that provides a good fit of the wealth index in the reference data but includes only variables available in both the reference and target data.The vector of parameter Supplement to: Ostermann J, Hair N, Grzimek V, et al.How poor is your sample?A simple approach for estimating the relative economic status of small and nonrepresentative samples.Glob Health Sci Pract. 2023;11(2):e2200394. https://doi.org/10.9745/GHSP-D-22-00394estimates  represents an estimate of  ̂, i.e., the contribution of each covariate to household wealth  ̂.
In the last step, data on household characteristics and the estimated contribution of each characteristic to household wealth are combined to generate an out-of-sample prediction of household wealth  ̂ for each household  in the target sample.Specifically, a wealth index score  ̂ can be generated for each household  as a linear combination of characteristics   and the corresponding parameter vector  derived in step 2: Supplement to: Ostermann J, Hair N, Grzimek V, et al.How poor is your sample?A simple approach for estimating the relative economic status of small and nonrepresentative samples.Glob Health Sci Pract. 2023;11(2) global assetlist doit hv206 "Electricity" doit hv207 "Radio" doit hv208 "Television" doit hv209 "Refrigerator" doit hv210 "Bicycle" doit hv211 "Motorbike" doit hv212 "Car" doit sh121h "Iron" doit hv243e "Computer" doit hv243a "Mobilephone" doit hv247 "Bankaccount" doit hv244 "AgricultLand" doit hv246 "Livestock" doit hv243c "Animalcart" Supplement to: Ostermann J, Hair N, Grzimek V, et al.How poor is your sample?A simple approach for estimating the relative economic status of small and nonrepresentative samples.Glob Health Sci Pract. 2023;11(2) red%50) lwidth(vthin) xlab(0(1)10) ylab(0(10)50) xtitle("Distribution of actual (dashed) and predicted (solid) DHS wealth index scores across deciles", size(small) margin(small)) xmtick(0(1)10) xla(0(1)10) ytitle("% of households", size(small)) scheme(s1color) legend(order(2 "DHS actual" 1 "DHS predicted" 3 "Target sample") rows( 1 Supplement to: Ostermann J, Hair N, Grzimek V, et al.How poor is your sample?A simple approach for estimating the relative economic status of small and nonrepresentative samples.Glob Health Sci Pract. 2023;11(2) ) region(lc(white)) symxsize(*.5)symysize(*.6)size(*.6))name(dhs_target, replace) saving(dhs_target, replace) exit Supplement to: Ostermann J, Hair N, Grzimek V, et al.How poor is your sample?A simple approach for estimating the relative economic status of small and nonrepresentative samples.Glob Health Sci Pract.2023;11(2):e2200394. https://doi.org/10.9745/GHSP-D-22-00394Supplement 3. Correlates of the DHS wealth index in the 2016 Tanzania Demographic and Health Survey (T-DHS) and characteristics of participants in the 2017-18 Identifying and Matching HIV/AIDS Counseling and Testing (IMPACT) study Notes: Estimated means or regression coefficients, with standard errors in parentheses.Regression coefficients estimated using a survey regression model with the DHS wealth index factor score as the dependent variable and continuous (persons per sleeping room, # of rooms used for sleeping) and binary indicator variables (all other household characteristics) as explanatory variables.*, **, and *** indicate statistical significance at the 0.05, 0.01, and 0.001 levels, respectively.Std.Err.-Standard Error; Coeff.-Coefficient 2015-Supplement to: Ostermann J, Hair N, Grzimek V, et al.How poor is your sample?A simple approach for estimating the relative economic status of small and nonrepresentative samples.Glob Health Sci Pract.2023;11(2):e2200394. https://doi.org/10.9745/GHSP-D-22-00394Supplement 4. Comparison of the sample distributions across wealth quintiles for the reference and target samples Notes: For comparability the percentages shown do not account for Demographic and Health Survey sampling weights.
(solid)  and predicted (dashed) DHS wealth index scores

Supplement 2. Stata code for all key steps outlined in Sample Application 1
:e2200394.https://doi.org/10.9745/GHSP-D-22-00394Note: With the appropriate specification of the data file and path names, the code shown below should run in Stata as-is *

identify your preferred model based on changes in RMSE and R2 Supplement to:
Ostermann J, Hair N, Grzimek V, et al.How poor is your sample?A simple approach for estimating the relative economic status of small and nonrepresentative samples.Glob Health Sci Pract.

error distributions Supplement to:
Ostermann J, Hair N, Grzimek V, et al.How poor is your sample?A simple approach for estimating the relative economic status of small and nonrepresentative samples.Glob Health Sci Pract.2023;11(2):e2200394. https://doi.org/10.9745/GHSP-D-22-00394for var diff*: dotplot X, name(X) graph combine diff_wi diff_pctile diff_quintile diff_rank /** OUT-OF-SAMPLE PREDICTIONS **/ pause: make sure you choose the right estimation model for predictions estimates restore wealth_model estimates replay wealth_model *apply component weights to the corresponding variables in the target data *Make sure that variable names and definitions are identical to those specified *in the wealth model that was estimated using the reference data *number of buckets/bins for visualization of distributions global BINS 10 *Reference

sample: Distribution of actual and predicted scores
Ostermann J, Hair N, Grzimek V, et al.How poor is your sample?A simple approach for estimating the relative economic status of small and nonrepresentative samples.Glob Health Sci Pract.Ostermann J, Hair N, Grzimek V, et al.How poor is your sample?A simple approach for estimating the relative economic status of small and nonrepresentative samples.Glob Health Sci Pract.2023;11(2):e2200394. https://doi.org/10.9745/GHSP-D-22-00394 *Distribution of predicted (rescaled) WI across wealth bins preserve collapse (sum) share_pred=one (mean) N , by(xtile_pred) Supplement to:*Target

Supplement 6. Correlates of the DHS wealth index in five Demographic and Health Surveys (DHS) and the Positive Outcomes for Orphans (POFO) study
Ostermann J, Hair N, Grzimek V, et al.How poor is your sample?A simple approach for estimating the relative economic status of small and nonrepresentative samples.Glob Health Sci Pract.2023;11(2):e2200394. https://doi.org/10.9745/GHSP-D-22-00394

Supplement 7. Distributions of actual and predicted DHS wealth indices across POFO study sites
Note: Wealth index scores in each setting were re-scaled to range from 0-10.