Often, public health interventions and evaluations lack scientifically sound baseline data due to sampling methods and sparsely populated counties. One standard solution is to collapse several years of data to reach a direct measure. This method does not allow, however, for rigorous evaluation of intervention impacts. Our modeling approach focuses on combining sparsely collected health data and developing modeling techniques which allow us to refine this data temporally and spatially. We then use the resultant data in additional machine learning models, which focus on more complex health-related issues, which may be associated with other areas, including economics, the environment, social outcomes, and food security. Our work provides regional and state public health agencies, as well as other health organizations, with baseline county level data never available before. These county level data can aid policy leaders, funders, researchers, and public health decision makers in their efforts to assess, improve, and monitor health across the United States. Our BRFSS modeling work uses population synthesis and microsimulation techniques to estimate health indicators at a county level. Our recent focus has been on applications to data for the state of Idaho – but our ongoing research is exploring alternative regions, additional health indicators, and differing techniques (geographically weighted regression, simulated annealing) to improve performance. Our work on this effort can be viewed on this website.
Current BRFSS projects include:
BRFSS modeling for the State of Idaho: in 2019, we developed an iterative proportional fitting (ipf) modeling approach to downscale health district level data (obesity, overweight, diabetes) to a county level, for the state of Idaho. See our Fall 2021 presentation to the state of Idaho. We additionally have a manuscript in review at Population, Space, and Place.
BRFSS COVID-19 health disparities microsimulation: Expanding on our previous work, this project uses spatial microsimulation/iterative proportional fitting to estimate a wide range of health parameters that are associated with COVID-19. Results of this project can be found on this website.
BRFSS Modeling and COVID-19 Indicators Modeling: Our BRFSS COVID indicators project, which starts in 2022, expands upon our microsimulation of BRFSS data to construct COVID-19 indicators which can then be used to model and predict COVID-19 fatality rates, deaths, and cases.
Idaho Tobacco Modeling: Our team has a multi-year project, funded by the state of Idaho, to examine tobacco usage using 2021 BRFSS data.
Materials and other Documents
BRFSS Microsimulation Approach
In order to facilitate our spatial microsimulation methodology, the state of Idaho’s Department of Health and Welfare (IDHW) partnered with the University of Idaho’s Institute for Modeling, Collaboration, and Innovation (IMCI - https://imci.uidaho.edu) to develop techniques for modeling county-level estimates. The basis for this data analysis was survey information taken from the Behavioral Risk Factor Surveillance System (BRFSS), an annual survey which is conducted in all 50 states, the District of Columbia (DC) and U.S. territories and initiated by the Centers for Disease Control and Prevention (CDC). Established in 1984, BRFSS is a telephone survey instrument which attempts to document health risk behaviors and health care risk/access across the entire United States. Individual states administer the BRFSS, which allows some level of customization, while maintaining survey consistency for across-state comparison and aggregate analysis (Centers for Disease Control and Prevention (CDC), 2022). BRFSS results are used extensively at a state and federal level, primarily for developing/planning interventions, education and training, evaluating public health impacts, as well as policy development (Figgs et al., 2000). On average, approximately 400,000 individuals nationwide complete the survey annually. Idaho’s rural communities are typically sparsely populated, mountainous regions that are considerably distant from urban centers. Of the forty-four counties, approximately nine (9) are below 5,000 in population, with only three (3) above 100,000.
A key aspect for BRFSS is the representative sampling strategy. Given limited resources and available funding, most states (including Idaho) construct a sampling methodology which conforms to regionalized health districts. While this approach has value in terms of generalized health policy implementation, it fails to provide data at the spatial scale required to assess critical health needs and plan for effective strategic interventions. For the state of Idaho, survey sampling is based on seven (7) health districts.
Our modeling approach constructs a modified small area estimation (SAE) technique. SAE is a methodology which involves the estimation of parameters for small sub-populations which are typically a part of a larger whole (Ballas et al., 2007; Whitworth et al., 2017). SAE follows traditional model prediction and imputation structures: a model is constructed using available survey information, with coefficients applied to a micro area of interest. Often utilized in situations where the sample size of a micro area may be too small to construct accurate estimates, SAE can be used, in combination with associated ”constraining” data (e.g. age, sex, race, education), to estimate the proportional sample size for the sub-population region. While numerous SAE approaches exist (Bishop, 1975; Ghosh, M., Rao, 1994; Hsia et al., 2020), there are generally two main groupings of SAE: 1) regression-based statistical estimation, and 2) spatial microsimulation (SM). While both techniques result in a set of small area central point estimates (with associated boundary ranges), the manner and precision in which the estimates are determined are different.
Two primary sets of data are used for our spatial microsimulation/IPF model construction. 1) 2019 American Community Survey (ACS) data, at a county level (U.S. Census, 2020), is used for our constraining variables. To utilize cross-tabulated population data across a set of ACS variable constraints (education, age, sex, and race), we use cross-tabulated ACS data from the National Center for Health Statistics (National Center for Health Statistics, 2020). 2) Idaho BRFSS Survey responses for 2019. BRFSS survey data consists of individual reported health conditions, health perception and behavior responses, with geographic identification based on Idaho public health districts. Using these two base datasets, we established a transformed data structure for use in our IPF modeling methodology.
- Ballas, D., Clarke, G., Dorling, D., & Rossiter, D. (2007). Using SimBritain to model the geographical impact of national government policies. Geographical Analysis, 39(1), 44–77. https://doi.org/10.1111/j.1538-4632.2006.00695.x
- Bishop, Y. M. . (1975). Discrete Multivariate Analysis: Theory and Practice. Cambridge.
- Figgs, L. W., Bloom, Y., Dugbatey, K., Stanwyck, C. A., Nelson, D. E., & Brownson, R. C. (2000). Uses of Behavioral Risk Factor Surveillance System data, 1993-1997. American Journal of Public Health, 90(5), 774–776. https://doi.org/10.2105/AJPH.90.5.774
- Ghosh, M., Rao, J. N. K. (1994). Small Area Estimation: An Appraisal. Statistical Science, 9(1), 55–76.
- Hsia, J., Zhao, G., Town, M., Ren, J., Okoro, C. A., Pierannunzi, C., & Garvin, W. (2020). Comparisons of Estimates From the Behavioral Risk Factor Surveillance System and Other National Health Surveys, 2011 to 2016. American Journal of Preventive Medicine, 58(6), e181–e190. https://doi.org/10.1016/j.amepre.2020.01.025
- National Center for Health Statistics. (2020). Vintage 2020 postcensal estimates of the resident population of the United States (April 1, 2010, July 1, 2010-July 1, 2020).
- U.S. Census Bureau. (2020). American Community Survey, 5 Year Estimates. https://www.census.gov/
- Whitworth, A., Carter, E., Ballas, D., & Moon, G. (2017). Estimating uncertainty in spatial microsimulation approaches to small area estimation: A new approach to solving an old problem. Computers, Environment and Urban Systems, 63, 50–57. https://doi.org/10.1016/j.compenvurbsys.2016.06.004