OpenIntro Statistics
Behavioral Survey (cdc)

The Behavioral Risk Factor Surveillance System (BRFSS) is an annual telephone survey of 350,000 people in the United States collected by the Centers for Disease Control and Prevention (CDC). As its name implies, the BRFSS is designed to identify risk factors in the adult population and report emerging health trends. For example, respondents are asked about their diet and weekly physical activity, their HIV/AIDS status, possible tobacco use, and even their level of healthcare coverage. The BRFSS Web site contains a complete description of the survey, the questions that were asked and even research results that have been derived from the data.

This data set is a random sample of 20,000 people from the BRFSS survey conducted in 2000. While there are over 200 questions or variables in this dataset, this data set only includes 9 variables.

Data Source: US CDC website.

  • genhlth: A categorical vector indicating general health, with categories excellent, very good, good, fair, and poor.
  • exerany: A categorical vector, 1 if the respondent exercised in the past month and 0 otherwise.
  • hlthplan: A categorical vector, 1 if the respondent has some form of health coverage and 0 otherwise.
  • smoke100: A categorical vector, 1 if the respondent has smoked at least 100 cigarettes in their entire life and 0 otherwise.
  • height: A numerical vector, respondent's height in inches.
  • weight: A numerical vector, respondent's weight in pounds.
  • wtdesire: A numerical vector, respondent's desired weight in pounds.
  • age: A numerical vector, respondent's age in years.
  • gender: A numerical vector, respondent's gender.
CSV Download