Datasets for Epidemiologic Research
Short Name (Acronym) |
Full Name |
Sponsor |
Total N |
Dates Available |
Sample Selection |
Data Restrictions |
Data Summary |
|---|---|---|---|---|---|---|---|
ACS |
American Community Survey |
US Census Bureau |
Varies |
2005 - 2014 |
US Survey Sample |
Publicly Available |
Ongoing survey that provides vital information on a yearly basis about US population |
All of Us Research Program |
All of US Research Program |
NIH |
>500,000 |
2017 - Present |
US diverse cohort |
Publicly available with registration |
Ongoing cohort with vast survey, HER, and biologic data |
BRFSS |
Behavioral Risk Factor Surveillance System |
Center for Disease Control (CDC) |
Varies (~70,000 observations/year) |
1984 - Present |
US landline random sample |
Free and publicly available |
Collects prevalence data among adult U.S. residents regarding their risk behaviors and preventive health practices |
CDC-Wonder |
Wide-Range Online Data for Epidemiological Research (Wonder) |
CDC |
Varies |
Varies |
US - Population based |
Free and publicly available |
Various public health data available. Data ranging from AIDS, births, to cancer |
CMF |
Compressed Mortality File (CMF) |
CDC |
Varies |
1968 - 2014 |
US - Population based |
Free and publicly available |
County-level national mortality and population database spanning the years 1968-2014 |
DHS |
Demographic and Health Surveys |
U.S. Agency for International Development (USAID) |
Varies by country |
1984 - Present |
Survey samples for various countries |
Free and publicly available |
Nationally (from multiple countries) representative samples (usually between 5,000 and 30,000 households) |
HRS |
Health and Retirement Study |
NIH National Institute on Aging |
~20,000 |
1998 - Present |
US Longitudinal survey sample (Adults 50+) |
Publicly available (must register) |
Examines temporal changes in work and health among older adults |
NHAMCS |
National Hospital Ambulatory Medical Care Survey |
CDC |
Varies (large) |
1992 - 2022 |
US population based |
Publicly available |
NHAMCS collected data about medical services provided in hospital emergency and outpatient departments. NHAMCS data can answer questions about hospitalbased medical care for patients in these settings (ambulatory medical care) |
HINTS |
Health Information National Trends Survey |
NCI-NIH |
~5000 participants per year |
2003 - Present |
US population based |
Publicly available |
The Health Information National Trends Survey (HINTS) collects nationally representative data routinely about the American public's use of cancer-related information. The survey: Provides updates on changing patterns, needs, and information opportunities in health Identifies changing communications trends and practices Assesses cancer information access and usage Provides information about how cancer risks are perceived Enables researchers to test new theories in health communication |
SEER |
Surveillance, Epidemiology, and End Results Program |
NCI-NIH |
100,000+ Cases per year |
1969 - Present |
Sample of 21+ US cancer registries |
Publicly available |
The Surveillance, Epidemiology, and End Results (SEER) Program provides information on cancer statistics in an effort to reduce the cancer burden among the U.S. population. SEER is supported by the Surveillance Research Program (SRP) in NCI's Division of Cancer Control and Population Sciences (DCCPS) |
UK HealthCare Data |
UK Center for Clinical and Translational Science |
University of Kentucky |
Varies |
2004 - Present (UK HealthCare) |
Patients within UK healthcare system |
UK faculty and students |
|
HCUP Databases |
Healthcare Cost and Utilization Project |
US Department of Health & Human Services |
Varies (~7 million hospital stays/year) |
1997 - 2013 |
Pre-2012: Sample of US hospitals from which all discharges were retained Post-2012: Sample of discharge records from all HCUPparticipating hospitals |
1. Free and publicly available | 2. Purchasable and more specific data. |
Can identify, track, and analyze national trends in health care utilization, access, charges, quality, and outcomes |
NGHIS |
National Historical Geographic Information System |
University of Minnesota |
Varies |
1790 - Present |
US Census and other surveys |
Free and publicly available |
Free online access to summary statistics and GIS boundary files for U.S. censuses and other nationwide surveys from 1790 through the present |
NHANES |
National Health and Nutrition Examination Survey |
CDC |
Varies |
1959 - Present |
US Survey Sample |
1. Free and publicly available | 2. Purchasable and more specific data. |
Program of studies designed to assess the health and nutritional status of adults and children in the United States |
NEISS |
National Electronic Injury Surveillance System |
US Consumer Product Safety Commission |
Varies – obtained from >100 US Hospitals |
1979 - Present |
National probability sample |
Free and publicly available |
Information extracted from medical charts, including patient demographics (i.e., age, sex, and race) and injury information |
Cohort Name |
Total N |
Dates of Enrollment |
Years of Follow-up |
Race (%) |
Sex (%) |
Age at Enrollment |
Region/Area |
Sample Selection |
|---|---|---|---|---|---|---|---|---|
Atherosclerosis Risk in Communities Study (ARIC) |
15,792 |
1987 – 1989 |
2014 (27) |
Black (27.3) White (72.6) |
Male (44.8) |
45 – 69 |
1. Suburban Minneapolis, MN 2. Jackson, MS 3. Forsyth County, NC 4. Washington County, MD |
Random |
Rochester Epidemiology Project (REP) |
>493,606 |
1966 - present |
Present (49) |
Black (4.8) White (85.7) |
Male (48.9) |
All Ages |
Southeastern, MN – Rochester, Olmsted County, MN-based |
Population-based |
Health, Aging, and Body Composition (Health ABC) Study |
3,075 |
March 1997 – July 1998 |
Black White |
70 – 79 |
1. Memphis, TN 2. Pittsburgh, PA |
Random |
||
The Cardiovascular Health Study (CHS) |
~5,888 |
1989 – 1999 |
1999 (10) |
Black White |
65+ |
1. Allegheny County, PA 2. Forsyth County, NC 3. Sacramento County, CA 4. Washington County, MD |
Random |
|
Jackson Heart Study (JHS) |
5,301 |
2000 - present |
Present (15) |
Black participants only |
35 – 84 |
Jackson, MS |
17% Random 22% Volunteer 31% ARIC 30% Secondary Family Members, MS |
|
Framingham Heart (Offspring) Study |
5,124 |
1971 |
2014 (43) |
Male (48.5) |
All ages (0 – 70) |
Framingham, MA |
Population-based |
|
REasons for Geographic and Racial Differences in Stroke (REGARDS) cohort |
30,239 |
2003 - 2007 |
Present (12) |
59% White 41% Black |
55% Female 45% Male |
45+ |
US Population sample |
Population-based |
NIH-AARP |
567,000 |
1995 – Present |
Present |
4% Black |
50% Male |
62+ |
US sample |
Population-based |