Possible Sources of Project Data
The following sources may be useful for locating data for your project.
- LTER Data Portal: A comprehensive search engine for data from all NSF Long-Term Ecological Research sites.
- NTL-LTER: A variety of primarily aquatics-related datasets from the North Temperate Lakes Long-Term Ecological Research site in Northern Wisconsin.
- Hubbard Brook: 50+ years of continuous stream temperature and chemistry data for Hubbard Brook.
- DRYAD: An international repository of data underlying scientific and medical publications.
- figshare: A repository for research output [Use search and then set “type” to “Dataset.”]
- dataverse: A Harvard arhive to share research data from many fields.
- Data is Plural Archive: A structured archive a data stories mentioned in the Data is Plural e-newsletter.
- Christmas Bird Count: A large database of annual counts of many bird species at many locations for many years.
- Journal of Fish and Wildlife Management: Most (all?) articles published in this journal contain the raw data as supplemental information that appears to be open-source. If you find data here that sounds interesting to you but you can not access then let me know.
- Journal of Scientific Data: A searchable database of scientific data.
- GEMStat Water Quality Data: A browseable database of world-wide water quality data.
- Isle Royale Wolf-Moose Data: A spreadsheet of population data for the long-term wolf-moose study on Isle Royale.
- Forest Fires Data: Data about many forest fires in Portugal.
- ORNL DAAC: A wide variety of “Earth data” at the Oak Ridge National Laboratoroy Distributed Active Archive Center.
- U.S. Census data: Data from the U.S. Census Bureau.
- CDC Wonder Data: Public health data available through the Center for Disease Control.
- areavibes.com: Find a wealth of information about each city that you select (you will have to drill down a bit to get the actual data).
- GasBuddy.com: Search for current gas prices for any U.S. city.
- USGS Water Data: A large database of water data from throughout the United States.
- Environmental Performance Index: A database of metrics used to derive the Environmental Performance Index for nations around the world.
- Bridge: An “ocean” of free marine education resources, including links to various NOAA databases.
- Internet Crossroads in Social Science Data: An annotated list with links to over 825 data-related resources on the internet.
- HealthData.gov: A large compilation of health-related data sets.
- NOAA Climate Data: A large compilation of climate-related data.
- Air Data: Air quality data collected by the EPA.
- UNICEF: A large compilation of data related to women and children around the world.
- Living Planet Index: A database for measuring the state of the world’s biological diversity.
- Walmart sales data: A database of information about a sample of Walmart stores and their sales. [I can help you merge the various datasets available here.]
- fivethirtyeight data: A list of data behind articles on fivethirtyeight.com.
- Buzzfeed data: A list of data behind articles on buzzfeed.com.
- socrata.com: A compendium of open-source clean data files.
- kaggle.com: A list of datasets used in Kaggle data analysis competitions.
- datahub.com: Lots of open-source datasets (will likely need to search to find something of interest).
- sports-reference.com: A group of sites providing basic statistics and resources for sports fans.
- MLB.com: Lots of data on major league baseball.
- Forest Service Data Catalog: A list of data files from the USDA Forest Service.
- Open Units: Open dataset containing units of alcohol in branded drinks in a variety of standard servings.
- GDP by County: Gross domestic product data by U.S. county distributed by the Bureau of Economic Analysis (BEA).
- UMESC Fisheries: A database of fisheries information provided by the Upper Midwest Environmental Science Center of the U.S. Geological Survey. This is largely related to data from the Mississippi R.
- Incarceration Trends Dataset: County-level jail data (1970-) and prison data (1983-).
- Monitoring the Future (MTF) Public-Use Datasets: A continuing study of the lifestyles of American youth since 1975.
- Professional Disc Golf Association Standards Data: Data sets for approved disc golf discs and disc golf targes.
- Bird Egg Shape: Data regarding the shape of bird eggs.
- National Phenology Network: Phenology data from citizen scientists around the U.S.
- Snap Chat Ads: Snapchat has released detailed data about every political ad purchased on its platform in 2018 and 2019. For each ad, the information includes its targeting parameters (age, gender, location, interests, internet service provider, device operating system, and more), the dates the it ran, the amount spent, number of impressions, and a link to the ad itself.
- Palm Traits Database: Species-level functional trait database for palms worldwide.
- New York City Squirrels: Data from the census of gray squirrels in New York City.
- Uniform Crime Reporting Statiistics.
- U.S. Bureau of Labor Statistics.
- College Tuition Comparer.
- National Center for Education Statistics, Common Core of Data.
- United Nations Data.
- MacroTrends Stock Market Data.
- Chemical Weapons Attacks in Syria.