Yale Environment Review

Yale Environment Review (YER) is a student-run review that provides weekly updates on environmental research findings.

Citizen Science: Using big data to track birds

Original Paper:
Kelling, S., et al. "Taking a 'Big Data' approach to data quality in a citizen science project." Ambio 44 (Suppl. 4) (2015): S601-S611. DOI: http://dx.doi.org/10.1007/s13280-015-0710-4

Monday, December 12, 2016

Anyone with a smartphone can add to the collective understanding of science, including data that can help us understand changes in animal behavior. But how good is the data they collect and is it usable?

Where do the birds go? We must understand the movements and habitat needs of bird species in order to protect and conserve them. This requires massive logistical efforts to take measurements over large areas and timescales, which is almost always difficult and impractical for individual researchers with limited resources. However, with the advent of smartphones, normal people can work together and contribute to our understanding in what are called "citizen science" projects.
 
One citizen science project, organized by Cornell University, is called eBird. It has collected 260 million bird observations from 250,000 users to better understand species movements across the world. A paper led by Steve Kelling at Cornell, and published in the journal Ambio, shares best practices to improving the data quality from citizen science projects.  The research team reported on three aspects of the eBird data analysis project: 1) the quality of the submitted data, 2) how to identify differences in user skill, and 3) how to fill in the gaps using models. By doing so, they developed a method to better determine how many birds will be at any place at any given time.
 
The eBird database uses a combination of computer and human review before accepting any bird observations from individual users. The user data are compared to nearby submissions for similarity and plausibility. For instance, it would be suspect if a user claimed to see an emperor penguin in Connecticut. If a submission is deemed unusual by the computer, it is flagged for expert review. This happens to about 5 percent of entries per year. All submitted data are kept, except for those data that do not pass expert review, which are excluded from analysis. There is a feedback loop between humans and computers prior to data being integrated into the database. That is, more observations give better points of comparison for new submissions, which leads to higher quality measurements and better bird movement predictions.
 
Users of eBird were evaluated based on their birding skill — that is, how many bird species they identified per hour. This number acts as a calibration to determine which people are better birders and therefore produce more reliable data. Some users can spot 60 or 80 birds in the same period of time it takes others to identify fewer than 20; those rapid identifiers also turned out to more easily identify the difficult species. The researchers plan to use this calibration to give more weight to better birders.
 
Finally, even though citizen science projects provide much more widespread information than from individual researchers alone, they still can't measure everything. As such, there will always be gaps in the database. For example, there are more observations near cities where people live and fewer observations in winter. To solve their data gaps, the researchers built a model to simulate species distribution. This model combined satellite data with user observations. If a bird was found in a certain land type, the researcher's model increased the likelihood of a bird species being present there — even if there was no observation. The resulting species model can be used to ask and test questions about which bird species prefer which environments. This greatly aids in conservation efforts by efficiently focusing limited resources.
 
The Nature Conservancy successfully applied the species model during bird migrations. They used the information it generated to identify California fields with the best habitat for migratory shorebirds. Using this knowledge of where the birds would likely land, the Nature Conservancy paid farmers to flood their fields during the birds migration. This resulted in the lease of almost 40 square miles of high-quality habitat without the need to purchase the land.
 
The ease of submitting to eBird allow anyone to provide information that helps everyone understand bird movements and gain ecological insights about how they may change as  climate change continue to accelerate. This citizen science has large implications for researchers, as there is no way the individuals could have measured all the birds themselves that were entered into the model. The impact of citizen science will continue to grow in the future as scientists and the general population collaborate to answer questions never thought possible in the age of the doomed passenger pigeon.