Welcome to Helping Science
This is a website for processing herbarium specimen sheets using citizen science.
Currently we are conducting a closed beta testing of our services before any official release.
If you are a Herbarium and are interested in using our services to process specimen sheets or if you are a volunteer that is looking to help test and improve our services please send an email to firstname.lastname@example.org and we send you more information on how to participate.
How Helping Science Works:
Citizen scientists sign in to the website to provide three different tasks. The first role is to identify the location of all labels and determinations on a given specimen sheet, the second is to identify the words that make up fields of a label, and the third is to type in the text values of each field image.
Once labels have been identified on a specimen sheet, using a mouse to outline the borders, a label image is created and sent to Evernote (http://www.evernote.com) for optical character recognition (OCR).? They return the position of every word, all the permutations of each word, and if the label is handwritten or typed.? We use this information for making educated guesses and to help in expediting the field tagging process.? We try to focus more on human input for accuracy and only use the OCR information as a secondary source.
Each part of the specimen label itself, whether it is the scientific name, date, country, etc., is parsed into associated DwC fields. These tags are assigned by a human using a simple click and drag interface. Once this is completed for a label, each marked field is created into individual images so they can be processed in parallel.
Each tagged field will be examined by three or more distinct users or citizen scientists who all input what they think the field image says. Typing the words is the most time consuming so we are trying different game style interfaces to see which type of game gives us the best response. No user will see the same field image twice. Each field image is circulated until enough people type in the same value, which gives a measure of accuracy. When the predetermined level of accuracy has been reached, the value for the field is accepted. Once all the field values are verified for a given label a DarwinCore record is created.All processed data runs through a series of taxonomic and geographic validations.? Any issues are reported to the collection manager for review.? All data will be available in a variety of formats including DwC.
Last Updated: April 7th, 2010
- Stage 1 - Identifying & Marking Label Locations on Specimen Sheets
- Stage 2 - Identifying Darwin Core (DwC) Fields on a Label
Presented at TDWG 2009: http://www.silverbiology.com/blog/2009/11/17/tdwg-2009-summary/
Revisied Project Overview (4/15/2010) (PowerPoint ~1MB)
PowerPoint Overview: http://www.tdwg.org/fileadmin/2009conference/slides/Giddens_SilverBiology.ppt (PowerPoint)
This website is owned and operated by: SilverBiology www.silverbiology.com
* More information on the SilverArchive engine can be found at: http://www.silverbiology.com/products/silverarchive/