There are lots of ways to participate at Data Rescue Pittsburgh, based on your interests and expertise!
Activities: Seeders canvass the resources of a given government agency, identifying important URLs. They identify whether those URLs can be crawled by the Internet Archive’s web crawler. Using the EDGI Nomination Chrome extension, Seeders nominate crawlable URLs to the Internet Archive or add them to the Archivers app if they require manual archiving.
Activities: Researchers review “uncrawlables” identified during Seeding, confirm the URL/dataset is indeed uncrawlable, and investigate how the dataset could be best harvested. Researchers need to have a good understanding of harvesting goals and have some familiarity with datasets.
Activities: Harvesters take the “uncrawlable” data and try to figure out how to actually capture it based on the recommendations of the Researchers. This is a complex task which can require substantial technical expertise, and which requires different techniques for different tasks.
Activities: Baggers do some quality assurance on the dataset to make sure the content is correct and corresponds to what was described in the spreadsheet. Then they package the data into a bagit file (or “bag”), which includes basic technical metadata, and upload it to the final DataRefuge destination.
Activities: Describers create a descriptive record in the DataRefuge CKAN repository for each bag. Then they link the record to the bag and make the record public.
Activities: You will record stories about the importance of climate and environmental data on our everyday lives and share this work on social media as well as document the event.
Recommended Skills: Consider this path if you’re on social media (Facebook, Instagram, Twitter, whatever), if you can use Storify, if you have good listening and writing skills, and/or if you can make creative and engaging materials.
This event requires individuals who are interested in participating in the meta-narrative of Data Rescue events. Below are some sub-paths.
Activities: Surveyors identify key programs, datasets, and documents on Federal Agency websites that are vulnerable to change and loss. Using templates and how-to guides, they create Main Agency Primers in order to introduce a particular agency, and Sub-Agency Primers in order to guide web archiving efforts by laying out a list of URLs that cover the breadth of an office.
Activities: Help groups at this event document their workflows, or improve the DataRefuge documentation to make it clearer and easier to use.