As part of our commitment to keeping data open, accessible, and available, the University of Minnesota Libraries is hosting a DataRescue Event in the Twin Cities on February 24 and 25. Join us in our efforts to capture and archive federally produced datasets to ensure they remain available.
Data that exist only on government websites have been unavailable during government shutdowns and historic data can be lost if archival practices change over time.
The University of Minnesota Libraries is committed to making data open, accessible, and available over time. As part of a national effort to find and preserve federal data, the University of Minnesota Libraries and the Liberal Arts Technologies and Innovation Services are hosting a DataRescue event.
Friday, February 24 from 1:00 to 6:00 p.m.
Saturday, February 25 from 10:00 a.m. to 6:00 p.m.
Humphrey School of Public Affairs, 50B
We are in need of researchers, tech-savvy coders, archivists, librarians, and passionate community members to help capture and archive federally produced vulnerable datasets.
If you have any questions about anything related to DataRescue-Twin Cities, please email us at firstname.lastname@example.org.
If you would like to join us, below are the paths you can choose to work in during the event according to your skills and interest. There will be guides/leaders for each path to guide you in your work, and people to help get you to the right place when you arrive.
- Seeding & Sorting Path (to feed the End of Term Archive): This is the widest path and requires a variety of skill levels. Consider this path if you are a coder, hacker, have front end web experience, or just have a great attention to details.
- Researchers/Harvesters: Consider this path if you’re a hacker or have exceptional technical skills. You could also consider joining the IoTFuse or Social Data Science groups. These two paths work very closely together and could be done by one person with the necessary skills, so we’ve listed them together. Researchers inspect the "uncrawlable" list to confirm that seeders' assessments were correct (that is, that the URL/dataset is indeed uncrawlable). Harvesters try to figure out how to actually capture it based on the recommendations of the Researchers.
- Checkers (includes a few people from the Baggers path): Consider this path if you have experience working with scientific data (particularly climate or environmental data) or with creating metadata. Trained librarians and scientists will be very helpful on this path.
- Baggers (for datasets that can’t be captured as WARC files): Consider this path if you’ve used WGET or another web scraping technology or are confident you’re a quick study. Command line experience of all sorts is useful.
- Toolbuilders (addressing larger issues and creating better ways to streamline this process) : Consider this path if you’re a hacker. If you have these exceptional technical skills, consider joining the IoTFuse or Social Data Science meetups listed above.
We are also looking for volunteers interested in serving as a guide/leader for some of these paths/tasks. For those interested in serving as guides/leaders, you would need to attend a brief training on Friday (2/24) morning from 10:00 a.m. to noon, and then commit to attend a four-hour shift of the event that Friday afternoon or Saturday. If this sounds like something you’d be interested in, please sign up on our volunteer spreadsheet!
If you are interested in helping out, but can't make the training, can't attend for that long, or don't wish to be a guide, feel free to show up when you can.
Frequently Asked Questions
|Q: How much time should I plan to be at the event?|
|A: This is a drop-in event where you can spend as little or as much time.|
|Q: If I’m coming from off-campus, where is the best place to park?|
|A: A circular drive in front of Humphrey is available for drop off and unloading. The 19th Avenue Ramp is just across the street from the Humphrey Center. The 21st Avenue Ramp is just south of the adjacent Carlson School of Management. Additional parking facilities can be viewed at University Parking and Transportation Services. Also consider taking public transit.|
|Q: If I am not particularly tech-savvy, is there still a job for me?|
|A: The Seeder & Sorting Path is the best path in DataRescue for those who don’t have high level technical and coding skills.|
|Q: What do you mean by “vulnerable” data?|
|A: We are referring to federally produced data that are hosted on government websites. We are particularly interested in federal data that researchers are using in their work. We consider this data vulnerable since their availability can change based on government funding, administrative policy, and the government's perceived usefulness for the data. In addition, data that only exist on a government website have been unavailable during government shutdowns and historic data can easily be lost if archival practices change over time.|
|Q: Are there other DataRescue efforts going on that weekend in the Twin Cities?|
|A: In conjunction with the efforts at the University, IoTFuse and Social Data Science, external groups interested in data science and improving access and usability of data, will be using their advanced technical and coding skills to assist with harvesting "uncrawlable" federal data, creating an application to address federal data that is updated after it is harvested for crawling, and other issues requiring advanced technical knowledge. Those interested in participating at this skill level have the option to get involved at their St. Paul meetup location on Saturday, February 25 starting at 9am by going to the IoTFuse meetup page or the Social Data Science meetup page.|
|Q: Will there be computers or laptops available for use?|
|A: We have a few laptops that can be used but overall we encourage you to bring your own device if possible.|