CURATEcamp DLF 2012 Discussion Ideas
From CURATEcamp
Revision as of 21:20, 3 November 2012 by Robin Dean (talk | contribs) (→Long-term Preservation of Complex Objects)
Contents
- 1 Agenda
- 1.1 Contribution and Ingest: Lowering Barriers
- 1.2 Fighting the "One Tool to Rule Them All" Mindset
- 1.3 Funding Repositories and Showing Value
- 1.4 Long-term Preservation of Complex Objects
- 1.5 User Experience
- 1.6 Getting Started with Repository Services
- 1.7 Selection for Digital Preservation
- 1.8 Demos
- 1.9 Wrap-up Session
- 2 Topics
- 3 Timeline
Agenda
Contribution and Ingest: Lowering Barriers
- just-in-case vs. just-in-time metadata
- what's "good enough": department, creator, hit send?
- how can we get digital stuff with the minimum amount of effort?
- set up a pre-ingest staging area for review to make sure content is "repository worthy"
- build in metadata and structure requirements as part of data creation - "data counseling" for reuse
- your digital ingest backlog already exists, it's just distributed across your institution
- should it be the content creators' job to deposit? mediated deposit yields better results
- set minimum compliance requirements for researchers
- this is a "knowledge translation" problem--it won't be easy
- tempting to scale back to absolute minimums, but that's not a good long term solution for reuse, discovery. Need a better balance.
- Email and Dropbox submission: accounts/authentication already built in, why not use it?
- Do mediated deposit and self deposit need to be mutually exclusive? How about "mediate on demand"?
- Levels of Service: process and team to spec out how long ingest will take. Data prioritized by content type/project.
- Help clean up existing content, and apply "lessons learned" to make templates for future data/metadata creation
- Summary: long process with a lot of complexity - make small, incremental steps. It's OK to do what you can for now. Doing what you can is better than over-promising.
- DataStaR from Cornell - staging area for repositories http://datastar.mannlib.cornell.edu/
- recap of C4L 2011 discussion: why does ingest suck?
- promise of permanence sets up a barrier (forever is intimidating)
- perception that ingest makes objects less discoverable
- rights - need to be cleared before ingest?
- metadata - requires too much? need a way to ingest easily and augment later
- curation happens outside repository, preservation happens inside
- more content in staging area than the "actual" repository
- content creators don't have time to ingest
- lessons learned
Fighting the "One Tool to Rule Them All" Mindset
- need to understand what each tool does well and its limitations
- the only way to cope with the weaknesses in each tool is multiple access/use layers. interoperability is our job security
- what is the usefulness of the comparison matrix? really have to install and use the tools to evaluate them, but sometimes that isn't possible
- more useful to think about a "framework" than "tools" -- but it's hard to do anything without programmers. Is outsourcing/contracting really feasible?
- compare to the times when we only had an OPAC and that was good enough--then came ERMs, discovery layers
- managing any preservation repository takes resources, so it's difficult to have more than one. Have one repository with multiple access/view layers.
- Managing/displaying many different types of descriptive metadata--do you need multiple tools to do this?
- The discovery layer is going to be Google! (or other search engines) - use RDF/schema.org and focus on how to expose metadata as broadly and usefully as possible
- cultural heritage interests are so specific--Google can actually work pretty well with basic information
- How to do SEO? are the typical methods of improving relevancy rankings applicable to library content/metadata? the approach that's worked for us
- Search engines are "dumb" - don't tolerate AJAX, JavaScript, etc. sitemaps can help
- schema.org extension project for books in the works
- Getting things into Google Scholar - DSpace has tools that help
- ETDs - need a national solution (repository, harvesting, metadata) for ETDs that doesn't involve proprietary licenses
Funding Repositories and Showing Value
- Know how to measure goals/outcomes (what do you want to measure and how) before you get started
- Base your outcomes on your institutional values. Why are you preserving/curating these objects?
- IT and Libraries need to be better at taking credit for the services they provide. You need to tell people how the work you do helps them. Why should people care? Tell stories, not numbers.
- Communication channels: both inside and outside the library, materials that go out to alums
- Infrastructure is a lot harder to build compelling stories about
- Tools don't have quantitative statistics built in, and librarians don't often have the training/experience to do qualitative research
- Capture quotes from constituents (satisfied customers)
- Still need to manage all the "ordinary" stuff -- have baseline metrics for everything.
- What was the original business case for preserving these things? Business case is not "this will turn a profit" but why are you doing this and what benefits will it bring?
- Digital preservation is more expensive than keeping stuff on a shelf, so you have to make a stronger case. What is the decision-making process for keeping digital stuff? Being digital is not enough.
- Have content creators tell you the significance/importance of their content and set priorities. This should be a mediated process.
- Access is easier to sell--how do you sell the importance of long-term preservation without making up worst-case scenarios?
- Ingest is half the cost of the overall digital preservation lifecycle--make sure what you ingest is worth it. The "collection development" aspect of physical collections isn't present with digital content in the same way.
- Risk assessment - how do you measure the risks that you mitigate?
Long-term Preservation of Complex Objects
- Rights management complicates preservation of linked objects
- Memento protocol for crawling/harvesting web sites
- How/when do you preserve fluid objects? We are used to preserving fixed objects.
- Make project/discipline-based decisions on when data should be captured (raw vs. processed data)
- Samples of complex objects
- Hyperlinks from journal articles to external data; complexities of not having all parts of the objects under your control
- GIS data: how much documentation/metadata do you need?
User Experience
- UI/UX development and reuse (how to do this, formal roles, community development)- usage of curation tools by users (vs. curators) - 16
Getting Started with Repository Services
- bootstrapping repository services (getting started with minimal resources) curation & preservation in the wild (sans repo) - 15
Selection for Digital Preservation
- Has the digital realm affected our idea of what digital preservation means? selection (e.g. of content types) for digital preservation -
are we saving too much? who decides? - 15
Demos
- Gather round for demos at 3:30
- data model from UCSD (17)
Wrap-up Session
- Wrap-up session: community-building: future of CURATEcamp, sustainability - 19
Topics
- linked data (7)
- digital curation
- records management
- metadata & authority control (10)
- long-term preservation of complex objects (16)
- data model from UCSD (17)
- bootstrapping repository services (getting started with minimal resources) curation & preservation in the wild (sans repo) - 15
- development trends
- standards
- data management tools & processes
- Cylinders of Excellence: living with multiple systems (interoperability, one system to rule them all?) combatting "one tool" philosophy (three tools: DAMS for simple items, repo for authorial/ETD workflow, GIS data somethingsomethin'), how not to shoehorn everything (platform/layers vs. monolithic) - especially issues with multiple workflows - 18
- expanding the value of library infrastructure/tools (business use, scholarship) - 12
- Contribution/ingest - 20
- Abstraction layer for repositories especially from early and/or bespoke systems - 2
- community development (e.g. Hydra project on top of not Fedora) - 15
- METS development - 1
- UI/UX development and reuse (how to do this, formal roles, community development)- usage of curation tools by users (vs. curators) - 16
- Has the digital realm affected our idea of what digital preservation means? selection (e.g. of content types) for digital preservation -
are we saving too much? who decides? - 15
- Now that the bits are preserved, how do we preserve behavior/experience - 7
- multi-institutional repositories (UC, CIC, etc.)
- Wrap-up session: community-building: future of CURATEcamp, sustainability - 19
- PREMIS for preservation metadata (user feedback, requests) and changes coming in PREMIS 3 - 4
- persistent identifiers, e.g., ARKs - 8
- Gather round for demos at 3:30 - 25ish
- service models for ingest: internal repos vs external or subject repos - 8
- project is done, now what? - proving value of investment - ROI ALSO funding models for repository/curation services (grants, etc.)- 18
- e-book preservation - 4
Timeline
- 09:00-09:40 Introductions
- 09:40-10:00 Break
- 10:00-10:45 Voting/Ranking
- 10:45-11:15 Session 1: Ingest Barriers
- 11:30-12:00 Session 2: One Tool
- 12:00-12:30 Session 3: ROI
- 12:30-2:00 Lunch
- 2:00-2:30 Session 4:
- 2:30-3:00 Session 5:
- 3:00-3:30 Break
- 3:30-4:00 Session 6:
- 4:00-4:30 Demos
- 4:30-5:00 Wrap-up/Future of Curate Camp