DLF 2011 - Microservices to support cataloging/metadata work

From CURATEcamp
Revision as of 19:47, 2 November 2011 by Jennrileylib (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Idea that's in heavy use at the California Digital Library. Microservices are standalone, independent of a repository technology. Examples:

  • "identity" service - assign an ID before the object actually gets created in a repository
  • "storage" service - all it knows about are versions of hierarchical objects. Repository services build on top of it
  • search
  • annotation
  • inventory (where files are)
  • fixity
  • replication

Microservices ideas are to break down into as small a job as is reasonable.

What could we pull out of a hard drive? Dates, longitude/latitude, owner's name.

Ideas for metadata microservices:

  • navigate hierarchy
  • transform
  • normalize (a specific type of data - eg dates, to a controlled vocabulary)
  • autocomplete
  • resolve ID to preferred label
  • get identifier from a text label

Is there a role for microservices in catalog maintenance, e.g., changing records that use a heading that's been updated? Maybe in the future we ditch the idea of "authorized forms" of heading entirely. Then we wouldn't need tools to manage the change. But need to make sure we aren't displaying outdated words to our users. If we do keep authorized forms, keep a complete history of them and when they change.

Unlikely that there's one tool to use to pull data from different places together. Instead will need lots of tools to be used in different environments, build community to share them as much as possible. Avoid the "alignment problem."

By using very small microservices, then sharing of tools might be easier because the tasks are generalized to the greatest degree.

Would an online community of scripts/tools/code help? PCC has one, OAI-PMH community has shared tools pages. This could also help the cataloger/coder communication - catalogers say "I have this problem," coder says "I have this code."

Is FRBRize a microservice, or is it too big? It's too big. Break it down into services like xISSN, xISBN. But some (eg @edsu) might call it a microservice. And it should be possible to have a (macro)service that does FRBRization, if you know what type of resource the input is.

Is running locally a requirement of something being a "microservice"? No. Microservices is just an approach to an architecture. No assumption about where it's run.

CODE4LIB mailing list as a good resource for asking these sorts of questions.