Difference between revisions of "Crowdsourcing"

From CURATEcamp
Jump to: navigation, search
m
Line 51: Line 51:
 
#*Can one size fit all with minimal accuracy and minimal effort while domain specific vocabularies are maintained?
 
#*Can one size fit all with minimal accuracy and minimal effort while domain specific vocabularies are maintained?
 
#It seems that there will be a significant need for an "Analytics tool for combined labeling systems" to expose usage for evaluation of labeling / metadata effectiveness.
 
#It seems that there will be a significant need for an "Analytics tool for combined labeling systems" to expose usage for evaluation of labeling / metadata effectiveness.
 +
 +
New York Public Library was mentioned - this wiki won't take YouTube links (it thinks they're spam) but check out Barbara Taranto's Crowd Sourcing Metadata link on this page:  http://www.cni.org/events/membership-meetings/past-meetings/fall-2011/

Revision as of 07:09, 26 May 2012

Questions About Crowd-sourced Metadata

  1. Who's using user-added metadata?
    • Successes and Failures.
    • Importing (Ingesting) or Exporting.
  2. What types of metadata fields are good candidates for crowd-sourced metadata?
  3. What effective incentives can be provided for metadata entry?
  4. Mechanical Turk.
    • Used by Amazon.
    • Looks computer generated, human created.
  5. What possible legal issues might there be with crowd-sourced metadata?
  6. What quality control or authority control systems can be implemented?
    • What reputation systems might be employed to handle quality / authority issues?
  7. What methods of integration could their be with non-user generated metadata?
  8. Controlled Vocabularies.
    • There must be some consideration of domain-specific vocabularies.
    • One size does not fit all.
  9. Awareness - Users must be aware of crowd-sourcing features.
    • Marketing.
    • Advertising.
    • Public Relations.

Examples of Systems Using Crowd-sourced Metadata

  1. Wikipedia (open-source)
  2. Amazon (corporate)
  3. GT ETDs "keywords"
  4. E-page "keywords via online forms - no schema"
  5. UT OPAC - Tags to any entry
  6. GT VuFind / Primo Central
  7. Emory ETDs "keywords"

Blue Sky

  1. Controlled Vocabularies (via something like LCSH)
  2. Search Options of Labeling Authority
    • Public (wide open)
    • Authority (domain experts)
    • Paid sources via harvest
      • Amazon
      • LibraryThing

Significantly Relevant Concepts

  1. Critical Mass
  2. Key:Value Pairs vs. Tags/Labeling

Conclusions

  1. Leave tagging wide open to users
  2. The amount of data being gathered is growing at an increasing rate, such that keeping up with metadata values will eventually become insurmountable by hand.
  3. Paid services (unpaid services?) to harvest metadat from where it already exists rather than generating locally may be necessary (Amazon album art, LibraryThing tags, etc).
  4. Without controlled vocabularies faceting becomes significantly less effective. For this reason controlled vocabularies may be a requirement of future systems.
  5. It may become necessary to make researched decisions concerning controlled vocabularies before implementing systems.
    • There may be need of supersets or subsets of vocabularies to extend within or beyond the original domain.
    • Can one size fit all with minimal accuracy and minimal effort while domain specific vocabularies are maintained?
  6. It seems that there will be a significant need for an "Analytics tool for combined labeling systems" to expose usage for evaluation of labeling / metadata effectiveness.

New York Public Library was mentioned - this wiki won't take YouTube links (it thinks they're spam) but check out Barbara Taranto's Crowd Sourcing Metadata link on this page: http://www.cni.org/events/membership-meetings/past-meetings/fall-2011/