Difference between revisions of "Crowdsourcing"

From CURATEcamp
Jump to: navigation, search
Line 27: Line 27:
 
#UT OPAC - Tags to any entry
 
#UT OPAC - Tags to any entry
 
#GT VuFind / Primo Central
 
#GT VuFind / Primo Central
#Emory ETDs "keywords"
+
#Emory ETDs "keywords" also selecting categories, soon digitized books.
 +
#New York Public Library - menus project  (this wiki won't take YouTube links, it thinks they're spam, but check out Barbara Taranto's Crowd Sourcing Metadata link on this page:  http://www.cni.org/events/membership-meetings/past-meetings/fall-2011/)
  
 
= Blue Sky =
 
= Blue Sky =
Line 51: Line 52:
 
#*Can one size fit all with minimal accuracy and minimal effort while domain specific vocabularies are maintained?
 
#*Can one size fit all with minimal accuracy and minimal effort while domain specific vocabularies are maintained?
 
#It seems that there will be a significant need for an "Analytics tool for combined labeling systems" to expose usage for evaluation of labeling / metadata effectiveness.
 
#It seems that there will be a significant need for an "Analytics tool for combined labeling systems" to expose usage for evaluation of labeling / metadata effectiveness.
 
New York Public Library was mentioned - this wiki won't take YouTube links (it thinks they're spam) but check out Barbara Taranto's Crowd Sourcing Metadata link on this page:  http://www.cni.org/events/membership-meetings/past-meetings/fall-2011/
 

Revision as of 08:12, 26 May 2012

Questions About Crowd-sourced Metadata

  1. Who's using user-added metadata?
    • Successes and Failures.
    • Importing (Ingesting) or Exporting.
  2. What types of metadata fields are good candidates for crowd-sourced metadata?
  3. What effective incentives can be provided for metadata entry?
  4. Mechanical Turk.
    • Used by Amazon.
    • Looks computer generated, human created.
  5. What possible legal issues might there be with crowd-sourced metadata?
  6. What quality control or authority control systems can be implemented?
    • What reputation systems might be employed to handle quality / authority issues?
  7. What methods of integration could their be with non-user generated metadata?
  8. Controlled Vocabularies.
    • There must be some consideration of domain-specific vocabularies.
    • One size does not fit all.
  9. Awareness - Users must be aware of crowd-sourcing features.
    • Marketing.
    • Advertising.
    • Public Relations.

Examples of Systems Using Crowd-sourced Metadata

  1. Wikipedia (open-source)
  2. Amazon (corporate)
  3. GT ETDs "keywords"
  4. E-page "keywords via online forms - no schema"
  5. UT OPAC - Tags to any entry
  6. GT VuFind / Primo Central
  7. Emory ETDs "keywords" also selecting categories, soon digitized books.
  8. New York Public Library - menus project (this wiki won't take YouTube links, it thinks they're spam, but check out Barbara Taranto's Crowd Sourcing Metadata link on this page: http://www.cni.org/events/membership-meetings/past-meetings/fall-2011/)

Blue Sky

  1. Controlled Vocabularies (via something like LCSH)
  2. Search Options of Labeling Authority
    • Public (wide open)
    • Authority (domain experts)
    • Paid sources via harvest
      • Amazon
      • LibraryThing

Significantly Relevant Concepts

  1. Critical Mass
  2. Key:Value Pairs vs. Tags/Labeling

Conclusions

  1. Leave tagging wide open to users
  2. The amount of data being gathered is growing at an increasing rate, such that keeping up with metadata values will eventually become insurmountable by hand.
  3. Paid services (unpaid services?) to harvest metadat from where it already exists rather than generating locally may be necessary (Amazon album art, LibraryThing tags, etc).
  4. Without controlled vocabularies faceting becomes significantly less effective. For this reason controlled vocabularies may be a requirement of future systems.
  5. It may become necessary to make researched decisions concerning controlled vocabularies before implementing systems.
    • There may be need of supersets or subsets of vocabularies to extend within or beyond the original domain.
    • Can one size fit all with minimal accuracy and minimal effort while domain specific vocabularies are maintained?
  6. It seems that there will be a significant need for an "Analytics tool for combined labeling systems" to expose usage for evaluation of labeling / metadata effectiveness.