Difference between revisions of "Crowdsourcing"
From CURATEcamp
(→Blue Sky) |
|||
(One intermediate revision by the same user not shown) | |||
Line 27: | Line 27: | ||
#UT OPAC - Tags to any entry | #UT OPAC - Tags to any entry | ||
#GT VuFind / Primo Central | #GT VuFind / Primo Central | ||
− | #Emory ETDs "keywords" | + | #Emory ETDs "keywords" also selecting categories, soon digitized books. |
+ | #New York Public Library - menus project (this wiki won't take YouTube links, it thinks they're spam, but check out Barbara Taranto's Crowd Sourcing Metadata link on this page: http://www.cni.org/events/membership-meetings/past-meetings/fall-2011/) | ||
= Blue Sky = | = Blue Sky = | ||
Line 37: | Line 38: | ||
#**Amazon | #**Amazon | ||
#**LibraryThing | #**LibraryThing | ||
+ | #**2 levels - uncontrolled, authorized, and interface suggests controlled (auto-complete? or like, didyoumean?) | ||
= Significantly Relevant Concepts = | = Significantly Relevant Concepts = | ||
Line 51: | Line 53: | ||
#*Can one size fit all with minimal accuracy and minimal effort while domain specific vocabularies are maintained? | #*Can one size fit all with minimal accuracy and minimal effort while domain specific vocabularies are maintained? | ||
#It seems that there will be a significant need for an "Analytics tool for combined labeling systems" to expose usage for evaluation of labeling / metadata effectiveness. | #It seems that there will be a significant need for an "Analytics tool for combined labeling systems" to expose usage for evaluation of labeling / metadata effectiveness. | ||
− | |||
− |
Latest revision as of 08:16, 26 May 2012
Contents
Questions About Crowd-sourced Metadata
- Who's using user-added metadata?
- Successes and Failures.
- Importing (Ingesting) or Exporting.
- What types of metadata fields are good candidates for crowd-sourced metadata?
- What effective incentives can be provided for metadata entry?
- Mechanical Turk.
- Used by Amazon.
- Looks computer generated, human created.
- What possible legal issues might there be with crowd-sourced metadata?
- What quality control or authority control systems can be implemented?
- What reputation systems might be employed to handle quality / authority issues?
- What methods of integration could their be with non-user generated metadata?
- Controlled Vocabularies.
- There must be some consideration of domain-specific vocabularies.
- One size does not fit all.
- Awareness - Users must be aware of crowd-sourcing features.
- Marketing.
- Advertising.
- Public Relations.
Examples of Systems Using Crowd-sourced Metadata
- Wikipedia (open-source)
- Amazon (corporate)
- GT ETDs "keywords"
- E-page "keywords via online forms - no schema"
- UT OPAC - Tags to any entry
- GT VuFind / Primo Central
- Emory ETDs "keywords" also selecting categories, soon digitized books.
- New York Public Library - menus project (this wiki won't take YouTube links, it thinks they're spam, but check out Barbara Taranto's Crowd Sourcing Metadata link on this page: http://www.cni.org/events/membership-meetings/past-meetings/fall-2011/)
Blue Sky
- Controlled Vocabularies (via something like LCSH)
- Search Options of Labeling Authority
- Public (wide open)
- Authority (domain experts)
- Paid sources via harvest
- Amazon
- LibraryThing
- 2 levels - uncontrolled, authorized, and interface suggests controlled (auto-complete? or like, didyoumean?)
Significantly Relevant Concepts
- Critical Mass
- Key:Value Pairs vs. Tags/Labeling
Conclusions
- Leave tagging wide open to users
- The amount of data being gathered is growing at an increasing rate, such that keeping up with metadata values will eventually become insurmountable by hand.
- Paid services (unpaid services?) to harvest metadat from where it already exists rather than generating locally may be necessary (Amazon album art, LibraryThing tags, etc).
- Without controlled vocabularies faceting becomes significantly less effective. For this reason controlled vocabularies may be a requirement of future systems.
- It may become necessary to make researched decisions concerning controlled vocabularies before implementing systems.
- There may be need of supersets or subsets of vocabularies to extend within or beyond the original domain.
- Can one size fit all with minimal accuracy and minimal effort while domain specific vocabularies are maintained?
- It seems that there will be a significant need for an "Analytics tool for combined labeling systems" to expose usage for evaluation of labeling / metadata effectiveness.