Difference between revisions of "Crowdsourcing"
From CURATEcamp
(→Blue Sky) |
|||
(3 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
− | + | = Questions About Crowd-sourced Metadata = | |
− | *Successes and Failures | + | #Who's using user-added metadata? |
− | *Importing (Ingesting) or Exporting | + | #*Successes and Failures. |
− | + | #*Importing (Ingesting) or Exporting. | |
− | + | #What types of metadata fields are good candidates for crowd-sourced metadata? | |
− | + | #What effective incentives can be provided for metadata entry? | |
− | *Used by Amazon | + | #Mechanical Turk. |
− | *Looks computer generated, human created | + | #*Used by Amazon. |
− | + | #*Looks computer generated, human created. | |
− | + | #What possible legal issues might there be with crowd-sourced metadata? | |
− | *What reputation systems might be employed to handle quality / authority issues? | + | #What quality control or authority control systems can be implemented? |
− | + | #*What reputation systems might be employed to handle quality / authority issues? | |
− | + | #What methods of integration could their be with non-user generated metadata? | |
− | *There must be some consideration of domain-specific vocabularies | + | #Controlled Vocabularies. |
− | *One size does not fit all | + | #*There must be some consideration of domain-specific vocabularies. |
− | + | #*One size does not fit all. | |
− | *Marketing | + | #Awareness - Users must be aware of crowd-sourcing features. |
− | *Advertising | + | #*Marketing. |
− | *Public Relations | + | #*Advertising. |
+ | #*Public Relations. | ||
+ | |||
+ | = Examples of Systems Using Crowd-sourced Metadata = | ||
+ | #Wikipedia (open-source) | ||
+ | #Amazon (corporate) | ||
+ | #GT ETDs "keywords" | ||
+ | #E-page "keywords via online forms - no schema" | ||
+ | #UT OPAC - Tags to any entry | ||
+ | #GT VuFind / Primo Central | ||
+ | #Emory ETDs "keywords" also selecting categories, soon digitized books. | ||
+ | #New York Public Library - menus project (this wiki won't take YouTube links, it thinks they're spam, but check out Barbara Taranto's Crowd Sourcing Metadata link on this page: http://www.cni.org/events/membership-meetings/past-meetings/fall-2011/) | ||
+ | |||
+ | = Blue Sky = | ||
+ | #Controlled Vocabularies (via something like LCSH) | ||
+ | #Search Options of Labeling Authority | ||
+ | #*Public (wide open) | ||
+ | #*Authority (domain experts) | ||
+ | #*Paid sources via harvest | ||
+ | #**Amazon | ||
+ | #**LibraryThing | ||
+ | #**2 levels - uncontrolled, authorized, and interface suggests controlled (auto-complete? or like, didyoumean?) | ||
+ | |||
+ | = Significantly Relevant Concepts = | ||
+ | #Critical Mass | ||
+ | #Key:Value Pairs vs. Tags/Labeling | ||
+ | |||
+ | = Conclusions = | ||
+ | #Leave tagging wide open to users | ||
+ | #The amount of data being gathered is growing at an increasing rate, such that keeping up with metadata values will eventually become insurmountable by hand. | ||
+ | #Paid services (unpaid services?) to harvest metadat from where it already exists rather than generating locally may be necessary (Amazon album art, LibraryThing tags, etc). | ||
+ | #Without controlled vocabularies faceting becomes significantly less effective. For this reason controlled vocabularies may be a requirement of future systems. | ||
+ | #It may become necessary to make researched decisions concerning controlled vocabularies before implementing systems. | ||
+ | #*There may be need of supersets or subsets of vocabularies to extend within or beyond the original domain. | ||
+ | #*Can one size fit all with minimal accuracy and minimal effort while domain specific vocabularies are maintained? | ||
+ | #It seems that there will be a significant need for an "Analytics tool for combined labeling systems" to expose usage for evaluation of labeling / metadata effectiveness. |
Latest revision as of 08:16, 26 May 2012
Contents
Questions About Crowd-sourced Metadata
- Who's using user-added metadata?
- Successes and Failures.
- Importing (Ingesting) or Exporting.
- What types of metadata fields are good candidates for crowd-sourced metadata?
- What effective incentives can be provided for metadata entry?
- Mechanical Turk.
- Used by Amazon.
- Looks computer generated, human created.
- What possible legal issues might there be with crowd-sourced metadata?
- What quality control or authority control systems can be implemented?
- What reputation systems might be employed to handle quality / authority issues?
- What methods of integration could their be with non-user generated metadata?
- Controlled Vocabularies.
- There must be some consideration of domain-specific vocabularies.
- One size does not fit all.
- Awareness - Users must be aware of crowd-sourcing features.
- Marketing.
- Advertising.
- Public Relations.
Examples of Systems Using Crowd-sourced Metadata
- Wikipedia (open-source)
- Amazon (corporate)
- GT ETDs "keywords"
- E-page "keywords via online forms - no schema"
- UT OPAC - Tags to any entry
- GT VuFind / Primo Central
- Emory ETDs "keywords" also selecting categories, soon digitized books.
- New York Public Library - menus project (this wiki won't take YouTube links, it thinks they're spam, but check out Barbara Taranto's Crowd Sourcing Metadata link on this page: http://www.cni.org/events/membership-meetings/past-meetings/fall-2011/)
Blue Sky
- Controlled Vocabularies (via something like LCSH)
- Search Options of Labeling Authority
- Public (wide open)
- Authority (domain experts)
- Paid sources via harvest
- Amazon
- LibraryThing
- 2 levels - uncontrolled, authorized, and interface suggests controlled (auto-complete? or like, didyoumean?)
Significantly Relevant Concepts
- Critical Mass
- Key:Value Pairs vs. Tags/Labeling
Conclusions
- Leave tagging wide open to users
- The amount of data being gathered is growing at an increasing rate, such that keeping up with metadata values will eventually become insurmountable by hand.
- Paid services (unpaid services?) to harvest metadat from where it already exists rather than generating locally may be necessary (Amazon album art, LibraryThing tags, etc).
- Without controlled vocabularies faceting becomes significantly less effective. For this reason controlled vocabularies may be a requirement of future systems.
- It may become necessary to make researched decisions concerning controlled vocabularies before implementing systems.
- There may be need of supersets or subsets of vocabularies to extend within or beyond the original domain.
- Can one size fit all with minimal accuracy and minimal effort while domain specific vocabularies are maintained?
- It seems that there will be a significant need for an "Analytics tool for combined labeling systems" to expose usage for evaluation of labeling / metadata effectiveness.