Difference between revisions of "Association of Moving Image Archivists & Digital Library Federation Hack Day 2014"

From CURATEcamp
Jump to: navigation, search
(New and Improved PBCore Tools)
Line 97: Line 97:
 
*Crystal Sanchez, Smithsonian Institution
 
*Crystal Sanchez, Smithsonian Institution
  
 +
<!--
 
==Development of a UUID (universally unique identifier - String or Number) system for moving image physical/digital elements==
 
==Development of a UUID (universally unique identifier - String or Number) system for moving image physical/digital elements==
 
There will be a UUID registrar. The registrar server would hold the UUID and pointer to metadata/item information. This would allow a wide range of possible usages from access information to relational trees. Because we do not want to limit this assignment for elements where there is no internet access there will be a system similar to MAC addresses/UPCs where a registered archivist/lab/individual could be given a UUID blocks for assignment offline and then register later online without collision. The UUID could be made into a 1D or 2D bar code or human readable marking on the element for instant access through the server pointer to metadata on the content and physical item.  
 
There will be a UUID registrar. The registrar server would hold the UUID and pointer to metadata/item information. This would allow a wide range of possible usages from access information to relational trees. Because we do not want to limit this assignment for elements where there is no internet access there will be a system similar to MAC addresses/UPCs where a registered archivist/lab/individual could be given a UUID blocks for assignment offline and then register later online without collision. The UUID could be made into a 1D or 2D bar code or human readable marking on the element for instant access through the server pointer to metadata on the content and physical item.  
 
* tommy [at!] videofilmsolutions [dawt] com [https://twitter.com/VideoFilmSol @VideoFilmSol]
 
* tommy [at!] videofilmsolutions [dawt] com [https://twitter.com/VideoFilmSol @VideoFilmSol]
 
* Interested? Your name + any comments/initial ideas
 
* Interested? Your name + any comments/initial ideas
 +
-->
  
 
==Video Characterization Comparison Viewer==
 
==Video Characterization Comparison Viewer==
Line 124: Line 126:
 
* Karen Cariani
 
* Karen Cariani
  
 +
<!--
 
==Broadcast Wave header support/testing==
 
==Broadcast Wave header support/testing==
 
* Further investigation of software support for Broadcast Wave header information (Audacity customization?) – justinkovar [at] utexas [dawt] edu [https://twitter.com/KovarSound @KovarSound]
 
* Further investigation of software support for Broadcast Wave header information (Audacity customization?) – justinkovar [at] utexas [dawt] edu [https://twitter.com/KovarSound @KovarSound]
 
* Interested? Your name + any comments/initial ideas
 
* Interested? Your name + any comments/initial ideas
 +
-->
  
 
==Video thumbnail summaries as metadata==
 
==Video thumbnail summaries as metadata==
Line 143: Line 147:
 
** Jay Brown - sounds interesting and will help out as I can
 
** Jay Brown - sounds interesting and will help out as I can
  
 +
<!--
 
==Disk usage pie chart==
 
==Disk usage pie chart==
 
Disk usage pie chart! I've been looking for a software tool that would allow us to calculate which projects are using the most server disk space in our collections, how old files are, and when they were last accessed, and then throws all of that data into visual form – like charts, graphs, and especially pie charts! I developed a web tool that just shows individual project sizes and how data is added or deleted from day to day, but it only shows a list of projects and their sizes. To convince my supervisors that certain projects are taking up too much room (and are never accessed), I have to create visuals using excel or other programs, which takes me hours but could easily be automated.  
 
Disk usage pie chart! I've been looking for a software tool that would allow us to calculate which projects are using the most server disk space in our collections, how old files are, and when they were last accessed, and then throws all of that data into visual form – like charts, graphs, and especially pie charts! I developed a web tool that just shows individual project sizes and how data is added or deleted from day to day, but it only shows a list of projects and their sizes. To convince my supervisors that certain projects are taking up too much room (and are never accessed), I have to create visuals using excel or other programs, which takes me hours but could easily be automated.  
Line 148: Line 153:
 
* Interested? Your name + any comments/initial ideas
 
* Interested? Your name + any comments/initial ideas
 
** for a web page presentation, maybe try Google Charts?
 
** for a web page presentation, maybe try Google Charts?
 +
-->
  
 
<!--
 
<!--
Line 165: Line 171:
 
* Team: brianjhoffman [at] gmail [dawt] com (instigator and developer) ; jackb [at] illinois [dawt] edu ; seth [at] avpreserve [dawt] com ; ben.moskowitz [at] nyu [dawt] edu ; hfrost [at] stanford [dawt] edu ; villereal [at] gmail [dawt] com
 
* Team: brianjhoffman [at] gmail [dawt] com (instigator and developer) ; jackb [at] illinois [dawt] edu ; seth [at] avpreserve [dawt] com ; ben.moskowitz [at] nyu [dawt] edu ; hfrost [at] stanford [dawt] edu ; villereal [at] gmail [dawt] com
  
 +
<!--
 
==File format monitors/QA==
 
==File format monitors/QA==
 
* My proposal is for building a File Format Obsolescence Analysis Engine. The purpose of the engine would be to provide information about--and options for--migrating and transcoding obsolete media file formats through simple and intuitive user interactions. The user provides the engine with an arbitrary file, which the engine then analyzes using any number of metadata forensics and validation tools (MediaInfo, JHOVE and DROID to name a few). The engine then decides whether the file needs to be migrated, or whether the current format can be considered "preservation ready". For the purpose of this proposal, "preservation ready" means that the file meets a list of minimum requirements, such as being stable and being supported by certain playback systems. However, determining a comprehensive list of these criteria is outside of the scope of this project. The output of the engine is two-fold: First it will generate a report about the input file. This report will contain the most salient aspects of the file and it's technical metadata in a format that is human readable, but can be easily parsed by a computer in order to facilitate scripting and automation. Second, the engine will move the input file to user-designated output folders according to its state (needs migration or preservation ready). These folders can be used simply to organize the files, or they may function as watch folders for transcoding engines or any other automation systems (which are out of the scope of this proposal). The two most important features of this engine are as follows: 1) The input and output should be as simple and intuitive as possible. The idea is to disseminate the engine as a general tool for the preservation community at large. Due to the wide range in technical skills available to potential users in this community it is critical that the tool be seen as "easy to use". 2) The engine needs to be built in a way that is extensible and easily updated. Due to the time constraints of this event, building a comprehensive analysis engine is out of the scope of this proposal. However, the engine's utility would be greatly enhanced if the framework is built in such a way that members of the community can easily update and add support for various file formats without compromising the previously mentioned usability. Thus, the idea would be to build a baseline that the community could then expand upon in the future.  
 
* My proposal is for building a File Format Obsolescence Analysis Engine. The purpose of the engine would be to provide information about--and options for--migrating and transcoding obsolete media file formats through simple and intuitive user interactions. The user provides the engine with an arbitrary file, which the engine then analyzes using any number of metadata forensics and validation tools (MediaInfo, JHOVE and DROID to name a few). The engine then decides whether the file needs to be migrated, or whether the current format can be considered "preservation ready". For the purpose of this proposal, "preservation ready" means that the file meets a list of minimum requirements, such as being stable and being supported by certain playback systems. However, determining a comprehensive list of these criteria is outside of the scope of this project. The output of the engine is two-fold: First it will generate a report about the input file. This report will contain the most salient aspects of the file and it's technical metadata in a format that is human readable, but can be easily parsed by a computer in order to facilitate scripting and automation. Second, the engine will move the input file to user-designated output folders according to its state (needs migration or preservation ready). These folders can be used simply to organize the files, or they may function as watch folders for transcoding engines or any other automation systems (which are out of the scope of this proposal). The two most important features of this engine are as follows: 1) The input and output should be as simple and intuitive as possible. The idea is to disseminate the engine as a general tool for the preservation community at large. Due to the wide range in technical skills available to potential users in this community it is critical that the tool be seen as "easy to use". 2) The engine needs to be built in a way that is extensible and easily updated. Due to the time constraints of this event, building a comprehensive analysis engine is out of the scope of this proposal. However, the engine's utility would be greatly enhanced if the framework is built in such a way that members of the community can easily update and add support for various file formats without compromising the previously mentioned usability. Thus, the idea would be to build a baseline that the community could then expand upon in the future.  
 
* [https://twitter.com/av_morgan @av_morgan]
 
* [https://twitter.com/av_morgan @av_morgan]
 
* Interested? Your name + any comments/initial ideas
 
* Interested? Your name + any comments/initial ideas
 
+
-->
 +
<!--
 
==FFmpeg GUIs==   
 
==FFmpeg GUIs==   
 
* [https://twitter.com/rhfraim @rhfraim]
 
* [https://twitter.com/rhfraim @rhfraim]
Line 175: Line 183:
 
I would love to see a good GUI for FFMPEG. Super is nice but it doesn't have all the formats and codecs that FFMPEG has and I think a program that transcodes into any format and does more than one format at the same time would be a great benefit to small budget archives.
 
I would love to see a good GUI for FFMPEG. Super is nice but it doesn't have all the formats and codecs that FFMPEG has and I think a program that transcodes into any format and does more than one format at the same time would be a great benefit to small budget archives.
 
• srdbx [at] netvision [dawt] net [dawt] il
 
• srdbx [at] netvision [dawt] net [dawt] il
 
+
-->
  
 
=Wikipedia Edit-a-thon topic proposals=
 
=Wikipedia Edit-a-thon topic proposals=

Revision as of 21:29, 8 October 2014

SIGN UP HERE: https://docs.google.com/spreadsheets/d/16uZEWO5wDs6FdwpzpRwpKm3IBvonHNFrVUaZ2nkNcis/edit#gid=0

>>> When, Where, What time?

  • Date: Wednesday, October 8, 2014
  • Time: ~9am-5pm (with option of continued work projects throughout the conference in our Developer Lounge TBA location)
  • Location: Hyatt Regency Savannah, Scarborough 3 Room
  • hashtag: #AVhack14
  • IRC: #curatecamp_avpres_1 If using an IRC client the server is chat.freenode.net, or you can use your browser and connect to webchat.freenode.net. If you are unfamiliar with IRC, take a look at this ☞ brief introduction.

How can I participate?

Sign up! As this will be a highly participatory event, registration is limited to those willing to get their hands dirty, so no onlookers please.

If you are unsure whether you can or want to participate in the hack day itself, you can still see the results by attending the AMIA closing plenary, where hack day projects will be presented, and the audience will have an opportunity to vote on their favorites.

What will be the format of the event?

In advance of the hack day, project ideas and edit-a-thon topics will be collected through the registration form and the event wiki. In advance of the event, participants will review and discuss submitted project ideas. We’ll then break into groups consisting of technologists and practitioners, and Wikipedia editors, selecting an idea or topic(s) to work on together for the day and (if desired) throughout the duration of the AMIA conference in the developers lounge.

The day itself will be structured something like this. Coffee/tea will be provided. Lunch is on your own.

9am – Welcome, introductions

9:30 - noon - Hacking & Wikipedia editing. Snacks and coffee to be served.

Noon-1pm – Lunch on your own.

1 - 4:30 - Hacking & Wikipedia editing. Snacks and coffee will be served.

4:30 - 5 - Wrap up.

Closing plenary & prizes

Projects will be presented towards the end of the conference. Projects will be judged by a panel as well as by conference attendees.

Summary

In association with the annual conference, the Association of Moving Image Archivists will host its 2nd annual hack day on October 8, 2014 in Savannah, GA. The event will be a unique opportunity for practitioners and managers of digital audiovisual collections to join with developers and engineers for an intense day of collaboration to develop solutions for digital audiovisual preservation and access. This year, we will be holding a concurrent Wikipedia Edit-a-thon[1] for those interested in adding to knowledge pool about audiovisual preservation and access. It will be fun and practical.

AMIA is again partnering with the Digital Library Federation in organizing the hack day. A robust and diverse community of practitioners who advance research, teaching and learning through the application of digital library research, technology and services, DLF brings years of experience creating and hosting events designed to foster collaboration and develop shared solutions for common challenges.

What if I’m not a developer?

Content managers and preservation practitioners are as central to the success of the event as having keen developers. YOU will be responsible for setting the agenda and the outcomes. The goal is to foster collaboration between audiovisual preservation specialists and technologists, to solve problems together and share expertise.

The day will also include a Wikipedia Edit-a-thon. So even if you're not a developer, nor feel compelled to lend your digital preservation ideas to software and code development, you can contribute to creating new or updated content on Wikipedia for the benefit of our community! You can read all about Wikipedia edit-a-thon events here.

Background

What is a hack day?

A hack day or hackathon is an event that brings together computer technologists and practitioners for an intense period of problem solving through computer programming. Within digital preservation and curation communities, hack days provide an opportunity for archivists, collection managers, and others to work together with technologists to develop software solutions for digital collections management needs. Hack days have been held independently by groups such as the Open Planets Foundation, as well as in association with preservation and access oriented conferences including Open Repositories and Museums and the Web.

The manifesto of a recent event at the Open Repositories conference framed the benefits this way: “Transparent, fun, open collaboration in diversely constituted teams...The creation of new professional networks over the ossification of old ones. Effective engagement of non-developers (researchers, repository managers) in development...Work done at the conference over presentation of something prepared earlier.”

Our Manifesto

Manifesto:

  • Transparent, fun, open collaboration in diversely constituted teams over individual brilliance and/or groups of like individuals in cut-throat competition.
  • The creation of new professional networks over the ossification of old ones
  • Effective engagement of non-developers (researchers, repository managers) in development over purely developer driven projects.
  • Work done at the conference over presentation of something prepared earlier (meaning not working on a project you a working on during your day job)

Hack Day Project proposals

Below are loose ideas for projects to hack on! If you're interested in one of the project stubs below, sign up for a wiki login and add your thoughtful comments or possible starting points to the proposal, or contact the proposer via twitter or email. As the Hack Day approaches, we'll brainstorm further and consolidate like-minded projects.

Hack day capture: GUI tool for BMDCapture, using FFmpeg + BMDTools + BlackMagic Decklink SDK

  •  Proposed by Dave Rice, @dericed
    • Lauren Sorensen - @laurensx
    • Dave Rice - @dericed
    • Tommy from Video Film Solutions
    • Shai Drori, Independent

Adding a GUI with XCode and adding some features to vrecord, an open source tool that can be used with Blackmagic hardware. Making it so it is controlled by the GUI (Mac), outputting to MOV container with various options for codec: uncompressed, FFV1, motion JPEG-2000, Prores. https://github.com/amiaopensource/hackdaycapture https://docs.google.com/document/d/1q6qEk3gHKNzf6jtEIzzl2-vKwjgUzOTXr6TmUej2mgc/edit

New and Improved PBCore Tools

This team is creating two PBCore 2.0 tools which can easily be updated when the next PBCore schema version is released. The tools include an updated PBCore XML record validator and PBCore XML record generator.

PBCore Validator

Updated and replaced application components to create an improved version based on the PBCore 2.0 XML. We are also including a Dublin Core validator through the same web application. Github Repo: https://github.com/tessafallon/pbcorevalidator/

PBCore Record Generator

We built a basic Rails webform and extracted the PBCore export code from the WGBH project HydraDAM developed by WGBH and DCE so that we have a lightweight web application to enter metadata based on the PBCore data model and export in well-formed PBCore 2.0 XML. We are also planning to provide a Dublin Core export via this form. View it: http://pb-form.curationexperts.com/ Github repo: github.com/mark-dce/pb-form/

Team Members

  • Proposed by Casey E. Davis, WGBH
  • Mark Bussey, DCE
  • Tessa Fallon, Collective Access
  • Crystal Sanchez, Smithsonian Institution


Video Characterization Comparison Viewer

The tool runs multiple command line video characterization applications (ffprobe, mediainfo, exiftool) on a given AV file/set of files and outputs the results in a format that is easy for comparative analysis. The aim of the tool is to identify differences in the outputs of these common applications, with the goal of better understanding the tools, and possibly submitting reports to their developers and eventually improving them.

Use cases for the tool

  • Compare different tool outputs in order to determine what tool to use in your characterization workflows
  • Reporting on differences in order to understand how the tool calculates the outputs
  • Identifying bugs in order to report to the tool developers

This project build on a similar project that was developed during the Open Repositories 2014 Developer Challenge (although the #AVHack14 version is bigger and better). Read more about that here.

Documentation

Team Members

  • Kara Van Malssen
  • Ben Fino-Radin
  • Morgan Morel
  • Joey Heinen
  • Nicole Martin
  • Karen Cariani


Video thumbnail summaries as metadata

Creating a command line program (wrapped as a Ruby gem) that exports thumbnail images, thumbnail sprite image, and WebVTT metadata.

Github repository

Original proposal: I've been interested in using video preview thumbnails as a way to provide summarized access to digitized video that will unlikely get further description. You can read more about what I've done here: http://ronallo.com/blog/a-plugin-for-mediaelement-js-for-preview-thumbnails-on-hover-over-the-time-rail/ I could use help improving that JavaScript plugin or in turning the production of video thumbnails and the metadata track file into a service of some sort. I'm also happy to help as a developer on another project.

  • Team Leader: Jason Ronallo - jronallo [at] gmail [dawt] com / @ronallo
  • Interested? Your name + any comments/initial ideas
    • Ashley Blewer! - @ablwr -- B-) Open Source Report Card once called me "a distinguished JavaScripter."
    • Nicholas Zoss - Servicizing the processing seems interesting. I'm interested in helping on this project as I'm able.
    • Jay Brown - sounds interesting and will help out as I can


ArchivesSpace plugins – AV_Space

A growing number of archival repositories are adopting ArchivesSpace for managing archival description. But out of the box, ASpace lacks support for AV specific data. We aim to do something about that.

  • Team: brianjhoffman [at] gmail [dawt] com (instigator and developer) ; jackb [at] illinois [dawt] edu ; seth [at] avpreserve [dawt] com ; ben.moskowitz [at] nyu [dawt] edu ; hfrost [at] stanford [dawt] edu ; villereal [at] gmail [dawt] com


Wikipedia Edit-a-thon topic proposals

IRC: #curatecamp_avpres_2 If using an IRC client the server is chat.freenode.net, or you can use your browser and connect to webchat.freenode.net. If you are unfamiliar with IRC, take a look at this brief introduction.

Day-of tracking/worksheet here! Notes and tips doc

We’ll be hosting a concurrent Wikipedia edit-a-thon, which will focus on topics related to digital preservation & access for audiovisual materials. While we encourage non-engineers to participate in the hack day portion, there’s a lot of work to be done to describe topics relevant to our community on Wikipedia as well. (via AMIA Announcement) Below are loose ideas for Wiki projects or topics to edit! If you're interested in one of the project stubs below, sign up for a wiki login and add your thoughtful comments, or contact the proposer via twitter or email.

Haven't edited a Wiki before? No problem! We will have a brief crash course early in the day, with help available anytime! It's easy to learn, we promise.

Have a new idea? Use this format:

  • Topic: [link to existing Wikipedia page goes here, or new topic]
  • Interested?
  • Sign up to edit this topic (OK to sign up for multiple topics):

AV Artifact Atlas

Team Members

  • Kristin MacDonough
  • Travis Wagner
  • Marleigh Chiles
  • Josephine McRobbie

Objective: Obtain and implement user feedback to improve the user experience

  • Proposal: The AVAA is intended for all levels of a/v knowledge, from experts and non-experts. While the resource provides a lot of useful information from experts, it benefits from more feedback and edits from general users. @super_kmac

Activities:

  • Review and discuss navigation and usability
  • Edit visual elements, interconnectivity, and links to external information and resources

Digital Preservation

Team Members:

Objective: Update and improve Digital Preservation Wikipedia entry

  • Proposal: Check out the great efforts to maintain and expand the Digital Preservation wiki page! "The scope of this project is to reorganize and revise the content of the current Digital Preservation article so that it reflects the current state of the field and is better suited to ongoing updating and editing. We will also review related articles to determine their content and relationship to the main article. A further goal of this effort is to include links to relevant standards and best practices in the field of digital preservation." Thanks to Lauren Sorenson and Andrea Goethals for suggesting. -- kgronsbell @k_grons

Activities:

FFmpeg Guides

Team Members

    • Rebecca Fraimow**
    • Erica Titkemeyer**

Objective: Make ffmpeg easier to understand and use

  • Final doc from last year here
  • Link to last year's project page with helpful links here
  • This is like a proof of concept but I'd really like to make it AWESOME! It also needs to be totally refactored into a non-Rails app - Ashley here

Activities