Difference between revisions of "Association of Moving Image Archivists & Digital Library Federation Hack Day 2013"

From CURATEcamp
Jump to: navigation, search
m (Summary)
(When, Where, What time?)
Line 7: Line 7:
 
*Time: '''~9am-5pm''' (with option of continued work projects throughout the conference in our Developer Lounge)
 
*Time: '''~9am-5pm''' (with option of continued work projects throughout the conference in our Developer Lounge)
 
*Location: '''[https://www.google.com/maps/preview#!data=!1m4!1m3!1d2818!2d-77.4410848!3d37.5381184!4m29!2m11!1m10!1s0x0%3A0x8c4d68568502a7b3!3m8!1m3!1d100627!2d-78.6065273!3d37.9863549!3m2!1i1024!2i768!4f13.1!5m16!2m15!1m14!1s0x89b1111603f85cd9%3A0x8c4d68568502a7b3!2sCrowne+Plaza+Richmond+Downtown+near+Crowne+Plaza+Richmond+Downtown!3m8!1m3!1d100627!2d-78.6065273!3d37.9863549!3m2!1i1024!2i768!4f13.1!4m2!3d37.538118!4d-77.441085 Crowne Plaza Richmond Downtown]''' in Richmond, VA
 
*Location: '''[https://www.google.com/maps/preview#!data=!1m4!1m3!1d2818!2d-77.4410848!3d37.5381184!4m29!2m11!1m10!1s0x0%3A0x8c4d68568502a7b3!3m8!1m3!1d100627!2d-78.6065273!3d37.9863549!3m2!1i1024!2i768!4f13.1!5m16!2m15!1m14!1s0x89b1111603f85cd9%3A0x8c4d68568502a7b3!2sCrowne+Plaza+Richmond+Downtown+near+Crowne+Plaza+Richmond+Downtown!3m8!1m3!1d100627!2d-78.6065273!3d37.9863549!3m2!1i1024!2i768!4f13.1!4m2!3d37.538118!4d-77.441085 Crowne Plaza Richmond Downtown]''' in Richmond, VA
 +
*hashtag: '''#AVhack13'''
  
 
==How can I participate?==
 
==How can I participate?==

Revision as of 20:58, 28 October 2013

Main Page > AMIA & DLF Hack Day 2013 > Association of Moving Image Archivists & Digital Library Federation Hack Day 2013

Logistics

When, Where, What time?

  • Date: November 6, 2013
  • Time: ~9am-5pm (with option of continued work projects throughout the conference in our Developer Lounge)
  • Location: Crowne Plaza Richmond Downtown in Richmond, VA
  • hashtag: #AVhack13

How can I participate?

Sign up! As this will be a highly participatory event, registration is limited to those willing to get their hands dirty, so no onlookers please.

If you are unsure whether you can or want to participate in the hack day itself, you can still see the results by attending the AMIA closing plenary, where hack day projects will be presented, and the audience will have an opportunity to vote on their favorites.

What will be the format of the event?

In advance of the hack day, project ideas will be collected through the registration form and the event wiki. In advance of the event, participants will review and discuss submitted project ideas. We’ll then break into groups consisting of technologists and practitioners, selecting an idea to work on together for the day and (if desired) throughout the duration of the AMIA conference in the developers lounge.

9am – Introductions!

Noon-1pm – Lunch on your own.

Closing plenary & prizes

Projects will be presented during the conference closing plenary, Saturday November 9 at 9:30am. Projects will be judged by a panel as well as by conference attendees.

Summary

In association with the annual conference, the Association of Moving Image Archivists will host its first ever hack day on November 6, 2013 in Richmond, VA. The event will be a unique opportunity for practitioners and managers of digital audiovisual collections to join with developers and engineers for an intense day of collaboration to develop solutions for digital audiovisual preservation and access. It will be fun and practical…and there will be prizes!

This year's hack day is a partnership between AMIA and the Digital Library Federation. A robust and diverse community of practitioners who advance research, teaching and learning through the application of digital library research, technology and services, DLF brings years of experience creating and hosting events designed to foster collaboration and develop shared solutions for common challenges.

What if I’m not a developer?

Content managers and preservation practitioners are as central to the success of the event as having keen developers. YOU will be responsible for setting the agenda and the outcomes. The goal is to foster collaboration between audiovisual preservation specialists and technologists, to solve problems together and share expertise.

Background

What is a hack day?

A hack day or hackathon is an event that brings together computer technologists and practitioners for an intense period of problem solving through computer programming. Within digital preservation and curation communities, hack days provide an opportunity for archivists, collection managers, and others to work together with technologists to develop software solutions for digital collections management needs. Hack days have been held independently by groups such as the Open Planets Foundation, as well as in association with preservation and access oriented conferences including Open Repositories and Museums and the Web.

The manifesto of a recent event at the Open Repositories conference framed the benefits this way: “Transparent, fun, open collaboration in diversely constituted teams...The creation of new professional networks over the ossification of old ones. Effective engagement of non-developers (researchers, repository managers) in development...Work done at the conference over presentation of something prepared earlier.”

Why an AMIA hack day?

An audiovisual preservation-themed CURATEcamp was held in April 2013, drawing over 120 registrants from at least 3 continents for a day of great conversations and lightning talks. CURATEcamp is as series of unconference-style events focused on connecting practitioners and technologists interested in digital curation. The event generated a lot of documentation and articulated many shared concerns. Topics covered included digitization of video, film scanning, digital storage strategies, proprietary digital video files in collections, and technical metadata for preservation. The participants of the event agreed that more work needed to be done and action taken, so the idea for an AMIA hack day was born.

Discussions between managers of audiovisual collections and solutions developers provided a fruitful starting point for a hack day project ideas, including:

  • Simple fixity tools to use when transferring files from one storage medium to another
  • Technical metadata extraction and making use of these reports (MediaInfo, ffprobe)
  • Simple cataloging tools for AV, with eye towards contemporary frameworks/schema
  • Discovery tools/UX for audiovisual collections, access at scale

Project proposals

Please register for the hack day (we're currently at capacity, but forming a wait list) and we will start adding your ideas here for voting in advance of the Hack Day!

Possible topics projects could touch on: fixity checking; transcoding; metadata validation; automating file movement; altering fdupes so that it will show user md5 checksum hash; alter Archivematica 1.0 code to bypass zipping the AIP.

Loose metadata projects ideas: Segmentation and time-based annotation of video segments on the web (maybe leveraging Media Fragments?); XSLT mapping; Turn CSV fields into PREMIS xml; Using geolocation information to facilitate new access pathways to video; RDFing PBCore, potentially to leverage in Fedora 4

Please submit your project ideas using the format below. Remember, the more specific the better. Have a look at the project descriptions from Open Repositories 2013 for inspiration.

Template: Sample project idea

Language describing the project scope, goal, and functional requirements

Possible starting points

Links to source code, standards, etc. that we can build on

Related skillsets

Programming/markup languages or expertise needed for this project

Data set required

What types of files or data would be required for testing this project?

Submitted by

[your name here]

Interested team members/participant roles

Who wants to work on this project? What roles will they fill? (e.g. coder/user interface testing/content expert/etc.)

1. Extraction of EIA-608/line 21 closed caption information

Ability to extract and reuse closed caption information from NTSC video.

Possible starting points

Maybe: http://ccextractor.sourceforge.net/ Also: http://dev.w3.org/html5/webvtt/

Related skillsets

Knowledge of ffmpeg, familiarity with .srt and .vtt subtitle formats

Data set required

Uncompressed video files that contain line 21 closed caption information

Submitted by

Steven Villereal

Interested team members/participant roles

Who wants to work on this project?

2. Integration of mediainfo generated metadata into a forensic imaging workflow

Would like to generate and include mediainfo key/value pairs into DFXML for forensic disk images that contain audio or video files. This could be accomplished through the FIWalk utility's DGI interface.

Possible starting points

http://www.sleuthkit.org/sleuthkit/
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.149.5362&rep=rep1&type=pdf
https://raw.github.com/dfxml-working-group/dfxml_schema/v1.1.0/dfxml.xsd
http://mediaarea.net/en/MediaInfo

Data set required

Forensic disk images containing audio and video files

Submitted by

Donald Mennerich

Interested team members/participant roles

Jason Evans Groth (NCSU, jevansg@ncsu.edu) / TBD based on others' experience? (am looking for experience with fiwalk, am a newbie with Mediainfo, have XML/XSLT experience [not specifically DFXML], have born digital collections responsibilities so could come at it with that background, too?)

3. Moving Image Research Collections Digital Video Repository

Several potential ideas for improving this DVR that can hopefully be integrated into other sites…
- Timecode-based tagging in videos or other ways to allow for user-generated metadata
- A way to connect related video material
- scripts for transcoding video (modifying an existing script)
- Issues in XACML restrictions / easy way to make records public/non-public

Possible starting points

DVR: http://mirc.sc.edu
Git: https://github.com/DGI-USC

Related skillsets

Drupal knowledge, Fedora/Islandora, ffmpeg, Python

Data set required

Video files, records, scripts, the DVR itself? (Providable.)

Submitted by

Ashley Blewer

Interested team members/participant roles

Who wants to work on this project? What roles will they fill? TBD?

4. RDFing PBCore

Let's see if we can come up with a RDF expression for PBCore. Could be useful for things like the up and coming Fedora 4.

Possible starting points

http://pbcore.org/index.php
http://www.w3.org/TR/REC-rdf-syntax/
http://dublincore.org/documents/dc-rdf/

Related skillsets

Any of: knowledge of pbcore, XML/RDF, OWL, metadata schema in general

Data set required

Sample PBCore (to be provided)

Submitted by

Kara Van Malssen (idea by Karen Cariani)

Interested team members/participant roles

Who wants to work on this project?

5. Reconciling filenames with embedded technical metadata/named parameters

I'd like to explore if it would be possible to compare embedded technical metadata (file/MIME type/external signature) to existing media filenames to ensure that all files in a given directory are what they are supposed to be according to the extension. There can be messages/a report if any files do not match your named parameters.

Potential User Story: As a CONTENT MANAGER, I need to verify that files with an "mov" extension in a named directory (*.mov) are Quicktime files so that I can ensure filenames accurately represent embedded technical metadata.

Pre-conditions: Specifications of files already determined (ie all access files are qt wrapped .mov), Have associated utilities available to read metadata

Post conditions: Filenames include accurate extension, content manager is delivered a report of any/all inaccurately named files in directory.

Possible starting points

Registries for extension associations (ex. PRONOM: http://www.nationalarchives.gov.uk/PRONOM/Default.aspx)

MediaInfo: http://mediaarea.net/en/MediaInfo

Exiftool: http://www.sno.phy.queensu.ca/~phil/exiftool/

Related skillsets

Python (probably a decent amount of parsing to be done)

Data set required

Require media files with matching or non-matching extensions.

Submitted by

Kathryn Gronsbell

Interested team members/participant roles

Who wants to work on this project? Might be a good opportunity for content managers to bring in some media files to scan!