Clean Collections: Using OpenRefine to Manage Messy Metadata

Speaker Kathryn Gronsbell
Kathryn focuses on helping develop customized digital preservation and collection management strategies. She specializes in lifecycle management to support ongoing preservation and access to audiovisual material, with a focus on production methodologies and content creator support. Recent projects include taxonomy development and management for the cultural heritage sector, content modeling, ...

Kathryn focuses on helping develop customized digital preservation and collection management strategies. She specializes in lifecycle management to support ongoing preservation and access to audiovisual material, with a focus on production methodologies and content creator support. Recent projects include taxonomy development and management for the cultural heritage sector, content modeling, DAMS configuration and implementation, and requirements gathering for technology selection. Kathryn also designs educational materials for digital preservation tools and concepts, including the freely available ExifTool and Fixity tutorials. Her instructional outreach includes workshops, presentations, and customized training for organizations.


Before joining AVPreserve, Kathryn earned her MA in Moving Image Archiving and Preservation from New York University after receiving a BA in Film & Media Arts from Temple University.

Full Description

Registration for this program has reached full capacity. Please contact Laura Forshay at lforshay@metro.org to be placed on our waiting list. You will be notified if any space becomes available.

Messy, inconsistent metadata makes collection management tasks challenging, yet it is the unfortunate the reality for most of us. In this workshop, participants will learn the basics of using OpenRefine (formerly Google Refine), "a free, open source power tool for working with messy data" to analyze, normalize, and clean up collections metadata so that datasets can be better integrated into workflows and across systems. The workshop is designed for practitioners who are interested in accessing, cleaning up, and modifying data with freely available tools. We will explore and explain how OpenRefine provides options to navigate around challenging data, and normalize both formatting and the data itself.

Participants will walk through several practical exercises using sample collections metadata featuring common metadata transformation techniques. We'll explore approaches to transformation like text clustering and writing basic expressions to get your data in its ideal state. Advanced OpenRefine topics, such as reconciliation of datasets against Freebase and other external datasets and web services will be discussed, but not in-depth. This is an introductory workshop, ideal for those who are new to OpenRefine and are interested in exploring it's simple yet powerful features.

 

IMPORTANT: after following the link to the right, please select the appropriate pricing option and submit your registration information. On the following information summary page, click on the Register button at the bottom of the screen in order to complete your registration. All registrants will receive an automated confirmation email.


Please take a moment to review METRO's Community Expectations. Questions about registration or cancellation?  Visit METRO’s Registration Info page for policies and procedures.

When?

Wed, Aug. 12, 2015
3 p.m. - 5 p.m. US/Eastern

How Much?

Event has ended

Where?

METRO Training Center
57 East 11th Street, 4th Floor
New York, NY 10003