by Davis Erin Anderson

On a temperate day last week, METRO hosted a workshop we’d planned as part of NYC Open Data Week. The themed week had long since passed; our initial program was postponed due to a snow day in early March — the second of three weekly nor’easters that bludgeoned our area at the end of a long winter.

Librarians and archivists, reporters and investigators, and NYC Open Data Coordinators — many still wearing scarves — met for an afternoon workshop to unpack the mysteries contained within a crucial asset: the data dictionary.

By regulation, data dictionaries appear alongside every data set hosted in NYC’s Open Data portal. They are documents that define the terms that appear in the data itself. As Julia Marden, our workshop facilitator said, “the purpose of a data dictionary is to explain what all the variable names and values in your spreadsheet really mean.”

But the majority of these resources tend to feature terms that require insider knowledge. Our mission last week was to make use of a framework by which we might work with Open Data Coordinators to ameliorate these issues. We worked a QA process by identifying research questions relevant to example data sets, working through these research questions, and identifying the challenges that people face when they try to use individual data dictionaries.

The group of which I was a part ran a QA process on data dictionaries for both the Wifi Hotspots Location data set and the Building Footprints data set. Dominic, an Open Data Coordinator from DOITT, was on hand to answer questions regarding the data sets under his purview.

The Building Footprints data set raised a lot of questions for me about land use in New York City, and Dominic Mauro from DOITT recommended a couple of sources for further research. One is New York City’s Zoning & Land Use Map, which provides a wealth of information about our city’s buildings. The other is NYC Then & Now, which overlays historic maps of our neighborhoods to show how our city has developed over time. Both of these fascinating resources drew me in on a path of discovery — which is why I wanted to share them here.

If you missed this workshop and are eager to dive in to the treasure that is the NYC Open Data Portal, be sure to check out our Introduction to NYC Open Data workshop with Julia Marden on Tuesday, May 22 from 4:00 p.m. until 7:00 p.m.