Government Data and Needle

Groups like the Sunlight Foundation, the Center for Responsive Politics, and FactCheck.org are doing a lot to make our governments more transparent, and more effective. Needle could be a valuable tool in this work, both for civil servants who want their data to be truly open and accessible, and for researchers and citizens who want to make sense of the data that is already available. Three new Needle domains demon­strate this: Federal Disaster Declar­ations, White House Nominations and Appoint­ments, and data.gov's Raw Data Catalog.

Data from government can come in a variety of formats, and to get the whole picture these disparate sources need to be integrated. As an example, data.gov has a table of official disaster declarations since 1953. However, that file does not include declarations for the current year. To get that, we can use Needle to scrape the fema.gov website and to combine the two sources.

Open government needs to be about more than files sitting on a server somewhere: the data needs to be understood, and the Nominations and Appointments domain is an example of  how Needle can help here, too. While the White House's list of pending nominations allows for some rudimentary searching and sorting, with Needle you can define new columns to tell who's been waiting the longest, or which agencies have the longest average wait. Data in Needle isn't just static lists or tables: it's a network you can explore along any path, and the results of your searches can themselves be structured sets: For example, to determine which individuals have received multiple nominations, we group nominations by the person nominated, and then keep only those groups with more than one nomination.

These two faces make Needle unique: an interface for web-scraping and data ingest, combined with a data explorer. One more example demonstrates this well: The data.gov site is the centerpiece in the federal government's drive for transparency. It does provide a decent search interface for finding a particular data set, but if you're more interested in the kinds of datasets available, or the popularity of data from different agencies, it's less helpful. Ironically, for an open government platform, data.gov does not advertise whether its own metadata is available in a machine readable form. With Needle we can still scrape the website, assimilate the data, and see the popularity of different file types, or determine the agency with the most downloads.

This is just a start. To see what Needle can do for you, sign up for the preview release.

 

with a Google account


Explore sample
Needlebase domains

 

 

Mass Technology Leadership Council - 2010 Finalist

badge150x50-finalist

Follow needlebase on Twitter

Careers at ITA Software

Copyright © 2010-2011 ITA Software, Inc. · Careers · Contact · Terms of Use · Privacy