Open Knowledge

United States / 2014 70% open

United States is ranked #8 in the 2014 Index
United States's overall Index ranking is down from #2 in 2013

Rank Dataset Breakdown Location (URL) Format Info Prev. Score
1 National Statistics
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
http://www.census.gov/ XLS, ... #1  100% 100%
1 National Statistics
URL: http://www.census.gov/
Format: XLS, CSV, pdf

Census.gov has data on a wide variety of national statistics including all major demographic and economic indicators (population, GDP etc etc). Tabular data is generally provided in machine-readable formats such as Excel (an open format such as CSV would be preferable but we still consider this acceptable). The data is also available "in bulk" in the sense that ther e are complete files plus data is available via automated access on an FTP site at http://www2.census.gov/ (where CSV files are often available). One point to note is that there are other sources of such data within the US government including the Bureau of Economic Analysis (http://bea.gov), the Bureau of Labor Statistics (http://bls.gov/) etc.

The data are in the public domain.


1 National Map
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
http://viewer.nationalmap.gov/... shape #1  100% 100%
1 National Map
URL: http://viewer.nationalmap.gov/viewer/
Format: shape

Use The National Map Viewer and Download Platform to visualize, inspect, and download our most current topographic base map data and products for free. Managed by the USGS National Geospatial Program (NGP), the National Map Viewer provides access to all eight primary data themes of The National Map to include US Topo and historical topographic map products. The viewer platform is extended upon the National Geospatial Intelligence Agency's (NGA) Palanterra x3 Viewer. Data include: Elevation, Orthoimagery, Hydrography, Geographic Names, Boundaries, Transportation, Structures, and Land Cover, while products include: US Topo and Historical Topo Maps. The National Map Viewer also allows visualization and identification queries (but not downloads) of Other Featured Data, to include Scanned Topo Maps, Ecosystems, Protected Areas, Gap Analysis Program Land Cover, Wetlands, Public Land Survey System, and National Park Service Boundaries. Also included is a Natural Hazards panel to view hazards-related information, such as for earthquakes, floods, wildfires, and weather, along with the U.S. National Grid for emergency response. Observation: expert review was done on 27th and 28th of November and on both dates the map viewer did not load on Chrome and Firefox.


1 Legislation
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
http://www.gpo.gov/fdsys/bulkd... XML #1  100% 100%
1 Legislation
URL: http://www.gpo.gov/fdsys/bulkdata/BILLS
Format: XML

The link provided is to bulk XML data for United States Code provided by Office of the Law Revision Counsel for US House of Representatives. Data made available since July 2013 (see this announce http://www.speaker.gov/press-release/opengov-house-representatives-makes-us-code-available-bulk-xml).

Regarding open licensing we assume that the US Code is public domain. In addition no copyright assertion is mentioned on the site and the congress.gov site is run by the Library of Congress and one would anticipate is subject to standard public domain provisions (though there is a legal section whose copyright portion is unfortunately rather unenlightening - http://beta.congress.gov/legal/#copyright).

There is a variety of additional (machine-readable) data from a variety of sources not least the new Congress.gov website (which will be completely replacing http://thomas.loc.gov/ from November 2013). Other resources include: - Bulk data from the GPO in XML format including Congressional Bills, Commerce Business Daily etc. (Does not seem to be updated since Jan 2013). Announced Jan 10 2013 - see http://www.gpo.gov/pdfs/news-media/press/13news01.pdf - The full US Code on the GPO at: http://www.gpo.gov/fdsys/browse/collectionUScode.action?collectionCode=USCODE (PDF) - The Federal Register https://www.federalregister.gov/ which includes "Regulations are issued by federal agencies, boards, or commissions [which] explain how [an] agency intends to carry out a law." (Data is provided in HTML, CSV and JSON and there is a full API - see https://www.federalregister.gov/developers/api/v1 and https://www.federalregister.gov/blog/learn/developers) - Resources listed on http://speaker.gov/open including http://docs.house.gov/ (which includes XML versions of laws being considered) and House floor activities at http://clerk.house.gov/floorsummary/floor-download.aspx In addition it is worth noting various unofficial sites that provide excellent material such as: - https://www.govtrack.us/ - http://opencongress.org/ It may also be interesting to read how expensive some of this material once was, see e.g. Carl Malamud's comments in http://radar.oreilly.com/2009/03/bulk-data-downloads-government-transparency-breakthrough.html


1 Government Budget
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
http://www.whitehouse.gov/omb/... XLS, ... #1  100% 100%
1 Government Budget
URL: http://www.whitehouse.gov/omb/budget/
Format: XLS, CSV, PDF

Current budget is available at http://www.whitehouse.gov/omb/budget/. Formal budget documents are in PDF format and are on http://www.whitehouse.gov/omb/budget/Overview. Machine-readable data in Excel and CSV files can be found in the supplemental material at http://www.whitehouse.gov/omb/budget/Supplemental.

Furthermore, the Whitehouse.gov copyright notice states all material is CC-By (see http://www.whitehouse.gov/copyright).

Bulk: Whilst the data is split across multiple files there is a good core set of data in the "Public Budget Database" which consists of only 3 substantial CSV files. As such, have marked Bulk as "Yes"

  • Machine-readable: whilst formal budget is PDF we believe that all relevant data is contained in the Excel and CSV files so Machine-readable is "Yes" (note also that the CSV files are well formatted)

  • Older budgets: the OMB budget site only contains data for the latest budget with past budgets available on the GPO website at: http://www.gpo.gov/fdsys/browse/collectionGPO.action?collectionCode=BUDGET. The GPO site only provides PDFs. However, the OMB site does contain a historical tables section at http://www.whitehouse.gov/omb/budget/Historicals and the Public Budget Database contains data back to 1962.


8 Location datasets
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
http://www.census.gov/geo/maps... TSV, SHP #1  100% 90%
8 Location datasets
URL: http://www.census.gov/geo/maps-data/data/docs/gazetteer/Gaz_zcta_national.zip
Format: TSV, SHP

This dataset in the US presents some challenges as there is a mixture of data available.

The US Census provide zip-code centroids as part of their 2010 Gazetteer located at http://www.census.gov/geo/maps-data/data/gazetteer2010.html in the form of the Zip-Code Tabulation Areas:

http://www.census.gov/geo/maps-data/data/docs/gazetteer/Gaz_zcta_national.zip

This data is open data (openly licensed, and machine readable etc).

However, there are some limitations of this data as it is based off the census, the last of which was in 2010. Thus, over time this database gets out of date versus the authoritative data held by the USPS. Given this, the data is marked as not timely.

In addition, the ZCTA isn’t as complete as the USPS zip-code list as it focused on geographical areas and so its coverage of e.g. PO boxes is more limited. However, given the specifications of this dataset this should not be considered a major limitation and so we have not docked points anywhere else.

More on Zip-Codes and the USPS

The full database of zip codes from the USPS has not yet been found although it looks like products are available for sale, and they are multiple datasets that can help to provide a whole picture.

The ZCTA file produced by the US Census Bureau is useful but is not the full USPS Database of Zip Codes and lookup tables. Instead, this is a census related product produced every 10 years.

"You may have noticed that Census Bureau products refer to “ZIP Code tabulation areas (ZCTAs)” and not simply to ZIP Codes™. The reason that we cannot tabulate data for ZIP Codes is that they do not have distinct geographic boundaries. Designed by the U.S. Postal Service for use in mail delivery, ZIP Codes represent carrier routes made up of individual addresses. A true representation of ZIP Codes would separate out individual housing units and releasing data for them would risk disclosing personally identifiable information". https://ask.census.gov/faq.php?id=5000&faqId=10488

The ZCTAs are not the same as the (regularly updated) USPS postal code lookup files or the postal code database with addresses and Lat Long. Those data are for sale and a number of USPS services are resold.

There is an USPS API for companies to integrate their their systems - https://www.usps.com/business/web-tools-apis/address-information-api.htm Some date exchange for companies - https://www.usps.com/postalone/program.htm Segmentation direct mail tools https://www.usps.com/business/pdf/Segmentation_WP.pdf Lots of info about databases and services - https://about.usps.com/publications/pub32/pub32_terms.htm and lots of private sector tools such as http://www.zip-codes.com/zip-code-database.asp?gclid=CLG5_NWPncICFWSK2wodgIQAnA

In addition, there are also some crosswalk files made by HUDs and posted on data.gov http://catalog.data.gov/dataset/hud-usps-zip-code-crosswalk-files


13 Pollutant Emissions
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
http://www.epa.gov/ttn/chief/e... CSV, ... #1  100% 70%
13 Pollutant Emissions
URL: http://www.epa.gov/ttn/chief/eiinformation.html
Format: CSV, dbf, mdm, wk4, xls, xlw, txt

Additional data can be easily accessed through the Data.gov Energy community at http://Energy.data.gov. The primary datasets with this material are the Toxic Release Inventory, but other datasets provide insight to additional environmental pollutants. Air pollution data exists at the given link. As the data is provided from federal websites and is therefore public domain (at least in the US) we consider it openly licensed.

There was no terms of use of license related to these data, however these are presumed to be in the public domain as is common for these federal agengies. A tweet was sent. The URL provided was for the entire EPA site which did not point to TOU.

I Updated the file formats as there are many more.


22 Election Results
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
http://www.fec.gov/pubrec/elec... XLS, PDF #1  100% 70%
22 Election Results
URL: http://www.fec.gov/pubrec/electionresults.shtml
Format: XLS, PDF

Data prior to 2004 is not machine readable although recent results are; marked as "machine readable" as this is the present data and direction of travel. Numbers are for "primary, runoff and general election results for the U.S. Senate, the U.S. House of Representatives and, when applicable, U.S. President." They are "obtained from each state’s election office and other official sources." Raw data for the 2000, 2004, and 2008, 2012 Presidential General Elections are also available through the National Atlas.

For elections on the state, local and territory level, please see the Combined Federal/State Disclosure and Election Directory. It provides contact information and links to the elections offices of the 50 states, the District of Columbia, American Samoa, Guam, Northern Mariana Islands, Puerto Rico, and the U.S. Virgin Islands.

Also, see the efforts to standardize local government election data being conducted by: http://openelections.net/

There are restrictions on the commercial re-use of the data "Reports and statements filed by political committees may be inspected and copied by anyone. The names and addresses of individual contributors, however, may not be sold or used for any commercial purpose or to solicit any type of contribution or donation, such as political or charitable contributions. 2 U.S.C. 438(a)(4); 11 CFR 104.15. This restriction applies to Federal reports and statements filed in Washington, as well as in each state. Any person who violates this restriction is subject to the penalties of 2 U.S.C. 437g" http://www.fec.gov/pages/brochures/sale_and_use_brochure.pdf


24 Transport Timetables
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
http://www.amtrak.com/train-sc... n/a #6  75% 45%
24 Transport Timetables
URL: http://www.amtrak.com/train-schedules-timetables


National Railroad Passenger Corporation, doing business as Amtrak, is a publicly funded railroad service operated and managed as a for-profit corporation. Amtrak provides the most meaningful national-scale transportation timetables, as the United States has no national bus service that is publicly funded. For information on local transportation data, see: http://www.citygoround.org/agencies/us/?public=all

Additional transportation data is available from the US Department of Transportation at http://catalog.data.gov/dataset?organization=dot-gov and more at the Data.gov Safety community at http://safety.data.gov. This includes transportation venues, locations, safety records, on-time arrivals, etc. http://data.gov However this question is about national bus and train timetables which are harder to find. There is also a question as to whether in the US national bus and train are in fact government or privately run.


66 Company Register
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
n/a n/a #58  5% 15%
66 Company Register


In the US, corporate registration happens at the state level. The timeliness, availability, and licensing of this data varies among all 50 states. There is no federal dataset that contains all corporate registrations. It would be possible to create a unified open registry for all US corporations (even if only via aggregation from state ones) but this does not exist at this time.

Across those states performance varies widely and in many cases data is not available in bulk, is not machine readable, is not openly licensed etc. For more detail, see the per state summary on Open Corporates.


15 Government Spending
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
n/a n/a #4  90% 10%
15 Government Spending


2014 comments (taken from Hudson Hollister email): The Treasury Department receives and processes payment requests from nearly every agency (with the prominent exception of DoD). These payments requests collectively are known as the Payments Information Repository, or PIR. Treasury promised in 2012 testimony to the Senate that it would publish the PIR as open data, but has not fulfilled the promise (and the DATA Act doesn't require it to).

Details on Treasury's PIR promise are here: http://datacoalition.blogspot.com/2012/07/senate-hearing-illuminates-need-for.html

Becky Sweger - National priorities project After a GAO audit (http://www.gao.gov/products/GAO-14-476) of USASpending.gov was published this summer, HHS agreed to start submitting aggregated Medicare spending data. I can confirm that Medicare benefits payments are showing up now (they didn't as of a few months ago), but didn't confirm the data are complete. If Medicare benefits are now represented in USASpending.gov, it's likely no longer true that the site represents "only a small part of federal spending data." I'd re-reword to "So while the data are available, they represent an incomplete picture of federal spending that does not include government salaries, benefits, and other operating expenditures." Agency procedures for reporting money paid to (or on behalf of) individuals are inconsistent. The Social Security Administration has always done a great job of reporting. HHS finally decided to aggregate and report Medicare. USDA, however, doesn't accurately report food stamp benefits in USASpending.gov. Since this type of payment represents the largest chunk of U.S. spending, it's important to have consistent and enforced guidelines for reporting.

2013 comments: The data at USASpending.gov accounts for only a fraction of all government spending, and it is organized in a way that makes it hard to understand and use. So while the data are available, it's only a small part of federal spending data. It does not include expenses on government salaries and operating expenditures or information on Medicare, the nation's government sponsored medical insurance for the elderly (~20% of total spending). There are no government-wide spending records that would actually be more helpful --- that is, the government doesn't collect the type of data about its own spending that would be useful to the open government world.

The data available at USAspending.gov comes from a variety of sources and is presumably licensed based on the original source licenses: http://usaspending.gov/learn?tab=Sources%20of%20Data


Contributors

Reviewers

  • Tracey P. Lauriault
  • Georg Neumann
  • Daniela Mattern
  • Gil Zaretzer
  • Katelyn Rogers
  • Kamil Gregor
  • Neal Bastek
  • Mor Rubinstein
  • Zach Christensen
  • anonymous
  • Nisha Thompson
  • Rebecca Sentance
  • Tryggvi Björgvinsson
  • Codrina Maria Ilie

Submitters

  • Daniela Mattern
  • Rufus Pollock
  • Blessing Jee
  • Mor Rubinstein
  • Rebecca Williams - XFB
  • anonymous
  • Tracey P. Lauriault