NYC building footprint data in TileMill

25 May 2013

The NYC OpenData site has a very appealing sounding dataset here (as of 5/25/2013):

Building Footprints - NYC OpenData

Upon downloading it you'll notice it's a directory building_1012.gdb. What?

It's an ESRI File Geodatabase (in GDAL parlance, "FileGDB") and the commenters on the NYC OpenData site are not pleased:


So you're welcome to follow the yellow brick road and perform the pagan ritual of compiling GDAL with FileGDB support, which is beautifully outlined here. But DON'T BOTHER.

The proprietary binaries ESRI distributes to access FileGDBs only work with files created by ArcCatalog 10.0 or later. The ones published are made using ArcCatalog 9.3.

For the old format you need ArcCatalog apparently. I've saved you the pain and made a Shapefile available here:

Once you have this as a Shapefile, there's still a few caveats.

I chose to put the data finally into PostGIS, since a) that's where the rest of my data is and b) with a large datasource like this (~300MB) Mapnik seemed to render faster reading from Postgres as opposed to directly from the Shapefile. More to investigate there.

This is what I used to convert the Shapefile to PostGIS:

./bin/ogr2ogr -f "PostgreSQL" PG:"dbname=plg user=Bdon" ~/Downloads/NYC_Building_Footprints/building_1012.shp -skip-failures

You'll need the --skip-failures because some of the exported geometries from ArcCatalog are MultiPolygons. So you'll be dropping a few buildings on the floor, but for my use case this doesn't really matter. I'm sure there's a way to get ogr2ogr to play nicely with those.

** UPDATE: ** @dangerscarf sez, "for shp > ogr2ogr > postgres, -nlt PROMOTETOMULTI is the secret to make postgres play nicely with the Polygon/MultiPolygon mix"

The building_1012 table has a 'bin' column. This is the Building Identification Number used by the Department of Buildings. You can do a BIN lookup here to find all the metadata: NYC Department of Buildings

Next, do our favorite thing of loading the PostGIS functions and spatialrefsys table into our database:

1 psql -d plg -f /usr/local/Cellar/postgis/2.0.3/share/postgis/postgis.sql 
2 psql -d plg -f /usr/local/Cellar/postgis/2.0.3/share/postgis/spatial_ref_sys.sql 

Add a geometry column to our building_1012 table:

1 plg=# SELECT AddGeometryColumn ('public','building_1012','geom',900914,'POLYGON',2,false);
2                      addgeometrycolumn                      
3 ------------------------------------------------------------
4  public.building_1012.geom SRID:900914 TYPE:POLYGON DIMS:2 
5 (1 row)

Now convert your WKB geometries:

plg=# update building_1012 set geom = ST_GeomFromEWKB(wkb_geometry);
UPDATE 1053713

You'll notice that the exported projection is SRID=900914:

plg=# select * from spatial_ref_sys where srid = 900914;
srid      | 900914
auth_name | 
auth_srid | 
srtext    | ...
proj4text | +proj=lcc +lat_1=40.66666666666666 +lat_2=41.03333333333333 +lat_0=40.16666666666666 +lon_0=-74 +x_0=300000 +y_0=0 +datum=NAD83 +units=us-ft +no_defs 

So to visualize this in TileMill you can use the 'Custom' SRS with the aforementioned proj4text.

tilemill screenshot