The NYC OpenData site has a very appealing sounding dataset here (as of 5/25/2013):
Upon downloading it you'll notice it's a directory building_1012.gdb. What?
It's an ESRI File Geodatabase (in GDAL parlance, "FileGDB") and the commenters on the NYC OpenData site are not pleased:
So you're welcome to follow the yellow brick road and perform the pagan ritual of compiling GDAL with FileGDB support, which is beautifully outlined here. But DON'T BOTHER.
The proprietary binaries ESRI distributes to access FileGDBs only work with files created by ArcCatalog 10.0 or later. The ones published are made using ArcCatalog 9.3.
For the old format you need ArcCatalog apparently. I've saved you the pain and made a Shapefile available here:
Once you have this as a Shapefile, there's still a few caveats.
I chose to put the data finally into PostGIS, since a) that's where the rest of my data is and b) with a large datasource like this (~300MB) Mapnik seemed to render faster reading from Postgres as opposed to directly from the Shapefile. More to investigate there.
This is what I used to convert the Shapefile to PostGIS:
./bin/ogr2ogr -f "PostgreSQL" PG:"dbname=plg user=Bdon" ~/Downloads/NYC_Building_Footprints/building_1012.shp -skip-failures
You'll need the --skip-failures because some of the exported geometries from ArcCatalog are MultiPolygons. So you'll be dropping a few buildings on the floor, but for my use case this doesn't really matter. I'm sure there's a way to get ogr2ogr to play nicely with those.
** UPDATE: ** @dangerscarf sez, "for shp > ogr2ogr > postgres, -nlt PROMOTETOMULTI is the secret to make postgres play nicely with the Polygon/MultiPolygon mix"
The building_1012 table has a 'bin' column. This is the Building Identification Number used by the Department of Buildings. You can do a BIN lookup here to find all the metadata: NYC Department of Buildings
Next, do our favorite thing of loading the PostGIS functions and spatialrefsys table into our database:
1 psql -d plg -f /usr/local/Cellar/postgis/2.0.3/share/postgis/postgis.sql 2 psql -d plg -f /usr/local/Cellar/postgis/2.0.3/share/postgis/spatial_ref_sys.sql
Add a geometry column to our building_1012 table:
1 plg=# SELECT AddGeometryColumn ('public','building_1012','geom',900914,'POLYGON',2,false); 2 addgeometrycolumn 3 ------------------------------------------------------------ 4 public.building_1012.geom SRID:900914 TYPE:POLYGON DIMS:2 5 (1 row)
Now convert your WKB geometries:
plg=# update building_1012 set geom = ST_GeomFromEWKB(wkb_geometry); UPDATE 1053713
You'll notice that the exported projection is SRID=900914:
plg=# select * from spatial_ref_sys where srid = 900914; srid | 900914 auth_name | auth_srid | srtext | ... proj4text | +proj=lcc +lat_1=40.66666666666666 +lat_2=41.03333333333333 +lat_0=40.16666666666666 +lon_0=-74 +x_0=300000 +y_0=0 +datum=NAD83 +units=us-ft +no_defs
So to visualize this in TileMill you can use the 'Custom' SRS with the aforementioned proj4text.