NYC PLUTO Dataset
(NYC buildings by age. Thanks to @Caged for the color scheme from his Portland map)
The PLUTO dataset - the most extensive land use dataset published by the NYC city government - was made free last week (h/t @SR_Spatial).
I'll walk through the steps I took to join three datasets - Building footprint, PLUTO, and PAD to generate the above map.
- BIN: The Building Identification Number assigned by the Dept of Buildings.
- PLUTO: a CSV file (one for each borough) that contains all tax blocks/lots.
- PAD: An address database. Notably, this can contain multiple rows for one unique combination of (BIN, Boro, Tax Block, Tax Lot)
Data quality notes:
- There is a significant amount of missing data in the PAD dataset. Notably, all the null-valued buildings for each borough have a BIN of 1000000, 2000000, 3000000, etc.
- PLUTO uses boro like 'MN' 'BX' BK', while PAD uses 1,2,3.
- You should reduce tax lots and blocks to numeric valued columns, since what is 405 in PLUTO is 00405 in PAD.
From my last post - you can consume the Shapefiles directly with ogr2ogr into a PostGIS table.
Then add a date column like follows:
alter table building_1012 add column year numeric(4,0);
The CREATE TABLE statement is quite large, it's in the GitHub repo here.
After you've created the table, import all 5 boroughs like this:
(the Queens dataset had a corrupted row which I corrected in vi).
copy nyc_pluto_13v1 from '/Users/Bdon/workspace/nyc_pluto_13v1/mn13v1.csv' delimiter ',' header csv;
copy nyc_pluto_13v1 from '/Users/Bdon/workspace/nyc_pluto_13v1/bk13v1.csv' delimiter ',' header csv;
copy nyc_pluto_13v1 from '/Users/Bdon/workspace/nyc_pluto_13v1/qn13v1.csv' delimiter ',' header csv;
copy nyc_pluto_13v1 from '/Users/Bdon/workspace/nyc_pluto_13v1/bx13v1.csv' delimiter ',' header csv;
copy nyc_pluto_13v1 from '/Users/Bdon/workspace/nyc_pluto_13v1/si13v1.csv' delimiter ',' header csv;
The CREATE TABLE statement is in this gist.
I reduced the PAD dataset to only unique bins, because two rows that have the same BIN just specified an alias address.
copy bobaadr from '/Users/Bdon/workspace/pad13a/bobaadr.txt' delimiter ',' header csv;
Then I normalized the data as follows:
Delete any rows from PAD where the BIN is not known.
delete from bobaadr where bin = '1000000' or bin = '2000000' or bin = '3000000' or bin = '4000000' or bin = '5000000';
Create a secondary table which will be unique over BINs.
create table bobaadr_onlyvalid_uniquebin as select * from bobaadr where 1=2;
insert into bobaadr_onlyvalid_uniquebin select distinct on (bin) * from bobaadr;
Make PAD joinable over the `borough` column on nyc_pluto_13v1.
alter table bobaadr_onlyvalid_uniquebin add column borocode char(2);
update bobaadr_onlyvalid_uniquebin set borocode = 'MN' where boro = '1';
update bobaadr_onlyvalid_uniquebin set borocode = 'BX' where boro = '2';
update bobaadr_onlyvalid_uniquebin set borocode = 'BK' where boro = '3';
update bobaadr_onlyvalid_uniquebin set borocode = 'QN' where boro = '4';
update bobaadr_onlyvalid_uniquebin set borocode = 'SI' where boro = '5';
Now to do the join to populate the building dataset, you will need to do a large join:
update building_1012 b set year = n.year_built from bobaadr_onlyvalid_uniquebin a, nyc_pluto_13v1 n
where a.bin = b.bin and a.borocode = n.borough and a.block = n.block_numeric and a.lot = n.lot_numeric;
This will be extremely slow - you should add some composite indices onto nyc_pluto_13v1 on (block, lot).
create index bobaadr_onlyvalid_uniquebin_bin_idx on bobaadr_onlyvalid_uniquebin(bin);
create index nyc_pluto_13v1_block_lot_idx on nyc_pluto_13v1(block,lot);
It's also recommended to make sure your local PostGIS is tuned well. on my 8GB macbook air I gave 2GB of shared_buffers to postgresql.
These two blog posts are the ones I usually refer to when tuning PostgreSQL: link
The final result is a building dataset that you can color-code by age, as shown above.
Some interesting regions on the building age map:
Brooklyn Heights - the oldest buildings in the city.
Borough Park has an interesting pattern of building ages -
Lefferts Manor historic district.