Git Product home page Git Product logo

nycdb's Introduction

nyc-db

Let's research the landlord! New York City is in a housing crisis. Some landlords leave their buildings in despair and let their tenants suffer without heat in winter. Others evict their tenants, legally or illegally, in order to flip buildings and profit off of gentrification. Affordable housing is a scarce resource.

Residents, lawyers, tenants, and organizers who want to use data in their struggle turn to proprietary databases and resources, like PropertyShark, designed for real estate or contend with CSV and printouts from city websites. NYC-DB aims to give technologists and researchers who want to volunteer their time helping community groups who are defending the city against the real estate industry a leg up by providing a ready-to-use database filled with housing data.

NYC-DB builds a postgresql database containing the following datasets:

  • Department of City Planning's Pluto: versions 15v1, 16v2, 17v1, 18v1, and 18v2
  • DOB Job Filings
  • DOB Complaints
  • HPD Violations
  • HPD Registrations
  • HPD Complaints
  • Department of Finance Rolling Sales
  • Tax bills - Rent Stabilization Unit Counts (John Krauss's data)
  • ACRIS
  • 2017 Marshal Evictions
  • ECB / Oath Hearings
  • Property Address Directory
  • J-51 Exemptions

NYC-DB is a python3 command line program that downloads and loads datasets into postgres.

(Easy) Get a copy

Just want a copy of the database?

Here are the latest versions available to download:

License: CC BY-NC-SA 4.0

It's ~3gb compressed and ~25gb decompressed.

Load the db: bzcat nyc-db-2019-07-24.sql.bz2 | psql -d database-name

(Easy) Use our copy of the database

The Housing Data Coalition host our own copy ("instance") of nycdb.

If you are not a member of HDC, please contact [email protected]

If you are a member of HDC, access credentials are in the description at the top of the Slack channel "nycdb-hackers". Take note of the hostname, user, and password. You will be using the base "nycdb". Make sure that you do NOT have "http://" in front of the hostname.

The easiest way is to use a graphical interface like Postico, DBeaver, or Falcon SQL. You will be "connecting to a server". If you have the option, select Postgresql-- this is the specific kind of SQL database that we are using.

Another option is to connect by command line. After installing Postgresql, you gain access to the command line tool "psql". This is how you would use it-- replace "hostname" and "user" with the actual credentials.

psql -h hostname -U user -t nycdb

It will prompt you for the password.

Adding New Datasets (Advanced)

Guide Here

Build it yourself! (Advanced)

nycdb cli

To manage and create copies of the database yourself, you can see the nycdb command line tool available on pypi: pip3 install nycdb

see src/README.md for more information on using the command line tool.

Using the Makefile to build the database

As a convenience you can create the database in one go using this command:

make nyc-db DB_HOST=localhost DB_DATABASE=nycdb DB_USER=databaseuser DB_PASSWORD=mypassword

Using Docker

You can also use Docker to both use and develop nycdb. This can be useful because you only need to install Docker--you don't need to worry about installing the proper version of Python, Postgres, or any other tools.

To proceed, first install Docker and then run:

docker-compose up

After Docker downloads and builds some things, it will start a Postgres server on port 7777 of your local machine, which you can connect to via a desktop client if you like. You can also press CTRL-C at any point to stop the server.

In a separate terminal, you can run:

docker-compose run app bash

At this point you are inside a bash shell in a container that has everything already set up for you. The initial working directory will be /nycdb, which is mapped to the root of the project's repository. From here you can run nycdb to access the command-line tool.

To develop on nycdb itself:

  • You can run pytest to run the test suite.
  • Any changes you make to the tool's source code will automatically be reflected in future invocations of nycdb and/or the test suite.
  • If you don't have a desktop Postgres client, you can always run nycdb --dbshell to interactively inspect the database with psql.

You can leave the bash shell with exit.

If you ever want to wipe the database, run docker-compose down -v.

Setup the database and API on a cloud server

See the folder /ansible for ansible playbooks to setup the database on a sever.

Acknowledgments

Future datasets to add:

  • census data

LICENSE: AGPLv3

NYC-DB - Postgres database of NYC housing data
Copyright (C) 2016-2018 Ziggy Mintz

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU Affero General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU Affero General Public License for more details.

You should have received a copy of the GNU Affero General Public License
along with this program.  If not, see <https://www.gnu.org/licenses/>.

The database files provided on this page are licensed CC BY-NC-SA 4.0.

nycdb's People

Contributors

aepyornis avatar toolness avatar 0xstarcat avatar romeboards avatar amagnasco avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.