AFFORDABLE HOUSING

UC BERKELEY

COLLEGE OF ENVIRONMENTAL DESIGN

CITY+ REGIONAL PLANNING

CP290E

May 12, 2017

DATA SCIENCE FOR

URBAN ANALYTICS LAB

PRESENTATION

AGENDA

WORKSHOP

Introduction

Implications for Planning and Policy

Questions and Discussion

Explaining Variations in Rent

Developing a Housing Market Monitoring System

Purpose

  • Provide near real-time rent estimates for metro areas across the country.

  • Build and host interactive web tools for housing affordability indicators.

Background

Developing a Housing Market Monitoring System

  • Craigslist Scraper

  • Visualization Tools

  • Redfin

PART 1

History of UAL Scrapers

  • First developed by Geoff Boeing for Boeing and Waddell (2016)**
  • Refactored by Samuel Maurer Summer '16
  • Updated by Max Gardner Fall '16 to automate scrapes and utilize Charity Engine distributed computing platform

** Boeing, Geoff, and Paul Waddell. "New Insights into Rental Housing Markets across the United States: Web Scraping and Analyzing Craigslist Rental Listings." Journal of Planning Education and Research (2016): 0739456X16664789.

Charity Engine

  • Based on UC Berkeley BOINC
  • Sell donated, distributed computing resources to orgs
  • Donate proceeds to partner charities
  • Added benefit of distributed IP addresses allows us to scale our searches

Median Advertised Rent for Two-Bedroom Units

Craigslist Listings: April 2017

Research to Date:

Craigslist Apts/Housing

Policy Topics:

  • Collect Up-to-Date Housing Costs (Allows for Improved Fair Market Rent Analysis)
  • Spatial & Temporal Trends

Policy Topics:

  • How much do rooms and shares play a role in the overall housing market? Are they more common in higher cost regions?
  • How do prices compare to apts/housing? What is the shared 'discount'?
  • Spatial & Temporal Trends

Our Research:

Rooms & Shared Listings

Referred to as Apartments

Referred to as Shared Units

Information Collected:

  • Price
  • Location
  • Room Characteristics
  • Body Text

 

Data Cleaning:

  • De-duplication
  • Filtering of incomplete posts and outliers for analysis
Craigslist Region Median Rent for 1-Bedroom and Studio Apts. Median Rent Per-Bedroom for 2+ BR Apts. Median Rent For Shared Listings Per-Bedroom Difference or "Rent Gap"
New York City $1995 $1133 $950 $183
San Francisco Bay Area $2325 $1350 $1000 $350
Los Angeles $1920 $1172 $800 $372
Washington, D.C. $1672 $872 $777 $95
Boston $1974 $1075 $850 $225

Exploring Rent Differences at Smaller Scales

Full Unit Versus Shared Craigslist Listings: April 2017
Full Unit Listings
Shared Listings

Future Research?

  • Amenities and price
  • Deeper text analysis
    • Who uses Craigslist for their apartment search and what are they looking for in a roommate?
    • Who gets screened out of the Craigslist market? (i.e. "No Section 8")
    • Lifestyle preferences

Explaining Variations in Rent

  • Accessibility Measures

  • Rent Prediction Models

PART 2

Understanding rental housing access to businesses

Quantifying Accessibility to Understand Effects on the Housing Market

RENTAL LISTINGS

ACCESSIBILITY

Open Street Map Networks

RENTAL LISTING

Quantifying Accessibility to Understand Effects on the Housing Market

Craigslist Rental Listings

Business Data

Infogroup Business Analyst Listings

Accessibility Metrics

ACCESSIBILITY

OSMnx tool

Web Scraper

Pandana Package

Accessibility Measures

NAICS Industry Codes

Number of Businesses

Number of Employees

Annual             Sales Volumes

Accessibility Measures

Businesses within 1 km

VISUALIZING

AMENITIES & RENTS

 

Businesses within 5,000 meters


in the San Francisco Bay Area

Median Rent within 5,000 meters

 

Exploring prediction models for housing rents

Geographically weighted regression

Gradient boosting classification

 

Feature selection

Housing characteristics
# of bedrooms, bathrooms, sqft, etc.
Accessibility measures
# of retail in block group, # of jobs within 3 km, etc.
Neighborhood characteristics
Median income, median rent,  housing tenure, year built, etc.
Built environment factors
Population density, dwelling unit density, etc.

GWR - a set of spatially varying coefficients

Implications for Planning and Policy

  • Airbnb

  • HUD Fair Market Rent

PART 3

WEB SCRAPING 

 

  • Airbnb started 2008 and has grown rapidly

  • Its growth has created controversy over its positive and negative impacts

  • The question: is Airbnb driving up rents?

Motivations

 

  • Web Scraping : we tested 3 methods

 

Data Collection

Neighborhoods

Zip Codes

Bounding Box

 

  • How can we answer this question?

  • We need data.  Web Scraping is our strategy 

Data Collection

 
  • City

  • Room type

  • Nightly rate

  • Neighborhood

  • Address

  • Overall satisfaction rating

  • Approximate latitude & longitude coordinates

  • Number of people who can be accommodated

  • Number of bedrooms and bathrooms

  • Number of reviews

  • Minimum stay

Data Collection

INFORMATION COLLECTED

 
  • Duplicate Listings

  • Outliers

  • Locations

  • Fake Profiles

Data Collection

DATA CLEANING

 

National Analysis

 

Regional Analysis 

 

San Francisco 

 

Data Analysis

National

In some areas, a significant fraction of hosts have many listings

Data Analysis

Multiple Rentals

National

In most areas, a large fraction of listings are for entire homes

Data Analysis

 Airbnb claims that their “hosts” are mostly just people occasionally renting out a spare room to help pay their mortgage costs.

Regional

Data Analysis

Total Airbnb listings per 1000 residents

Counties

Regional

Data Analysis

Whole Unit listings dominate the key 


Whole unit listings dominate key Airbnb markets


Regional

Data Analysis

In some areas, a significant fraction of hosts have many listings

San Francisco

Data Analysis

 Neighborhoods with high Airbnb concentration

Airbnb Density

Number of Rentals

San Francisco

Data Analysis

Multiple Listings

Regression

Does Airbnb drive up rents? 

One unit increase in Airbnb density (Block Group) is associated with  $146 increase in the monthly rent.

 

Spatial Analysis

Exploring Fair Market Rent Areas using Craigslist data

Motivation

  • Use highly spatially disaggregate and current data to contextualize federal housing policy, particularly HUD's Housing Choice Voucher program, billed as the “centerpiece of the federal low-income housing assistance arsenal” (Grigsby2004).
  • Identify on-the-ground challenges relative to Federal housing subsidy programs
  • Scale the analysis to the national scale
  • Exploratory analysis using Craigslist data to assess defacto rental deserts for many Housing Choice Voucher families
  • To compare rent differences between FMR levels and what Craigslist implies about local rental markets
  • To explore key dimensions of such rental deserts: Prices, availability, and voucher availability

Purpose of Analysis

HCV is

the federal government’s major program for assisting
very low-income families
, the elderly, and the disabled to afford decent, safe, and sanitary housing in the private market.
(HUD web site)”

  • In addition: (community perspective) HUD goal of deconcentrating of poverty

 

Goals of the HCV Program

  • Household pays up to 30-40% of their income in rent
  • Area-specific (2,500), annually updated FMR areas in US
  • HUD pays up to local FMR level

Fair Market Rents key to HCV program

  • Craigslist, Nov 2016-March 2017
  • HUD FMRs: National shapefile w FMRs for 1974 counties, 624 CBSAs, for a total of local 2,598 estimates for different bedroom sizes
  • File counting voucher recipients by each area (1.8m nationally)
  • For HCV eligibility estimates: Census ACS PUMS

Data

  • Substantial areas nationally are off limits to low income households, either because of:
    • Dimension I: High prices, or
    • Dimension II: Inadequate count of units,
    • Dimension III: Lack of available vouchers
  • Maps to follow show difference in FMR rent and “actual” Craigslist rent

Finding: Emergence of Rental Deserts

San Francisco off limits

NorCal Distribution of list vs FMR price

State-level Differences

Craigslist - HUD FMR Rate Differences

Recap: Issues with HUD’s FMR areas

  • Good program--as far as it goes
  • FMR areas are large, concealing substantial within-area variation
  • Outcome: Red on the map jeopardizes HUD’s goals of deconcentration of voucher holders—typical response has been move to 50th percentile FMRs, instead of 40th
  • As of November 2016, regions can opt in for small area FMRs instead

Got voucher?

Most eligible hhs don’t get vouchers

Conclusion

  • HUD has been grappling with concentration issues for decades
  • Remains to be seen if latest approach of smaller area FMRs will help
  • Better, more current data is needed to monitor actually existing local markets so program can be effective. Data like Craigslist is tremendously helpful
  • The count of vouchers way too low relative to demand
    (Obviously), local housing shortages cannot be addressed by Washington