View on GitHub

What Price Should I List My Apartment For On Airbnb?

A Study Of Airbnb Data - Boston, MA Metro Area

December 2014

Download this project as a .zip file Download this project as a tar.gz file

This site hosts the final project for CS109, Fall 2014.

Contributors To This Website:

Note that the below dashboard has 4 tabs that each contain distinct visualizations.


Airbnb is a web-based marketplace for people to list, discover, and book unique accommodations around the world. It has over 800,000 listings in more than 34,000 cities and 190 countries. Every property listed is associated with an online profile including information about the property such as amenities, space, reviews by previous guests, as well as information about the host. Airbnb provides a medium for hosts to monetize their extra space and provides travelers with an alternate means of lodging to hotels.


As a host of Airbnb, we wanted to optimize our listing, by investigating the following:

We wanted to be able to study this data, visualize it and see if we could glean additional insights than what is available on Airbnb.

Methodology / Table Of Contents

1) Scraping / Data Collection: visit the Github repository for the code used to scrape Airbnb.

Description: The code employed for scraping ( as well as the instructions on how to run this code (readme file) is located in the associated Github repository of this project. We built a scraper to get data for over 2000 listings in the Boston Metro area. Data scraped includes information on listings such as space (property type, number of bedrooms, number of bathrooms, etc.), amenities (kitchen, TV, internet, etc.), prices (cleaning fee, etc.), reviews, location (longitude, latitude, and location review), host information, and description. This code is generalized and can be used to scrape listings for any location.

2) Data Cleaning: the Github repository also contains functions that we used to "clean" the data after scraping. The name of this file is

3) Data Analysis: We first explored the data with plots in both matplotlib and Tableau. Then we attempted to cluster properties using PCA and K-means clustering, and attempted to better understand the most important variables that are related to price by using the variable importance feature of Random Forests. You can view the file here, and it also available in the Github repository and is called AirbnbWrapUp.ipynb

4) Visualization: We were able to create visualizations in Tableau to help us further explore the data and glean interesting insights from the data. One of the dashboards are dedicated to helping one of our team members, Hamel Husain decide how to best price his listing. The dashboard is embedded at the top of this page, but might be better viewed here. Please note that the dashboard contains four tabs at the top which display distinct sets of information.

5) Video: We made a short video describing our motivation and problem statement for this project. That can be viewed here.


Other Useful Documents