webcrawler package

Submodules

webcrawler.analyze module

Analyze Module, was designed to analyze and plot csv data into a scatter and box plot graph.

webcrawler.analyze.graph_boxplot(frame)[source]

CSV data box plot. Plots are made from the house ratios (price/size) and grouped by year.

webcrawler.analyze.graph_scatter(frame)[source]

CSV data scatter plot. Plots are made from the house price and size.

webcrawler.analyze.main()[source]

Analyze and create a scatter and box plot.

webcrawler.crawler module

Python script to download the house information of Elizabeth City from Zillow and save it to a csv file. Each record should contain address, year, size, price. The csv file should contain at least 100 records.

class webcrawler.crawler.WebCrawler(url=None, header=None, data=None, page=None)[source]

Web Crawler

address(address)[source]
Returns:
prop_address: list - updated list of the properties address.
Example:
>>> prop_address = crawler.address(address)
data_clear()[source]

Clear the data stored in the web crawler Example:

>>> crawler.data_clear()
get_addresses()[source]
Returns:
address: list - current list of the properties addresses.
Example:
>>> address = crawler.get_addresses()
get_data()[source]

Return the the url page data. Returns:

data: object - the data of the current url page that is being read.
Example:
>>> data = crawler.get_data()
get_header()[source]

Return the header Returns:

req_headers: dict - the header.
Example:
>>> req_headers = crawler.get_header()
Returns:
links: list - current list of the properties links.
Example:
>>> links = crawler.get_links()
get_page()[source]
Returns:
page: object - the current page being read from the selected url.
Example:
>>> page = crawler.get_page()
get_prices()[source]
Returns:
price: list - current list of properties prices.
Example:
>>> price = crawler.get_prices()
get_sizes()[source]
Returns:
size: list - current list of the properties sizes.
Example:
>>> size = crawler.get_sizes()
get_url()[source]

Return the url Returns:

url: string - the url.
Example:
url = crawler.get_url()
get_years()[source]
Returns:
year: list - current list of the properties years.
Example:
>>> year = crawler.get_years()
price(price)[source]
Returns:
prop_price: list - updated list of the properties prices.
Example:
>>> prop_price = crawler.price(price)
size(size)[source]
Returns:
prop_size: list - updated list of the properties sizes.
Example:
>>> prop_size = crawler.size(size)
year(year)[source]
Returns:
prop_year: list - updated list of the properties years.
Example:
>>> prop_year = crawler.year(year)

webcrawler.extract module

Extracting Data from Zillow, and storing into a csv file to be graphed.

webcrawler.extract.main()[source]

Extracting housing information from zillow by utilizing the web-crawler.

Module contents