Collecting Web Data - APIs & Web Scraping


The internet is a vast, rich, and (sometimes) intimidating source of data.  Beyond simply downloading datasets that are already available as spreadsheets or other convenient file formats, there are two additional ways that you may (but not always) be able to collect data and information from individual webpages.

The information in this guide is meant to help provide you with some starting points for learning about and using these methods.  This guide however is not exhaustive.


Workshop Files

The link below contains the workshop files used in the "Introduction to Web Scraping" session.  The workshop is focused on using Python code and packages to complete web scraping tasks.  However, R users can use the R code to achieve nearly identical outcomes.  Both code files are structured and organized in the same way.