Enhancing Your Extractors

In this section we will explain different methods for enhancing your extractor, increasing your ability to obtain the desired data from the websites.

Section 1. Dealing with multiple pages

How you can take into account differences between multiple pages and still extract the data from the websites.

How to extract data from sites which appear to be able to scroll down forever.

How you can extracted data from websites which require login details.

How websites can often hide information behind different menus and buttons, this section considers how to get access this hidden data.

How to use manual XPath to extract information using the inspection tool of your browser.

Different error codes and what they mean and how to adjust your extractor to deal with the errors.

How to set a regular expression in your output table to improve your data accuracy.

How to extract data from multiple layers of a website

A checklist of actions you can take to improve your extractor.