Enhancing Your Extractors
In this section we will explain different methods for enhancing your extractor, increasing your ability to obtain the desired data from the websites.
Section 1. Dealing with multiple pages
How you can take into account differences between multiple pages and still extract the data from the websites.
Section 2. Dealing with infinite scroll
How to extract data from sites which appear to be able to scroll down forever.
Section 3. Getting data from behind a login
How you can extracted data from websites which require login details.
Section 4. Dealing with hidden elements
How websites can often hide information behind different menus and buttons, this section considers how to get access this hidden data.
Section 5. Using manual XPath
How to use manual XPath to extract information using the inspection tool of your browser.
Section 6. Understanding error codes
Different error codes and what they mean and how to adjust your extractor to deal with the errors.
Section 7. Set regular expression
How to set a regular expression in your output table to improve your data accuracy.
Section 8. Linking extractors (chaining)
How to extract data from multiple layers of a website
Section 9. Checklist for improving your extractor
A checklist of actions you can take to improve your extractor.