Linking extractors
This page will cover how to link extractors together, for example using both list and detail pages.
Step 1. Selecting a detail pages
Select a detail page you want to extract, in this example I am going to use a BBC news article:.
Step 2. Create an extractor from this webpage
Create an extractor from the webpage, so it outputs some data, like the information shown below:
Now we have our detail extractor we have to go and create a list extractor.
Step 3. Create a list extractor
For this I am going to use the BBC news home page, and create an extractor which extracts all of the links from the first page
Thus creating this list of URLs.
Step 4. Enter the dashboard and change how the detail page collects URLs
Now, we can link both of these extractors together, enabling the first extractor to be used on all extractors from the list. To do this we need to go into the dashboard, go to the detail page extractor (navigating the dashboard) and we can link it to the other extractor by changing "explicit URLs" to "URLs from another extractor".
Step 5. Link the two extractors together
This will bring up a new box asking for an extractor to link to. Type in the name of the list page and then a new box will appear asking for the column with the URLs in, choose that column than run the extractor.
Step 6. Creating the data
Thus creating this data (opened in Excel):
This extractor could be set to run every day, so that you can instantly click and get information about the newest articles on the BBC front page. Now imagine this extractor combined with a few from other sites, you will be able to quickly gather what is happening in the world, without having to open each article, one by one.
Note: further expanding the chain
It is worth noting that this a chain of two extractors, however if this second extractor also had links in, you could then select another extractor to run from it, then another extractor from that, creating a long chain of extractors.