Checklist for improving extractor performance
This checklist provides a list of possible actions to improve you extractor
1. Train more URLs
By training more URLs, you can provide more information for your extractor to learn from and thus enable it to extract more effectively.
2. Look at the error codes
Looking at the log files displays the error codes of the extractor, this can enable you to work out why they are failing. For more information on the error codes please click here.
3. Using manual XPath
Manual XPath is an advanced tool which use the elements of a webpage to select specific data.
4. Split extractors
Sometimes the simplest way to solve the extraction problem is to create a second extractor and run only the failed results through this extractor.
5. Turn off the website styles and script
Certain website have data which can only be accessed if the styles or scripts are turned off.