for an explanation of this script
click here
for more information about robot exclusion protocols check
http://en.wikipedia.org/wiki/Robots.txt
This software is in early stages of development, if you have any feedback please direct your comments to
newmedia@sonologic.nl
or
gelmer@ryerson.ca
Depth of crawl:
This box works the same as the issuecrawler Harvester: just drop some text in it and url's will be ripped out of it.
Note: only www.* or http://* will be recognized as urls