usage: crawler.py [-h] [-c CATEGORIES] [-p PAGES] [-o OUTPUT_LINKS] [-O OUTPUT_DETAILS] [-n NUM_PROCS] [-t TIMEOUT_Q]
options:
-h, --help show this help message and exit
-c CATEGORIES, --categories CATEGORIES
file containing categories, one per line
-p PAGES, --pages PAGES
number search of pages to fetch and analyze
-o OUTPUT_LINKS, --output-links OUTPUT_LINKS
file to write the search results on
-O OUTPUT_DETAILS, --output-details OUTPUT_DETAILS
file to write the results on
-n NUM_PROCS, --num-procs NUM_PROCS
Number of processes
-t TIMEOUT_Q, --timeout-q TIMEOUT_Q
Timeout when fetching from queues, in seconds
Example: by default the program will read categories.txt, this behaivor can be customized using the --categories option (short option -c)