This is a web crawler written in Python using asyncio, aiohttp, and BeautifulSoup to find and save forms on web pages.
- Python 3.7 or higher
- Libraries: aiohttp, BeautifulSoup
- Allows selecting a text file containing a list of URLs to be crawled at limited depth.
- Extracts and saves URLs of forms found on web pages into a text file.
- Clone the repository or download the
crawler.py
file. - Make sure you have Python and the required libraries installed (
aiohttp
,BeautifulSoup
). You can install the dependencies by running:pip install aiohttp beautifulsoup4
- Run the
crawler.py
script. - Choose the text file containing the URLs to be crawled.
- Wait while the web crawler crawls the URLs and saves the forms found.
crawler.py
: The main script.README.md
: This documentation file.
- Crawling is limited to a maximum depth of 3.
- Request rate is limited to avoid overloading the server.
Contributions are welcome! If you encounter issues or have suggestions for improvements, feel free to open an issue or submit a pull request.
This project is licensed under the MIT License.