This is an automated web scraper/data extractor for the 1000 Genomes project found here - https://www.internationalgenome.org/ it uses the coding language Python 3.9 with the Selenium Web Driver and Pandas packages. This script will aid in automating the extraction of VCF and FastQ files from their database.
-
Get Started by installing the Selenium and Pandas packages for python 3.X: pip install selenium pip install pandas
-
Add an output CSV file.
-
Run the Script.
NOTES: The script isn't fully functional yet and still in development. Please feel free to contribute to the project. There are a couple of simple additions needing to be made to make the script run perfectly.