House of Gord scraper returns 403

CommunityScrapers

This is a public repository containing scrapers created by the Stash Community.

❗ Make sure to read ALL of the instructions here before requesting any help in the Discord channel. For a more user friendly step-by-step guide you can check out the Guide to Scraping ❗

When asking for help do not forget to mention what version of Stash you are using, the scraper that is failing, the URL you are attempting to scrape, and your current Python version (but only if the scraper requires Python)

Note that some scrapers (notably ThePornDB for Movies and ThePornDB for JAV) require extra configuration. As of v0.24.0 this is not possible through the web interface so you will need to open these in a text editor and read the instructions to add the necessary fields, usually an API key or a cookie.

Installing scrapers

With the v0.24.0 release of Stash you no longer need to install scrapers manually: if you go to Settings > Metadata Providers you can find the scrapers from this repository in the Community (stable) feed and install them without ever needing to copy any files manually. Note that some scrapers still require manual configuration

If you still prefer to manage your scrapers manually that is still supported as well, using the same steps as before. Manually installed scrapers and ones installed through Stash can both be used at the same time.

Installing scrapers (manually)

To download all of the scrapers at once you can clone the git repository. If you only need some of the scrapers they can be downloaded individually.

When downloading directly click at the .yml you want and then make sure to click the raw button:

and then save page as file from the browser to preserve the correct format for the .yml file.

Any scraper file has to be stored in the path you've configured as your Scrapers Path in Settings > System > Application Paths, which is ~/.stash/scrapers by default. You may recognize ~/.stash as the folder where the config and database file are located.

After manually updating the scrapers folder contents or editing a scraper file a reload of the scrapers is needed and a refresh of the edit scene/performer page. (Scrape with... -> Reload scrapers)

Some sites block content if the user agent is not valid. If you get some kind of blocked or denied message make sure to configure the Scraping -> Scraper User Agent setting in stash. Valid strings e.g. for firefox can be found here https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/User-Agent/Firefox . Scrapers for those sites should have a comment mentioning this along with a tested and working user agent string

Scrapers with useCDP set to true require that you have properly configured the Chrome CDP path setting in Stash. If you decide to use a remote instance the headless chromium docker image from https://hub.docker.com/r/chromedp/headless-shell/ is highly recommended.

Python scrapers

Some scrapers require external programs to function, usually Python. All scrapers are tested with the newest stable release of Python, currently 3.12.2.

Depending on your operating system you may need to install both Python and the scrapers' dependencies before they will work. For Windows users we strongly recommend installing Python using the installers from python.org instead of through the Windows Store, and also installing it outside of the Users folder so it is accessible to the entire system: a commonly used option is C:\Python312.

After installing Python you should install the most commonly used dependencies by running the following command in a terminal window:

python -m pip install stashapp-tools requests cloudscraper beautifulsoup4 lxml

You may need to replace python with py in the command if you are running on Windows.

If Stash does not detect your Python installation you can set the Python executable path in Settings > System > Application Paths. Note that this needs to point to the executable itself and not just the folder it is in.

Manually configured scrapers

Some scrapers need extra configuration before they will work. This is unfortunate if you install them through the web interface as any updates will overwrite your changes.

ThePornDBMovies and ThePornDBJAV need to be edited to have your API key in them. Make sure you do not remove the Bearer part when you add your key.
Python scrapers that need to communicate with your Stash (to create markers, for example, or to search your file system) might need to be configured to talk to your local Stash: by default they will use http://localhost:9999/graphql with no authentication to make their queries, but if your setup requires otherwise then you can find py_common/config.ini and set your own values.
Python scrapers that can be configured will (usually) create a default configuration file called config.ini in their respective directories the first time you run them.

Scrapers

You can find a list of sites that currently have a scraper in SCRAPERS-LIST.md

💥 For most scrapers you have to provide the scene/performer URL

Stable build (>=v0.11.0)
Once you populate the `URL` field with an appropriate url, the scrape URL button will be active.

Clicking on that button brings up a popup that lets you select which fields to update.

Some scrapers support the Scrape with... function so you can you use that instead of adding a url. Scrape with... usually works with either the Title field or the filename so make sure that they provide enough data for the scraper to work with.

A Query button is also available for scrapers that support that. Clicking the button allows you to edit the text that the scraper will use for your queries.

In case of errors/no results during scraping make sure to check stash's log section (Settings->Logs->Log Level Debug) for more info.

For more info please check the scraping help section

Contributing

Contributions are always welcome! Use the Scraping Configuration help section to get started and stop by the Discord #scrapers channel with any questions.

The last line of a scraper definition (.yml file) must be the last updated date, in the following format:
# Last Updated Month Day, Year
Month = Full month name (October)
Day = Day of month, with leading zero (04, 16)
Year = Full year (2020)
Example: # Last Updated October 04, 2020

Validation

The scrapers in this repository can be validated against a schema and checked for common errors.

First, install the validator's dependencies - inside the ./validator folder, run: yarn.

Then, to run the validator, use node validate.js in the root of the repository.
Specific scrapers can be checked using: node validate.js scrapers/foo.yml scrapers/bar.yml

Docker option

Instead of NodeJS being installed, Docker can be used to run the validator

docker run --rm -v .:/app node:alpine /bin/sh -c "cd /app/validator && yarn install --silent && cd .. && node validate.js --ci""

Site	URL	Free/Paid/Premium	Sub Req (Yes/No)	Notes	Worked On By
abdreams	https://abdreams.com/	paid	yes	none
abplayhouse	http://www.abplayhouse.com/	paid	no	none
Allover30	https://www.allover30.com/	paid	yes	none
bailey jay	https://www.ts-baileyjay.com/	paid	no	none
diapermess	https://www.diapermess.com/	paid	no	none
erodougazo	https://erodougazo.com/	free	no subscription	detailed Performer & film metadata, JAV only
Gelbooru	https://gelbooru.com	free	no	mostly images, but also some scenes
Max Hardcore	https://www.max-hardcore.com/	Paid	no	Scenes and models
natalie mars	https://nataliemars.com/	paid	no	none
Pascals Sub Sluts	https://www.pascalssubsluts.com/submissive/	paid	yes	Scenes and models
punishedindiapers	http://punishedindiapers.com/	paid	no	none
redtube	https://www.redtube.com/	premium	no	none
Suicide Girls	https://www.suicidegirls.com	paid	yes	probably galleries and scenes
Triga Films	https://trigafilms.com/	paid	no	Doesn't require a subscription to view details, but does require a free login
youtube	https://youtube.com	premium	no	none
~~21roles~~	~~https://www.21roles.com/~~	~~paid~~	no	~~none~~	halorrr
~~adult time studios~~	~~https://www.adulttime.com/~~	~~paid~~	~~yes~~	~~none~~	Belleyy
~~angela white~~	~~http://angelawhite.com/~~	~~paid~~	no	~~none~~	halorrr
~~avmoo~~	~~https://avmoo.online/~~	~~free~~	~~no subscription~~	~~Scenes and Performers, contain images for both, change domain name frequently, JAV only~~	MortonBridges
~~Broke Straight Boys~~	~~https://brokestraightboys.com/~~	~~paid~~	no	~~Mostly clips~~	Maista6969
~~Cadinot~~	~~https://www.cadinot.fr/en~~	~~paid~~	no	~~none~~	Maista6969
~~Carnal+~~	~~https://carnalplus.com/~~	~~paid~~	no	~~None~~	Maista6969
~~Chanta's Bitches~~	~~https://www.twistedfactory.com/~~	~~free~~	no	~~Scenes and Performers~~	~~Morton Bridges~~
~~Chanta's Bitches~~	~~https://www.twistedfactory.com/~~	~~free~~	no	~~Scenes and Performers~~	Morton Bridges
~~CockyBoys~~	~~https://cockyboys.com/~~	~~paid~~	no	~~None~~	Maista6969
~~Colby Knox~~	~~https://www.colbyknox.com/~~	~~Paid~~	no	~~Scenes and Performers~~	DogmaDragon
~~Colette~~	~~https://www.colette.com/~~	~~paid~~	no	~~None~~	bnkai
~~dickdrainers~~	~~http://dickdrainers.com/~~	~~paid~~	~~yes~~	~~just the scenes~~	niemands
~~dorcelclub~~	~~https://www.dorcelclub.com/~~	~~paid~~	no	~~none~~	bnkai
~~Facial Abuse~~	~~https://tour5m.facialabuse.com~~	~~paid~~	no	~~scenes~~	MortonBridges
~~Fetish Network~~	~~http://www.fetishnetwork.com/~~	~~paid~~	no	~~sub sites can be scraped from the main site~~	niemands
~~FuckedHard18~~	~~http://fuckedhard18.com/membersarea/~~	~~paid~~	no	no
~~Gay Erotic Video Index~~	~~https://gayeroticvideoindex.com/~~	~~free~~	no	~~none~~	Maista6969
~~girlsway~~	~~https://www.girlsway.com/~~	~~paid~~	no	~~none~~	halorrr
~~Gloryhole Secrets~~	~~https://www.gloryholesecrets.com/~~	~~paid~~	no	~~none~~	niemands
~~Guys in Sweatpants~~	~~https://guysinsweatpants.com~~	~~paid~~	no	~~none~~	Maista6969
~~HarmonyVision~~	~~https://harmonyvision.com/~~	~~paid~~	no	~~none~~	bnkai
~~hobby.porn~~	~~https://hobby.porn/~~	~~free~~	~~no subscription~~	~~Scenes and Performers, contains a lot deleted information from PornHub~~	IAmKontrast
~~hucows~~	~~https://www.hucows.com/~~	~~paid~~	no	~~none~~	bnkai
~~Hung Young Brit~~	~~https://www.hungyoungbrit.com/~~	~~Paid~~	~~no for Scenes, Yes for Performers~~	~~none~~	MortonBridges
~~incestflix~~	~~http://www.incestflix.com/~~	~~free~~	no	~~none~~	plz12345
~~Jacquie & Michel TV~~	~~https://www.jacquieetmicheltv.net~~	~~paid~~	no	~~blocked~~	niemands
~~javdatabase~~	~~https://www.javdatabase.com/~~	~~free~~	no	~~Scenes & Performers, images available for both, JAV only~~	MortonBridges
~~lordaardvark~~	~~http://www.lordaardvark.com/~~	~~free~~	no	~~none~~	nrg101
~~Mandy Flores~~	~~https://mandyflores.com/~~	~~paid~~	no	~~none~~	niemands
~~metart~~	~~https://www.metart.com/~~	~~premium~~	no	~~none~~	halorrr
~~motherless~~	~~https://motherless.com/~~	~~free~~	no	~~none~~	halorrr
~~My Dirty Hobby~~	~~https://www.mydirtyhobby.com/~~	~~paid~~	no	~~Scenes and Models~~	bnkai
~~NYSEED~~	~~https://nyseedxxx.com/~~	~~Paid~~	no	~~Scenes and Performers~~	MortonBridges
~~Older4Me~~	~~www.older4me.com/home~~	~~paid~~	no	~~AdultSiteRunner network~~	Maista6969
~~rachel steele~~	~~https://rachel-steele.com/~~	~~paid~~	no	~~none~~	VillageIdiot
~~Raw Fuck Club~~	~~https://www.rawfuckclub.com/~~	~~Premium~~	no	~~Scenes and Performers~~
~~shiny's bound sluts~~	~~https://shinysboundsluts.com/~~	~~paid~~	~~yes~~	~~browser able with out membership~~	DogmaDragon
~~spankbang~~	~~https://spankbang.com/~~	~~free~~	no	~~none~~	halorrr
~~strokies~~	~~http://www.strokies.com/~~	~~paid~~	no	~~none~~	DogmaDragon
~~studiofow~~	~~https://studiofow.com/~~	~~free~~	no	~~none~~	bnkai
~~Tadpolexstudio~~	~~https://www.tadpolexstudio.com~~	~~paid~~	no	~~non-VR scenes for tadpole studio~~	Maista6969
~~TMDB~~	~~https://www.themoviedb.org/~~	~~free~~	no	~~Scenes and Performers~~	KennyG
~~Trailer Trash Boys~~	~~https://trailertrashboys.com/~~	~~paid~~	no	~~Mostly clips~~	MortonBridges
~~traxxx~~	~~https://traxxx.me~~	~~free~~	no	~~none~~	stg-annon
~~Treasure Island Media~~	~~https://www.treasureislandmedia.com/~~	~~premium~~	no	~~Scenes and Performers~~	MortonBridges
~~tushyraw~~	~~https://www.tushyraw.com/~~	~~paid~~	no	~~none~~	ueaslsef
~~tushy~~	~~https://www.tushy.com/~~	~~paid~~	no	~~none~~	ueaslsef
~~Twisted Factory~~	~~https://www.twistedfactory.com/~~	~~free~~	no	~~Scenes and Performers~~	MortonBridges
~~watchmygf~~	~~https://www.watchmygf.me~~	~~free~~	no	~~none~~	halorrr
~~xhamster~~	~~http://xhamster.com/~~	~~premium~~	no	~~none~~	bnkai
~~xnxx~~	~~https://www.xnxx.com/~~	~~premium~~	no	~~none~~	niemands
~~xtube~~	~~https://www.xtube.com/~~	~~premium~~	no	~~none~~	niemands
~~WTFPass~~	~~https://wtfpass.com/~~	~~paid~~	no	~~none~~	niemands
~~xvideos~~	~~https://www.xvideos.com/~~	~~premium~~	no	~~none~~	niemands
~~youporn~~	~~https://youporn.com~~	~~free~~	no	~~none~~	halorrr

stashapp / communityscrapers Goto Github PK

communityscrapers's Introduction

CommunityScrapers

Installing scrapers

Installing scrapers (manually)

Python scrapers

Manually configured scrapers

Scrapers

Contributing

Validation

Docker option

communityscrapers's People

Stargazers

Watchers

Forkers

communityscrapers's Issues

MGStage

Sites:

Why Is it blocked?:

What might unblock this?:

Title

Sites:

Why Is it blocked?:

What might unblock this?:

Recommend Projects

Recommend Topics

Recommend Org