danielnieto / scrapman Goto Github PK
View Code? Open in Web Editor NEWRetrieve real (with Javascript executed) HTML code from an URL, ultra fast and supports multiple parallel loading of webs
License: MIT License
Retrieve real (with Javascript executed) HTML code from an URL, ultra fast and supports multiple parallel loading of webs
License: MIT License
Hi ,
I am looking for something similar. Just wanted to know if its possible to set and use cookies in the Electron app across webviews, through the module yet?
Thanks
Using parallel requests shows erratic link counts for https://www.nytimes.com. When using 5 concurrent requests with the code below, the following link counts were generated:
358
358
659
659
656
"If you build it, they will come." - field of dreams
Scrapman appears to hang after making requests on Ubuntu 16.04. Checking the processes (ps aux | grep electron
), electron still remains running (see picture below). Sending a SIGTERM to electron processes exits the scrapman script (killall electron
). Please let me know if you need anymore info.
Does scrapman fully render webpages with javascript? Looks like there are about 300 more links when scraping www.nytimes.com with nightmare, even though both packages use electron:
Scrapman.js.txt
Nightmare.js.txt
Nice idea on parallel requests btw.
Hi Daniel, I am having trouble getting Scrapman to work in Ubuntu (16.04), just hangs and doesn't provide any callback when calling scrapman.load. I noticed on a prior post you mentioned that you had not tested it on Linux (although that was back in 2017). I have checked that I have a linux electron install so that is not the issue. Thoughts welcome and thanks.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.