amalrajan / learncpp-download Goto Github PK
View Code? Open in Web Editor NEWAn advanced web scraper tool that seamlessly fetches and combines over 350 online tutorials into a convenient offline PDF format.
License: MIT License
An advanced web scraper tool that seamlessly fetches and combines over 350 online tutorials into a convenient offline PDF format.
License: MIT License
(process:18596): GLib-GIO-WARNING **: 21:51:57.865: Unexpectedly, UWP app `Microsoft.ScreenSketch_11.2309.16.0_x64__8wekyb3d8bbwe' (AUMId `Microsoft.ScreenSketch_8wekyb3d8bbwe!App') supports 29 extensions but has no verbs
(process:18596): GLib-GIO-WARNING **: 21:51:57.909: Unexpectedly, UWP app `Clipchamp.Clipchamp_2.8.1.0_neutral__yxz26nhyzhsrt' (AUMId `Clipchamp.Clipchamp_yxz26nhyzhsrt!App') supports 41 extensions but has no verbs
It stops at 99.7%.
Empty Download folder.
When using the instructions featured in the README.md, with a Python3 fresh environment, main.py will fail to execute. The following modules need to be installed (using pip):
Is the step after wkhtmlpdf: python3 main.py [-h] [-o OUTPUT] [--nopdf] or something else?
I have run wkhtmlpdf application but it just opens for a certain instant.
docker run --rm --name=learncpp-download --mount type=bind,destination=/app/downloads,source=/home/siemens/learncpp/learncpp-download/source/downloads --shm-size=1.14gb amalrajan/learncpp-download
2023-11-21 13:41:04,311 WARNING services.py:1826 -- WARNING: The object store is using /tmp instead of /dev/shm because /dev/shm has only 1224069120 bytes available. This will harm performance! You may be able to free up space by deleting files in /dev/shm. If you are inside a Docker container, you can increase /dev/shm size by passing '--shm-size=8.98gb' to 'docker run' (or add it to the run_options list in a Ray cluster config). Make sure to set this to more than 30% of available RAM.
2023-11-21 13:41:04,434 INFO worker.py:1636 -- Started a local Ray instance.
Traceback (most recent call last):
File "/usr/lib/python3.10/urllib/request.py", line 1348, in do_open
h.request(req.get_method(), req.selector, req.data, headers,
File "/usr/lib/python3.10/http/client.py", line 1282, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/usr/lib/python3.10/http/client.py", line 1328, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/usr/lib/python3.10/http/client.py", line 1277, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/usr/lib/python3.10/http/client.py", line 1037, in _send_output
self.send(msg)
File "/usr/lib/python3.10/http/client.py", line 975, in send
self.connect()
File "/usr/lib/python3.10/http/client.py", line 941, in connect
self.sock = self._create_connection(
File "/usr/lib/python3.10/socket.py", line 824, in create_connection
for res in getaddrinfo(host, port, 0, SOCK_STREAM):
File "/usr/lib/python3.10/socket.py", line 955, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -3] Temporary failure in name resolution
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/app/source/main.py", line 27, in
instance = render.WeasyRender()
File "/app/source/helper/render.py", line 99, in init
self.urls = self.get_urls(cooldown)
File "/app/source/helper/render.py", line 27, in get_urls
return scraper.get_urls(cooldown)
File "/app/source/helper/scraper.py", line 20, in get_urls
sauce = urllib.request.urlopen(req).read()
File "/usr/lib/python3.10/urllib/request.py", line 216, in urlopen
return opener.open(url, data, timeout)
File "/usr/lib/python3.10/urllib/request.py", line 519, in open
response = self._open(req, data)
File "/usr/lib/python3.10/urllib/request.py", line 536, in _open
result = self._call_chain(self.handle_open, protocol, protocol +
File "/usr/lib/python3.10/urllib/request.py", line 496, in _call_chain
result = func(*args)
File "/usr/lib/python3.10/urllib/request.py", line 1377, in http_open
return self.do_open(http.client.HTTPConnection, req)
File "/usr/lib/python3.10/urllib/request.py", line 1351, in do_open
raise URLError(err)
urllib.error.URLError: <urlopen error [Errno -3] Temporary failure in name resolution>
WeasyPrint could not import some external libraries. Please carefully follow the installation steps before reporting an issue:
[Installation Steps](https://doc.courtbouillon.org/weasyprint/stable/first_steps.html#installation)
[Troubleshooting Guide](https://doc.courtbouillon.org/weasyprint/stable/first_steps.html#troubleshooting)
Traceback (most recent call last):
File "C:\Users\WDAGUtilityAccount\Documents\learncpp-download-master\source\main.py", line 5, in <module>
from helper import render
File "C:\Users\WDAGUtilityAccount\Documents\learncpp-download-master\source\helper\render.py", line 10, in <module>
import weasyprint
File "C:\Users\WDAGUtilityAccount\AppData\Local\Programs\Python\Python310\lib\site-packages\weasyprint\__init__.py", line 387, in <module>
from .css import preprocess_stylesheet # noqa isort:skip
File "C:\Users\WDAGUtilityAccount\AppData\Local\Programs\Python\Python310\lib\site-packages\weasyprint\css\__init__.py", line 25, in <module>
from . import computed_values, counters, media_queries
File "C:\Users\WDAGUtilityAccount\AppData\Local\Programs\Python\Python310\lib\site-packages\weasyprint\css\computed_values.py", line 11, in <module>
from ..text.ffi import ffi, pango, units_to_double
File "C:\Users\WDAGUtilityAccount\AppData\Local\Programs\Python\Python310\lib\site-packages\weasyprint\text\ffi.py", line 428, in <module>
gobject = _dlopen(
File "C:\Users\WDAGUtilityAccount\AppData\Local\Programs\Python\Python310\lib\site-packages\weasyprint\text\ffi.py", line 417, in _dlopen
return ffi.dlopen(names[0]) # pragma: no cover
File "C:\Users\WDAGUtilityAccount\AppData\Local\Programs\Python\Python310\lib\site-packages\cffi\api.py", line 150, in dlopen
lib, function_cache = _make_ffi_library(self, name, flags)
File "C:\Users\WDAGUtilityAccount\AppData\Local\Programs\Python\Python310\lib\site-packages\cffi\api.py", line 832, in _make_ffi_library
backendlib = _load_backend_lib(backend, libname, flags)
File "C:\Users\WDAGUtilityAccount\AppData\Local\Programs\Python\Python310\lib\site-packages\cffi\api.py", line 827, in _load_backend_lib
raise OSError(msg)
OSError: cannot load library 'gobject-2.0-0': error 0x7e. Additionally, ctypes.util.find_library() did not manage to locate a library called 'gobject-2.0-0'
$ docker run --rm --name=learncpp-download -v learncpp-download:/app/downloads --shm-size=10.17gb amalrajan/learncpp-download
[======================================================------] 90.0% ...
It just exits after it gets to 90% (where it gets pretty much instantly), then it just exists. The mounted folder for download is empty, and it seems to be exiting with exit code 0
log is following:
(download_file pid=70080) ERROR:root:unable to download: http://www.learncpp.com/cpp-tutorial/introduction-to-these-tutorials#FAQ
(download_file pid=70080) ERROR:root:unable to download: https://www.learncpp.com/cpp-tutorial/configuring-your-compiler-compiler-extensions/
(download_file pid=70081) ERROR:root:unable to download: https://www.learncpp.com/cpp-tutorial/introduction-to-cplusplus/
(download_file pid=70081) ERROR:root:unable to download: https://www.learncpp.com/cpp-tutorial/configuring-your-compiler-warning-and-error-levels/
(download_file pid=70081) ERROR:root:unable to download: https://www.learncpp.com/cpp-tutorial/introduction-to-iostream-cout-cin-and-endl/
(download_file pid=70084) ERROR:root:unable to download: https://www.learncpp.com/cpp-tutorial/installing-an-integrated-development-environment-ide/
(download_file pid=70084) ERROR:root:unable to download: https://www.learncpp.com/cpp-tutorial/comments/
(download_file pid=70084) ERROR:root:unable to download: https://www.learncpp.com/cpp-tutorial/introduction-to-expressions/
(download_file pid=70083) ERROR:root:unable to download: http://www.learncpp.com/cpp-tutorial/introduction-to-these-tutorials#FAQ
(download_file pid=70083) ERROR:root:unable to download: https://www.learncpp.com/cpp-tutorial/compiling-your-first-program/
Hello,
I think your program is not working for python versions 3.8 and above.
Weasy print is giving errror on import.
I found a possible solution : https://stackoverflow.com/questions/63449770/oserror-cannot-load-library-gobject-2-0-error-0x7e
But for this solution I have to download GTK. Is there a way I can use it without GTK?
My Python version is 3.9.7.
The web scraper tries to scrape local Javascript functions, like the show solution button in quizzes, which inevitably returns void
.
I do not know much about web scraping, but it would be nice if the bot opens the functions and copies the solution instead or something like that...
P.S. You may tell me to use the HTML documents, but it would be nice to also have the answers visible in the PDF
When following instructions, the "ray" dependency isn't automatically installed. I had to manually install ray. Not a big deal though, thank you for your work!
hi
thank you for write this software
but i have a problem:
it doesn't download anything and only displays this output
and just create directory downloads
2023-10-15 09:11:30,546 INFO worker.py:1636 -- Started a local Ray instance.
(process:1140): GLib-GIO-WARNING **: 11:13:56.138: Unexpectedly, UWP app Microsoft.ScreenSketch_11.2305.26.0_x64__8wekyb3d8bbwe' (AUMId
Microsoft.ScreenSketch_8wekyb3d8bbwe!App') supports 29 extensions but has no verbs
Traceback (most recent call last):
File "C:\Users\abhayhm\Desktop\learncpp\learncpp-download\source\main.py", line 23, in
ray.init(log_to_driver=False)
File "C:\Users\abhayhm\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\ray_private\client_mode_hook.py", line 103, in wrapper
return func(*args, **kwargs)
File "C:\Users\abhayhm\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\ray_private\worker.py", line 1534, in init
_global_node = ray._private.node.Node(
File "C:\Users\abhayhm\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\ray_private\node.py", line 287, in init
self.start_head_processes()
File "C:\Users\abhayhm\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\ray_private\node.py", line 1164, in start_head_processes
self.start_monitor()
File "C:\Users\abhayhm\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\ray_private\node.py", line 1067, in start_monitor
process_info = ray._private.services.start_monitor(
File "C:\Users\abhayhm\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\ray_private\services.py", line 1957, in start_monitor
process_info = start_ray_process(
File "C:\Users\abhayhm\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\ray_private\services.py", line 904, in start_ray_process
ray._private.utils.set_kill_child_on_death_win32(process)
File "C:\Users\abhayhm\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\ray_private\utils.py", line 916, in set_kill_child_on_death_win32
raise OSError(ctypes.get_last_error(), "AssignProcessToJobObject() failed")
OSError: [Errno 0] AssignProcessToJobObject() failed
I have followed all the given steps but still unable to download:
python3 main.py
2023-11-21 19:18:56,214 INFO worker.py:1636 -- Started a local Ray instance.
[============================================================] 99.7% ...
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.