Comments (5)
The problem you see is that the new detailpages from tvgids.nl are slow and therefor can timeout. My guess at present is that this happens because they now are created on demand. When I first tested these new pages they often failed the first time but always succeeded the second time.
I also see you have a problem creating part of the error message text. I will check this out.
from tvgrabpyapi.
There are alternative json detail pages, but they miss part of the information like season/episode data.
from tvgrabpyapi.
Any news on this issue, or suggestions for work-arounds?
The global_timeout setting is set to 10 seconds, that seems a reasonable time for tvgids.nl to generate pages on demand. Does TVGrabAPI only fetch the actual details page, or also all links (i.e. images, ads, ...)? If I manually navigate to an arbitrary detail page on tvgids.nl, the page loads very fast with Ghostery enabled, but takes a much longer time if Ghostery is disabled (and thus the browser needs to load all ads and such). So if the grabber also fetches all images and ads, that could explain a time-out.
Are you sure though this issue is caused by time-outs, and not for example due to tvgids.nl returning a temporary error page or so? I think it would be helpful if the grabber would output more details on errors like these, like the full page source if there was an error parsing the data, or a message saying that the page could not be fetched due to a time-out.
I have created a small test set-up with empty cache database, only 2 channels enabled, fetching only 2 days, and all (detail) sources disabled apart from tvgids.nl. Indeed on the first run I see some errors like originally described, and on the second run (even after clearing the cache database) no errors are being reported.
In my regular set-up, I run the grabber every morning at 8:20, and this morning it took a little over 2.5 hours to complete. As such running the grabber twice isn't really an option I think; it simply takes too long, and by the time you run the grabber for the second time, tvgids.nl may have already expired the pages that were cached on the first run.
As for alternative detail sources, my set-up is dependent on information like season/episode, so I would prefer to fetch and combine as much information as possible from the available sources. So if at all possible, I would prefer to keep using the tvgids.nl HTML pages instead of JSON endpoints.
However, if there is no good work-around for this issue, and if other sources provide episode information as well, it could make sense to switch to the JSON endpoints for tvgids.nl. So two questions:
- Are there any other detail sources (my setup lists tvgids.tv, npo.nl, primo.eu) that provide episode information for the main Dutch channels (NPO, RTL, SBS, maybe BBC 1/2 and Een/Canvas)?
- I see that https://github.com/tvgrabbers/sourcematching/blob/master/sources/source-tvgids.nl.json already defines detail2 for the JSON-based details. Is there any configuration setting that allows us to use this source by default, instead of the HTML-based details source?
from tvgrabpyapi.
Sorry I react slowly, but of late I am short in time as I moved recently and my new house is taking a lot of time. Also my workspace is not jet up and running and still temporarily and limited.
When you test the pages through tv_grab_test_source.py you get more details. On a first try it almost always fails on a time-out. Trying again always succeeds. At first I did not recognize the significance for production, but now I think I will have to set it to the json page. This simply is not working.
In a few months when I have things here more organized I'll look deeper into solutions to utilize the html data.
from tvgrabpyapi.
Last week I moved the detailfetches for tvgids.nl from the html pages to the json pages. I now see my nightly fetch take less then half the time.
from tvgrabpyapi.
Related Issues (20)
- Extremely slow and data-errors in tvgids.nl HOT 1
- Memory usage? HOT 5
- is it ready for end-user? HOT 7
- tv_grab_nl3.py with --quiet still writes to stderr HOT 4
- Source "Humo" fails HOT 15
- tvgrabpy looping after exception in sources. HOT 6
- NPO Episode Numbers HOT 23
- Lot of programs have missing serie and episode HOT 17
- Hangs HOT 1
- Cannot disable source 8 HOT 15
- Port to Python 3 HOT 14
- Encoding/locale issue in retrieved EPG data HOT 2
- JSON error HOT 4
- tv_grab_fetch error: ValueError: 1 is not in list HOT 9
- Tvgrabber stopped working after json errorin sourcefile for source-horizon.tv HOT 1
- Serie information missing from lot of programs and lot of stations. HOT 5
- BBC First: time offset HOT 5
- tvgrab no longer works due to lack of valid sources HOT 2
- Configure offset -1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tvgrabpyapi.