Comments (5)
Thanks for the detailed report. There's a couple of issues, some on my end, some on my the Archipelago end :)
-
The AbortError is the same issue, and appears to be related to ref counting related to large WARCs. It probably should not be enabled for WARC loading at all, and I'll disable it for now. I believe this should fix all of the AbortErrors.
-
Looking at
https://webarchive.archipelago.nyc//do/4/iiif/51b281b4-093e-494c-9820-9eeeb03a4c6e/full/full/0/default.warc
, it appears that the the current nginx setup does not support byte range requests.
This will be needed for WACZ support as well. ReplayWeb.page checks to see if it can make range requests, and if so, optimizes to read data on-demand later. If it can not, it will try to store everything, which exacerbates the AbortError in this case (but they shouldn't happen either way).nginx should handle range request automatically for static files, and there is also this module: https://www.nginx.com/blog/smart-efficient-byte-range-caching-nginx/
However, if you're proxying from S3, it should be possible to just get the ranges from S3 directly..
If you can get range requests working, that should address a lot of the issues, and I will also add an additional fix to the 1.3.0 release. Let me know if you run into any questions, happy to look at the nginx config.
-
The initial ui.js not found is fine, it will then load it from the CDN as a fallback, this is expected.
from replayweb.page.
@DiegoPino Released v1.3.0, which should fix the AbortError issue, even without range request support -- it will load slowly, but it should load w/o errors now.
from replayweb.page.
@ikreymer thanks for your detailed response. We are having some trouble finding the right balance between "security", flexibilty and speed right now. I managed to get HEAD requests for the WACZ implementation working but hitting some resource limits when trying to seek/deliver the range request afterwards (PHP/AWS S3 SDK are playing with my patience on streams and memory usage).
I have some options (like delivering a presigned URL) that would alleviate this on the short term but may get me in trouble with caching (local one since presigned urls are made to last less than the HTML caching) but I will get there! Some explanation (probably out of context) here esmero/archipelago-deployment#75
I will test V1.3.0
There is still one issue that seems to be affecting is that some CSS/Images are being handled differently (no clue why) on WARC files and end not being served. E.g here
https://webarchive.archipelago.nyc/do/db17b0d6-886b-4ee4-bfb9-0edf9ce404b5
In this case missing CSS is consistent for the landing page
But for the first example I shared (1 Gbyte WARC)
Safari can load it, Chrome not.
Will do my homework first and get Ranges working without killing the server!
Thanks!
from replayweb.page.
There is still one issue that seems to be affecting is that some CSS/Images are being handled differently (no clue why) on WARC files and end not being served. E.g here
https://webarchive.archipelago.nyc/do/db17b0d6-886b-4ee4-bfb9-0edf9ce404b5
Try updating to 1.3.0 -- The AbortError would cause certain resources to not be loaded, hopefully this is fixed now.
Sorry to hear about the difficulties with streaming!
I would definitely recommend using the S3 presigned URLs with a reasonable duration (a day?), then you do not need to worry about local cacheing at all! That should work pretty well, you'll just need to configure CORS settings on the bucket, which I can help you with also.
Here's the WARC you shared loading from DigitalOcean CDN, it takes some time, but does load:
https://replayweb.page/?source=https%3A%2F%2Fdh-preserve.sfo2.cdn.digitaloceanspaces.com%2Fmisc%2Fnyarc.warc
from replayweb.page.
I think all the issues mentioned here have now been resolved and the embedding is working!
Closing for now, please re-open if anything unresolved, or open a new issue for any new errors.
from replayweb.page.
Related Issues (20)
- Document `liveRedirectOnNotFound`
- Inconsistently Loading Videos in Embedded Player HOT 2
- [Replay Bug]: the reply of image galleries sometimes mixes links to different subpages
- [Replay Bug]: replay shows the wrong video to a news article at dr.dk HOT 1
- [Bug]: Missing ads on news sites HOT 3
- [Replay Bug]: Failure to render websites created with Shorthand.com
- [Replay Bug]: Failure to render websites published on Microsoft SharePoint
- [Bug]: Safari can't open wacz stored on Dropbox, Firefox & Chrome can HOT 1
- [Replay Bug]: Star Citizen ARK Starmap - stuck on loading HOT 4
- [Feature]: Change image-rendering mode based on snapshot date
- RWP: tab list disappears when opening an empty WARC file HOT 1
- [Docs]: WBN is listed in the placeholder text instructing users what files to open but isn't supported HOT 1
- ReplayWebpage Branding Update
- [Replay Bug]: Facebook archive content display partial in replayweb page HOT 2
- replay of timed transitions in hero elements or carousels HOT 2
- [Replay Bug]: Players not rendering Scalar crawls in Chrome HOT 1
- [Feature]: Adblock Support!
- [Replay Bug]: archived reddit pages not displaying properly
- [Bug]: Firefox Won't Open WACZ from Remote Server because the Size of the File is Not Accessible HOT 1
- [Feature]: Add back button to the "Archived Page Not Found" page
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from replayweb.page.