Comments (5)
The idea I currently have in mind regarding file sizes is to have a default max file size, applicable to all types of media (audio, video, images), and then either minimize (images) or exclude (audio/video) files that exceed that limit. The file size limit will be adjustable via flags.
from monolith.
Another approach to this might be using webp
to re-encode and optimize images for size. While PNGs are still the dominant format, webp can often be a lot smaller, especially with compression set high enough.
This PNG image:
compresses from 83K to 36K [41%] with the strongest level of lossless webp compression, and even more so to 24K [29%] at 60% quality lossy.
Considering that images represent a large amount of the filesize of current monoliths, this might be worth putting time into. That said, it doesn't look like there's a very good amount of support for webp in Rust yet, so maybe this is best put on the backburner?
from monolith.
Let's not forget the purpose of this tool is to archive and not alter the content.
Imho turning images off is a good idea but doing some image optimization isn't. I'd rather have the original information than saving 2mb per page.
from monolith.
I think it's a good idea to have images converted into one lossless format that takes up the least amount of space, as long as it's supported by all major browsers and if there are no issues like patents surrounding it. And I agree with @Alch-Emi that while scaling most image formats down to save space can be done with what Rust 's ecosystem has to offer as of now, converting them all into webp to save space might have to be delayed until Rust gets a good webp crate.
from monolith.
@grerrg what you're saying sounds very reasonable. I rather follow the Unix philosophy and keep this tool as lightweight as possible. The idea of squeezing bytes out of archived pages still seems to be appealing, perhaps it could be done by running some other tool after the page's saved. Outside of the scope of this program, closing.
from monolith.
Related Issues (20)
- Error following redirect for url HOT 2
- Converting CSS with the wrong encoding HOT 1
- Monolith saves HTML with a NULL character in between every actual character HOT 4
- "ERROR [builder 5/5] RUN make install" on OpenSUSE HOT 3
- monolith https://win98icons.alexmeub.com/ -o test.html HOT 4
- Parameter --output doesn't write a file if using docker container HOT 5
- Use sitemap to download cohesive full site? HOT 2
- JS module imports not captured HOT 3
- Outdated Project Reference in README HOT 2
- The page you need to log in cannot be saved after logging in HOT 1
- whole progress failed caused by get favicon.ico HOT 1
- Unicode mangling
- [Feature request] Simple way to permanently store and use Blacklist of domains HOT 13
- [Bug] Data URLs exceed length limits HOT 8
- Incomplete output on broken HTML like https://distrowatch.com/table.php?distribution=void HOT 5
- Save apple-touch-icon too HOT 3
- "https://mp.weixin.qq.com" web title and CSS switch image on click not work
- How to get just HTML, no <script> HOT 1
- Additionally fetch dynamic content HOT 3
- [proposal] An option to remove alternative sources for media urls HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from monolith.