voku / htmlmin Goto Github PK
View Code? Open in Web Editor NEW:clamp: HtmlMin: HTML Compressor and Minifier via PHP
License: MIT License
:clamp: HtmlMin: HTML Compressor and Minifier via PHP
License: MIT License
The compressed output does not contain any vertical whitespace (e.g. newlines \r
or \n
) but does contain many runs of multiple horizontal whitespace characters.
<form>
<button>foo</button>
<input type="hidden" name="bar" value="baz">
</form>
Becomes:
<form><button>foo</button> <input name=bar type=hidden value=baz></form>
Note that there are two spaces instead of one between </button>
and <input
.
This problem is endemic: it's all over the compressed file and not limited to just two spaces; sometimes there are three or more. Clearly all such cases should be compressed to a single space.
hello .. what option to make css to be one line ? other is work fine except css still on default style, not compressed... thanks..
<dl>
<dt>foo
<dd><span class="bar"></span>
</dl>
<a></a>
Becomes
<dl><dt>foo <dd><span class=bar></span></dl><a class=avatar></a>
This is invalid because the indentation between the closing </dl>
and opening <a>
creates a whitespace element which is rendered as a space on screen. Since there is no longer any whitespace between these two tags in the compressed version this affects the display.
While spaces and tabs are preserved, newlines inside <pre>
tags are gone.
string(23) "<pre>foo bar zoo</pre>"
$html = '<pre>
foo
bar
zoo
</pre>';
$htmlMin = new voku\helper\HtmlMin();
$result = $htmlMin->minify($html);
var_dump($result);
Unsure
Using 3.1.3 installed via composer on PHP 7.2.10.
HtmlMin cut html code at the end of html page
Example page without use HtmlMin:
https://sector.biz.ua/docs/tiworker_exe_bitcoin_miner_riched32_dll/page-no-htmlmin.phtml
Example page with HtmlMin:
https://sector.biz.ua/docs/tiworker_exe_bitcoin_miner_riched32_dll/page-with-htmlmin.phtml
see difference in pagesources
i don't know
By default, the minifier is actually creating more content by enforcing a closing li tag (</li>
), even when one does not exist in the source. HTML5 does not require a closing li element, it is closed implicitly when not specified. The minifier should not inject closing </li>
tags.
Similarly, <td>
tags are being closed after minification, even when they were not in the source. I do not think a minifier should be adding tags, even closing tags; that is not its job.
Script like this:
<p>Text 1</p><script>$(".second-column-mobile-inner").wrapAll("<div class='collapse' id='second-column'></div>");</script><p>Text 2</p>
after HtmlMin turns into:
<p>Text 1</p><script>$(".second-column-mobile-inner").wrapAll("<div class='collapse' id='second-column'></script></div>");<p>Text 2</p>
Add this script anywhere:
<script>$(".second-column-mobile-inner").wrapAll("<div class='collapse' id='second-column'></div>");</script>
I don't know
Version 3.1.3 via composer
Php 7.0.28
Some closing tags are vanishing. In my example, the </p>
is suddenly gone.
string(94) "<div class=rating><p style="margin: 0;"><span style="width: 100%;"></span> (2 reviews) </div>"
$html = '
<div class="rating">
<p style="margin: 0;">
<span style="width: 100%;"></span>
</p>
(2 reviews)
</div>
';
$htmlMin = new voku\helper\HtmlMin();
$result = $htmlMin->minify($html);
var_dump($result);
Dunno.
Using voku/html-min 4.3.0 and voku/simple_html_dom 4.7.14.
On a legacy project, I'm facing issues with some client who (I don't really know how) has inserted base64 encoded images.
Further more, those images aren't optimized at all (uncompressed PNG for photos...), resulting in huge base64 strings (multiple images up to 2mo):
<img src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAA7EAAAJyCAYAAAFlL3dhAAAAAXNSR0IArs4c6QAAAARnQU1B....." />
<!-- base64 string is shortened for clarity 😅 -->
When minifying such HTML, I end up with an empty string because this replacement fails:
HtmlMin/src/voku/helper/HtmlMin.php
Lines 1375 to 1381 in 4f70058
Because it exceeds the pcre.backtrack-limit
and returns a PREG_BACKTRACK_LIMIT_ERROR
.
I'm fully aware this is all wrong: wrong image format for photo and more importantly, large images sources should be inserted as URLs.
However, technically speaking, all this crap is still valid HTML (the page displays when unminified) and I think such cases (even if it's kind of an edge case) should not fail.
Thus, I wonder if a sanity check should be inserted somewhere to prevent the script from failing and returning an empty string? Maybe catch errors on all preg_
calls and return the unminified HTML if an error occured? Or check the string length against the pcre.backtrack-limit
value?
Wrapping preg_
calls the Guzzle json_decode
way could be a nice solution (because it would not only handle backtrack-limit
errors but all kinds of preg_
errors)?
https://github.com/guzzle/guzzle/blob/74ca2cb463a7a99a0b99f195ca809cc4ba6c3147/src/Utils.php#L281-L301
try {
$html = (string) Utils::preg_replace_callback(
'#<([^/\s<>!]+)(?:\s+([^<>]*?)\s*|\s*)(/?)>#',
static function ($matches) {
return '<' . $matches[1] . \preg_replace('#([^\s=]+)(=([\'"]?)(.*?)\3)?(\s+|$)#su', ' $1$2', $matches[2]) . $matches[3] . '>';
},
$html
);
} catch (\Exception $e) {
return $html;
}
What do you think?
Try to minify some HTML with length that exceeds the pcre.backtrack-limit
.
I think a couple hours should be enough, I can submit a PR but would like your feedback on the different proposed solutions before working something out.
Example html before minification:
<span> Click <a href="/">here</a> to see my other page. </span>
Expected:
<span>Click <a href="/">here</a> to see my other page.</span>
Actual:
<span>Click <a href="/">here</a>to see my other page.</span>
Load the library through composer and put the above html snippets into the minification call.
Not a clue, haven't started debugging through the library code yet, just reporting to see if it might be a quick fix or not.
Any help would be very much appreciated, if i get time i'll try and have a look as well. I'll link a pull request if i manage to solve this one.
Hey @voku,
thanks for the great script!
Though, I have a small issue with the latest version. Single white spaces are removed between span elements. In my case only the first white space is preserved, all other for the next span elements are removed. The problem is that buttons are loaded via CSS and with the optimization they are glued together.
Original:
<p><span class="label-icons">XXX</span> <span class="label-icons label-free">FREE</span> <span class="label-icons label-pro">PRO</span> <span class="label-icons label-popular">POPULAR</span> <span class="label-icons label-community">COMMUNITY CHOICE</span></p>
Minified:
<p><span class="label-icons">XXX</span> <span class="label-free label-icons">FREE</span><span class="label-icons label-pro">PRO</span><span class="label-icons label-popular">POPULAR</span><span class="label-community label-icons">COMMUNITY CHOICE</span>
Option doRemoveSpacesBetweenTags() is not set.
Thanks!
It sucks.
If I do this.
Compress("
</script>
<script async src="cdnjs"></script>
");
Since I use a different package to compress js. It suddenly insert a tag at the top. Which breaks the behaviour of the whole document.
matthiasmullie/minify>=1.3.60
voku/html-min>=3.0.3
use voku\helper\HtmlMin;
use MatthiasMullie\Minify;
$minifier = new Minify\JS();
$htmlMin = new HtmlMin();
echo $htmlMin->minify("
<script defer>
");
echo $minifier->minify("
//some js code
");
echo $htmlMin->minify("
</script>
<script async src="some cdnjs"></script>
");
## Result will be:
<head><script defer>//some js</script><script async src=some cdnjs></script>
Just remove the frikin autocomplete then we're done :D
Hi voku, thanks for this library ;-).
Have a question about minify, about HTML into my <script type="text/template">
. The output return by compressor escaping the slash of my closing tag.
$htmlMin = new \voku\helper\HtmlMin();
$htmlMin->doOptimizeViaHtmlDomParser(true);
return $htmlMin->minify('
<script id="comment-loader" type="text/x-handlebars-template">
<nocompress>
<i class="fas fa-spinner fa-pulse"></i> Loading ...
</nocompress>
</script>
');
This code return (escaping with </i>)
<script id="comment-loader" type="text/x-handlebars-template">
<i class="fas fa-spinner fa-pulse"><\/i>
Loading ...
</script>
Shouldn't it return the closing tag without escaping the slash character? It is into a script[type=text/template].
And tag not seems work.
I used https://github.com/nochso/html-compress-twig for years which worked great, but it doesn't support Twig 3 and is not maintained anymore, so I switched to HtmlMin for Twig. Migration was easy and it works perfectly, but I really miss inline Javascript minification.
According to #42 it looks such option is kind of supported? Would it be possible to minify inline JS in next release of HtmlMin?
Hello,
I often use the construct <script id="some-id" type="text/html"> some HTML code </script> to inject HTML code in the DOM. The HTML code between <script> and </script> is incorrectly processed by HtmlMin.
Source code :
<!doctype html>
<html lang="fr">
<head>
<title>Test</title>
</head>
<body>
A Body
<script id="elements-image-1" type="text/html">
<div class="place badge-carte">Place du Village<br>250m - 2mn à pied</div>
<div class="telecabine badge-carte">Télécabine du Chamois<br>250m - 2mn à pied</div>
<div class="situation badge-carte"><img src="https://domain.tld/assets/frontOffice/kneiss/template-assets/assets/dist/img/08ecd8a.png" alt=""></div>
</script>
</body>
</html>
Expected behaviour :
<!DOCTYPE html><html lang="fr"><head><title>Test</title></head><body>A Body<script id="elements-image-1" type="text/html">
<div class="place badge-carte">Place du Village<br>250m - 2mn à pied</div>
<div class="telecabine badge-carte">Télécabine du Chamois<br>250m - 2mn à pied</div>
<div class="situation badge-carte"><img src="https://domain.tld/assets/frontOffice/kneiss/template-assets/assets/dist/img/08ecd8a.png" alt=""></div>
</script></body></html>
Actual behaviour :
<!DOCTYPE html><html lang="fr"><head><title>Test</title></head><body>A Body<script id="elements-image-1" type="text/html">
<div class="place badge-carte">Place du Village<br>250m - 2mn à pied
<div class="telecabine badge-carte">Télécabine du Chamois<br>250m - 2mn à pied
<div class="situation badge-carte"><img src="https://domain.tld/assets/frontOffice/kneiss/template-assets/assets/dist/img/08ecd8a.png" alt="">
</script></body></html>
Use the above source code.
Not sure about that. Maybe minutes to ignore <script type="text/html"> content ?
Thanks for your work :)
HtmlMin should not add a whitespace when ommitting </p>
This was mentioned in issue #50 (after it was closed) so I think there was no follow up.
The following HTML
<div>
<p>Text</p>
</div>
should minify to <div><p>Text</div>
and NOT to <div><p>Text </div>
(which is the current implementation).
If you pass these to a browser, you will notice the DOM output is not the same for the original and for the minified version:
"Text "
!= "Text"
(notice the trailing whitespace in the DOM for the minified version that is not there in the non-minified HTML's DOM output).
It probably requires some work.
One way to test for this, and similar fidelity issues is to add a headless browser to the unit tests.
The tests would use non-minified HTML and a minified version of that HTML fed seperately into the headless browser.
If the DOM output of the two versions are not exact - the tests would fail.
Such a tesing suite should pick up many similar fidelity issues (if there are any).
Hello, I am using your script "voku/html-min": "^3.0"
for some time and it is working great.
Recently we integrated vue.js into our laravel project and an unexpected bug appeared.
After some investigation I realised that the minifier removes the vue js attributes that start with @
.
Let me show you an example. This code:
<select v-model="filter" @change="getGraphData" :class="['c-chart__label']" name="filter">
If you minify it through:
$htmlMin = new HtmlMin();
$htmlMin->doOptimizeViaHtmlDomParser(true);
$htmlMin->doRemoveWhitespaceAroundTags(true);
$htmlMin->doRemoveSpacesBetweenTags(true);
return $htmlMin->minify($content);
You get this in the frontend:
<select v-model="filter" :class="['c-chart__label']" name="filter">
That means that HtmlMin it strips out attributes that start with @.
It shouldn't be a hard fix for you @voku.
Thanks.
I have just created a small plugin https://packagist.org/packages/studiomitte/html-min which is mainly a wrapper for your package for the Open Source CMS TYPO3.
However TYPO3 sets for uncached parts of the content some HTML comments which are later then replaced by uncached code. Because of various reasons I would like to minimize the HTML markup before this replacement which doesn't work anymore as those html comments are stripped away
It would be cool to have an option to keep comments, e.g. those starting with <!--INT_SCRIPT
I got no clue :(
Html minified has more errors then before in W3C Validator.
https://dpaste.de/i72Q gives in Validator 8 warnings/errors
After minify i have 11 warning/errors
Used following code
$html = (new \voku\helper\HtmlMin())->minify($html)
No idea
Expected behaviour is a single line of HTML
Actual beaviour is :
<img alt="PAIRE DE SILENCIEUX TYPE MEGATON Lg 440 mm" class="img-responsive" height="170" itemprop="image" sizes="(max-width: 768x) 354px,
(max-width: 992px) 305px,
212px" src="https://cdn.gmp-classic.com/cache/images/product/93c869f20df68d3e531f7e9c3e603e5e-p1000223_3800.jpg" srcset="https://cdn.gmp-classic.com/cache/images/product/5ee4535311159aaf1c4ae44fbebd83c2-p1000223_3800.jpg 768w,
https://cdn.gmp-classic.com/cache/images/product/82e8bafbecab56f932720490e7fc2f85-p1000223_3800.jpg 992w,
https://cdn.gmp-classic.com/cache/images/product/93c869f20df68d3e531f7e9c3e603e5e-p1000223_3800.jpg 1200w" width="212"></a><div class="col-sm-6 product-info">
Try to minify this code :
<img alt="PAIRE DE SILENCIEUX TYPE MEGATON Lg 440 mm" class="img-responsive" height="170" itemprop="image" sizes="(max-width: 768x) 354px, (max-width: 992px) 305px, 212px" src="https://cdn.gmp-classic.com/cache/images/product/93c869f20df68d3e531f7e9c3e603e5e-p1000223_3800.jpg" srcset="https://cdn.gmp-classic.com/cache/images/product/5ee4535311159aaf1c4ae44fbebd83c2-p1000223_3800.jpg 768w, https://cdn.gmp-classic.com/cache/images/product/82e8bafbecab56f932720490e7fc2f85-p1000223_3800.jpg 992w,https://cdn.gmp-classic.com/cache/images/product/93c869f20df68d3e531f7e9c3e603e5e-p1000223_3800.jpg 1200w" width="212"></a><div class="col-sm-6 product-info">
Not sure.
Thanks !
Hello Lars Moelleken,
Thank you for sharing this great library, it's very useful ! :)
HtmlMin doesn't work properly when using <nocompress></nocompress>
tag.
use voku\helper\HtmlMin;
$html = "
<html>
\r\n\t
<body>
<ul style=''>
<li style='display: inline;' class='foo'>
\xc3\xa0
</li>
<nocompress><!-- Protect me --></nocompress>
<li class='foo' style='display: inline;'>
\xc3\xa1
</li>
<!-- Remove me -->
</ul>
</body>
\r\n\t
</html>
";
$htmlMin = new HtmlMin();
echo $htmlMin->minify($html);
This code returns
<html><body><ul><li style="display: inline;" class="foo">
à
</li>
<nocompress><!-- Protect me --></nocompress>
<li class="foo" style="display: inline;">
á
</li>
<!-- Remove me --></ul>
I don't know
The code should return
<html><body><ul><li class=foo style="display: inline;"> à <!-- Protect me --><li class=foo style="display: inline;"> á </ul>
or at least
<html><body><ul><li class=foo style="display: inline;"> à <nocompress><!-- Protect me --></nocompress><li class=foo style="display: inline;"> á </ul>
Having multiple <code>
tags results in a fatal error.
Warning: Couldn't fetch DOMElement. Node no longer exists in vendor/voku/simple_html_dom/src/voku/helper/SimpleHtmlDom.php on line 485
Call Stack:
0.0001 386032 1. {main}() scripts/test.php:0
0.0240 2647432 2. voku\helper\HtmlMin->minify() scripts/test.php:7
0.0241 2647616 3. voku\helper\HtmlMin->minifyHtmlDom() vendor/voku/html-min/src/voku/helper/HtmlMin.php:1082
0.0397 3298824 4. voku\helper\HtmlMin->protectTags() vendor/voku/html-min/src/voku/helper/HtmlMin.php:1207
0.0415 3397496 5. voku\helper\SimpleHtmlDom->parentNode() vendor/voku/html-min/src/voku/helper/HtmlMin.php:1277
Notice: Undefined property: DOMElement::$parentNode in vendor/voku/simple_html_dom/src/voku/helper/SimpleHtmlDom.php on line 485
Call Stack:
0.0001 386032 1. {main}() scripts/test.php:0
0.0240 2647432 2. voku\helper\HtmlMin->minify() scripts/test.php:7
0.0241 2647616 3. voku\helper\HtmlMin->minifyHtmlDom() vendor/voku/html-min/src/voku/helper/HtmlMin.php:1082
0.0397 3298824 4. voku\helper\HtmlMin->protectTags() vendor/voku/html-min/src/voku/helper/HtmlMin.php:1207
0.0415 3397496 5. voku\helper\SimpleHtmlDom->parentNode() vendor/voku/html-min/src/voku/helper/HtmlMin.php:1277
Fatal error: Uncaught TypeError: Argument 1 passed to voku\helper\SimpleHtmlDom::__construct() must be an instance of DOMNode, null given, called in vendor/voku/simple_html_dom/src/voku/helper/SimpleHtmlDom.php on line 485 and defined in vendor/voku/simple_html_dom/src/voku/helper/SimpleHtmlDom.php on line 61
TypeError: Argument 1 passed to voku\helper\SimpleHtmlDom::__construct() must be an instance of DOMNode, null given, called in vendor/voku/simple_html_dom/src/voku/helper/SimpleHtmlDom.php on line 485 in vendor/voku/simple_html_dom/src/voku/helper/SimpleHtmlDom.php on line 61
Call Stack:
0.0001 386032 1. {main}() scripts/test.php:0
0.0240 2647432 2. voku\helper\HtmlMin->minify() scripts/test.php:7
0.0241 2647616 3. voku\helper\HtmlMin->minifyHtmlDom() vendor/voku/html-min/src/voku/helper/HtmlMin.php:1082
0.0397 3298824 4. voku\helper\HtmlMin->protectTags() vendor/voku/html-min/src/voku/helper/HtmlMin.php:1207
0.0415 3397496 5. voku\helper\SimpleHtmlDom->parentNode() vendor/voku/html-min/src/voku/helper/HtmlMin.php:1277
0.0461 3398192 6. voku\helper\SimpleHtmlDom->__construct() vendor/voku/simple_html_dom/src/voku/helper/SimpleHtmlDom.php:485
$html = '<code>foo</code> and <code>bar</code>';
$htmlMin = new voku\helper\HtmlMin();
$htmlMin->minify($html);
Unsure.
Using 3.1.3 installed via composer on PHP 7.2.10.
Any HTML that's inside a <script type=text/html>
tag is turned invalid. Similar to issue #26. Seems like voku/simple_html_dom@e6e597e#diff-c9aceb5e54e67d5f19805df3f679fb1bR389 only works on double quotes or single quotes (type="text/html"
or type='text/html'
) rather than no quotes at all (type=text/html
).
string(81) "<script type=text/html><p>Foo
<div class="alert alert-success">
Bar</script>"
$html = '
<script type=text/html>
<p>Foo</p>
<div class="alert alert-success">
Bar
</div>
</script>
';
$htmlMin = new voku\helper\HtmlMin();
$result = $htmlMin->minify($html);
var_dump($result);
Unsure
Using 3.1.3 installed via composer on PHP 7.2.10.
I started using HtmlMin and a couple of days later noticed such an error. It's better to look at it by example. Trying to minimize html:
<blockquote class="bg-gray primary">
<p class="text-monospace">
Malwarebytes<br>
www.malwarebytes.com<br>
User: User-\<wbr>u00d0\<wbr>u009f\<wbr>u00d0\<wbr>u009a\<wbr>User<br>
<br>
Windows (WMI): 0<br>
(end)<br>
</p>
</blockquote>
As a result, we get extra tags in the end.
<blockquote class="bg-gray primary"><p class=text-monospace> Malwarebytes<br> www.malwarebytes.com<br> User: User-\<wbr>u00d0\<wbr>u009f\<wbr>u00d0\<wbr>u009a\<wbr>User<br> <br> Windows (WMI): 0<br> (end)<br> </wbr></wbr></wbr> </blockquote>
And if html has about 123 or more , then get the result at the end of 123 or more and then html is trimmed and lost.
As a result, I lose the end of the page, you can see here: https://sector.biz.ua/docs/tiworker_exe_bitcoin_miner_riched32_dll/tiworker_exe_bitcoin_miner_riched32_dll.phtml
Whitespaces before <strong>
tags are removed, giving incorrect rendering :
Try to minify the following code :
<!DOCTYPE html>
<html lang="fr">
<head><title>Test</title></head>
<body>
<p>Visitez notre boutique <strong>eBay</strong> : <a href="https://stores.ebay.fr/CAFE-RACER-OLD-SPARES" target="_blank">https://stores.ebay.fr/CAFE-RACER-OLD-SPARES</a></p>
<p><strong>ID Vintage</strong>, spécialiste de la vente de pièces et accessoires pour motos tout- terrain classiques :<a href="https://id-vintage.com" target="_blank">https://id-vintage.com</a></p>
<p>Magazine <strong>Café-Racer</strong> : <a href="https://www.cafe-racer.fr" target="_blank">https://www.cafe-racer.fr</a></p>
<p><strong>Julien Lecointe</strong> : <a href="https://julien-lecointe.blogspot.com" target="_blank">https://julien-lecointe.blogspot.com</a></p>
</body>
</html>
Not sure
Using version 3.1.30 installed with composer on PHP 7.1
Thanks !
I have a table with specifications.
<table> <tr> <td><3</td> </tr></table>
And i use:
$htmlMin->doRemoveOmmitedHtmlTags(false); $htmlMin->minify($table);
I want:
<table><tr><td><3</td></tr></table>
I got:
<table><tr><td></td></tr></table>
How can I save <3 in html?
Given the following input:
<dl>
<dt>foo
<dd><span class="bar"></span>
</dl>
<a class="baz"></a>
Becomes:
<dl><dt>foo <dd><span class=bar></span> </dl><a class=baz></a>
This is incorrect because the space between the closing </dl>
and opening <a>
has completely disappeared.
$html = "
<!doctype html>
<html lang=\"nl\">
<head>
</head>
<body>
<div class=\"price-box price-tier_price\" data-role=\"priceBox\" data-product-id=\"1563\" data-price-box=\"product-id-1563\">
</div>
<script type=\"text/x-custom-template\" id=\"tier-prices-template\">
<ul class=\"prices-tier items\">
<% _.each(tierPrices, function(item, key) { %>
<% var priceStr = '<span class=\"price-container price-tier_price\">'
+ '<span data-price-amount=\"' + priceUtils.formatPrice(item.price, currencyFormat) + '\"'
+ ' data-price-type=\"\"' + ' class=\"price-wrapper \">'
+ '<span class=\"price\">' + priceUtils.formatPrice(item.price, currencyFormat) + '</span>'
+ '</span>'
+ '</span>'; %>
<li class=\"item\">
<%= 'some text %1 %2'.replace('%1', item.qty).replace('%2', priceStr) %>
<strong class=\"benefit\">
save <span class=\"percent tier-<%= key %>\"> <%= item.percentage %></span>%
</strong>
</li>
<% }); %>
</ul>
</script>
<div data-role=\"tier-price-block\"></div>
</body>
</html>
";
echo $this->htmlMin->minify($html);
<!DOCTYPE html><html lang=nl><head> <body><div class="price-box price-tier_price" data-price-box=product-id-1563 data-product-id=1563 data-role=priceBox></div> <script id=tier-prices-template type=text/x-magento-template><ul class="items prices-tier"><____simple_html_dom__voku__percent____ _.each="" ____simple_html_dom__voku__percent____="" function="" key=""><____simple_html_dom__voku__percent____ ____simple_html_dom__voku__plus____="" class=price-wrapper data-price-amount="' + priceUtils.formatPrice(item.price, currencyFormat) + '" data-price-type="" pricestr='<span class="price-container price-tier_price">' var="">' + '<span class=price>' + priceUtils.formatPrice(item.price, currencyFormat) + '</span>' + '' + ''; %> <li class=item><____simple_html_dom__voku__percent____ ____simple_html_dom__voku__percent____="" ____simple_html_dom__voku__percent____1="" ____simple_html_dom__voku__percent____2="" item.qty="" pricestr="" text=""><strong class=benefit> save <span class="%> key percent tier-<%="> <____simple_html_dom__voku__percent____ ____simple_html_dom__voku__percent____="" item.percentage=""></____simple_html_dom__voku__percent____></span>% </strong> </____simple_html_dom__voku__percent____></li> <____simple_html_dom__voku__percent____ ____simple_html_dom__voku__percent____=""></____simple_html_dom__voku__percent____></____simple_html_dom__voku__percent____></____simple_html_dom__voku__percent____></ul> </script><div data-role=tier-price-block></div>
Google supports scripts tags in Structured Data. In that script tags white spaces are not removed:
<script type=application/ld+json>
{
"@context": "https://schema.org",
"@type": "LocalBusiness",
}
</script>
Is it possible to handle that case since there is always type
attribute in script
tag?
Allow removing of https:
as well as http:
prefixes.
I believe this should be done by the existing function. A single function that removes both schemes is optimal IMO.
Also remove prefix from srcset
attribute as well as src
attribute.
This is technically a separate issue but it's all to do with the same hunk of code.
It takes a few seconds to fix.
Please, update Twig to V3 in voku/html-compress-twig
.
expected:
closing tag of a paragraph should still exist in HTML output after compressing:
actual:
closing tag is removed after compression
use voku\helper\HtmlMin;
$html = "
<p>
This is a simple test to show compressing paragraphs does not work.
#compressing html
</p>
";
$htmlMin = new HtmlMin();
echo $htmlMin->minify($html);
output:
<p> This is a simple test to show compressing paragraphs does not work. #compressing html
not sure
It also affects https://github.com/voku/html-compress-twig
Minification is breaking JSON placed in data-* attributes. Actually, it doesn't "break" the JSON part, it replaces single quotes with double quotes.
<div data-json='{"key":"value"}'></div>
Becomes:
<div data-json="{"key":"value"}"></div>
Unfortunalety, you can't use single quotes in JSON. But you can use single quotes in HTML.
<?php
require_once 'vendor/autoload.php';
use voku\helper\HtmlMin;
$htmlMin = new HtmlMin();
ob_start();
?>
<html>
<body>
<div data-json='{"key":"value"}'></div>
</body>
</html>
<?php
$html = ob_get_clean();
echo $htmlMin->minify($html);
I have no idea, I didn't dig into code.
Given the following code:
<p>Foo <em>bar</em> baz.</p>
By default the minifier outputs:
<p>Foo<em>bar</em>baz.</p>
This means "foo bar baz" now reads "foobarbaz" on screen. This is not equivalent and therefore invalid.
It may be worth noting this is not even consistent, depending on other markup that may be used within the p element. For example:
<p>Foo <br> bar <em>baz</em> bat.</p>
Becomes:
<p>Foo<br> bar <em>baz</em>bat.</p>
This means "foo bar baz bat" now reads "foo bar bazbat" on screen. That is, insertion of the <br>
element causes some whitespace to be preserved that otherwise, due to this bug, would not be.
I need to remove all data attribute inside a tag. Could you please add it as an optional ?
For example:
<img src="http://path/to/png" data-src="http://path/to/png" data-slide="2" />
=>
<img src="http://path/to/png" />
Hey,
If doRemoveHttpPrefixFromAttributes
is set to true (default)
and your current website protocol is https
and you have external links in your website without https
the links will break because the minifier will remove the http
or https
part of the link and only leave //
Example site where your library is deployed -- https://www.example.com
External link in website is -- http://www.mirror.com
After minify link is transformed into //www.mirror.com
and when you click it it actually takes you to -- https://www.mirror.com
Therefore the link breaks because www.mirror.com
is working only on http
and is not responding on https
.
Probably adding an extra condition (after the rel external and target _blank conditions) to check the current protocol against the protocol of the link (so proceed only if protocols match) will make this feature a lot safer.
Example:
on https
protocol, link with https
, remove the protocol and leave only //
on https
protocol, link with http
, don't remove the protocol because the link will break
on http
protocol, link with http
, remove the protocol and leave only //
on http
protocol, link with https
, don't remove the protocol because while some websites will redirect to https
, others might not and the link will break
Thanks
PS: I know, I'm to blame for not using rel="external"
on actual external links but sometimes I forget, okay? 😆
The issue happens with <
and >
.
<?php
use voku\helper\HtmlMin;
$minifier = new HtmlMin();
echo $minifier->minify('<span><</span>');
// prints: <span><</span>
// expects: <span><</span>
However, a lonely entity is not decoded.
<?php
use voku\helper\HtmlMin;
$minifier = new HtmlMin();
echo $minifier->minify('<');
// prints: <
// expects: <
I would say minutes.
Voku version 3.0.4
There are no spaces between tags in this input.
<span class="foo"><span title="bar"></span><span title="baz"></span><span title="bat"></span></span>
HtmlMin injects spaces between all these tags, but only when doRemoveWhitespaceAroundTags
is set to false
.
<span class=foo><span title=bar></span> <span title=baz></span> <span title=bat></span> </span>
HtmlMin should absolutely not be introducing whitespace where none existed, no matter what settings are used, because this affects the content.
Given the following input:
<span class="title">
1.
<a>Foo</a>
</span>
HtmlMin removes white-space after the closing </a>
tag despite doRemoveWhitespaceAroundTags(false)
being set. This affects how the content is displayed.
<span class=title> 1. <a>Foo</a></span>
This seemingly basic HTML results in some very weird output:
<p><html-min--voku--saved-content data-html-min--voku--saved-content="0"></html-min--voku--saved-content></p>
<p>
</p>
<h3>Vestibulum eget velit arcu.</h3>
Vestibulum eget velit arcu. Phasellus eget scelerisque dui, nec elementum ante. <code>aoaoaoao</code>
Namely:
<p>
has been replaced with some internal code.<p>
has been instantly closed leaving the content below missing a parent.<?php
use voku\helper\HtmlMin;
$html = '
<p>
foo <code>bar</code>. ZIiiii zzz <code>1.1</code> Lorem ipsum dolor sit amet, consectetur adipiscing elit.
</p>
<p>
<h3>Vestibulum eget velit arcu.</h3>
Vestibulum eget velit arcu. Phasellus eget scelerisque dui, nec elementum ante. <code>aoaoaoao</code>
</p>
';
$htmlMin = new voku\helper\HtmlMin();
echo $htmlMin->minify($html);
No clue, hopefully minutes. :)
voku/html-min 4.4.8 HTML Compressor and Minifier
voku/simple_html_dom 4.7.29 Simple HTML DOM package.
Before compression:
<span>foo</span>
<a href="bar">baz</a>
<span>bat</span>
After compression:
<span>foo</span><a href=bar>baz</a><span>bat</span>
Notice that all whitespace has been removed, even though there was whitespace between <span>
and <a>
elements in the source. This is invalid because it changes the way content is displayed after compression.
To use AMP properly, one has to use a specific html tag
<html ⚡>
unfortunately HtmlMin removed the important bit, leaving only
<html>
run the minifier over a sample document with the tag sahown above
depending on the source of the problem and how much you are into your code:
hours to days
Not found UTF8 library file.
Deprecated MIME type not omitted on inline scripts:
<script type="text/javascript">alert("Hello");</script>
minifies to:
<script type=text/javascript>alert("Hello");</script>
should minify to:
<script>alert("Hello");</script>
https://validator.w3.org/ flags this as
Warning: The
type
attribute is unnecessary for JavaScript resources.
Self-closing tag "hr" is creating closing tag when minifying the HTML code. It seems to happen when executing this function:
HtmlMin/src/voku/helper/HtmlMin.php
Lines 1074 to 1077 in c7ad429
This makes the minifier to produce bad HTML markup that does not pass the W3C validation throwing this error: Stray end tag “hr”.
Try to minify <hr> or <hr/>
in src/voku/helper/HtmlMin.php, line 247
TODO comments are left in the code when a feature (or a bug) isn't completely developed (or fixed). You should complete the implementation and remove the comment.
if (
($element->tag === 'script' || $element->tag === 'style')
&&
!isset($attributs['src'])
) {
// TODO: protect inline css / js
}
*/
$attrs = array();
foreach ((array)$attributs as $attrName => $attrValue) {
Posted from SensioLabsInsight
Without </source>
Try to minify:
$html = "\r\n
\t<source src="horse.ogg" type="audio/ogg">\r\n
\t<source src="horse.mp3" type="audio/mpeg">\r\n
\tYour browser does not support the audio element.\r\n
";
It produces additional </source>
tags:
<audio controls><source src=horse.ogg type=audio/ogg><source src=horse.mp3 type=audio/mpeg> Your browser does not support the audio element. </source></source></audio>
The class HtmlMin doesn't have doRemoveOmittedQuotes method mentioned in the docs.
When an tag contains "sizes" or "srcset" attributes with multiline values, the tag is not inlined, some line breaks remains.
Try to minify the following code
<!DOCTYPE html>
<html lang="fr">
<head><title>Test</title></head>
<body>
<article class="row" itemscope itemtype="http://schema.org/Product">
<a href="https://www.gmp-classic.com/echappement_311_echappement-cafe-racer-bobber-classique-etc_paire-de-silencieux-type-megaton-lg-440-mm-__gmp11114.html" itemprop="url" tabindex="-1" class="product-image overlay col-sm-3">
<img width="212" height="170"
itemprop="image"
srcset="https://cdn.gmp-classic.com/cache/images/product/5ee4535311159aaf1c4ae44fbebd83c2-p1000223_3800.jpg 768w,
https://cdn.gmp-classic.com/cache/images/product/82e8bafbecab56f932720490e7fc2f85-p1000223_3800.jpg 992w,
https://cdn.gmp-classic.com/cache/images/product/93c869f20df68d3e531f7e9c3e603e5e-p1000223_3800.jpg 1200w"
sizes="(max-width: 768x) 354px,
(max-width: 992px) 305px,
212px"
src="https://cdn.gmp-classic.com/cache/images/product/93c869f20df68d3e531f7e9c3e603e5e-p1000223_3800.jpg"
class="img-responsive"
alt="PAIRE DE SILENCIEUX TYPE MEGATON Lg 440 mm">
</a>
</article>
</body>
</html>
You'll get :
<!DOCTYPE html><html lang="fr"><head><title>Test</title></head><body><article class="row" itemscope itemtype="http://schema.org/Product"><a class="col-sm-3 overlay product-image" href="https://www.gmp-classic.com/echappement_311_echappement-cafe-racer-bobber-classique-etc_paire-de-silencieux-type-megaton-lg-440-mm-__gmp11114.html" itemprop="url" tabindex="-1"> <img alt="PAIRE DE SILENCIEUX TYPE MEGATON Lg 440 mm" class="img-responsive" height="170" itemprop="image" sizes="(max-width: 768x) 354px,
(max-width: 992px) 305px,
212px" src="https://cdn.gmp-classic.com/cache/images/product/93c869f20df68d3e531f7e9c3e603e5e-p1000223_3800.jpg" srcset="https://cdn.gmp-classic.com/cache/images/product/5ee4535311159aaf1c4ae44fbebd83c2-p1000223_3800.jpg 768w,
https://cdn.gmp-classic.com/cache/images/product/82e8bafbecab56f932720490e7fc2f85-p1000223_3800.jpg 992w,
https://cdn.gmp-classic.com/cache/images/product/93c869f20df68d3e531f7e9c3e603e5e-p1000223_3800.jpg 1200w" width="212"></a></article></body></html>
instead of :
<!DOCTYPE html><html lang="fr"><head><title>Test</title></head><body><article class="row" itemscope itemtype="http://schema.org/Product"><a class="col-sm-3 overlay product-image" href="https://www.gmp-classic.com/echappement_311_echappement-cafe-racer-bobber-classique-etc_paire-de-silencieux-type-megaton-lg-440-mm-__gmp11114.html" itemprop="url" tabindex="-1"> <img alt="PAIRE DE SILENCIEUX TYPE MEGATON Lg 440 mm" class="img-responsive" height="170" itemprop="image" sizes="(max-width: 768x) 354px,(max-width: 992px) 305px, 212px" src="https://cdn.gmp-classic.com/cache/images/product/93c869f20df68d3e531f7e9c3e603e5e-p1000223_3800.jpg" srcset="https://cdn.gmp-classic.com/cache/images/product/5ee4535311159aaf1c4ae44fbebd83c2-p1000223_3800.jpg 768w,https://cdn.gmp-classic.com/cache/images/product/82e8bafbecab56f932720490e7fc2f85-p1000223_3800.jpg 992w,https://cdn.gmp-classic.com/cache/images/product/93c869f20df68d3e531f7e9c3e603e5e-p1000223_3800.jpg 1200w" width="212"></a></article></body></html>
Don't know.
Using version 3.1.30 installed with composer on PHP 7.1
Thanks !
You might find it interesting to read this. I suggest you pick another verb for your naming scheme such as enable or set, e.g. enableRemoveHttpPrefixFromAttributes.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.