equalitie / bundler Goto Github PK
View Code? Open in Web Editor NEWSite bundling system for use in Ceno and DDeflect
Site bundling system for use in Ceno and DDeflect
Just got this error - I'm not sure what triggered it, but this error indicates a need for type checking regardless:
Got request for undefined
url.js:107
throw new TypeError("Parameter 'url' must be a string, not " + typeof url)
^
TypeError: Parameter 'url' must be a string, not undefined
at Url.parse (url.js:107:11)
at Object.urlParse [as parse] (url.js:101:5)
at Object.module.exports.sameHostPredicate (/opt/bundler/utils.js:56:29)
at Server.handleRequests (/opt/bundler/proxyserver.js:103:26)
at Server.emit (events.js:98:17)
at HTTPParser.parser.onIncoming (http.js:2108:12)
at HTTPParser.parserOnHeadersComplete [as onHeadersComplete] (http.js:121:23)
at Socket.socket.ondata (http.js:1966:22)
at TCP.onread (net.js:527:27)
I'll see if I can get anything useful about what caused it now.
Currently the remap stuff lives in the proxyserver.js application - it'd be great it if could be moved to its own configuration file, in a similar simple JSON format. It'd also be nice if the location of this file could be configured via the psconfig.json file.
Despite me previously requesting its removal, I think that having lots of logging in bundler is better than having none at all.
A few options/ideas:
I'd like a way to configure the bundling of non-local resources to the requested domain. Currently if I fetch site.com and site.com has a resource on gstatic.com (let's say), the resource on gstatic.com will be bundled. This is obviously great for CeNo!'s purposes, but for DDeflect this is not optimal - I'd like a means to specify that if I ask for site.com, only resources under site.com be bundled.
When useProxy is set to true, the Host header is no longer rewritten appropriately.
Fetcing the following using the bundler-proxy works fine: http://127.0.0.1:9008/?url=http://learn.equalit.ie/wiki/Main_Page
However, when using a remap that points learn.distributed.deflect.ca at learn.equalit.ie's origin (contact me off-thread for the IP address), I get some really weird behaviour which ultimately results in a 500 error. Some debugging and checking the logs on the origin reveals that Bundler is attempting to fetch the following files:
[10/Apr/2015:17:33:55 +0200] "GET /mw/)!ie}.mw-help-field-data{display:block;background-color: HTTP/1.1" 301 625 "-" "-"
[10/Apr/2015:17:33:55 +0200] "GET /mw/)!ie;background-position:left%20center;background-repeat:no-repeat;cursor:pointer;font-size:.8em;text-decoration:underline;color: HTTP/1.1" 301 783 "-" "-"
[10/Apr/2015:17:33:55 +0200] "GET /mw/)%20} HTTP/1.1" 301 516 "-" "-"
This appears to be a failure to finish loading or encoding style sheets or something of the sort.
Why would Bundler be behaving differently when in theory all that should have changed is the Host header and the URL?
Currently our error returning is somewhat selective - could we have catch-alls in place to ensure that no matter what happens we give the client a return of some sort, in lieu of a timeout?
For example:
debug: in Bundler.send/async.reduce, memo = url=http://localhost/favicon.ico, strictSSL=false, rejectUnauthorized=false
debug: in Bundler.send/async.reduce, memo = url=http://localhost/favicon.ico, strictSSL=false, rejectUnauthorized=false,
debug: in Bundler.send/async.reduce, memo = url=http://localhost/favicon.ico, strictSSL=false, rejectUnauthorized=false, Origin=Bundler
debug: in Bundler.send/async.reduce, memo = url=http://localhost/favicon.ico, strictSSL=false, rejectUnauthorized=false, Origin=Bundler, followRedirect=true, followAllRedirects=false, maxRedirects=10
error: Error making request to http://localhost/favicon.ico; Error: Error: connect ECONNREFUSED
at errnoException (net.js:904:11)
at Object.afterConnect [as oncomplete] (net.js:895:1
Now that this code is in npm, I think it makes sense to split off the proxy server or at least a copy of it given the amount of changes that will be happening to it over the next while. @redwire - what are your thoughts?
There are some differences between my locally bundled version of learn.equalit.ie and the currently proxied-to instance at http://learn.distributed.deflect.ca/. They are not currently serving the same origin but they host the same content for the most part. It seems some of the issues are related to the content types of the CSS files.
The bundling process that is applied to images is precisely the same as is used for CSS files and Javascript files. The latter two types of resources are being bundled just fine, and tests on various websites confirm that dynamic functionality and styles work and are applied correctly. However, images do not appear to be being bundled properly. In displaying the bundle for a page like imgur.com, we observe that images are not displayed but the CSS and javascript around them works fine.
Upon viewing a bundled page's source code, we see that images are indeed being bundled into valid-looking data URIs. However I also notice Firefox's source view displaying some error-indicating red text where escape-codes for special characters appear. My suspicion is that somehow some replacement of special characters is happening and causing issues or that this encoding is interfering in the display of the site. However, clicking the data URI of an image informs us that the image cannot be displayed due to containing errors.
Figured out by @scottstamp
Proof of concept:
index.html
<!DOCTYPE html>
<html>
<head>
<title>Hi</title>
<link rel="stylesheet" href="main.css" />
</head>
<body>
<p> Hello world! </p>
</body>
</html>
main.css
@import url("secondary.css");
body {
background-color: red;
width: 100%;
height: 100%;
}
secondary.css
@import url("main.css");
h1 {
color: red;
}
Node.js code
index.js
"use strict";
let b = require('equalitie-bundler');
let bundler = new b.Bundler('http://localhost:8000/index.html');
console.log(b);
bundler.on('originalReceived', b.replaceCSSFiles);
bundler.on('resourceReceived', b.bundleCSSRecursively);
bundler.bundle((err, content) => {
if (err) {
console.log(err.message);
} else {
console.log(content);
}
});
If you now run a simple Python server in the directory with these files, and try to run the bundler code, it will try to recursively bundle CSS forever.
Just got this after a few repeated failures - still trying to reproduce but the fact there's a crash-inducing error is a bit worrying. I plan on setting up monit for Bundler regardless but either way:
error: Failed to call a resource response hook. Error: socket hang up
Failed to create bundle for /?url=https%3A%2F%2Fdistributed.deflect.ca%2F
Error: socket hang up
events.js:85
throw er; // Unhandled 'error' event
^
Error: write after end
at ServerResponse.OutgoingMessage.write (_http_outgoing.js:413:15)
at /opt/bundler/applications/proxyserver.js:96:11
at fs.js:336:14
at FSReqWrap.oncomplete (fs.js:99:15)
When a redirect like
document.location = '/some/path';
is encountered, the bundler should be configurable to allow that redirect to be followed.
At the moment logging is hardcoded to use "./log/info.log" and "../log/error.log". This is a bit inconsistent generally, but that's somewhat parallel to a larger question imo:
http://127.0.0.1:9008/?url=http://equalit.ie/ bundles perfectly okay, but http://127.0.0.1:9008/?url=http://equalit.ie/portfolio/np1sec doesn't.
This is because bundling fails when it gets a "Error: getaddrinfo ENOTFOUND" (something we might need to have a think about halting on generally). Regardless, this error is raised due to images in CSS being (de)referenced into this URL: "https://www.equalit.iewp-content/themes/agile/images/prettyPhoto/default/sprite_x.png". I can't quite figure out what's unique about this image (and a few other images that suffer the same fate), as other images in CSS load without incident, and I can't figure out where the URL is being reconstructed that fails to add the leading slash so I'm brain-dumping here until I can figure it out. Any counsel on this would be greatly appreciated.
It'd be awesome if we could simply return a bundle if some elements fail to load, but that's another task I suspect.
I believe there is an SSL problem somewhere in Bundler's remapping - despite extensive debugging, I can't figure out where. When not using remaps, there are zero issues with any of the domains in question.
In my branch (#4) I have added deflect.ca as an origin for distributed.deflect.ca, in addition to fulltimeinter.net as an origin for nosmo.me. The latter works just fine, but the former fails to remap as follows:
debug: in Bundler.send/async.reduce, memo = url=http://distributed.deflect.ca, strictSSL=false, agent=false, rejectUnauthorized=false
debug: in Bundler.send/async.reduce, memo = url=http://distributed.deflect.ca, strictSSL=false, agent=false, rejectUnauthorized=false,
debug: in Bundler.send/async.reduce, memo = url=http://distributed.deflect.ca, strictSSL=false, agent=false, rejectUnauthorized=false, Origin=Bundler
debug: in Bundler.send/async.reduce, memo = url=http://distributed.deflect.ca, strictSSL=false, agent=false, rejectUnauthorized=false, Origin=Bundler, followRedirect=true, followAllRedirects=false, maxRedirects=10
Remapping URL to http://deflect.ca/. Hostname is distributed.deflect.ca
error: Error making request to http://distributed.deflect.ca; Error: Error: socket hang up
at createHangUpError (http.js:1472:15)
at Socket.socketOnEnd [as onend] (http.js:1568:23)
at Socket.g (events.js:180:16)
at Socket.emit (events.js:117:20)
at _stream_readable.js:929:16
at process._tickCallback (node.js:419:13) socket hang up
Failed to create bundle for /?url=http://distributed.deflect.ca
Error: socket hang up
Deflect hosts' SSL setups are not exactly unique but they have some specifics - most obviously disabling SSLv2 and SSLv3 completely. There is a restricted but still quite generous cipher list in use also. In this particular case deflect.ca redirects to https://deflect.ca. The HTTPS redirect itself is not an issue as I've tested this in other places.
It looks like this could be related to nodejs/node-v0.x-archive/issues/5360, but I'm not certain. Signs definitely point that way, as using other sites that claimed to be experiencing similar issues as a remap also produces this error.
Getting this with node v0.10.33 and v0.12.0.
With the following remaps.json
file:
{
"distributed.deflect.ca": "deflect.ca",
"nosmo.me": "fulltimeinter.net"
}
and a request to the url
http://127.0.0.1:9008/?url=https://nosmo.me
yields the following output:
debug: Calling originalRequest hook with options url=https://nosmo.me, strictSSL=false, rejectUnauthorized=false
debug: Calling originalRequest hook with options url=https://nosmo.me, strictSSL=false, rejectUnauthorized=false,
debug: Calling originalRequest hook with options url=https://nosmo.me, strictSSL=false, rejectUnauthorized=false, Origin=Bundler
debug: Calling originalRequest hook with options url=https://nosmo.me, strictSSL=false, rejectUnauthorized=false, Origin=Bundler, followRedirect=true, followAllRedirects=false, maxRedirects=10
###### OPTIONS = { url: 'https://fulltimeinter.net/',
strictSSL: false,
rejectUnauthorized: false,
headers: { Origin: 'Bundler', Host: 'nosmo.me' },
followRedirect: true,
followAllRedirects: false,
maxRedirects: 10 }
###### OPTIONS = { url: 'https://fulltimeinter.net/horse_ebooks_large_verge_medium_landscape.jpg',
encoding: null,
headers: { Origin: 'Bundler', Host: 'nosmo.me' } }
error: Failed to call a resource response hook. Error: DEPTH_ZERO_SELF_SIGNED_CERT
Failed to create bundle for /?url=https://nosmo.me
Error: DEPTH_ZERO_SELF_SIGNED_CERT
run on OS X 10.9 with Node 0.10
chardet
library.Accessing https://www.deflect.ca/stats/ with bundler leads to a pageful of garbled absolute nonsense. However, so does accessing the (non-bundled) page with curl. I suspect that the page is offering either a faulty encoding or no encoding whatsoever. Bundler should try its best to compensate for pages that misbehave in this manner, if possible. I don't have any suggestions for the "how" of this for now, but I'm creating this ticket for reference.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.