Git Product home page Git Product logo

bundler's People

Contributors

nosmo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bundler's Issues

Error when passed request for undefined

Just got this error - I'm not sure what triggered it, but this error indicates a need for type checking regardless:

Got request for undefined

url.js:107
    throw new TypeError("Parameter 'url' must be a string, not " + typeof url)
          ^
TypeError: Parameter 'url' must be a string, not undefined
    at Url.parse (url.js:107:11)
    at Object.urlParse [as parse] (url.js:101:5)
    at Object.module.exports.sameHostPredicate (/opt/bundler/utils.js:56:29)
    at Server.handleRequests (/opt/bundler/proxyserver.js:103:26)
    at Server.emit (events.js:98:17)
    at HTTPParser.parser.onIncoming (http.js:2108:12)
    at HTTPParser.parserOnHeadersComplete [as onHeadersComplete] (http.js:121:23)
    at Socket.socket.ondata (http.js:1966:22)
    at TCP.onread (net.js:527:27)

I'll see if I can get anything useful about what caused it now.

Move remap config to individual config file

Currently the remap stuff lives in the proxyserver.js application - it'd be great it if could be moved to its own configuration file, in a similar simple JSON format. It'd also be nice if the location of this file could be configured via the psconfig.json file.

Reintroduce logging to Bundler

Despite me previously requesting its removal, I think that having lots of logging in bundler is better than having none at all.

A few options/ideas:

  • Having a configurable output location
  • Having an info and an error log file
  • Having a -v option to log to stderr/stdout at a higher level of output
  • Configurable format (single-line vs. JSON output or something) would be nice but this is low-priority

Resource bundling for non-external resources?

I'd like a way to configure the bundling of non-local resources to the requested domain. Currently if I fetch site.com and site.com has a resource on gstatic.com (let's say), the resource on gstatic.com will be bundled. This is obviously great for CeNo!'s purposes, but for DDeflect this is not optimal - I'd like a means to specify that if I ask for site.com, only resources under site.com be bundled.

Using a remap to retrieve learn.equalit.ie via Bundler results in a parsing failure that breaks bundling

Fetcing the following using the bundler-proxy works fine: http://127.0.0.1:9008/?url=http://learn.equalit.ie/wiki/Main_Page

However, when using a remap that points learn.distributed.deflect.ca at learn.equalit.ie's origin (contact me off-thread for the IP address), I get some really weird behaviour which ultimately results in a 500 error. Some debugging and checking the logs on the origin reveals that Bundler is attempting to fetch the following files:

[10/Apr/2015:17:33:55 +0200] "GET /mw/)!ie}.mw-help-field-data{display:block;background-color: HTTP/1.1" 301 625 "-" "-"
[10/Apr/2015:17:33:55 +0200] "GET /mw/)!ie;background-position:left%20center;background-repeat:no-repeat;cursor:pointer;font-size:.8em;text-decoration:underline;color: HTTP/1.1" 301 783 "-" "-"
[10/Apr/2015:17:33:55 +0200] "GET /mw/)%20} HTTP/1.1" 301 516 "-" "-"

This appears to be a failure to finish loading or encoding style sheets or something of the sort.

Why would Bundler be behaving differently when in theory all that should have changed is the Host header and the URL?

Need to return errors to client in all cases

Currently our error returning is somewhat selective - could we have catch-alls in place to ensure that no matter what happens we give the client a return of some sort, in lieu of a timeout?

For example:

debug: in Bundler.send/async.reduce, memo = url=http://localhost/favicon.ico, strictSSL=false, rejectUnauthorized=false
debug: in Bundler.send/async.reduce, memo = url=http://localhost/favicon.ico, strictSSL=false, rejectUnauthorized=false,
debug: in Bundler.send/async.reduce, memo = url=http://localhost/favicon.ico, strictSSL=false, rejectUnauthorized=false, Origin=Bundler
debug: in Bundler.send/async.reduce, memo = url=http://localhost/favicon.ico, strictSSL=false, rejectUnauthorized=false, Origin=Bundler, followRedirect=true, followAllRedirects=false, maxRedirects=10
error: Error making request to http://localhost/favicon.ico; Error: Error: connect ECONNREFUSED
    at errnoException (net.js:904:11)
    at Object.afterConnect [as oncomplete] (net.js:895:1

Split proxy server off into its own repo

Now that this code is in npm, I think it makes sense to split off the proxy server or at least a copy of it given the amount of changes that will be happening to it over the next while. @redwire - what are your thoughts?

Issues Bundling pages

There are some differences between my locally bundled version of learn.equalit.ie and the currently proxied-to instance at http://learn.distributed.deflect.ca/. They are not currently serving the same origin but they host the same content for the most part. It seems some of the issues are related to the content types of the CSS files.

Images contain errors

The bundling process that is applied to images is precisely the same as is used for CSS files and Javascript files. The latter two types of resources are being bundled just fine, and tests on various websites confirm that dynamic functionality and styles work and are applied correctly. However, images do not appear to be being bundled properly. In displaying the bundle for a page like imgur.com, we observe that images are not displayed but the CSS and javascript around them works fine.

Upon viewing a bundled page's source code, we see that images are indeed being bundled into valid-looking data URIs. However I also notice Firefox's source view displaying some error-indicating red text where escape-codes for special characters appear. My suspicion is that somehow some replacement of special characters is happening and causing issues or that this encoding is interfering in the display of the site. However, clicking the data URI of an image informs us that the image cannot be displayed due to containing errors.

Infinite loop in CSS self-referencing import

Figured out by @scottstamp

Proof of concept:

index.html

<!DOCTYPE html>
<html>
<head>
    <title>Hi</title>
    <link rel="stylesheet" href="main.css" />
</head>
<body>
    <p> Hello world! </p>
</body>
</html>

main.css

@import url("secondary.css");

body {
    background-color: red;
    width: 100%;
    height: 100%;
}

secondary.css

@import url("main.css");

h1 {
    color: red;
}

Node.js code

index.js

"use strict";


let b = require('equalitie-bundler');

let bundler = new b.Bundler('http://localhost:8000/index.html');

console.log(b);
bundler.on('originalReceived', b.replaceCSSFiles);
bundler.on('resourceReceived', b.bundleCSSRecursively);

bundler.bundle((err, content) => {
    if (err) {
        console.log(err.message);
    } else {
        console.log(content);
    }
});

If you now run a simple Python server in the directory with these files, and try to run the bundler code, it will try to recursively bundle CSS forever.

Crash after repeated socket hang ups

Just got this after a few repeated failures - still trying to reproduce but the fact there's a crash-inducing error is a bit worrying. I plan on setting up monit for Bundler regardless but either way:

error: Failed to call a resource response hook. Error: socket hang up
Failed to create bundle for /?url=https%3A%2F%2Fdistributed.deflect.ca%2F
Error: socket hang up
events.js:85
      throw er; // Unhandled 'error' event
            ^
Error: write after end
    at ServerResponse.OutgoingMessage.write (_http_outgoing.js:413:15)
    at /opt/bundler/applications/proxyserver.js:96:11
    at fs.js:336:14
    at FSReqWrap.oncomplete (fs.js:99:15)

Follow Javascript redirects

When a redirect like

document.location = '/some/path';

is encountered, the bundler should be configurable to allow that redirect to be followed.

Logging should be configurable and optional

At the moment logging is hardcoded to use "./log/info.log" and "../log/error.log". This is a bit inconsistent generally, but that's somewhat parallel to a larger question imo:

  • Logging should be optional- if someone wants to use bundler without any logging it should be doable.
  • We should be able to pass the location that the logs are written to

Images referenced in CSS files with relative positions sometimes break bundling

http://127.0.0.1:9008/?url=http://equalit.ie/ bundles perfectly okay, but http://127.0.0.1:9008/?url=http://equalit.ie/portfolio/np1sec doesn't.

This is because bundling fails when it gets a "Error: getaddrinfo ENOTFOUND" (something we might need to have a think about halting on generally). Regardless, this error is raised due to images in CSS being (de)referenced into this URL: "https://www.equalit.iewp-content/themes/agile/images/prettyPhoto/default/sprite_x.png". I can't quite figure out what's unique about this image (and a few other images that suffer the same fate), as other images in CSS load without incident, and I can't figure out where the URL is being reconstructed that fails to add the leading slash so I'm brain-dumping here until I can figure it out. Any counsel on this would be greatly appreciated.

It'd be awesome if we could simply return a bundle if some elements fail to load, but that's another task I suspect.

Potential SSL/TLS issue when remapping

I believe there is an SSL problem somewhere in Bundler's remapping - despite extensive debugging, I can't figure out where. When not using remaps, there are zero issues with any of the domains in question.

In my branch (#4) I have added deflect.ca as an origin for distributed.deflect.ca, in addition to fulltimeinter.net as an origin for nosmo.me. The latter works just fine, but the former fails to remap as follows:

debug: in Bundler.send/async.reduce, memo = url=http://distributed.deflect.ca, strictSSL=false, agent=false, rejectUnauthorized=false
debug: in Bundler.send/async.reduce, memo = url=http://distributed.deflect.ca, strictSSL=false, agent=false, rejectUnauthorized=false,
debug: in Bundler.send/async.reduce, memo = url=http://distributed.deflect.ca, strictSSL=false, agent=false, rejectUnauthorized=false, Origin=Bundler
debug: in Bundler.send/async.reduce, memo = url=http://distributed.deflect.ca, strictSSL=false, agent=false, rejectUnauthorized=false, Origin=Bundler, followRedirect=true, followAllRedirects=false, maxRedirects=10
Remapping URL to http://deflect.ca/. Hostname is distributed.deflect.ca
error: Error making request to http://distributed.deflect.ca; Error: Error: socket hang up
    at createHangUpError (http.js:1472:15)
    at Socket.socketOnEnd [as onend] (http.js:1568:23)
    at Socket.g (events.js:180:16)
    at Socket.emit (events.js:117:20)
    at _stream_readable.js:929:16
    at process._tickCallback (node.js:419:13) socket hang up
Failed to create bundle for /?url=http://distributed.deflect.ca
Error: socket hang up

Deflect hosts' SSL setups are not exactly unique but they have some specifics - most obviously disabling SSLv2 and SSLv3 completely. There is a restricted but still quite generous cipher list in use also. In this particular case deflect.ca redirects to https://deflect.ca. The HTTPS redirect itself is not an issue as I've tested this in other places.

It looks like this could be related to nodejs/node-v0.x-archive/issues/5360, but I'm not certain. Signs definitely point that way, as using other sites that claimed to be experiencing similar issues as a remap also produces this error.

Getting this with node v0.10.33 and v0.12.0.

Error when encountering self-signed certificates

With the following remaps.json file:

{
  "distributed.deflect.ca": "deflect.ca",
  "nosmo.me": "fulltimeinter.net"
}

and a request to the url

http://127.0.0.1:9008/?url=https://nosmo.me

yields the following output:

debug: Calling originalRequest hook with options  url=https://nosmo.me, strictSSL=false, rejectUnauthorized=false
debug: Calling originalRequest hook with options  url=https://nosmo.me, strictSSL=false, rejectUnauthorized=false,
debug: Calling originalRequest hook with options  url=https://nosmo.me, strictSSL=false, rejectUnauthorized=false, Origin=Bundler
debug: Calling originalRequest hook with options  url=https://nosmo.me, strictSSL=false, rejectUnauthorized=false, Origin=Bundler, followRedirect=true, followAllRedirects=false, maxRedirects=10
###### OPTIONS =  { url: 'https://fulltimeinter.net/',
  strictSSL: false,
  rejectUnauthorized: false,
  headers: { Origin: 'Bundler', Host: 'nosmo.me' },
  followRedirect: true,
  followAllRedirects: false,
  maxRedirects: 10 }
###### OPTIONS =  { url: 'https://fulltimeinter.net/horse_ebooks_large_verge_medium_landscape.jpg',
  encoding: null,
  headers: { Origin: 'Bundler', Host: 'nosmo.me' } }
error: Failed to call a resource response hook. Error: DEPTH_ZERO_SELF_SIGNED_CERT
Failed to create bundle for /?url=https://nosmo.me
Error: DEPTH_ZERO_SELF_SIGNED_CERT

run on OS X 10.9 with Node 0.10

DDeflect wishlist

  • A configuration file - It can be JSON/YAML/INI, something readable that is easily understood cross-language. My preference is for YAML or JSON.
  • Daemonisation - ability to run as a service. Not a high priority while dev is in progress.
  • Privilege dropping - process.setgid/process.setuid
  • Polite logging - ability to verbosely log to console or to actionably log to syslog. Explanatory logging once a failure is detected (currently all we have is Abort: fail from PhantomJS 💩)
  • Parallelised fetching
  • Encoding awareness - probably not a major issue with Node but we have encountered some issues with coercion of strings breaking encodings. I assume there is a Javascript replacement for Python’s chardet library.
  • Remaps - the ability to read a data structure that will say www.site.com is actually at 10.0.0.1, and then remap the origin based on the Host header of the incoming request (in addition to other header manipulations/passing elsewhere )
  • Proxy support - the ability to use a proxy server for all requests to the origin site
  • Header-passing support - incoming requests will have headers that will need to be passed on to the origin (cookies, other headers) and some that will not need to be passed on (Cache-Control etc). A white/blacklist system for this would be great.
  • Redirect handling - Initially support for returning redirects, later support for all HTTP status codes
  • Polite behaviours - passing a proper Via header, setting a user agent

Potential encoding issue with misbehaving pages

Accessing https://www.deflect.ca/stats/ with bundler leads to a pageful of garbled absolute nonsense. However, so does accessing the (non-bundled) page with curl. I suspect that the page is offering either a faulty encoding or no encoding whatsoever. Bundler should try its best to compensate for pages that misbehave in this manner, if possible. I don't have any suggestions for the "how" of this for now, but I'm creating this ticket for reference.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.