Git Product home page Git Product logo

chrome-har's People

Contributors

juvirez avatar lallenlowe avatar marty90 avatar michaelcypher avatar mikedijkstra avatar miro-balaz avatar soulgalore avatar starrify avatar tobli avatar yurynix avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

chrome-har's Issues

Redirects in main document are not detected as new page (explicit redirect or navigation)

Hi everybody !

At first, thanks for this very useful tool !

And I wanted to suggest an improvement about a multi page HAR output, and I will try to provide as many elements as possible.

Usecase:

page1.html: loading 3 images + 1 js file, redirecting to page2.html on DOMContentloaded + 500ms
page2.html: loading 2 images

With chrome-har:

  • the generated HAR object is "flat" (only page1.html detected, and every entries are relative to this page => 1 page in har.log.pages)

1-page-detected

1-page-detected-pages-section

With Chrome Devtools > Save all as HAR with content:

  • the generated HAR object detects the 2 pages (page1.html and page2.html exist in har.log.pages, with entry.pageref relative to expected page id)

2-pages-detected

2-pages-detected-pages-section

Suspicion:

  • in chrome-har/index.js, method harFromMessages: add a page (maybe on "Page.frameScheduledNavigation" protocol event) when frameId == rootFrame ?

strong-suspicion

protocol-monitor

Alternative / extended behaviour:

  • possibility to add an extra "allowMultiPage" parameter to harFromMessages to generate multi page HAR when set to true (preserve default behaviour) ?

Thanks by advance !

Advice on capturing iframe or service worker events?

Hi, I'm not sure of the best place to ask this. I see that there are some PRs / issues around iframe support in the project, but I'm struggling to see how to hook it up.

I am using the Chrome Extension debugger API to attach to the top level tabId. I then receive a top level iframe loading network event, but I won't receive any events for scripts inside of that iframe. The same thing occurs for service workers.

I think that I can manually attach to the iframe as a target, but that leaves the problem of attaching immediately and enabling network before any requests are made in the target. Target.setAutoAttach comes back successful, but I only see events the time after I restart the debugger—which is not super helpful. Using functions like Target.setDiscoverTargets just says "not allowed", so those don't seem viable.

Any pointers here?

Add option to skip deleteInternalProperties()

It would be awesome if there was an option to keep the internal properties of the HAR entries, such as __requestId. This would allow us to make extensions to the HAR, e.g. to add the raw content of the requests. Is that something you'd consider adding? If yes, I can prepare a pull request.

chrome-har crashes with "Cannot set property 'receive' of undefined" when working on puppeteer output

when trying to work on my application using puppeteer i attempted to use chrome-har to convert the session events into a har file
everything works great, except a few requests (3 requests which are all served from browser cache) which seem to cause chrome-har to crash with following error message

TypeError: Cannot set property 'receive' of undefined
    at harFromMessages (c:\tufin\tasks\2019_08_27_puppeteerPlayground\node_modules\chrome-har\index.js:373:29)
    at Object.<anonymous> (c:\tufin\tasks\2019_08_27_puppeteerPlayground\testEventsLeadingToErrorInChromHar.js:25:13)

this is how i created the browser session events that i used to create the har file that fails

let events = [];
const client = await page.target().createCDPSession();
await client.send('Page.enable');
await client.send('Network.enable');
observe.forEach(method => {
	client.on(method, params => {
		events.push({ method, params });
	});
});

this is an example of the problematic events (from a single requestId) that failed

{
		"method": "Network.requestWillBeSent",
		"params": {
			"requestId": "1000029348.4288",
			"loaderId": "86B2349DB4E665FBB7E8BADA9DF0D2F6",
			"documentURL": "https://10.100.9.28/securetrack/pages/homePage/dashboard/dashboardMain.faces",
			"request": {
				"url": "https://10.100.9.28/securetrack/fonts/fontawesome-webfont.woff2?v=4.5.0",
				"method": "GET",
				"headers": {},
				"mixedContentType": "none",
				"initialPriority": "VeryLow",
				"referrerPolicy": "no-referrer-when-downgrade"
			},
			"timestamp": 257791.674372,
			"wallTime": 1567574376.163183,
			"initiator": {
				"type": "parser",
				"url": "https://10.100.9.28/securetrack/pages/homePage/dashboard/dashboardMain.faces",
				"lineNumber": 42
			},
			"type": "Font",
			"frameId": "15AFFC9193B829A16FD799DE57028678",
			"hasUserGesture": false
		}
	},
	{
		"method": "Network.requestServedFromCache",
		"params": {
			"requestId": "1000029348.4288"
		}
	},
	{
		"method": "Network.loadingFinished",
		"params": {
			"requestId": "1000029348.4288",
			"timestamp": 257791.674396,
			"encodedDataLength": 0,
			"shouldReportCorbBlocking": false
		}
	}

when i remove all of the requests which are problematic (there are 3 requests, all related to this strange fontawesome-webfont url, with 3 events each, total 9 events) everything works great

discrepancies in Har generated from browsertime and Chorme DevTools

There are discrepancies in the har generated from browsertime and the one from Chrome Dev tools

Find attached the two har files run on https://www.infosys.com

I have used below commands to generate the HAR using browsertime

browsertime https://www.infosys.com --chrome.chromedriverPath D:\programs\chromedriver_win32\89.0.4389.23\chromedriver.exe --iterations 1 --headless --browser chrome --chrome.includeResponseBodies all

My chrome browser version is 89.0.4389.114 and the compatible chrome driver I used is of version 89.0.4389.23

Chrome DevTools HAR has total 143 Http Requets whereas in btime har has only 137 http requests captured.

hars.zip

doesn't work with lighthouse devtoolslog.json?

So GoogleChrome/lighthouse#4005 implies this module will generate a HAR file from lighthouse output from the lighthouse cli like:
$ lighthouse --save-assets http://google.com

which generates several files the interesting one being:
www.google.com_2018-11-28_11-31-08-0.devtoolslog.json

Trying the create a HAR file from that gives errors like:

m-c02x8099jgh7:~ m0t01b6$ DEBUG=* node node_modules/chrome-har/tools/harCreator.js ~/www.google.com_2018-11-28_11-31-08-0.devtoolslog.json 
chrome-har Request will be sent with requestId F64EEE5F8CA7C3C531714B9410C214AB that can't be mapped to any page at the moment. +0ms
chrome-har Couldn't find original request for redirect response: F64EEE5F8CA7C3C531714B9410C214AB +2ms
chrome-har Request will be sent with requestId F64EEE5F8CA7C3C531714B9410C214AB that can't be mapped to any page at the moment. +0ms
chrome-har Couldn't find original request for redirect response: F64EEE5F8CA7C3C531714B9410C214AB +0ms
chrome-har Request will be sent with requestId F64EEE5F8CA7C3C531714B9410C214AB that can't be mapped to any page at the moment. +0ms

Unhandled rejection TypeError: Cannot read property 'content' of undefined

  at Object.harFromMessages (/Users/m0t01b6/node_modules/chrome-har/index.js:326:28)

  at fs.readFileAsync.then.then.messages (/Users/m0t01b6/node_modules/chrome-har/tools/harCreator.js:22:28)

  at tryCatcher (/Users/m0t01b6/node_modules/bluebird/js/release/util.js:16:23)

  at Promise._settlePromiseFromHandler (/Users/m0t01b6/node_modules/bluebird/js/release/promise.js:512:31)

  at Promise._settlePromise (/Users/m0t01b6/node_modules/bluebird/js/release/promise.js:569:18)

  at Promise._settlePromise0 (/Users/m0t01b6/node_modules/bluebird/js/release/promise.js:614:10)

  at Promise._settlePromises (/Users/m0t01b6/node_modules/bluebird/js/release/promise.js:694:18)

  at Promise._fulfill (/Users/m0t01b6/node_modules/bluebird/js/release/promise.js:638:18)

  at Promise._resolveCallback (/Users/m0t01b6/node_modules/bluebird/js/release/promise.js:432:57)

  at Promise._settlePromiseFromHandler (/Users/m0t01b6/node_modules/bluebird/js/release/promise.js:524:17)

  at Promise._settlePromise (/Users/m0t01b6/node_modules/bluebird/js/release/promise.js:569:18)

  at Promise._settlePromise0 (/Users/m0t01b6/node_modules/bluebird/js/release/promise.js:614:10)

  at Promise._settlePromises (/Users/m0t01b6/node_modules/bluebird/js/release/promise.js:694:18)

  at Promise._fulfill (/Users/m0t01b6/node_modules/bluebird/js/release/promise.js:638:18)

  at /Users/m0t01b6/node_modules/bluebird/js/release/nodeback.js:42:21

  at FSReqWrap.readFileAfterClose [as oncomplete] (internal/fs/read_file_context.js:53:3)

Thoughts??? THANKS!!!

pageTimings empty

Hello,

I'm using [email protected] [email protected] and [email protected]

const puppeteer = require('puppeteer');
const PuppeteerHar = require('puppeteer-har');

(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
console.log("create page");

const har = new PuppeteerHar(page);
await har.start({ path: 'results.har' });
console.log('Har started');

await page.goto('https://********/portail');

await har.stop();
await browser.close();
})();

How can I stop having empty "pageTimings" please ?

{"log":{"version":"1.2","creator":{"name":"chrome-har","version":"0.11.7","comment":"https://github.com/sitespeedio/chrome-har"},"pages":[{"id":"page_1","startedDateTime":"2020-04-29T14:00:06.681Z","title":"https://********/portail","**pageTimings":{}**}],"entries":[{"cache":{},"startedDateTime":"2020-04-29T14:00:06.682Z","_requestId":"5380B4DD08F8636FEB1F2EC3A5914CE6","_initialPriority":"VeryHigh","_priority":"VeryHigh","pageref":"page_1","request":{"method":"GET","url":

Create a new Release

Hello,

I would like to create a new release for chrome-har with the fix for initiator ( See #9 )

Do I have to make a Pull Request with a Tag and ask you to merged it so that you can create the release

Or do I have to open an Issue to create a new release ?

Thanks 😄

Wrong value type for entry request params

The value field should be a string according to: https://dvcs.w3.org/hg/webperf/raw-file/tip/specs/HAR/Overview.html#sec-object-types-params, instead it is an object.

This doesn't happen all the time, but sometimes it does. So if the value is a Json, I think it should be escaped, somehow, to not break the RFC.

I just replaced the title and url.

value [string, optional] - value of a posted parameter or content of a posted file.

{
  "request": {
    "method": "GET",
    "postData": {
      "params": [
        {
          "name": "feedUrls",
          "value": {
            "Title": "xxTitlexx",
            "Url": "xxUrlxx",
            "ImageHandling": 1
          }
        }
      ]
    }
  }
}

Request order sometimes wrong for H2

or at least when testing Wikipedia. The request on upload.wikimedia... with DNS/Connect/SSL should come first, in these cases they are the third on that domain. Let me add trace logs.

screen shot 2017-03-10 at 22 03 15

screen shot 2017-03-10 at 22 02 07

Transfer sizes seem incorrect

Original issue (sitespeedio/sitespeed.io#1467)

==========

Hello there,

I've been trying to run sitespeed on a page behind login (I believe that's all good now) but I am seeing different results (on requests numbers for example) between doing it "manually" in my Chrome browser (and looking in the Network tab) and when doing it via the sitespeed command.

I've setup a test page with a login / password for you to have a look here: http://lumsites-sandbox.appspot.com/a/lumapps/home/sitespeed-homepage-test-private (see credentials in the preScript login script I use).

Here's the command I use:

docker-compose run sitespeed.io --preScript sitespeed-login.js http://lumsites-sandbox.appspot.com/a/lumapps/home/sitespeed-homepage-test-private -n 3 -b chrome --browsertime.chrome.args no-sandbox --browsertime.chrome.args disk-cache-dir=/dev/null --browsertime.chrome.args disk-cache-size=1 --browsertime.chrome.args media-cache-size=1 --browsertime.pageCompleteCheck 'return (function() {try { return (Date.now() - window.performance.timing.loadEventEnd) > 15000;} catch(e) {} return true;})()' --graphite.host=graphite --summary-detail

These are the versions it uses on my machine:

Versions OS: linux 4.9.8-moby nodejs: v6.9.1 sitespeed.io: 4.4.2 browsertime: 1.0.0-beta.25 coach: 0.31.0

Which gives me the following results:

Total requests 31
Image requests 3
CSS requests 4
Javascript requests 7
Font requests 0

When I do the same process manually (logging in, then going to the test page and looking up the numbers in Chrome > Network with cache disabled) I get this:

Total requests 51
Image requests 4
CSS requests 4
Javascript requests 21
Font requests 5

I suspect it has something to do with the 'warm cache' due to the login page visited during the preScript step, but if that's the case I can't see how to clear that cache completely before running the tests on my target page.

Thanks in advance for your help or any hint on what I may have done wrong.

For info, I get different results depending on the version of the image of sitespeed I specify in my docker-compose file. The 'Transfer size' values look especially different.

==========

sitespeed 4.0.3

Score / Metric Median
✗ Overall score 67
✗ Performance score 65
✗ Accessibility score 65
✗ Best Practice score 76
√ Fast Render advice 95
! Avoid scaling images advice 90
√ Compress assets advice 100
! Optimal CSS size advice 90
✗ Total size (transfer) 1.9 MB
√ Image size (transfer) 642.6 KB
✗ Javascript size (transfer) 1.1 MB
✗ CSS size (transfer) 97.0 KB
Total requests 43
Image requests 13
CSS requests 1
Javascript requests 11
Font requests 0
200 responses 43
Domains per page 7
Cache time 10 minutes
Time since last modification -1 second
RUM Speed Index 5121
First Paint 4840 ms
Backend Time 1530 ms
Frontend Time 7155 ms

==========

sitespeed 4.5.1

Score / Metric Median

✗ Overall score 69
✗ Performance score 68
✗ Accessibility score 65
✗ Best Practice score 76
√ Fast Render advice 95
! Avoid scaling images advice 90
✗ Compress assets advice 80
! Optimal CSS size advice 90
√ Total size (transfer) 987.1 KB
√ Image size (transfer) 664.9 KB
✗ Javascript size (transfer) 126.5 KB
✗ CSS size (transfer) 113.3 KB
Total requests 46
Image requests 17
CSS requests 4
Javascript requests 7
Font requests 0
200 responses 46
Domains per page 8
Cache time 45 minutes
Time since last modification -1 second
RUM Speed Index 4440
First Paint 2015 ms
Backend Time 1389 ms
Frontend Time 6617 ms

Receive Size wrong for cached entries

Hi,

We are using chrome har creator in out production env. If we enable cached entries in code, it is the case that receive time for cached entries seems to be wrong.

www softwareishard com_har_viewer_ 2

If we get the har file manually from chrome, receive time is always within few milliseconds. Not sure what the issue is. Does anyone has the same issue.

Make response.content.text full value available

Currently in the outputted HAR, the response.content object does not include the actual content, which is how the HAR typically looks when exported from the browser. Instead there is a compression value and no text value. Is there a way to enable this or implement in the code?

                "response": {
                    "httpVersion": "h2",
                    "redirectURL": "",
                    "status": 200,
                    "statusText": "",
                    "content": {
                        "mimeType": "text/html",
                        "size": 1135094,
                        "compression": 979339
                    },

requestIntercepted deprecated. Support for responseReceived?

Seem requestIntercepted has been deprecated. Is there any way for this library to work with responseReceived? It contains response and request information. All you would need for capture is:

client.on('Network.responseReceived', async (params: any) => { let method = "Network.responseReceived"; harEvents.push({ method, params }); });

However when harFromMessages is called, it gets no events and no pages, basically empty. Thoughts?

How to include the "_error" in the HAR file?

This issue is copied from sitespeedio/sitespeed.io#3132

Hi, my targeted website includes some pictures that are blocked by the firewall, so that there some errors like "_error": "net::ERR_TIMED_OUT" or "_error": "net::ERR_CONNECTION_RESET" in the HAR file generated by the Chrome DevTools.

But I can not find any such errors in the HAR file generated by the sitespeed.io on the same website pages while there is this error:

ERROR: Failed waiting on page to finished loading, timed out after 300000 ms BrowserError: Running page complete check

Is it possible to ask the sitespeed.io tool to include the "_error" in the HAR file?

Invalid time value

On version 0.13.2, getting this error fairly consistently:

RangeError: Invalid time value
at Date.toISOString ()
at M.m.toISOString (/app/node_modules/chrome-har/node_modules/dayjs/dayjs.min.js:1:6270)
at module.exports (/app/node_modules/chrome-har/lib/entryFromResponse.js:185:53)
at harFromMessages (/app/node_modules/chrome-har/index.js:410:15)

har.stop() - random error - "RangeError: Invalid time value"

My netsniffPrototype.js :

await this.page.goto(url, {timeout: 60000});
let cssSelectorconsent = '#consent-notice-agree-button';
let cookieAcceptBtn = await this.page.waitForSelector(cssSelectorconsent, { timeout: 60000, visible: true });
await cookieAcceptBtn.click();
await this.page.setCacheEnabled(false);
await this.harStart();
await this.page.reload({waituntil: networkidle2, timeout: 60000});
await this.harStop();

Add console.log in chrome-har/lib/entryFromResponse.js

  const entrySecs =
    page.__wallTime + (timing.requestTime - page.__timestamp);
  console.log('page.__wallTime  = ' + page.__wallTime + ' timing.requestTime =  ' + timing.requestTime + ' page.__timestamp  = ' + page.__timestamp)
  entry.startedDateTime = dayjs.unix(entrySecs).toISOString();
  console.log('dayjs.unix(entrySecs).toISOString() =  ' + dayjs.unix(entrySecs).toISOString())

Stdout results (Python call nodejs script using subprocess.run() ) :

2020-02-24:12:39:28,092 INFO [netsniffPrototype.js:296] <Thread(Thread-1, started daemon 140015018862336)> Stop Har events

page.__wallTime = 1582544363.28644
timing.requestTime = 6971.141358
page.__timestamp 6971.138063
dayjs.unix(entrySecs).toISOString() = 2020-02-24T11:39:23.289Z

page.__wallTime = undefined
timing.requestTime = 6971.149286
page.__timestamp = undefined

2020-02-24:12:39:28,101 ERROR [netsniffPrototype.js:303] <Thread(Thread-1, started daemon 140015018862336)> Process exitHarNotStopped event Har not stopped properly !
e.stack => RangeError: Invalid time value
at Date.toISOString ()
at h.d.toISOString (/home/pptruser/node_modules/dayjs/dayjs.min.js:1:6372)
at module.exports (/home/pptruser/node_modules/chrome-har/lib/entryFromResponse.js:161:53)
at harFromMessages (/home/pptruser/node_modules/chrome-har/index.js:310:15)
at PuppeteerHar.stop (/home/pptruser/node_modules/puppeteer-har/lib/PuppeteerHar.js:109:21)
at runMicrotasks ()
at processTicksAndRejections (internal/process/task_queues.js:97:5)

Cached resources does not appear in har

Hello,
I am using chrome-har as a npm package to extract har from my puppeteer scenarios.

However, I can't see the requests served from cache when I reload any page.
Here is a quick check-list :

  • I observe the correct events (mostly Network.requestServedFromCache for my problem I guess)
  • I record all the events
  • I output everything in a har file.

I tested with a simple scenario : a load of https://google.fr as a first step, and then a reload of the page.
You can find attached the waterfalls/har of both steps :
Google_test_waterfall_cache.zip

For example the google logo is not present in the second capture
(https://www.gstatic.com/images/branding/googlelogo/1x/googlelogo_color_92x36dp.png)

Here are some log details that could be useful :
chrome-har_cache.zip

Perhaps I misunderstood and chrome-har cannot manage requests served from cache ?
The code seems to handle Network.requestServedFromCache, and I'm pretty sure I've already recorded a har which displayed cached ressources !

Don't hesitate if you want a full repro demo with puppeteer.

Thanks a lot, have a good day.

SPA without Page navigation events?

I'm trying to automate a test with puppeteer where I generate a har file after each interaction step.
Since it is an SPA I do not get Page navigation events. Because of that I do not get any results except for the intial page where I do have a page event. During debugging I could see that the events are captured but the nr. of pages is 0.
It also does not seem to be related to puppeteer because I checked the events when manually navigating the page.

Any hints what I can do?
I can maybe clone the code and throw out all page event handling, because effectively I do not need those events.
Any better option?

HAR file partially generated

Hi everyone,

I already described the situation in this thread:

sitespeedio/browsertime#1664 (comment)

but, anyway, to summarize the problem, what I want to do is to visit a page for 9 minutes, and, on this page, there is a video provided using Nginx server and dash player.

Everything seems to work well in every scenario except for what concerns http3 using tc (traffic controller - Linux) with 10Mbit Up/Down constraints.
In particular, I can see on the nginx server-side that browsertime executes correctly 150 (almost) requests but on the har file I'm able to see only the first 60.
I don't know why browsertime seems to stop registering requests after the first 60.
Also looking at the pcap (that is not attached cause dimension) the experiment seems fine, the problem is only in missing data in the .har file

To conclude, I attach the result obtained in one of these cases, in the zip folder there are both .har and perfLogs of chrome.

I am really grateful if anyone can help me.
Really thanks,
Gianluca

http3.zip

Early hints requests do not appear in HAR

On a page with verified early hints requests we can see requests initiated by early hints:

image

Running that page with npx browsertime --chrome.collectPerfLog -n 1 $URL and inspecting the HAR we can see:

  • Link headers on the main document indicating that early hint requests should be made
  • No early hints requests are present, e.g.: no .css requests
  • SVG requests initiated by .css stylesheets

Inspecting the devtools log I can see that the request made to master-cb9e9afd3d.css (request ID: 87395.8) has the expected Network.requestWillBeSent, Network.responseReceived, Network.dataReceived, Network.loadingFinished events.

Curiously, I noticed that both fromDiskCache and fromEarlyHints are both true on the Network.responseReceived request:

image

In checking entryFromResponse.js we can see that fromDiskCache and __servedFromCache are used to gate functionality:

if (response.fromDiskCache === true) {

if (!entry.__servedFromCache) {

I suspect we'll want to check request entries to see if they've been loaded by early request hints.

In case we need it, there's a Network.responseReceivedEarlyHints event.

Problems (i met) in Chrome Extension use

I'm using chrome-har in my Chrome Extension, all i want is to record a pice of network activity and generate a HAR file.

chrome.debugger.sendCommand({ tabId }, 'Network.enable', {}); chrome.debugger.sendCommand({tabId}, 'Page.enable', {});
As above, i enabled "Network" and "Page" event debugger whitch i investigated they are Required to get HAR data

Then

chrome.debugger.onEvent.addListener(debuggerEventListener);

i add an Listener to Listen the events on the page

and then

const debuggerEventListener = (debuggeeId, method, params) => { if (recording && debuggeeId.tabId === tabId) { if (method !== 'Network.responseReceived') { networkEvents.push({method, params}); } else { const requestId = params.requestId; if (params.response) { chrome.debugger.sendCommand({ tabId: tabId }, "Network.getResponseBody", { requestId: requestId }, function (responseBody) { if (responseBody) { if (responseBody.body) { params.response.body = responseBody.body; networkEvents.push({method, params}); } else { networkEvents.push({method, params}); } } else { console.error('Error fetching response body:', error); networkEvents.push({method, params}); } }); } else { networkEvents.push({method, params}); } } } }

I collect all the event in a Array and i past the Array to harFromMessages Fun

const har = harFromMessages(eventsArrayData, {includeTextFromResponseBody: true, includeResourcesFromDiskCache: true});

The problem is i got pages: [] and entries: [] in the last har result. why???

i printed the eventsArrayData in my console, there is Obviously so many events data collected, but harFromMessages Fun gives me two [] in the result. is seems that harFromMessages dose not push my given event, I'm very confused about this

Looking forward to your answer 🙏

New HAR page doesn't appear to be created upon navigation

This is the test code. It loads http://localhost:8080/ and then clicks on the only link on the page. The index.html page has one IMG tag, one IFRAME tag and one A tag.

const fs = require('fs');
const { promisify } = require('util');

const puppeteer = require('puppeteer');
const { harFromMessages } = require('chrome-har');

// list of events for converting to HAR
const events = [];

// event types to observe
const observe = [
  'Page.loadEventFired',
  'Page.domContentEventFired',
  'Page.frameStartedLoading',
  'Page.frameAttached',
  'Network.requestWillBeSent',
  'Network.requestServedFromCache',
  'Network.dataReceived',
  'Network.responseReceived',
  'Network.resourceChangedPriority',
  'Network.loadingFinished',
  'Network.loadingFailed',
];

function wait(ms) {
  return new Promise((resolve, reject) => setTimeout(resolve, ms));
}

(async () => {
  const launchOptions = {
    executablePath: "C:/Program Files (x86)/Google/Chrome/Application/chrome.exe",
  };
  const browser = await puppeteer.launch(launchOptions);
  const page = await browser.newPage();

  // register events listeners
  const client = await page.target().createCDPSession();
  await client.send('Page.enable');
  await client.send('Network.enable');
  observe.forEach(method => {
    client.on(method, params => {
      events.push({ method, params });
    });
  });

  // perform tests

  await page.goto('http://localhost:8080/');

  (await page.waitForSelector('a', { visible: true })).click();

  await page.waitForNavigation({ waitUntil: 'networkidle2' });

  await wait(2000);

  await browser.close();

  // convert events to HAR file
  const har = harFromMessages(events);
  await promisify(fs.writeFile)('localhost.har', JSON.stringify(har));
})();

index.html

<p><img src="One_black_Pixel.png?index">
<p><iframe src="iframe.html"></iframe>
<p><a href="index2.html">Next</a>

iframe.html

<img src="One_black_Pixel.png?iframe">

index2.html

<p><img src="One_black_Pixel.png?index2">
<p><iframe src="iframe2.html"></iframe>

iframe2.html

<img src="One_black_Pixel.png?iframe2">

One_black_Pixel.png

Copied from here, https://en.wikipedia.org/wiki/File:One_black_Pixel.png.

HARs

The chrome-har captured HAR shows all the requests collected as a single page. But there should be two pages, one for index.html and one for index2.html.

image

The HAR captured from Chrome devtools after performing the test manually, correctly shows two pages (albeit in the wrong order here).

image

Site content can make the parsing fail

INFO - b'[2017-06-29 21:33:28] INFO: Testing url https://www.macys.com/ run 1\n'
INFO - b'[2017-06-29 21:33:39] ERROR: SyntaxError: Unexpected token \x1f in JSON at position 0\n'
INFO - b'SyntaxError: Unexpected token \x1f in JSON at position 0\n'
INFO - b' at Object.parse (native)\n'
INFO - b' at parsePostData (/usr/src/app/node_modules/chrome-har/index.js:570:37)\n'

See sitespeedio/sitespeed.io#1654

Missing requests in iFrame within iFrame

I've come across an issue where requests are being omitted from an iFrame in an iFrame.

I've created a reproducible case where an image (img_girl.jpg) is missing from the generated HAR, even though it is in the devtools logs.

How to reproduce

  1. Generate a devtools log for this URL: https://qduej.csb.app/
    See devtools log
  2. Generate a HAR using the devtools log
    See HAR
    Logging the request IDs you can see the following:

Going through the devtools log you can see that there are no Page.frameAttached events for the second iframe, despite a Network.requestWillBeSent event referring to the frameId 25A2EF16DA63D669E56F8639E5EFE72C.

  {
    "method": "Network.requestWillBeSent",
    "params": {
      "requestId": "A96E1426CD51C4AC5E90C43EE09F3C23",
      "loaderId": "A96E1426CD51C4AC5E90C43EE09F3C23",
      "documentURL": "https://zfh61.csb.app/",
      "request": {
        "url": "https://zfh61.csb.app/",
        "method": "GET",
        "headers": {
          "Upgrade-Insecure-Requests": "1",
          "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.61 Safari/537.36",
          "Referer": "https://deploy-preview-26--michaeldijkstra-netlify-test.netlify.app/"
        },
        "mixedContentType": "none",
        "initialPriority": "VeryHigh",
        "referrerPolicy": "no-referrer-when-downgrade"
      },
      "timestamp": 162823.851228,
      "wallTime": 1590380010.679388,
      "initiator": {
        "type": "parser",
        "url": "https://deploy-preview-26--michaeldijkstra-netlify-test.netlify.app/",
        "lineNumber": 6
      },
      "type": "Document",
      "frameId": "25A2EF16DA63D669E56F8639E5EFE72C",
      "hasUserGesture": false
    },
    "source": {
      "targetId": "B6C52BA4191283DB2B514A0AA6507C5A",
      "sessionId": "1F88229A0F0DA12B6ADECDE9D3134F15"
    }
  }

This could well be a bug in Chromium as when I was trying to reproduce the issue I noticed that if the first iframe was on the same domain it would create the Page.frameAttached event but when it's on a different domain it doesn't.

I still think it'd be good to show these events in the HAR as they are happening.

The block of code which ultimately removes the entry is during Network.responseReceived and it can't be matched to a page:

const frameId = rootFrameMappings.get(params.frameId) || params.frameId;
const page = pages.find((page) => page.__frameId === frameId);
if (!page) {
  debug(
    `Received network response for requestId ${params.requestId} that can't be mapped to any page.`
  );
  continue;
}

I noticed that when processing Network.requestWillBeSent the last page is used, which was added to support multi page hars in PR #30

const page = pages[pages.length - 1];

Do you think we can do something similar to this? Where we look for the page but fallback to the last page?

const page = pages.find((page) => page.__frameId === frameId) || pages[pages.length - 1];

I'm not sure of the reasoning behind filtering the events out or what the repercussions of doing this as a fallback would be.

Again, happy to work on a PR for this if we think there's a good way forward!

How to determine if a request was preloaded ?

Hi guys,

First of all, thanks for your amazing work! That's huge !

I would like to determine from HAR generated file which request was preloaded (or not) during page loading.
Is there a way to determine it easily ?

Thanks for your help.

Best.

no har generated with Page.enable, Network.enable

I have sent Page.enable and Network.enable and when I dump all events that I have received when loading example.com my events look like this:

[
  {
    "method": "Network.requestWillBeSent",
    "params": {
      "requestId": "B008822534A6FDC6F47E9345E71955AC",
      "loaderId": "B008822534A6FDC6F47E9345E71955AC",
      "documentURL": "https://example.com/",
      "request": {
        "url": "https://example.com/",
        "method": "GET",
        "headers": {
          "Upgrade-Insecure-Requests": "1",
          "User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.120 Safari/537.36",
          "Sec-Fetch-Mode": "navigate",
          "Sec-Fetch-User": "?1"
        },
        "mixedContentType": "none",
        "initialPriority": "VeryHigh",
        "referrerPolicy": "no-referrer-when-downgrade"
      },
      "timestamp": 178604.513679,
      "wallTime": 1573906240.497734,
      "initiator": {
        "type": "other"
      },
      "type": "Document",
      "frameId": "9D57A9C83BE59C00C8A87B300ADA30B8",
      "hasUserGesture": false
    }
  },
  {
    "method": "Network.responseReceivedExtraInfo",
    "params": {
      "requestId": "B008822534A6FDC6F47E9345E71955AC",
      "blockedCookies": [],
      "headers": {
        "status": "200",
        "content-encoding": "gzip",
        "accept-ranges": "bytes",
        "cache-control": "max-age=604800",
        "content-type": "text/html; charset=UTF-8",
        "date": "Sat, 16 Nov 2019 12:05:00 GMT",
        "etag": "\"3147526947\"",
        "expires": "Sat, 23 Nov 2019 12:05:00 GMT",
        "last-modified": "Thu, 17 Oct 2019 07:18:26 GMT",
        "server": "ECS (bsa/EB17)",
        "vary": "Accept-Encoding",
        "x-cache": "HIT",
        "content-length": "648"
      }
    }
  },
  {
    "method": "Network.responseReceived",
    "params": {
      "requestId": "B008822534A6FDC6F47E9345E71955AC",
      "loaderId": "B008822534A6FDC6F47E9345E71955AC",
      "timestamp": 178604.518433,
      "type": "Document",
      "response": {
        "url": "https://example.com/",
        "status": 200,
        "statusText": "",
        "headers": {
          "status": "200",
          "content-encoding": "gzip",
          "accept-ranges": "bytes",
          "cache-control": "max-age=604800",
          "content-type": "text/html; charset=UTF-8",
          "date": "Sat, 16 Nov 2019 12:05:00 GMT",
          "etag": "\"3147526947\"",
          "expires": "Sat, 23 Nov 2019 12:05:00 GMT",
          "last-modified": "Thu, 17 Oct 2019 07:18:26 GMT",
          "server": "ECS (bsa/EB17)",
          "vary": "Accept-Encoding",
          "x-cache": "HIT",
          "content-length": "648"
        },
        "mimeType": "text/html",
        "connectionReused": false,
        "connectionId": 0,
        "remoteIPAddress": "93.184.216.34",
        "remotePort": 443,
        "fromDiskCache": true,
        "fromServiceWorker": false,
        "fromPrefetchCache": false,
        "encodedDataLength": 0,
        "timing": {
          "requestTime": 178604.514472,
          "proxyStart": -1,
          "proxyEnd": -1,
          "dnsStart": -1,
          "dnsEnd": -1,
          "connectStart": -1,
          "connectEnd": -1,
          "sslStart": -1,
          "sslEnd": -1,
          "workerStart": -1,
          "workerReady": -1,
          "sendStart": 0.179,
          "sendEnd": 0.179,
          "pushStart": 0,
          "pushEnd": 0,
          "receiveHeadersEnd": 0.824
        },
        "protocol": "h2",
        "securityState": "secure",
        "securityDetails": {
          "protocol": "TLS 1.3",
          "keyExchange": "",
          "keyExchangeGroup": "P-256",
          "cipher": "AES_256_GCM",
          "certificateId": 0,
          "subjectName": "www.example.org",
          "sanList": [
            "www.example.org",
            "example.com",
            "example.edu",
            "example.net",
            "example.org",
            "www.example.com",
            "www.example.edu",
            "www.example.net"
          ],
          "issuer": "DigiCert SHA2 Secure Server CA",
          "validFrom": 1543363200,
          "validTo": 1606910400,
          "signedCertificateTimestampList": [],
          "certificateTransparencyCompliance": "unknown"
        }
      },
      "frameId": "9D57A9C83BE59C00C8A87B300ADA30B8"
    }
  },
  {
    "method": "Page.frameStartedLoading",
    "params": {
      "frameId": "9D57A9C83BE59C00C8A87B300ADA30B8"
    }
  },
  {
    "method": "Page.frameNavigated",
    "params": {
      "frame": {
        "id": "9D57A9C83BE59C00C8A87B300ADA30B8",
        "loaderId": "B008822534A6FDC6F47E9345E71955AC",
        "url": "https://example.com/",
        "securityOrigin": "https://example.com",
        "mimeType": "text/html"
      }
    }
  },
  {
    "method": "Network.dataReceived",
    "params": {
      "requestId": "B008822534A6FDC6F47E9345E71955AC",
      "timestamp": 178604.550809,
      "dataLength": 1256,
      "encodedDataLength": 0
    }
  },
  {
    "method": "Network.loadingFinished",
    "params": {
      "requestId": "B008822534A6FDC6F47E9345E71955AC",
      "timestamp": 178604.516411,
      "encodedDataLength": 0,
      "shouldReportCorbBlocking": false
    }
  },
  {
    "method": "Page.domContentEventFired",
    "params": {
      "timestamp": 178604.55525
    }
  },
  {
    "method": "Page.loadEventFired",
    "params": {
      "timestamp": 178604.557057
    }
  },
  {
    "method": "Page.frameStoppedLoading",
    "params": {
      "frameId": "9D57A9C83BE59C00C8A87B300ADA30B8"
    }
  }
]

however, no entries are found:

{
  "log": {
    "version": "1.2",
    "creator": {
      "name": "chrome-har",
      "version": "0.11.4",
      "comment": "https://github.com/sitespeedio/chrome-har"
    },
    "pages": [],
    "entries": []
  }
}

Information missing from requests and responses

I've come across an issue where headers are missing from requests and responses despite being sent by the browser.

There was an update in Chrome which introduced new protocol events that contain request information from network
service. This change introduced Network.requestWillBeSentExtraInfo and Network.responseReceivedExtraInfo events in the devtools log.

Comparing the HAR generated by the Chrome Browser to one from chrome-har you can see the extra information is grouped into the request.

Using the request for a CSS file as an example, you can see in the Devtools logs that there is a Network.requestWillBeSentExtraInfo and a Network.responseReceivedExtraInfo but this extra information is not present in a har generated by chrome-har while it is present in a har generated by the chrome web inspector.

I'm happy to put together a PR to include this information in chrome-har the same way it's included by the Chrome Browser.

timings.receive invalid for cached items

When I include cached items in my HAR, I get very high receive times for anything cached. I think it's somehow relating when the request was first cached to now, because it increases every time I do a capture.

I fixed it in my project like so:

            const timings = entry.timings || {};
            if (entry.cache.beforeRequest) {
              timings.receive = 0;
            } else {
              timings.receive = formatMillis(
                (params.timestamp - entry._requestTime) * 1000 -
                  entry.__receiveHeadersEnd
              );
            }

I am happy to PR this, although I am not familiar enough with this project to know if it's the right thing to do or not.

P.S. This project is amazing. Doing this by hand would have been such a nightmare, after seeing all of the intricacies here!

Not working with a blank / empty page ?

Just curious.

When i try

const puppeteer = require('puppeteer');
const PuppeteerHar = require('puppeteer-har');

(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();

const har = new PuppeteerHar(page);
await har.start({ path: 'results.har' });


await page.goto('https://www.mbcreation.net/har/index.html');
await har.stop();
await browser.close();

})();

Response is

{"log":{"version":"1.2","creator":{"name":"chrome-har","version":"0.2.3","comment":"https://github.com/sitespeedio/chrome-har"},"pages":[],"entries":[]}}

Destination page is actually empty but i thought i would have load time for it anyway ?

<title> Har </title>

Thanks for your answer / help

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.