chrome-php / chrome Goto Github PK
View Code? Open in Web Editor NEWInstrument headless chrome/chromium instances from PHP
License: MIT License
Instrument headless chrome/chromium instances from PHP
License: MIT License
Thanks a lot!
Does this library support DOM interaction? Ex. click events?
Hey dude, really cool work!
Is it possible to wait for javascript evaluation (poll for a global variable) rather than navigation such as Page::NETWORK_IDLE?
I'm faced with an interesting issue where I pull a lot of data via ajax, and it takes more than 1000ms to populate the data in the dom (fancy charts etc) therefore Page::NETWORK_IDLE evacuates to true and the image/pdf is captured before the next ajax request fires for the final piece of content.
Thanks!
[2018-12-26 22:32:53] DEBUG Factory: chrome version: Google Chrome 71.0.3578.98
[2018-12-26 22:32:53] DEBUG process: initializing
[2018-12-26 22:32:53] DEBUG process: using directory: /tmp/chromium-php-U0Asfm
[2018-12-26 22:32:53] DEBUG process: starting process: google-chrome --remote-debugging-port=0 --disable-background-networking --disable-background-timer-throttling --disable-client-side-phishing-detection --disable-default-apps --disable-extensions --disable-hang-monitor --disable-popup-blocking --disable-prompt-on-repost --disable-sync --disable-translate --metrics-recording-only --no-first-run --safebrowsing-disable-auto-update --enable-automation --password-store=basic --use-mock-keychain --headless --disable-gpu --hide-scrollbars --mute-audio --user-data-dir=/tmp/chromium-php-U0Asfm
[2018-12-26 22:32:53] DEBUG process: waiting for 30 seconds for startup
[2018-12-26 22:32:53] DEBUG process: chrome output:mkdir: cannot create directory '/.local': Permission denied
touch: cannot touch '/.local/share/applications/mimeapps.list': No such file or directory
[2018-12-26 22:32:53] DEBUG process: ignoring output:mkdir: cannot create directory '/.local': Permission denied
[2018-12-26 22:32:53] DEBUG process: ignoring output:touch: cannot touch '/.local/share/applications/mimeapps.list': No such file or directory
[2018-12-26 22:32:53] DEBUG process: chrome output:Fontconfig warning: "/etc/fonts/fonts.conf", line 86: unknown element "blank"
[2018-12-26 22:32:53] DEBUG process: ignoring output:Fontconfig warning: "/etc/fonts/fonts.conf", line 86: unknown element "blank"
[2018-12-26 22:32:54] DEBUG process: chrome output:[1226/223254.000921:ERROR:gpu_process_transport_factory.cc(967)] Lost UI shared context.
DevTools listening on ws://127.0.0.1:38553/devtools/browser/0e5d4d2d-5d62-4666-bb55-69f729995a44
[2018-12-26 22:32:54] DEBUG process: ignoring output:[1226/223254.000921:ERROR:gpu_process_transport_factory.cc(967)] Lost UI shared context.
[2018-12-26 22:32:54] DEBUG process: ✓ accepted output
[2018-12-26 22:32:54] DEBUG process: connecting using ws://127.0.0.1:38553/devtools/browser/0e5d4d2d-5d62-4666-bb55-69f729995a44
[2018-12-26 22:32:54] DEBUG socket(1): connecting
[2018-12-26 22:32:54] DEBUG socket(1): ✓ connected
[2018-12-26 22:32:54] DEBUG socket(1): → sending data:{"id":1,"method":"Target.setDiscoverTargets","params":{"discover":true}}
[2018-12-26 22:32:54] DEBUG socket(1): ← receiving data:{"method":"Target.targetCreated","params":{"targetInfo":{"targetId":"231c9529-1486-4254-be86-4d1ae0878540","type":"browser","title":"","url":"","attached":false}}}
[2018-12-26 22:32:54] DEBUG socket(1): ← receiving data:{"method":"Target.targetCreated","params":{"targetInfo":{"targetId":"DB2B5C3E6A760C4F705280AC10A61DBF","type":"page","title":"","url":"about:blank","attached":false,"browserContextId":"7B78E93131F1E4984B43E5677B645745"}}}
[2018-12-26 22:32:54] DEBUG socket(1): ← receiving data:{"method":"Target.targetCreated","params":{"targetInfo":{"targetId":"b57e2158-65b6-4c1d-acbf-bfddf97d034f","type":"browser","title":"","url":"","attached":true}}}
[2018-12-26 22:32:54] DEBUG socket(1): ← receiving data:{"id":1,"result":{}}
[2018-12-26 22:32:54] DEBUG connection: ⇶ dispatching method:Target.targetCreated
[2018-12-26 22:32:54] DEBUG connection: ⇶ dispatching method:Target.targetCreated
[2018-12-26 22:32:54] DEBUG connection: ⇶ dispatching method:Target.targetCreated
[2018-12-26 22:32:54] DEBUG socket(1): → sending data:{"id":2,"method":"Target.createTarget","params":{"url":"about:blank"}}
[2018-12-26 22:32:54] DEBUG socket(1): ← receiving data:{"method":"Target.targetCreated","params":{"targetInfo":{"targetId":"CF5F317D588BA3DFE0EE1263FEC5DEAA","type":"page","title":"","url":"","attached":false,"browserContextId":"7B78E93131F1E4984B43E5677B645745"}}}
[2018-12-26 22:32:54] DEBUG connection: ⇶ dispatching method:Target.targetCreated
[2018-12-26 22:32:54] DEBUG socket(1): ← receiving data:{"id":2,"result":{"targetId":"CF5F317D588BA3DFE0EE1263FEC5DEAA"}}
[2018-12-26 22:32:54] DEBUG socket(1): ← receiving data:{"method":"Target.targetInfoChanged","params":{"targetInfo":{"targetId":"DB2B5C3E6A760C4F705280AC10A61DBF","type":"page","title":"about:blank","url":"about:blank","attached":false,"browserContextId":"7B78E93131F1E4984B43E5677B645745"}}}
[2018-12-26 22:32:54] DEBUG socket(1): → sending data:{"id":3,"method":"Target.attachToTarget","params":{"targetId":"CF5F317D588BA3DFE0EE1263FEC5DEAA"}}
[2018-12-26 22:32:54] DEBUG connection: ⇶ dispatching method:Target.targetInfoChanged
[2018-12-26 22:32:54] DEBUG socket(1): ← receiving data:{"method":"Target.targetInfoChanged","params":{"targetInfo":{"targetId":"CF5F317D588BA3DFE0EE1263FEC5DEAA","type":"page","title":"","url":"about:blank","attached":true,"browserContextId":"7B78E93131F1E4984B43E5677B645745"}}}
[2018-12-26 22:32:54] DEBUG connection: ⇶ dispatching method:Target.targetInfoChanged
[2018-12-26 22:32:54] DEBUG socket(1): ← receiving data:{"method":"Target.attachedToTarget","params":{"sessionId":"C989FE6F0EA1D398F72EA8F014FC73F7","targetInfo":{"targetId":"CF5F317D588BA3DFE0EE1263FEC5DEAA","type":"page","title":"","url":"about:blank","attached":true,"browserContextId":"7B78E93131F1E4984B43E5677B645745"},"waitingForDebugger":false}}
[2018-12-26 22:32:54] DEBUG socket(1): ← receiving data:{"id":3,"result":{"sessionId":"C989FE6F0EA1D398F72EA8F014FC73F7"}}
[2018-12-26 22:32:54] DEBUG connection: ⇶ dispatching method:Target.attachedToTarget
[2018-12-26 22:32:54] DEBUG socket(1): → sending data:{"id":5,"method":"Target.sendMessageToTarget","params":{"message":"{"id":4,"method":"Page.getFrameTree","params":[]}","sessionId":"C989FE6F0EA1D398F72EA8F014FC73F7"}}
[2018-12-26 22:32:54] DEBUG socket(1): ← receiving data:{"id":5,"result":{}}
[2018-12-26 22:32:54] DEBUG socket(1): ← receiving data:{"method":"Target.targetInfoChanged","params":{"targetInfo":{"targetId":"CF5F317D588BA3DFE0EE1263FEC5DEAA","type":"page","title":"about:blank","url":"about:blank","attached":true,"browserContextId":"7B78E93131F1E4984B43E5677B645745"}}}
[2018-12-26 22:32:54] DEBUG connection: ⇶ dispatching method:Target.targetInfoChanged
[2018-12-26 22:32:54] DEBUG socket(1): ← receiving data:{"method":"Target.receivedMessageFromTarget","params":{"sessionId":"C989FE6F0EA1D398F72EA8F014FC73F7","message":"{"id":4,"result":{"frameTree":{"frame":{"id":"CF5F317D588BA3DFE0EE1263FEC5DEAA","loaderId":"C1B1AA7DADCC400D31F6A673C3F784E9","url":"about:blank","securityOrigin":"://","mimeType":"text/html"}}}}","targetId":"CF5F317D588BA3DFE0EE1263FEC5DEAA"}}
[2018-12-26 22:32:54] DEBUG socket(1): → sending data:{"id":7,"method":"Target.sendMessageToTarget","params":{"message":"{"id":6,"method":"Page.enable","params":[]}","sessionId":"C989FE6F0EA1D398F72EA8F014FC73F7"}}
[2018-12-26 22:32:54] DEBUG socket(1): ← receiving data:{"id":7,"result":{}}
[2018-12-26 22:32:54] DEBUG socket(1): ← receiving data:{"method":"Target.receivedMessageFromTarget","params":{"sessionId":"C989FE6F0EA1D398F72EA8F014FC73F7","message":"{"id":6,"result":{}}","targetId":"CF5F317D588BA3DFE0EE1263FEC5DEAA"}}
[2018-12-26 22:32:54] DEBUG socket(1): → sending data:{"id":9,"method":"Target.sendMessageToTarget","params":{"message":"{"id":8,"method":"Network.enable","params":[]}","sessionId":"C989FE6F0EA1D398F72EA8F014FC73F7"}}
[2018-12-26 22:32:54] DEBUG socket(1): ← receiving data:{"id":9,"result":{}}
[2018-12-26 22:32:54] DEBUG socket(1): ← receiving data:{"method":"Target.receivedMessageFromTarget","params":{"sessionId":"C989FE6F0EA1D398F72EA8F014FC73F7","message":"{"id":8,"result":{}}","targetId":"CF5F317D588BA3DFE0EE1263FEC5DEAA"}}
[2018-12-26 22:32:54] DEBUG socket(1): → sending data:{"id":11,"method":"Target.sendMessageToTarget","params":{"message":"{"id":10,"method":"Runtime.enable","params":[]}","sessionId":"C989FE6F0EA1D398F72EA8F014FC73F7"}}
[2018-12-26 22:32:54] DEBUG socket(1): ← receiving data:{"id":11,"result":{}}
[2018-12-26 22:32:54] DEBUG socket(1): ← receiving data:{"method":"Target.receivedMessageFromTarget","params":{"sessionId":"C989FE6F0EA1D398F72EA8F014FC73F7","message":"{"method":"Runtime.executionContextCreated","params":{"context":{"id":1,"origin":"://","name":"","auxData":{"isDefault":true,"type":"default","frameId":"CF5F317D588BA3DFE0EE1263FEC5DEAA"}}}}","targetId":"CF5F317D588BA3DFE0EE1263FEC5DEAA"}}
[2018-12-26 22:32:54] DEBUG session(C989FE6F0EA1D398F72EA8F014FC73F7): ⇶ dispatching method:Runtime.executionContextCreated
[2018-12-26 22:32:54] DEBUG socket(1): ← receiving data:{"method":"Target.receivedMessageFromTarget","params":{"sessionId":"C989FE6F0EA1D398F72EA8F014FC73F7","message":"{"id":10,"result":{}}","targetId":"CF5F317D588BA3DFE0EE1263FEC5DEAA"}}
[2018-12-26 22:32:54] DEBUG socket(1): → sending data:{"id":13,"method":"Target.sendMessageToTarget","params":{"message":"{"id":12,"method":"Page.setLifecycleEventsEnabled","params":{"enabled":true}}","sessionId":"C989FE6F0EA1D398F72EA8F014FC73F7"}}
[2018-12-26 22:32:54] DEBUG socket(1): ← receiving data:{"id":13,"result":{}}
[2018-12-26 22:33:01] DEBUG socket(1): ← receiving data:{"method":"Target.receivedMessageFromTarget","params":{"sessionId":"C989FE6F0EA1D398F72EA8F014FC73F7","message":"{"method":"Page.lifecycleEvent","params":{"frameId":"CF5F317D588BA3DFE0EE1263FEC5DEAA","loaderId":"C1B1AA7DADCC400D31F6A673C3F784E9","name":"commit","timestamp":7008.348604}}","targetId":"CF5F317D588BA3DFE0EE1263FEC5DEAA"}}
[2018-12-26 22:33:01] DEBUG socket(1): ← receiving data:{"method":"Target.receivedMessageFromTarget","params":{"sessionId":"C989FE6F0EA1D398F72EA8F014FC73F7","message":"{"method":"Page.lifecycleEvent","params":{"frameId":"CF5F317D588BA3DFE0EE1263FEC5DEAA","loaderId":"C1B1AA7DADCC400D31F6A673C3F784E9","name":"DOMContentLoaded","timestamp":7008.349236}}","targetId":"CF5F317D588BA3DFE0EE1263FEC5DEAA"}}
[2018-12-26 22:33:01] DEBUG socket(1): ← receiving data:{"method":"Target.receivedMessageFromTarget","params":{"sessionId":"C989FE6F0EA1D398F72EA8F014FC73F7","message":"{"method":"Page.lifecycleEvent","params":{"frameId":"CF5F317D588BA3DFE0EE1263FEC5DEAA","loaderId":"C1B1AA7DADCC400D31F6A673C3F784E9","name":"load","timestamp":7008.349361}}","targetId":"CF5F317D588BA3DFE0EE1263FEC5DEAA"}}
[2018-12-26 22:33:01] DEBUG socket(1): ← receiving data:{"method":"Target.receivedMessageFromTarget","params":{"sessionId":"C989FE6F0EA1D398F72EA8F014FC73F7","message":"{"method":"Page.lifecycleEvent","params":{"frameId":"CF5F317D588BA3DFE0EE1263FEC5DEAA","loaderId":"C1B1AA7DADCC400D31F6A673C3F784E9","name":"networkAlmostIdle","timestamp":7008.349759}}","targetId":"CF5F317D588BA3DFE0EE1263FEC5DEAA"}}
[2018-12-26 22:33:01] DEBUG socket(1): ← receiving data:{"method":"Target.receivedMessageFromTarget","params":{"sessionId":"C989FE6F0EA1D398F72EA8F014FC73F7","message":"{"method":"Page.lifecycleEvent","params":{"frameId":"CF5F317D588BA3DFE0EE1263FEC5DEAA","loaderId":"C1B1AA7DADCC400D31F6A673C3F784E9","name":"networkIdle","timestamp":7008.349759}}","targetId":"CF5F317D588BA3DFE0EE1263FEC5DEAA"}}
[2018-12-26 22:33:01] DEBUG socket(1): ← receiving data:{"method":"Target.receivedMessageFromTarget","params":{"sessionId":"C989FE6F0EA1D398F72EA8F014FC73F7","message":"{"id":12,"result":{}}","targetId":"CF5F317D588BA3DFE0EE1263FEC5DEAA"}}
[2018-12-26 22:33:01] DEBUG socket(1): ← receiving data:{"method":"Target.receivedMessageFromTarget","params":{"sessionId":"C989FE6F0EA1D398F72EA8F014FC73F7","message":"{"method":"Page.lifecycleEvent","params":{"frameId":"CF5F317D588BA3DFE0EE1263FEC5DEAA","loaderId":"C1B1AA7DADCC400D31F6A673C3F784E9","name":"networkAlmostIdle","timestamp":7008.349759}}","targetId":"CF5F317D588BA3DFE0EE1263FEC5DEAA"}}
[2018-12-26 22:33:01] DEBUG socket(1): ← receiving data:{"method":"Target.receivedMessageFromTarget","params":{"sessionId":"C989FE6F0EA1D398F72EA8F014FC73F7","message":"{"method":"Page.lifecycleEvent","params":{"frameId":"CF5F317D588BA3DFE0EE1263FEC5DEAA","loaderId":"C1B1AA7DADCC400D31F6A673C3F784E9","name":"networkIdle","timestamp":7008.349759}}","targetId":"CF5F317D588BA3DFE0EE1263FEC5DEAA"}}
[2018-12-26 22:33:01] DEBUG session(C989FE6F0EA1D398F72EA8F014FC73F7): ⇶ dispatching method:Page.lifecycleEvent
( ! ) Fatal error: Uncaught HeadlessChromium\Exception\OperationTimedOut: Operation timed out (3sec) in /var/www/html/vendor/chrome-php/chrome/src/Utils.php on line 65 | ||||
---|---|---|---|---|
( ! ) HeadlessChromium\Exception\OperationTimedOut: Operation timed out (3sec) in /var/www/html/vendor/chrome-php/chrome/src/Utils.php on line 65 | ||||
Call Stack | ||||
# | Time | Memory | Function | Location |
1 | 0.0003 | 377168 | {main}( ) | .../index.php:0 |
2 | 0.1578 | 1724192 | HeadlessChromium\Browser\ProcessAwareBrowser->createPage( ) | .../index.php:278 |
3 | 0.1881 | 1879624 | HeadlessChromium\Communication\Session->sendMessageSync( ) | .../Browser.php:149 |
4 | 0.1883 | 1880560 | HeadlessChromium\Communication\SessionResponseReader->waitForResponse( ) | .../Session.php:79 |
5 | 0.1883 | 1881136 | HeadlessChromium\Utils::tryWithTimeout( ) | .../ResponseReader.php:103 |
Trying to access the websockets, using a page, so i've got the browser, created the page, and started setting up a websocket Uri, but every time i try and do any thing it just says the connection is closed, and I'm thinking its not connecting to the websocket at all..
$webSocketUri = 'ws://127.0.0.1:9222/devtools/browser';
$connection = new Connection($webSocketUri);
$connection->connect();
Log::info(dd($connection));
HeadlessChromium\Communication\Connection {#645
#strict: true
#delay: null
-lastMessageSentTime: null
#wsClient: HeadlessChromium\Communication\Socket\Wrench {#700
#client: Wrench\Client {#746
#uri: "ws://127.0.0.1:9222/devtools/browser/"
#origin: "http://127.0.0.1"
#socket: Wrench\Socket\ClientSocket {#748
#scheme: "tcp"
#host: "127.0.0.1"
#port: 9222
#socket: false
#context: null
#connected: false
#name: null
#options: array:5 [
"protocol" => Wrench\Protocol\Rfc6455Protocol {#749}
"timeout_socket" => 5
"timeout_connect" => 2
"ssl_verify_peer" => false
"ssl_allow_self_signed" => true
]
#protocol: Wrench\Protocol\Rfc6455Protocol {#749}
}
#headers: []
#connected: false
#payloadHandler: Wrench\Payload\PayloadHandler {#750
#callback: array:2 [
0 => Wrench\Client {#746}
1 => "onData"
]
#payload: null
#options: array:4 [
"protocol" => Wrench\Protocol\Rfc6455Protocol {#747}
"socket_class" => "Wrench\Socket\ClientSocket"
"on_data_callback" => null
"socket_options" => []
]
#protocol: Wrench\Protocol\Rfc6455Protocol {#747}
}
#received: []
#options: array:4 [
"protocol" => Wrench\Protocol\Rfc6455Protocol {#747}
"socket_class" => "Wrench\Socket\ClientSocket"
"on_data_callback" => null
"socket_options" => []
]
#protocol: Wrench\Protocol\Rfc6455Protocol {#747}
}
#socketId: 2
#logger: Psr\Log\NullLogger {#690}
}
#responseBuffer: []
#sendSyncDefaultTimeout: 3000
#sessions: []
#listeners: []
#onceListeners: []
#logger: Psr\Log\NullLogger {#690}
}
how to do basic auth with headless chrome?
hi,
is it possible to inject jquery to get DOM elements values more easy, ie to use jquery selectors?
smth like this
$page->addPreScript(file_get_contents ('jquery-latest.min.js'));
$page->navigate('https://www.somepage.com/feed.html')->waitForNavigation();
$value = $page->evaluate('jQuery("#pagelet_group_mall")')->getReturnValue();
Fatal error: Uncaught HeadlessChromium\Exception\JavascriptException: Error during javascript evaluation: ReferenceError: jQuery is not defined
at :1:1 in F:\web\chrome\headless-chromium-php\src\PageUtils\PageEvaluation.php:78
Would it be possible to send a post message to a page?
So the situation:
First I need to login onto a page
Then go to the download page
Fill in some information
Press a button
The button trigger's some jQuery on the page that does a post json to other page that downloads the file.
I manage to login go to the download page and press the button but any idea how to fetch the file of the jQuery json call when pressing the button?
I have a requirement to get all the information about a specific page. For example, I want to know how many total requests are required to load a page, how many total requests are static files, etc. Is it possible?
Hi.
I tried to take full-size screenshot of a page with this library. But noticed that the Page.getLayoutMetrics
feature is not implemented.
So I was hoping a little help would be welcome. #43.
Not sure where the best place of communication is so I've opened up another ticket for this, I've been working on implementing DOM.getFlattenedDocument
so that I can get the contents. However, this line and the socket returns are a bit awkward.
An example response looks like;
[2017-12-06 23:10:43] DEBUG socket: |=> sending data:{"id":6,"method":"Target.sendMessageToTarget","params":{"message":"{\"id\":5,\"method\":\"DOM.enable\",\"params\":[]}","sessionId":"21dee360-7e84-45dd-b3ba-1573ab9f4e1b:1"}}
[2017-12-06 23:10:43] DEBUG socket: <=| receiving data:{"method":"Target.receivedMessageFromTarget","params":{"sessionId":"21dee360-7e84-45dd-b3ba-1573ab9f4e1b:1","message":"{\"id\":3,\"result\":{\"frameId\":\"26824.1\"}}","targetId":"21dee360-7e84-45dd-b3ba-1573ab9f4e1b"}}
[2017-12-06 23:10:43] DEBUG socket: <=| receiving data:{"id":6,"result":{}}
[2017-12-06 23:10:43] DEBUG socket: |=> sending data:{"id":8,"method":"Target.sendMessageToTarget","params":{"message":"{\"id\":7,\"method\":\"DOM.getFlattenedDocument\",\"params\":{\"depth\":-1}}","sessionId":"21dee360-7e84-45dd-b3ba-1573ab9f4e1b:1"}}
[2017-12-06 23:10:44] DEBUG socket: <=| receiving data:{"method":"Target.receivedMessageFromTarget","params":{"sessionId":"21dee360-7e84-45dd-b3ba-1573ab9f4e1b:1","message":"{\"id\":5,\"result\":{}}","targetId":"21dee360-7e84-45dd-b3ba-1573ab9f4e1b"}}
[2017-12-06 23:10:44] DEBUG socket: <=| receiving data:{"id":8,"result":{}}
[2017-12-06 23:10:44] DEBUG socket: <=| receiving data:{"method":"Target.receivedMessageFromTarget","params":{"sessionId":"21dee360-7e84-45dd-b3ba-1573ab9f4e1b:1","message":"{\"id\":7,\"result\":{\"nodes\":[{\"nodeId\":2,\"parentId\":1,\"backendNodeId\":16,\"nodeType\":10,\"nodeName\":\"html\",\"localName\":\"\",\"nodeValue\":\"\",\"publicId\":\"\",\"systemId\":\"\"},{\"nodeId\":5,\"parentId\":4,\"backendNodeId\":19,\"nodeType\":1,\"nodeName\":\"META\",\"localName\":\"meta\",\"nodeValue\":\"\",\"childNodeCount\":0,\"children\":[],\"attributes\":[\"content\",\"/images/branding/googleg/1x/googleg_standard_color_128dp.png\",\"itemp .... truncated ....
Problem here is, my response from the request, I guess it fulfilled the requirements of checkForResponse
;
object(HeadlessChromium\Communication\Response)#103 (2) {
["message":protected]=>
object(HeadlessChromium\Communication\Message)#98 (3) {
["id":protected]=>
int(8)
["method":protected]=>
string(26) "Target.sendMessageToTarget"
["params":protected]=>
array(2) {
["message"]=>
string(66) "{"id":7,"method":"DOM.getFlattenedDocument","params":{"depth":-1}}"
["sessionId"]=>
string(38) "21dee360-7e84-45dd-b3ba-1573ab9f4e1b:1"
}
}
["data":protected]=>
array(2) {
["id"]=>
int(8)
["result"]=>
array(0) {
}
}
}
I'm not sure if there's cases where you'd want to keep multiple responses for the same request (or if there's any requests that even do that) or, to say, filter out responses to requests that don't get immediate responses.
More than happy to take direction on this and implement some methods that manipulate the DOM, maybe a separate class that can be called with a Page
object.
Thanks!
Hi guys,
Got chrome installed on my local env which is OSX running php 7.2. I've got chrome installed with an aliase to the following
alias chrome="/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome"
chrome version is "Google Chrome 66.0.3359.117 "
but i Keep getting this stack trace
[2018-04-25 15:59:14] local.ERROR: Cannot get chrome version, make sure you provided the correct chrome binaries using (chrome). sh: chrome: command not found {"exception":"[object] (RuntimeException(code: 0): Cannot get chrome version, make sure you provided the correct chrome binaries using (chrome). sh: chrome: command not found at /Users/dansmacbook/projects/project-scrapper/vendor/chrome-php/chrome/src/BrowserFactory.php:78)
code execution is this
$browser = $browserFactory->createBrowser([
'headless' => true,
'connectionDelay' => 0.8,
'debugLogger' => 'php://stdout'
]);
$page = $browser->createPage();
$response = $page->navigate($url)->waitForNavigation();
$browser->close();
dd($response);
any thoughts on how to resolve this?
The first time I run this via JavaScript POST, HeadlessChromium saves out a new directory folder instead of an image (with the image_name.jpg as the folder name). But when my JavaScript continues to call the script, it saves out an image on each consecutive call..
What would cause HeadlessChromium to save the first image as a directory, but all of the others as JPGs?
My Javascript seems fine.. it's sending the same type of data each time, the script doesn't change on each consecutive call, yet I keep getting a different result on the first call only..
Anyone know what's up here?
<?php
require 'vendor/autoload.php';
use HeadlessChromium\BrowserFactory;
$delay = 15; // 15 seconds
$bannerName = $_POST['bannerName'];
$htmlFile = $_POST['htmlFile'];
$sessionId = $_POST['sessionId'];
// headless-chromium-php
$browserFactory = new BrowserFactory('/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome');
$browser = $browserFactory->createBrowser([
'headless' => true, // disable headless mode
'connectionDelay' => 0, // add 0.8 second of delay between each instruction sent to chrome,
'debugLogger' => 'php://stdout', // will enable verbose mode
'windowSize' => [900, 650],
'startupTimeout' => 50
]);
$page = $browser->createPage();
$page->navigate('http://localhost:8888/phpBackend/'.$htmlFile)->waitForNavigation('networkIdle', 10000);
$pageTitle = $bannerName; //$page->evaluate('document.title')->getReturnValue();
sleep(15);
$screenshot = $page->screenshot([
'format' => 'jpeg', // default to 'png' - possible values: 'png', 'jpeg',
'quality' => 80 // only if format is 'jpeg' - default 100
])->saveToFile('./backups/'.$sessionId."/".$bannerName.".jpg");
$browser->close();
$responsed = array("bannerName" => $bannerName, "htmlFile" => $htmlFile);
echo json_encode($responsed);
?>
Error logs:
[13-Aug-2018 23:24:09 Europe/Berlin] PHP Warning: fopen(./backups/C30465E8-748F-70B0-B570-79D9E867FB3B/en_300x250_banner.jpg): failed to open stream: Is a directory in /Applications/MAMP/htdocs/phpBackend/vendor/chrome-php/chrome/src/PageUtils/PageScreenshot.php on line 96
[13-Aug-2018 23:24:09 Europe/Berlin] PHP Warning: stream_filter_append() expects parameter 1 to be resource, boolean given in /Applications/MAMP/htdocs/phpBackend/vendor/chrome-php/chrome/src/PageUtils/PageScreenshot.php on line 97
[13-Aug-2018 23:24:09 Europe/Berlin] PHP Warning: fwrite() expects parameter 1 to be resource, boolean given in /Applications/MAMP/htdocs/phpBackend/vendor/chrome-php/chrome/src/PageUtils/PageScreenshot.php on line 98
[13-Aug-2018 23:24:09 Europe/Berlin] PHP Warning: fclose() expects parameter 1 to be resource, boolean given in /Applications/MAMP/htdocs/phpBackend/vendor/chrome-php/chrome/src/PageUtils/PageScreenshot.php on line 99
I use:
$cookies = $page->evaluate ('document.cookie')->getReturnValue();
Only this does not return all the cookies, the secure cookies are not returned.
Is there a different way to get the cookies that Headless chrome receives?
Hi guys,
So I have a apache with php 7 installed on a Ubuntu VPS.
I installed the library to the html directory of apache and created a php page.
On the page I have: require 'vendor/autoload.php';
After that the example code of the installation page.
I get the error:
Fatal error: Uncaught RuntimeException: Chrome process stopped before startup completed in /var/www/html/vendor/chrome-php/chrome/src/Browser/BrowserProcess.php:326 Stack trace: #0 /var/www/html/vendor/chrome-php/chrome/src/Utils.php(51): HeadlessChromium\Browser\BrowserProcess->HeadlessChromium\Browser{closure}(Object(Symfony\Component\Process\Process)) #1 /var/www/html/vendor/chrome-php/chrome/src/Browser/BrowserProcess.php(361): HeadlessChromium\Utils::tryWithTimeout(30000000, Object(Generator)) #2 /var/www/html/vendor/chrome-php/chrome/src/Browser/BrowserProcess.php(124): HeadlessChromium\Browser\BrowserProcess->waitForStartup(Object(Symfony\Component\Process\Process), 30000000) #3 /var/www/html/vendor/chrome-php/chrome/src/BrowserFactory.php(59): HeadlessChromium\Browser\BrowserProcess->start('chrome', Array) #4 /var/www/html/index.php(16): HeadlessChromium\BrowserFactory->createBrowser() #5 {main} thrown in /var/www/html/vendor/chrome-php/chrome/src/Browser/BrowserProcess.php on line 326
Any idea what to do?
Hi there,
Would it be possible to let chromium act as different browser by setting the user agent?
Good day all,
The following was installed;
root@server003:/var/www/html_test# composer require chrome-php/chrome
sudo apt-get install chromium-browser
The following works perfectly:
echo '<pre>';
require_once('vendor/autoload.php');
use HeadlessChromium\BrowserFactory;
$factory = new BrowserFactory('chromium-browser --ignore-certificate-errors');
$browser = $factory->createBrowser([
'debugLogger' => 'php://stdout'
]);
$page = $browser->createPage();
$page->navigate('https://www.google.nl/')->waitForNavigation();
$value = $page->evaluate('document.querySelector("input[name=\"btnK\"]").value')->getReturnValue();
var_dump($value);
But then when I run the following:
echo '<pre>';
require_once('vendor/autoload.php');
use HeadlessChromium\BrowserFactory;
$factory = new BrowserFactory('chromium-browser --ignore-certificate-errors');
$browser = $factory->createBrowser([
'debugLogger' => 'php://stdout'
]);
$page = $browser->createPage();
$page->navigate('https://ip-address:8443/login')->waitForNavigation();
$evaluation = $page->evaluate(
'(() => {
document.querySelector("#username").value = "***********";
document.querySelector("#password").value = "***********";
document.querySelector("form[name=\"login\"]").submit();
})()'
);
$evaluation->waitForPageReload();
$value = $page->evaluate('document.querySelector(".duration .ng-binding").innerHTML')->getReturnValue();
var_dump($value);
The following error is generated:
Fatal error: Uncaught HeadlessChromium\Exception\NavigationExpired: The page has navigated to an other page and this navigation expired in /var/www/html_test/vendor/chrome-php/chrome/src/PageUtils/PageNavigation.php:165
Stack trace:
#0 /var/www/html_test/vendor/chrome-php/chrome/src/Utils.php(51): HeadlessChromium\PageUtils\PageNavigation->navigationComplete('load')
#1 /var/www/html_test/vendor/chrome-php/chrome/src/PageUtils/PageNavigation.php(109): HeadlessChromium\Utils::tryWithTimeout(30000000, Object(Generator))
#2 /var/www/html_test/tst.php(41): HeadlessChromium\PageUtils\PageNavigation->waitForNavigation()
#3 {main}
thrown in /var/www/html_test/vendor/chrome-php/chrome/src/PageUtils/PageNavigation.php on line 165
Looking through the closed issues, I modified the createBrowser() call like this:
$browser = $factory->createBrowser([
'headless' => false, // disable headless mode
'connectionDelay' => 0.8, // add 0.8 second of delay between each instruction sent to chrome,
'debugLogger' => 'php://stdout'
]);
Now the following is shown:
Fatal error: Uncaught HeadlessChromium\Exception\NavigationExpired: The page has navigated to an other page and this navigation expired in /var/www/html_test/vendor/chrome-php/chrome/src/PageUtils/PageNavigation.php:165
Stack trace:
#0 /var/www/html_test/vendor/chrome-php/chrome/src/Utils.php(51): HeadlessChromium\PageUtils\PageNavigation->navigationComplete('load')
#1 /var/www/html_test/vendor/chrome-php/chrome/src/PageUtils/PageNavigation.php(109): HeadlessChromium\Utils::tryWithTimeout(30000000, Object(Generator))
#2 /var/www/html_test/tst.php(41): HeadlessChromium\PageUtils\PageNavigation->waitForNavigation()
#3 {main}
thrown in /var/www/html_test/vendor/chrome-php/chrome/src/PageUtils/PageNavigation.php on line 165
Does anyone have any ideas?
Thank you very much.
Hi
I am trying to install this library using composer, I already installed latest chrome, when I run this command composer require chrome-php/chrome
I get the following error
Your requirements could not be resolved to an installable set of packages.
Problem 1
- Can only install one of: evenement/evenement[v3.0.1, v2.0.0].
- Can only install one of: evenement/evenement[v3.0.1, v2.0.0].
- Can only install one of: evenement/evenement[v3.0.1, v2.0.0].
- chrome-php/chrome v0.6.0 requires evenement/evenement ^3.0.1 -> satisfiable by evenement/evenement[v3.0.1].
- Installation request for chrome-php/chrome ^0.6.0 -> satisfiable by chrome-php/chrome[v0.6.0].
- Installation request for evenement/evenement (locked at v2.0.0) -> satisfiable by evenement/evenement[v2.0.0].
Installation failed, reverting ./composer.json to its original content.
Hi,
would it be possible to set width (and height or height = fullpage, ...) für die screenshot() function?
best regards
stefan
Hello there 👋
thank you for this cool package in advance, i'm using it to access information from sites which do not offer APIs and it is a very helpful tool.
On of the things i require to be able is to set the timeout when calling \HeadlessChromium\Page::navigate
.
Currently there is no parameter, therefor \HeadlessChromium\Communication\Connection::$sendSyncDefaultTimeout
will be used which can't be set and is hardcoded to 3 seconds. When downloading files from some pages the download might take longer than 3 seconds (because the file is big/needs to be generated by the server first) and $page->navigate($downloadUrl);
will throw an exception.
I will provide a non-breaking pull request shortly and i'd be glad if you could merge it into your awesome lib 🙂
Is there a way to make the browser always running in background as a service and prevent calling it every time?
When do you plan to add support for the following features?
$evaluation = $page->evaluate( '(() => { document.body.innerHTML += "<form id=\""dynForm\"" action=\""http://example.com/\"" method=\""post\""></form>"; document.querySelector("#dynForm").submit(); })()' );
is this the inccorect way to add it to the page?
document.body.innerHTML += "<
Using the chrome console it works.
Hello,
With the help of gsouf I now have the following script:
$content = $page ->evaluate( '(() => { document.querySelector('#username').value = 'username'; document.querySelector('#password').value = 'password'; document.querySelector('#loginform').submit(); })()' ) ->getReturnValue();
Only when doing this the return value is empty.
I am getting the following exception when trying to load a page with network errors
<b>Fatal error</b>: Uncaught TypeError: Return value of HeadlessChromium\Communication\ResponseReader::waitForResponse() must be an instance of HeadlessChromium\Communication\Response, none returned in /var/www/html/application/vendor/chrome-php/chrome/src/Communication/ResponseReader.php:105
Stack trace:
#0 /var/www/html/application/vendor/chrome-php/chrome/src/Communication/Connection.php(228): HeadlessChromium\Communication\ResponseReader->waitForResponse(3000)
#1 /var/www/html/application/vendor/chrome-php/chrome/src/Browser/BrowserProcess.php(197): HeadlessChromium\Communication\Connection->sendMessageSync(Object(HeadlessChromium\Communication\Message))
#2 [internal function]: HeadlessChromium\Browser\BrowserProcess->kill()
#3 {main}
thrown in <b>/var/www/html/application/vendor/chrome-php/chrome/src/Communication/ResponseReader.php</b> on line <b>105</b><br />
Real browser console ( errors are due to CORS policy )
Access to font at 'https://xxxx/assets/fonts/open-sans-v15-vietnamese_latin_greek_cyrillic-700.ttf' from origin 'null' has been blocked by CORS policy: No 'Access-Control-Allow-Origin' header is present on the requested resource.
Is there a way to ignore those errors?
I need to get response status code, what can I do?
$page->navigate('some - url')->waitForNavigation();
$content = $page->evaluate('document.documentElement.innerHTML')->getReturnValue();
Not all javascript is executed when doing the getReturnValue.
Is there a way to wait until it's done?
I'm trying to test your package and am getting the following error when trying to run your example...
Parse error: syntax error, unexpected '?', expecting variable (T_VARIABLE) in /var/www/html/scripts/browser/vendor/symfony/process/Process.php on line 140
The error seems to be triggered by this line...
$browserFactory = new BrowserFactory();
I've tracked the error down to...
$process = new Process($processString);
but on checking $processString just before the call it has content, so now I'm lost...
chrome --remote-debugging-port=0 --disable-background-networking --disable-background-timer-throttling --disable-client-side-phishing-detection --disable-default-apps --disable-extensions --disable-hang-monitor --disable-popup-blocking --disable-prompt-on-repost --disable-sync --disable-translate --metrics-recording-only --no-first-run --safebrowsing-disable-auto-update --enable-automation --password-store=basic --use-mock-keychain --headless --disable-gpu --hide-scrollbars --mute-audio --user-data-dir=/tmp/chromium-php-ndLwqS
I've installed google-chrome-stable but I wonder if it's not finding it for some reason?
I've tried for hours to fix it with no luck. Does anybody have any ideas? I'm a newb when it comes to composer so I'm more than happy to believe I've done something wrong :)
UPDATE: I noticed I had two versions of PHP installed 7.0 and 7.2, and apache was using 7. I removed 7.2 and updated the one apache uses to 7.2 so hopefully, I only have one version now.
I managed to move past the last error but now there's a new one...
Fatal error: Uncaught RuntimeException: Chrome process stopped before startup completed. Additional info: sh: chrome: command not found in /var/www/html/scripts/browser/vendor/chrome-php/chrome/src/Browser/BrowserProcess.php:353 Stack trace: #0 /var/www/html/scripts/browser/vendor/chrome-php/chrome/src/Utils.php(51): HeadlessChromium\Browser\BrowserProcess->HeadlessChromium\Browser\{closure}(Object(Symfony\Component\Process\Process)) #1 /var/www/html/scripts/browser/vendor/chrome-php/chrome/src/Browser/BrowserProcess.php(388): HeadlessChromium\Utils::tryWithTimeout(30000000, Object(Generator)) #2 /var/www/html/scripts/browser/vendor/chrome-php/chrome/src/Browser/BrowserProcess.php(127): HeadlessChromium\Browser\BrowserProcess->waitForStartup(Object(Symfony\Component\Process\Process), 30000000) #3 /var/www/html/scripts/browser/vendor/chrome-php/chrome/src/BrowserFactory.php(79): HeadlessChromium\Browser\BrowserProcess->start('chrome', Array) #4 /var/www/html/scripts/browser/index.php(14): HeadlessChromium\BrowserFactory->c in /var/www/html/scripts/browser/vendor/chrome-php/chrome/src/Browser/BrowserProcess.php on line 353
I'm sure it's the path not being set up correctly. I'll investigate further.
Well, I've finally fixed it... sort of, and as expected, it was my fault :(
The path to google-chrome isn't found so I added it manually just to test and it worked. Then I got an error that 'foo' couldn't be created, which I guess is a permissions error. I removed that and just had the script echo the page title and it worked as expected.
Not sure if it's worth leaving this here just in case somebody else has the same issues but I'll leave it up to the thread owner.
Thanks for a great piece of kit. It's going to make my job a LOT easier :)
Lux
ex:
$browserFactory->add_argument(['--disable-gpu','--no-sandbox','--blink-settings=imagesEnabled=false']);
It's more flexible than fixed settings.
Thank you.
Hi! That looks great! I was looking for something like that
Is there any way to bundle it with Chrome to run on Linux? I'd like to create such a portable package
Thanks
This only happens when using the userData, I need to keep the environment instead of having an entire fresh instance created on every load.
[2018-09-29 00:03:49] DEBUG socket(1): ← receiving data:{"error":{"code":-32602,"message":"No target with given id found"},"id":22}
Which eventually breaks the Session constructor because the sessionId is not returned.
PHP Fatal error: Uncaught TypeError: Argument 2 passed to HeadlessChromium\Communication\Session::__construct() must be of the type string, null given, called in .../vendor/chrome-php/chrome/src/Communication/Connection.php on line 240 and defined in .../vendor/chrome-php/chrome/src/Communication/Session.php:42
Current workaround:
src/Communication/Connection.php
commented out the return type of createSession so I can:
$sessionId = $response['result']['sessionId'] ?? false;
if(!$sessionId)
return;
src/Browser.php
Not creating a target if no session is returned
if(!$session)
return;
Not sure if it's the best approach as I just started to fiddle around with this library...
I will update once I track everything down.
Great work by the way!
So, I've installed "sudo apt-get install chromium-browser".
What I've got working so far, is reading the value from the "Search"-button https://www.google.nl/
require_once('vendor/autoload.php');
use HeadlessChromium\BrowserFactory;
// open browser
$factory = new BrowserFactory('chromium-browser');
//$browser = $factory->createBrowser();
$browser = $factory->createBrowser([
'debugLogger' => 'php://stdout'
]);
// navigate to a page with a form
$page = $browser->createPage();
$page->navigate('https://www.google.nl/')->waitForNavigation();
// get value in the new page
$value = $page->evaluate('document.querySelector("input[name=\"btnK\"]").value')->getReturnValue();
var_dump($value);
The problem I currently have, is with a web-gui from a vendor we have running internally.
There are multiple servers running the same web-gui, with each a self-signed certificate.
The servers are load-balanced by a virtual-ip. So the certificate is not only self-signed, it also contains the wrong server-name and server-ip address.
Of course, when using Google Chrome on my desktop, I get a certificate error.
How do I ignore the certificate errors with this script?
Because there are some statistics on that web-gui I'd like to grab.
I don't understand how I set headers. For example Accept-Language etc..
How to get image from site and save(not screenshot) via headless chromium?
Is this library ready enough to screenshot DOM objects? I have a project that has a requirement to take full page screenshots as well as screenshots of specific DOM objects, I have been using PhantomJS to do this but its just to slow. if this is not supported yet is it possible to use things like page.evaluate with"getBoundingClientRect()" to return the object position then snip the screenshot?
Hi there, I am trying to get pages html (whole). Is there an easy way to do this? When I am printing $page, it gives me more than I want.
$evaluation = $page->evaluate(
'(() => {
element1.setAttribute("name", 'I need to have a quote " in here');
the ' quote is not working so I end up with:
element1.setAttribute("name", "I need to have a quote " in here");
But of course that doesn't work.
Any tips advice?
Is there any way how to track redirects? For example I load page domain.tld, it redirects to domain.com and it redirects to domain.com/en/ ...now I need to know all these url's where user will be redirected to reach final destination
Is there a way to destroy a tab?
$page = $browser->createPage($uri) // I need to destroy this tab.
Is that possible?
Thanks 👍.
Running a scan across the full site eventually fires off Killed: 9
tried implementing a logger interface from Log::class, but getting nothing in the log, is the implementation for this correct? or is it implemented differently?
(using the built in Log class from laravel which impliments the PSR\LoggerInterface)
$browser = $browserFactory->createBrowser([
'headless' => true, // disable headless
'debugLogger' => Log::class // will enable verbose mode
]);
Hi,
Since a few days I have a problem with a form that I submit. I now get the error:
Error Executing Test. The operation has timed out
Strange thing before it was working!
$evaluation = $page->evaluate(
'(() => {
document.querySelector("#username").value = "'.USERNAME.'";
document.querySelector("#password").value = "'.PASSWORD.'";
document.querySelector("#submitButton").click();
})()'
);
// wait for the page to be reloaded
$evaluation->waitForPageReload();
Any tips or idea's?
When setting the browser userAgent, the string is not quoted, thus spaces break the command.
In file src/Browser/BrowserProcess.php at line 322
$args[] = '--user-agent=' . $options['userAgent'];
Current workaround: using double quotes on the userAgent option or using page->setUserAgent
Hi
would it be possible to add support to keyboard events or the possibility to trigger events that are "trusted", as keyboards events 'or other kinds of events) triggered by javascript can be detected as "not trusted" and not accepted by the website
I need to create a delay before the screenshot is taken..
I want the page to load, and then have headless-chromium-php take a screenshot after 15 seconds (I have an animation running that I want to capture it's end frame).
How do I do this?
The package is published as chrome/chrome
on Packagist. Please use a proper vendor name next time instead of the project name itself.
$browser = $browserFactory->createBrowser(['userAgent' => 'Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko']);
Wil return:
Fatal error: Uncaught RuntimeException: Chrome process stopped before startup completed. Additional info: sh: 1: Syntax error: "(" unexpected in /opt/vendor/chrome-php/chrome/src/Browser/BrowserProcess.php:355
Stack trace:
#0 /opt/vendor/chrome-php/chrome/src/Utils.php(51): HeadlessChromium\Browser\BrowserProcess->HeadlessChromium\Browser{closure}(Object(Symfony\Component\Process\Process))
#1 /opt/vendor/chrome-php/chrome/src/Browser/BrowserProcess.php(390): HeadlessChromium\Utils::tryWithTimeout(30000000, Object(Generator))
#2 /opt/vendor/chrome-php/chrome/src/Browser/BrowserProcess.php(125): HeadlessChromium\Browser\BrowserProcess->waitForStartup(Object(Symfony\Component\Process\Process), 30000000)
#3 /opt/vendor/chrome-php/chrome/src/BrowserFactory.php(80): HeadlessChromium\Browser\BrowserProcess->start('chromium-browse...', Array)
#4 /var/www/html/domains/x/public_html/x/index.php(57): HeadlessChromium\BrowserFactory->createBrowser(Array)
#5 {main}
thrown in /opt/vendor/chrome-php/chrome/src/Browser/BrowserProcess.php on line 355
I've been trying to utilise this, I've defined "chromium-browser" as the headless browser in the factory, like so;
$browserFactory = new BrowserFactory('chromium-browser');
// starts headless chrome
$browser = $browserFactory->createBrowser();
try {
// creates a new page and navigate to an url
$page = $browser->createPage($payload->uri);
$page->navigate($payload->uri);
} catch(NoResponseAvailable $exception) {
dd($exception);
}
$browser->close();
However, I'm getting OperationTimedOut in waitForStartup, it seems like the regex does match and does get back a DevTools ws URL. A process does get started (and not removed interestingly) - I'm unsure how to debug this further. Any advice would be greatly appreciated.
Thanks!
I read you have the issue in your TODO list
I am currently using the command line of chrome to generate PDFs but its very lacking in options compared to the puppeteer version. Since i dont have nodejs in my current env, i am stuck with it :(
My use case is, that i generate the HTML and feed it to chrome and grab the resulting file
Since this library supports sending direct commands to chrome,
has anyone successfully generated and saved PDFs?
Thank you for your time
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.