Git Product home page Git Product logo

snoopy's Introduction

NAME:

Snoopy - the PHP net client v1.2.4

SYNOPSIS:

include "Snoopy.class.php";
$snoopy = new Snoopy;

$snoopy->fetchtext("http://www.php.net/");
print $snoopy->results;

$snoopy->fetchlinks("http://www.phpbuilder.com/");
print $snoopy->results;

$submit_url = "http://lnk.ispi.net/texis/scripts/msearch/netsearch.html";

$submit_vars["q"] = "amiga";
$submit_vars["submit"] = "Search!";
$submit_vars["searchhost"] = "Altavista";
	
$snoopy->submit($submit_url,$submit_vars);
print $snoopy->results;

$snoopy->maxframes=5;
$snoopy->fetch("http://www.ispi.net/");
echo "<PRE>\n";
echo htmlentities($snoopy->results[0]); 
echo htmlentities($snoopy->results[1]); 
echo htmlentities($snoopy->results[2]); 
echo "</PRE>\n";
$snoopy->fetchform("http://www.altavista.com");
print $snoopy->results;

DESCRIPTION:

What is Snoopy?

Snoopy is a PHP class that simulates a web browser. It automates the task of retrieving web page content and posting forms, for example.

Some of Snoopy's features:

  • easily fetch the contents of a web page
  • easily fetch the text from a web page (strip html tags)
  • easily fetch the the links from a web page
  • supports proxy hosts
  • supports basic user/pass authentication
  • supports setting user_agent, referer, cookies and header content
  • supports browser redirects, and controlled depth of redirects
  • expands fetched links to fully qualified URLs (default)
  • easily submit form data and retrieve the results
  • supports following html frames (added v0.92)
  • supports passing cookies on redirects (added v0.92)

REQUIREMENTS:

Snoopy requires PHP with PCRE (Perl Compatible Regular Expressions), which should be PHP 3.0.9 and up. For read timeout support, it requires PHP 4 Beta 4 or later. Snoopy was developed and tested with PHP 3.0.12.

CLASS METHODS:

fetch($URI)

This is the method used for fetching the contents of a web page. $URI is the fully qualified URL of the page to fetch. The results of the fetch are stored in $this->results. If you are fetching frames, then $this->results contains each frame fetched in an array.

fetchtext($URI)

This behaves exactly like fetch() except that it only returns the text from the page, stripping out html tags and other irrelevant data.

fetchform($URI)

This behaves exactly like fetch() except that it only returns the form elements from the page, stripping out html tags and other irrelevant data.

fetchlinks($URI)

This behaves exactly like fetch() except that it only returns the links from the page. By default, relative links are converted to their fully qualified URL form.

submit($URI,$formvars)

This submits a form to the specified $URI. $formvars is an array of the form variables to pass.

submittext($URI,$formvars)

This behaves exactly like submit() except that it only returns the text from the page, stripping out html tags and other irrelevant data.

submitlinks($URI)

This behaves exactly like submit() except that it only returns the links from the page. By default, relative links are converted to their fully qualified URL form.

CLASS VARIABLES: (default value in parenthesis)

	$host			the host to connect to
	$port			the port to connect to
	$proxy_host		the proxy host to use, if any
	$proxy_port		the proxy port to use, if any
	$agent			the user agent to masqerade as (Snoopy v0.1)
	$referer		referer information to pass, if any
	$cookies		cookies to pass if any
	$rawheaders		other header info to pass, if any
	$maxredirs		maximum redirects to allow. 0=none allowed. (5)
	$offsiteok		whether or not to allow redirects off-site. (true)
	$expandlinks	whether or not to expand links to fully qualified URLs (true)
	$user			authentication username, if any
	$pass			authentication password, if any
	$accept			http accept types (image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */*)
	$error			where errors are sent, if any
	$response_code	responde code returned from server
	$headers		headers returned from server
	$maxlength		max return data length
	$read_timeout	timeout on read operations (requires PHP 4 Beta 4+)
					set to 0 to disallow timeouts
	$timed_out		true if a read operation timed out (requires PHP 4 Beta 4+)
	$maxframes		number of frames we will follow
	$status			http status of fetch
	$temp_dir		temp directory that the webserver can write to. (/tmp)
	$curl_path		system path to cURL binary, set to false if none

EXAMPLES:

Fetch a web page and display the return headers and the contents of the page (html-escaped):

	include "Snoopy.class.php";
	$snoopy = new Snoopy;
	
	$snoopy->user = "joe";
	$snoopy->pass = "bloe";
	
	if($snoopy->fetch("http://www.slashdot.org/"))
	{
		echo "response code: ".$snoopy->response_code."<br>\n";
		while(list($key,$val) = each($snoopy->headers))
			echo $key.": ".$val."<br>\n";
		echo "<p>\n";
		
		echo "<PRE>".htmlspecialchars($snoopy->results)."</PRE>\n";
	}
	else
		echo "error fetching document: ".$snoopy->error."\n";

submit a form and print out the result headers and html-escaped page:

	include "Snoopy.class.php";
	$snoopy = new Snoopy;
	
	$submit_url = "http://lnk.ispi.net/texis/scripts/msearch/netsearch.html";
	
	$submit_vars["q"] = "amiga";
	$submit_vars["submit"] = "Search!";
	$submit_vars["searchhost"] = "Altavista";

		
	if($snoopy->submit($submit_url,$submit_vars))
	{
		while(list($key,$val) = each($snoopy->headers))
			echo $key.": ".$val."<br>\n";
		echo "<p>\n";
		
		echo "<PRE>".htmlspecialchars($snoopy->results)."</PRE>\n";
	}
	else
		echo "error fetching document: ".$snoopy->error."\n";

Showing functionality of all the variables:

	include "Snoopy.class.php";
	$snoopy = new Snoopy;

	$snoopy->proxy_host = "my.proxy.host";
	$snoopy->proxy_port = "8080";
	
	$snoopy->agent = "(compatible; MSIE 4.01; MSN 2.5; AOL 4.0; Windows 98)";
	$snoopy->referer = "http://www.microsnot.com/";
	
	$snoopy->cookies["SessionID"] = 238472834723489l;
	$snoopy->cookies["favoriteColor"] = "RED";
	
	$snoopy->rawheaders["Pragma"] = "no-cache";
	
	$snoopy->maxredirs = 2;
	$snoopy->offsiteok = false;
	$snoopy->expandlinks = false;
	
	$snoopy->user = "joe";
	$snoopy->pass = "bloe";
	
	if($snoopy->fetchtext("http://www.phpbuilder.com"))
	{
		while(list($key,$val) = each($snoopy->headers))
			echo $key.": ".$val."<br>\n";
		echo "<p>\n";
		
		echo "<PRE>".htmlspecialchars($snoopy->results)."</PRE>\n";
	}
	else
		echo "error fetching document: ".$snoopy->error."\n";

Fetched framed content and display the results

	include "Snoopy.class.php";
	$snoopy = new Snoopy;
	
	$snoopy->maxframes = 5;
	
	if($snoopy->fetch("http://www.ispi.net/"))
	{
		echo "<PRE>".htmlspecialchars($snoopy->results[0])."</PRE>\n";
		echo "<PRE>".htmlspecialchars($snoopy->results[1])."</PRE>\n";
		echo "<PRE>".htmlspecialchars($snoopy->results[2])."</PRE>\n";
	}
	else
		echo "error fetching document: ".$snoopy->error."\n";

COPYRIGHT:

Copyright(c) 1999,2000 ispi. All rights reserved. This software is released under the GNU General Public License. Please read the disclaimer at the top of the Snoopy.class.php file.

THANKS:

Special Thanks to:

snoopy's People

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.