poorbillionaire / sitereview Goto Github PK
View Code? Open in Web Editor NEWBluecoat SiteReview Checker (CLI)
Bluecoat SiteReview Checker (CLI)
When I tried to execute the code, whatever input I give it returns me the following error:
File "C:\Users\USER\Desktop\sitereview-master\sitereview.py", line 49
print "\n{0}\n{1}\n{0}\n".format(border, "Blue Coat Site Review")
^
SyntaxError: invalid syntax
Am I doing something wrong or there is an error in the code?
PS: I have python 3.6.2
Is it possible to feed this a list of domains from a text file? Maybe 1 per line and have it output the results to a file ?
E.G
URL: domain1.com | Last Time Rated/Reviewed: > 7 days | Category: Malicious Sources/Malnets
URL: http://domain2.com | Last Time Rated/Reviewed: > 7 days | Category: Malicious S ources/Malnets
URL: http://domain3.com | Last Time Rated/Reviewed: > 7 days | Category: Malicious S ources/Malnets
Hi there,
Since yesterday, the script does not work no more and generate this error.
It's look like that the json's structure might have changed:
Traceback (most recent call last):
File "sitereview.py", line 61, in <module>
main(args.url)
File "sitereview.py", line 43, in main
s.check_response(response)
File "sitereview.py", line 28, in check_response
root = ET.fromstring(self.req.content)
File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1311, in XML
parser.feed(text)
File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1653, in feed
self._raiseerror(v)
File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1517, in _raiseerror
raise err
xml.etree.ElementTree.ParseError: mismatched tag: line 10, column 10
This site uses anti-spider method now, and I think only using selenium+web driver can crawl information, but there is still a limit per IP in a period of time.
When running the script, I get this error:
Traceback (most recent call last):
File "sitereview.py", line 65, in <module>
main(args.url)
File "sitereview.py", line 47, in main
response = s.sitereview(url)
File "sitereview.py", line 28, in sitereview
return json.loads(self.req.content.decode("UTF-8"))
File "/usr/local/Cellar/python@2/2.7.14_3/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py", line 339, in loads
return _default_decoder.decode(s)
File "/usr/local/Cellar/python@2/2.7.14_3/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 364, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/local/Cellar/python@2/2.7.14_3/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 382, in raw_decode
raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded
After altering the script to print the raw response, it looks like something is wrong with the endpoint:
<html><head><title>Apache Tomcat/7.0.52 (Ubuntu) - Error report</title><style><!--H1 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:22px;} H2 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:16px;} H3 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:14px;} BODY {font-family:Tahoma,Arial,sans-serif;color:black;background-color:white;} B {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;} P {font-family:Tahoma,Arial,sans-serif;background:white;color:black;font-size:12px;}A {color : black;}A.name {color : black;}HR {color : #525D76;}--></style> </head><body><h1>HTTP Status 403 - Security violation.</h1><HR size="1" noshade="noshade"><p><b>type</b> Status report</p><p><b>message</b> <u>Security violation.</u></p><p><b>description</b> <u>Access to the specified resource has been forbidden.</u></p><HR size="1" noshade="noshade"><h3>Apache Tomcat/7.0.52 (Ubuntu)</h3></body></html>
Am I using the script wrong, or is this an error with Site Review?
Add the ablility to submit a category correction via a url and updated category/subcategory
https://sitereview.bluecoat.com/sitereview.jsp#/?search=http%3A%2F%2Fwww.example.com
Hey PoorBillionaire,
Just playing with this and noted that in the check_response function, in the if statement, you are calling req.status_code. Needs to be self.req.status_code.
Just letting you know,
Thanks for the script!
Update readme to reflect Symantec ownership and to be more accurate with the new API
I get this response indicating that it violates the terms of service
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Site Review Acceptable Use Information</title>
<script type="text/javascript" src="/analytics.js"></script>
<link type="text/css" rel="stylesheet" media="all" href="/css/legacy.css" />
<noscript>
<meta HTTP-EQUIV="REFRESH" content="0; url=noJavascript.jsp">
</noscript>
<meta name="viewport" content="width=device-width, initial-scale=1">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<meta charset="utf-8">
<style id="antiClickjack">body{display:none !important;}</style>
<style>
section.site-review-content p {
margin-bottom: 1em;
color: #454545;
}
section.site-review-content h1 {
margin-bottom: 1em;
}
section.site-review-content hr {
border-color: black;
}
section.important-note {
margin-left: 4em;
margin-right: 4em;
margin-bottom: 2em;
background-color: white;
border: solid 2px #fdbb30;
padding: 0 1em;
}
section.important-note h3 {
color: black;
font-weight: bold;
margin-top: 1em;
}
</style>
</head>
<body class="sym-theme">
<div class="header-background" style="padding-bottom: 1em">
<div class="container header_container">
<div class="header_master">
<div class="app-logo"></div>
</div>
</div>
</div>
<section class="site-review-content b-content">
<div class="center_section container">
<h1>Site Review Acceptable Use Information</h1>
<p>
It appears you are using Site Review in an automated fashion, which violates our <a href="https://www.symantec.com/about/legal/blue-coat-legal-archive/website-terms-of-use">Terms
of Use</a> and can result in loss of access to the service.
</p>
<p>
Please contact your Symantec representative for other options.
</p>
</div>
</section>
<div class="footer-background">
<div class="footer_container container">
<div class="footer_master">
<div class="copyright">
Copyright © 1995-<span>2019</span> Symantec Corporation
</div>
</div>
</div>
</div>
<script type="text/javascript">
if(self == top) {
var antiClickjack = document.getElementById("antiClickjack");
antiClickjack.parentNode.removeChild(antiClickjack);
} else {
top.location = self.location;
}
</script>
</body>
</html>
Quick question: I need to query the API for about 100,000 domains. Do you know how much time gap is recommended between consecutive queries? I tried 5-10 seconds and I feel like that is probably not sufficient as I seem to have gotten blocked.
Hi,
I am getting connection time out while using this script. Can you help
$python sitereview.py http://brins.biz/
Traceback (most recent call last):
File "sitereview.py", line 60, in
main(args.url)
File "sitereview.py", line 42, in main
response = s.sitereview(url)
File "sitereview.py", line 28, in sitereview
return json.loads(self.req.content.decode("UTF-8"))
File "/usr/local/Cellar/python@2/2.7.15_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/init.py", line 339, in loads
return _default_decoder.decode(s)
File "/usr/local/Cellar/python@2/2.7.15_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 364, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/local/Cellar/python@2/2.7.15_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 382, in raw_decode
raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded
sitereview.py https://www.google.com
Traceback (most recent call last):
File "/usr/local/bin/sitereview.py", line 4, in
import('pkg_resources').run_script('sitereview==2.0', 'sitereview.py')
File "/usr/lib/python2.7/dist-packages/pkg_resources/init.py", line 739, in run_script
self.require(requires)[0].run_script(script_name, ns)
File "/usr/lib/python2.7/dist-packages/pkg_resources/init.py", line 1501, in run_script
exec(script_code, namespace, namespace)
File "/usr/local/lib/python2.7/dist-packages/sitereview-2.0-py2.7.egg/EGG-INFO/scripts/sitereview.py", line 61, in
File "/usr/local/lib/python2.7/dist-packages/sitereview-2.0-py2.7.egg/EGG-INFO/scripts/sitereview.py", line 43, in main
File "/usr/local/lib/python2.7/dist-packages/sitereview-2.0-py2.7.egg/EGG-INFO/scripts/sitereview.py", line 26, in check_response
NameError: global name 'req' is not defined
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.