Comments (5)
Hello, I think I have a possible source of the problem.
First I will put you in context:
I was trying to compare 2 web pages ( using NGINX ), the web pages and sitediff inside a docker container.
Both websites had the default content, except for a small change to check and review the differences between the two.
Sitediff detected that there were the changes but only showed the difference between binary and as everyone I am interested to view the differences.
In my case specific to both NGINX servers (containers) the charset was not specified and defining it in the index.html file did not work.
The solution
The solution in my case is to define the charset in the Nginx configuration file /etc/nginx/nginx.conf the charset inside the http or https block:
. . .
http {
charset utf-8;
. . .
}
After the changes you would only have to reload nginx:
$ /etc/init.d/nginx reload
So be careful with the configuration of your web service.
from sitediff.
It's hard for me to verify, but I'd start by looking at the headers returned by your http servers. Might be that you can resolve by setting an appropriate header.
I haven't looked at this issue personally, but let's keep this open for now pending further testing.
from sitediff.
I've seen this one before. I think this happens when the library that reads a URL as HTML cannot determine the encoding for the page. That's when the Result
class (if I remember correctly) treats the content as binary and compares hashes instead of actual content.
from sitediff.
I get this with the default/initial (example?) HTML files, but not with my own site's HTML files.
Ubuntu 18.04, installed via Docker (1.0.0 and latest).
from sitediff.
If SiteDiff can't determine the character encoding of the files, it will revert to a "Binary" encoding. This is something we wish to improve.
from sitediff.
Related Issues (20)
- feature backlog suggestion: API diffs HOT 1
- Merging sanitisation rules from includes HOT 2
- Feature request - crawl sitemap.xml HOT 4
- sitediff version throws error HOT 1
- Exclude does not seem to work for crawl HOT 3
- Can sitediff load pages from disk? HOT 2
- Improve docume ntation for "Export" option HOT 2
- Invalid byte sequence in US-ASCII (ArgumentError) when running `sitediff diff` HOT 1
- Sitediff init not creating Paths.txt HOT 5
- sitediff fails with "Not a directory @ apply2files" if crawl only produces one page HOT 6
- Diff can fail with error HOT 1
- sitediff store ends up in ArgumentError HOT 2
- Paths with trailing slashes always have the trailing slash removed HOT 1
- Allow separate curl_opts for before and after
- Crawler throws error when URL has leading/trailing whitespace HOT 1
- Ignore "itok" for image style URIs in Drupal preset HOT 2
- URLs with the www subdomain are not crawled
- zsh: command not found: sitediff HOT 2
- Drupal: Paginated views are not crawled HOT 1
- Can I use sitediff to compare dynamic content, or result of clickable element , or UI of element(css style) ? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from sitediff.