Git Product home page Git Product logo

Comments (8)

s0md3v avatar s0md3v commented on May 18, 2024

Hi @noraj ,

Thanks for reporting the issue, can you please check if this PR fixes it?

Photon should now store the redirecting URLs in redirects.txt in the following format:

https://example.com/redirect_from==>https://example.com/redirect_to

from photon.

s0md3v avatar s0md3v commented on May 18, 2024

@noraj ???

from photon.

noraj avatar noraj commented on May 18, 2024

@s0md3v Yeah answering, I'm just writing long post and I need to check what I say before affirming it.

I git cloned a fresh copy then git checkout redirect, then ran python photon.py --url http://x.X.x.x/ --level 1 --only-url but I have the exact same result as before without https://example.com/redirect_from==>https://example.com/redirect_to.

I think this is because when http://x.X.x.x/ is hit the code is 200 and there is --level 1 so other links are scrapped but not requested no we never go in the if code[0] == '3': statement.

Photon/photon.py

Lines 219 to 222 in 0a5de25

if code != '404':
if code[0] == '3':
redirects.add(url + ':' + response.url)
return response.text

So we are forced to use python photon.py --url http://x.X.x.x/ --level 2 --only-url but here instead of having the 103 internal URL from the root page I have more than 700 URLs from all the sub-pages and it took way more time to scan (103 remote pages instead of just one).

That is why I talked about a redirect switch option that will allow internal URL collected to be requested to see if they answer a page or a redirection, and then if it is a redirection.
So what I mean is keep the actual behavior + add a new option --whatevername that will treat internal URL scrapped as potential redirection and so request them to store the potential redirection value in addition of the raw internal URL.

Also I got about 30 (using level 2) URL in failed.txt but all are valid, example:

$ curl -vvv http://x.x.x.x/\?s\=_____ba8da76e357a______
*   Trying x.x.x.x...
* TCP_NODELAY set
* Connected to x.x.x.x (x.x.x.x) port 80 (#0)
> GET /?s=_____ba8da76e357a______ HTTP/1.1
> Host: x.x.x.x
> User-Agent: curl/7.61.1
> Accept: */*
> 
< HTTP/1.1 303 See Other
< Date: Tue, 23 Oct 2018 18:47:37 GMT
< Server: localhost
< Content-Type: text/html
< Location: https://googleprojectzero.blogspot.com/xxxxxxxxxxx.html
< Content-Length: 0
< 
* Connection #0 to host x.x.x.x left intact

So I don't know why they are failed.

But even with level no redirection value are stored, I even checked with grep -ri '==>' ./.

from photon.

noraj avatar noraj commented on May 18, 2024

PS : maybe check that python requests lib handle 303 redirect.

from photon.

s0md3v avatar s0md3v commented on May 18, 2024

Hi @noraj ,

It is to let you know that the issue has been acknowledged and I am working on it.
I will add a new switch, --verify which will solve redirection and 404 issues by verifying all the URLs added on each level before crawling further.

Thanks for the verbose explanation of the issue, it really helped.

PS: Would it be possible for you to provide the website you are testing against?
You can dm at twitter

from photon.

0xInfection avatar 0xInfection commented on May 18, 2024

I guess adding a parameter allow_redirects=False to L239, and doing a relevant check will fix this.

from photon.

s0md3v avatar s0md3v commented on May 18, 2024

@0xInfection We want to follow redirects.

from photon.

s0md3v avatar s0md3v commented on May 18, 2024

Don't worry guys, I will fix it once I have free time.

from photon.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.