Git Product home page Git Product logo

urlhunter's Introduction

				o  	  Utku Sen's
				 \_/\o   
				( Oo)                    \|/
				(_=-)  .===O-  ~~U~R~L~~ -O-
				/   \_/U'        hunter  /|\
				||  |_/
				\\  |    utkusen.com
				{K ||	twitter.com/utkusen

urlhunter is a recon tool that allows searching on URLs that are exposed via shortener services such as bit.ly and goo.gl. The project is written in Go.

How?

A group named URLTeam (kudos to them) are brute forcing the URL shortener services and publishing matched results on a daily basis. urlhunter downloads their collections and lets you analyze them.

Installation

From Binary

You can download the pre-built binaries from the releases page and run. For example:

tar xzvf urlhunter_0.1.0_Linux_amd64.tar.gz

./urlhunter --help

From Source

  1. Install Go on your system

  2. Run: go install github.com/utkusen/urlhunter@latest

Note For The Windows Users: urlhunter uses XZ Utils which is pre-installed on Linux and macOS systems. For Windows systems, you need to download it from https://tukaani.org/xz/

Usage

urlhunter requires 3 parameters to run: -keywords, -date and -o.

For example: urlhunter --keywords keywords.txt --date 2020-11-20 --o out.txt

--keywords

You need to specify the txt file that contains keywords to search on URLs. Keywords must be written line by line. You have three different ways to specify keywords:

Single Keyword: urlhunter will search the given keyword as a substring. For example:

acme.com keyword will both match https://acme.com/blabla and https://another.com/?referrer=acme.com

Multiple Keywords: urlhunter will search the given keywords with an AND logic. Which means, a URL must include all the provided keywords. Keywords must be separated with , character. For example:

acme.com,admin will match https://acme.com/secret/adminpanel but won't match https://acme.com/somethingelse

Regex Values: urlhunter will search for the given regex value. In the keyword file, the line that contains a regular expression formula must start with regex string. The format is: regex REGEXFORMULA. For example:

regex 1\d{10} will match https://example.com/index.php?id=12938454312 but won't match https://example.com/index.php?id=abc223

--date

urlhunter downloads the archive files of the given date(s). You have three different ways to specify the date:

Latest: urlhunter will download the latest archive. -date latest

Single Date: urlhunter will download the archive of the given date. Date format is YYYY-MM-DD.

For example: -date 2020-11-20

Date Range: urlhunter will download all the archives between given start and end dates.

For example: -date 2020-11-10:2020-11-20

--output

You can specify the output file with -o parameter. For example -o out.txt

Demonstration Video

Watch the video

The Speed Problem

Archive.org throttles the speed when downloading files. Therefore, downloading an archive takes more time than usual. As a workaround, you can download the archives via Torrent and put them under the archive/ folder which is located in the same directory with the urlhunter's binary. The directory tree will look like:

|-urlhunter
|---urlhunter(binary)
|---archive
|-----urlteam_2020-11-20-11-17-04
|-----urlteam_2020-11-17-11-17-04

Example Use Cases

urlhunter might be useful for cyber intelligence and bug bounty purposes. For example:

docs.google.com/a/acme.com drive.google.com/a/acme.com keywords allow you to find public Google Docs&Drive share links of Acme company.

acme.com,password_reset_token keyword may allow you to find the working password reset tokens of acme.com

trello.com allows you to find public Trello addresses.

Thanks

Special thanks to Samet Bekmezci(@sametbekmezci) who gave me the idea of this tool.

Donation

Loved the project? You can buy me a coffee

Buy Me A Coffee

urlhunter's People

Contributors

itsignacioportal avatar rzhade3 avatar utkusen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

urlhunter's Issues

`panic: invalid line` when parsing one of the urls

Steps to reproduce:

echo "google" >key1.txt; urlhunter --keywords key1.txt --date 2023-03-15:2023-03-15 --output a.txt

Output:


	o  	 Utku Sen's
	\_/\o
	( Oo)                    \|/
	(_=-)  .===O-  ~~U~R~L~~ -O-
	/   \_/U'        hunter  /|\
	||  |_/
	\\  |    utkusen.com
	{K ||	twitter.com/utkusen

 
Search starting for: 2023-03-15
[+]: urlteam_2023-03-15-20-17-01 already exists locally. Skipping download..
[+]: Searching: "google" in urlteam_2023-03-15-20-17-01/ed-gr
[+]: Searching: "google" in urlteam_2023-03-15-20-17-01/goo-gl
panic: invalid line: yRjCQD|http://www.priceminister.com/offer/buy/853650105/google-chromecast-audio-noir.html?t=180110&ptnrid=pt%7C89084411603%7Cc%7C53388079763%7C853650105&gclid=CK_R94KCvswCFZadGwod1bQLdA#sort=0&bbaid=1919751435&filter=20&xtatc=PUB-%5Bggp%5D-%5BHifi%5D-%5Baccessoire-audio-video%5D-%5B853650105%5D-%5Boccasion%5D-%5BTop_Occasion%5D&t=&ptnrid=s16SUVAfu_dc|pcrid|53388079763|pkw||pmt|&ja1=tsid:67590|cid:285246443|agid:14445716363|tid:pla-89084411603|crid:53388079763|nw:g|rnd:3013699068986104353|dvc:c|adp:1o2

goroutine 1 [running]:
main.searchFile({0xc0003786c0, 0x36}, {0xc000599dc9, 0x6}, {0x7ffe85ef8071, 0x5})
	github.com/utkusen/urlhunter/main.go:276 +0x70e
main.getArchive({0xc000436000, 0x1cec1, 0x20000}, {0xc00013e010, 0xa}, {0x7ffe85ef8042, 0x8}, {0x7ffe85ef8071, 0x5})
	github.com/utkusen/urlhunter/main.go:222 +0x5e9
main.main()
	github.com/utkusen/urlhunter/main.go:114 +0x6c5

Install via Go Get fails

go/src/golang.org/x/term/term_unix_linux.go:9:7: ioctlReadTermios redeclared in this block
        previous declaration at go/src/golang.org/x/term/term_unix_aix.go:9:26
go/src/golang.org/x/term/term_unix_linux.go:10:7: ioctlWriteTermios redeclared in this block
        previous declaration at go/src/golang.org/x/term/term_unix_aix.go:10:27

I'm using Ubuntu.

Using a multiline keywords file breaks the output

When keywords.txt has a single word as its content:

code
C:\Users\REDACTED\Desktop>main.exe -k C:\Users\REDACTED\Documents\GitHub\urlhunter\keywords.txt -d 2022-01-01:2022-01-04 -o test.txt -a C:\Users\REDACTED\Documents\GitHub\urlhunter\archives

        o         Utku Sen's
         \_/\o
        ( Oo)                    \|/
        (_=-)  .===O-  ~~U~R~L~~ -O-
        /   \_/U'        hunter  /|\
        ||  |_/
        \\  |    utkusen.com
        {K ||   twitter.com/utkusen


Search starting for: 2022-01-01
[+]: Couldn't find an archive with that date.
Search starting for: 2022-01-02
[+]: urlteam_2022-01-02-11-17-02 already exists locally. Skipping download..
[+]: Searching: "code" in C:\Users\REDACTED\Documents\GitHub\urlhunter\archives\urlteam_2022-01-02-11-17-02\goo-gl\______.txt
^C

When keywords.txt has multiple lines as its content:

code
auth
token
C:\Users\REDACTED\Desktop>main.exe -k C:\Users\REDACTED\Documents\GitHub\urlhunter\keywords.txt -d 2022-01-01:2022-01-04 -o test.txt -a C:\Users\REDACTED\Documents\GitHub\urlhunter\archives

        o         Utku Sen's
         \_/\o
        ( Oo)                    \|/
        (_=-)  .===O-  ~~U~R~L~~ -O-
        /   \_/U'        hunter  /|\
        ||  |_/
        \\  |    utkusen.com
        {K ||   twitter.com/utkusen


Search starting for: 2022-01-01
[+]: Couldn't find an archive with that date.
Search starting for: 2022-01-02
[+]: urlteam_2022-01-02-11-17-02 already exists locally. Skipping download..
" in C:\Users\REDACTED\Documents\GitHub\urlhunter\archives\urlteam_2022-01-02-11-17-02\goo-gl\______.txt
^C

urlteam_2020-12-28-03-17-02 Archive already exists!

I have download this repo.
I have download xz-5.2.5-windows.zip. Install files to c:\Windows\System32.
I have goo message from: > xz --help
First time run was downloaded the archives to archives\urlteam_2020-12-28-03-17-02
But then i have errors: go run main.go -date latest -keywords keywords.txt -o out.txt

Search starting for: latest
urlteam_2020-12-28-03-17-02 Archive already exists!
panic: runtime error: index out of range [1] with length 1

goroutine 1 [running]:
main.searchFile(0xc0001a5380, 0x37, 0xc0001adbc0, 0x14, 0xc000010118, 0x7)
        ../urlhunter/main.go:201 +0xc5c
main.getArchive(0xc000300000, 0x172e4, 0x1fe00, 0xc0000100a8, 0x6, 0xc0000100f0, 0xc, 0xc000010118, 0x7)
        ../urlhunter/main.go:194 +0xb2f
main.main()
        ../urlhunter/main.go:94 +0x387
exit status 2

Return shortlink referring to a given longlink

Currently, we're only searching for/ returning the long link for a given resource. However, the archives also contain information about the shortlink within the file (the format of the file is in the Beacon Link Dump format:

b9YiMs|https://www.google.com

It'd be helpful to also return the shortlink which refers to the longlink. This could be implemented by scanning the archive to find the entire line that contains a resource instead of just the specific string that was matched.

Containerize URLHunter

Would you be open to containerizing this service, as well as possibly publishing it to DockerHub (or GitHub Container Registry)? It would really help to automate running this service.

I did see that someone submitted this PR: #2, which you closed.

XZ executable file not found

image

Dear Creator,
Love this tool, and can't wait to see its full capabilities. I'm faced with the following errors even when copying the files to the path folders

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.