Archived URLs
Some websites may be limited and/or rather small in size. You need more parameters
, hidden pages
or directories
.
This is why archived URLs is a good thing and can be a goldmine in certain situations!
You have probably used archive.org?
Use this URL and change www.example.com
to the target URL - keep /*
https://web.archive.org/cdx/search/cdx?url=www.example.com/*&collapse=urlkey&fl=original
Copy paste all URLs into sublime
or other text editor of choice, I tend to use this regex to remove or clean the archived URLs that aren't too interesting.
- Remove all the file extensions that aren't of interest through
CTRL + F
, select all and remove
.*(\.jpg|\.jpeg|\.png|\.gif|\.woff|\.woff2|\.css|\.svg|\.ttf|\.eot|\.otf|\.ico|\.scss|\.webp).*
- Go through the URLs with an HTTP port
:80
as these URLs are usually way too old. Usually aren't of interest as these endpoints don't exist. But depending on the app, they may still exist
.*(:80).*
Subdomain Enumeration
Change this accordingly by changing example.com
curl -s "https://web.archive.org/cdx/search/cdx?url=*.example.com/*&output=text&fl=original&collapse=urlkey" | cut -d'/' -f3 | sort -u