URL validation bypasses are the root cause of numerous vulnerabilities
including many instances of
SSRF,
CORS misconfiguration, and
open redirection. These work by using ambiguous URLs to trigger URL parsing discrepancies
and bypass validation. However, many of these techniques are poorly
documented and overlooked as a result.
To address this, we wanted to create a cheat sheet that consolidates all
known payloads, saving you the time and effort of searching and gathering
information from across the Internet. Today, we’re excited to introduce a
new tool designed to solve this problem:
the URL Validation Bypass Cheat Sheet.
We hope you find it useful! This is a frequently updated repository of all
known techniques, allowing you to quickly generate a wordlist that meets
your needs.
How to get started
The URL Validation Bypass Cheat Sheet is a brand new interactive web
application that automatically adjusts its settings based on your context.
Currently, there are three contexts available:
-
A fully qualified absolute URL – useful for a situation where URL is
used in a request query parameter for example. All payloads are designed
to be Burp Suite Intruder friendly, so you don’t have to worry about the
correct encoding. -
Only hostname – direct input of the domain, such as in the Host header
value. -
CORS Origin – where the hostname is intended to be used in a valid
browser origin header.
Initially, the cheat sheet provides six types of payload wordlists. The
advanced settings allow you to select a specific wordlist or use all of them
simultaneously. Here’s a brief overview of the most important ones:
-
Domain Allow List Bypass: Designed for domain confusion attacks. You can
customize the testing domains by entering the allowed and attacker
domains accordingly. -
Fake Relative URLs: This includes the browser-valid absolute URLs that
might be incorrectly validated by client-side code. -
Loopback Address: This wordlist includes various representations of
IPv4, IPv6 addresses, and their normalizations.
Encodings
The URL Validation Cheat Sheet supports several types of string encoding:
-
Intruder’s Percent Encoding: This option encodes a payload string by
replacing certain characters with one to four escape sequences that
represent the UTF-8 encoding of the character. It excludes Burp Suite
Intruder’s default characters and is enabled by default, making it easily
compatible with Burp Suite -
Everything: This option percent-encodes all characters except alphanumeric
ones - The Special Chars option encodes everything except the following
characters:["!","$","'",""","(",")","*",",","-",".","https://portswigger.net/","\",":",";","[","]","^","_","{","}","|","~"] -
Unicode Escape: This option represents a payload string as a six-character
escape sequenceuXXXX, except for the following characters:
['"','\','b','f','n','r','t']and those in the range[0x0020 - 0x007f]
Note: Unencoded strings should be used with caution, as Unicode values may
not be transmitted correctly.
Advanced settings
IPv4 Addresses representation
When working with web applications, encoding IP addresses into different
formats can be crucial for testing, validation, and security purposes. The
cheat sheet supports standard IPv4 address as attacker IP input and returns
an array of encoded representations, including octal, hexadecimal, binary,
and decimal formats. It also converts an IPv4 address into its IPv6-mapped
address format.
Encoding Details:
-
Octal: Each segment of the IP address is converted to an octal number
and padded to 4 digits. For example, the loopback IP address 127.0.0.1
would be represented as0177.0000.0000.0001 -
Hexadecimal: Each segment is converted to a hexadecimal number, prefixed
with 0x, and padded to 2 digits. The same loopback IP address would be
0x7F.0x00.0x00.0x01 -
Binary: Each segment is converted to an 8-bit binary number. The example
IP address would be01111111.00000000.00000000.00000001 -
Partial Decimal: Combines the third and fourth parts of the IP address
into a single decimal number:127.0.1 -
DWORD Notation: The entire IP address is converted into an unsigned
32-bit integer:2130706433 -
DWORD Notation with overflow: The result from the previous conversion is
added to 2^32 * 10 =45080379393 -
IPv6 Mapped Address: Converts the IPv4 segments into hexadecimal and
formats them into a standard IPv6-mapped address. The loopback IP address can be represented as[::FFFF:7F00:0001]or::FFFF:127.0.0.1
Normalization
The wordlists include numerous payloads that exploit Unicode string
normalization. For instance, the normalization of the following characters results in an empty string:
These techniques can be used to bypass Web Application Firewalls (WAFs).
Another example of an allowed domain bypass occurs when a validation regular
expression permits multiline strings. For instance, if the regex
^allowed_domain$ is used, the following can bypass the validation:
Credits
This cheat sheet wouldn’t be possible without the web security community who
share their research. Big thanks to:
Gareth Heyes,
James Kettle,
Jann Horn,
Liv Matan,
Takeshi Terada,
Orange Tsai,
Nicolas Grégoire.
We published all payloads at our GitHub account
https://github.com/PortSwigger/url-cheatsheet-data, so you can contribute to this cheat sheet by creating a
new issue
or updating the JSON files and submitting a
pull request.
We look forward to your interesting discoveries using our new
URL validation bypass cheat sheet!
Back to all articles





