October 31, 2008

Tags blog

URL Obfuscation

<div style=text-align: left;>On August 26, our CTO Nathan Day wrote a <a href=http://theinnerlayer.softlayer.com/2008
On August 26, our CTO Nathan Day wrote a post on the InnerLayer blog about nameservers. His straightforward explanation of nameservers and their operations got me thinking about how NOT straightforward the whole operation is.

The way Nathan explained it, you type in “theinnerlayer.softlayer.com” and it is translated to an IP address, which is then contacted, and the page is returned to you. However, if you know the IP address already, you can use that instead of the URL, and skip the nameserver entirely. For instance, will take you directly to the InnerLayer blog, bypassing the name server. But that's not all! Not only will the dotted-decimal representation of the IP work in a url, but the dword representation will as well! Try http://1122268947. That will also get you to the InnerLayer.

Now that we've gotten the domain out of the way, what about the bits before and after? Before the domain, between the protocol (http) and the domain itself, there is an optional authentication part. You can specify a username to log into secured sites right in the URL. http://user:pass@site.com is the standard format for such logins. However, if the website you're going to doesn't require authentication, most browsers simply ignore it. FireFox 3 will prompt you when you click on these obfuscated links to ask you if "site.com" is really the site you wish to visit, where IE7 simply won't work at all if there's an unexpected authentication string. This is a fairly new feature, and it's a good way to protect users against this sort of attack. Now that you know about the methods of obfuscating domain names in URLs, you can probably see how http://www.bankofamerica.com%20login@1122268947 actually redirects to the InnerLayer. This is a common tactic used by spammers and phishers to obfuscate their URLs. You can put anything you want into the authentication portion of the URL to obfuscate it, as long as it's not a reserved URL character like colon, "at" sign, or forward slash. For our case, let's use "4NDIw:U4ODYwMCAxMjE5" as our fake authentication data, just to be confusing.

Now that we've added stuff to the beginning of the URL, what about the filename at the end? Nathan's post could easily be accessed using http://4NDIw:U4ODYwMCAxMjE5@1122268947/2008/do-you-know-where-your-nameserver-is/. However, there's still all that easy-to-read nonsense at the end. That will never do. Have you ever seen a URL with a space in it? The space is encoded as %20. That's the hexadecimal representation of the ASCII code 32, a space. The percent sign indicates that the following 2 digits are to be interpreted as a hex code for a real character. This is how you keep URLs from breaking on spaces, you turn the spaces into non-breaking characters. However, did you know it works for ALL characters and not just spaces? We can change every character but the forward slashes in any url to their hex equivalents. Nathan's article link then becomes: http://4NDIw:U4ODYwMCAxMjE5@1122268947/%32%30%30%38/%64%6f%2d%79%6f

But wait, there's more! Let's go back to the domain name, shall we? Most browsers will handle overflow in the dword representation of the domain just fine. What that means is that we can continually add 4294967296 (2^32) to the domain portion of our obfuscated URL and still continue to get the results we want. Our URL is now: http://4NDIw:U4ODYwMCAxMjE5

As a final trick, you don't have to obfuscate every letter. A fixed pattern of %xx%xx%xx over and over again will get boring. Mix it up. I only converted 70% of my URL to hex, resulting in this gem: http://4NDIw:U4ODYwMCAxMjE5@5417236243//%3200%38/%64%6f%2d%79%6f%75
. As you can see, this is quite a bit more confusing than the original URL, which was http://theinnerlayer.softlayer.com/2008/do-you-know-where-your-nameserver-is/.

This information can be useful to any systems administrator who is dealing with an elusive, abusive user. Being able to translate a crazy URL to the actual human-readable equivalent can greatly assist both the SoftLayer abuse department as well as any other group attempting to track down spammers, scammers, or just plain old sneaky users.

As a final note: Please don't use this knowledge for evil. As mentioned before, the new versions of both FireFox and Internet Explorer are no longer fooled by the fake authentication string trick, and the rest of the obfuscation should really only be used to fool web spiders. Personally, I used this method in combination with javascript to obfuscate links and email addresses so that I wouldn't get spammed.

The following PHP code was used to generate the links in this article.

[Editor's note: We at SoftLayer use our powers for good and so should you. Thankfully half of these kinds of links won't open in the latest versions of Outlook and Safari. -K]

//the URL we're attempting to obfuscate
$url "http://theinnerlayer.softlayer.com/2008/do-you-know-where-your-nameserver-is/";

$urlData parse_url($url);

$path $urlData['path'];

$startingIP gethostbyname($urlData['host']);

//get the long representation:
$long ip2long($startingIP);

//add 4294967296 to the long for further obfuscation:
$long += 4294967296;

//add random authentication characters to the beginning of the string:
$auth substr(base64_encode(microtime()), rand(5,10), rand(515)) . ":" substr(base64_encode(microtime()), rand(5,10), rand(515));

//obfuscate the rest of the URL
$len strlen($path);
$obfuscatedLocation "";

for ( $p 0$p $len$p++ ) {
//check for slashes
    //also, 3 in 10 characters make it through plain for further confusion

    if ( $path[$p] = = '/' || rand(010) > ) {
$obfuscatedLocation .= $path[$p];

    //made it here, obfuscate this character:
$obfuscatedLocation .= '%' dechex(ord($path[$p]));

echo "http://$auth@$long$obfuscatedLocation";



If this article contains any error, or leaves any of your questions unanswered, please help us out by opening up a github issue.
Open an issue