Wondering how those strange looking URL's work?


It is quite possible that at sometime you have received spam email.  Usually you can simply look at the subject line and decide the proper course of action.  To combat this, the spammers modified their approach by providing a vague and often work related subject line in hopes you will actually read the contents of the email.  Fortunately for us, the URL link they want you to browse to gives their intentions away.  Failing to believe that you are simply not interested in their products or services, some spammers resorted to a clever new approach which involves enticing you with a strange looking URL.  For example purposes, let's use

http://3486010312

You may be saying to yourself "Hey, I am Internet savvy, this must be that new smart browsing feature that those browser companies must have added."  If this was the case, then this link should resolve to http://www.3486010312.com.  If you tried the link, you will notice that this resulted in the o'l 404 error or unresolved DNS name.  I now this will never happen here, but let's continue this scenario and assume that you accidentally browsed to the strange URL. There would be a high probability that your hair would be standing on end about now since usually these links are obfuscated for a reason.  From a technical standpoint, you may be wondering how this was done.  Although the link returned an error when you go directly to it, it worked successfully when you just used the numbers.  

Don't be alarmed, there is a simple explanation for this strange behavior.  It evolves around a feature inside of TCP/IP.  Rather than transfer the TCP/IP string as a number of bytes xxx.xxx.xxx.xxx, it is much more efficient to send this as a single number.  This is similar to the techniques we used when we were developing IO intensive programs and wanted to pass more than one piece of information back from a subroutine call.  Through bit masking, we could actually store a number of different integer values in a single long.  The TCP/IP trick is accomplished by switching back and forth between HEX and DEC bases.  Here is how it is done: (Note, if you are on a PC, this can be done quite easily with the built in calculator since it has a Hex and Dec converter.

1) Convert the number from Decimal to Hex (set the type to Dec, enter in the digits 3486010312 and then select the Hex toggle)  This results in a base 16 representation of the base 10 number or CFC84BC8.

2) Now simply extract off sets of two digits. CF - C8 - 4B - C8 (this is called bit masking and can be done quite easily progammatically)

3) The fact that there are 4 separate numbers and 4 digits in the TCP/IP address is not a coincidence.  All you have to do now is convert each of the 4 hex numbers back to base 10, which results in: 207.100.75.200  

4) It should be obvious now, simply use http://207.200.75.200 to browser to the same site.  Feel free to browse there, this example used a well known and highly regarded web-related site.  

That is all there is to it!  By the way, the reverse process holds true as well.  Now, why do spammers go to this much trouble?  Generally it is because firewalls  and child protection software generally block whole domains that are deemed inappropriate for viewing.  Since there isn't a true domain in the URL, it can often avoid detection.

See you next technote, till then, watch your step, the dog's been out