How to code a URL shortener?
I want to create a URL shortener service where you can write a long URL into an input field and the service shortens the URL to " http://www.example.org/abcdef
".
Edit: Due to the ongoing interest in this topic, I've published an efficient solution to GitHub, with implementations for JavaScript, PHP, Python and Java. Add your solutions if you like :)
Instead of " abcdef
" there can be any other string with six characters containing az, AZ and 0-9
. That makes 56~57 billion possible strings.
My approach:
I have a database table with three columns:
I would then insert the long URL into the table. Then I would select the auto-increment value for " id
" and build a hash of it. This hash should then be inserted as " short
". But what sort of hash should I build? Hash algorithms like MD5 create too long strings. I don't use these algorithms, I think. A self-built algorithm will work, too.
My idea:
For " http://www.google.de/
" I get the auto-increment id 239472
. Then I do the following steps:
short = '';
if divisible by 2, add "a"+the result to short
if divisible by 3, add "b"+the result to short
... until I have divisors for a-z and A-Z.
That could be repeated until the number isn't divisible any more. Do you think this is a good approach? Do you have a better idea?
I would continue your "convert number to string" approach. However you will realize that your proposed algorithm fails if your ID is a prime and greater than 52.
Theoretical background
You need a Bijective Function f. This is necessary so that you can find a inverse function g('abc') = 123 for your f(123) = 'abc' function. This means:
How to convert the ID to a shortened URL
[a-zA-Z0-9]
. It contains 62 letters. Take an auto-generated, unique numerical key (the auto-incremented id
of a MySQL table for example).
For this example I will use 12510 (125 with a base of 10).
Now you have to convert 12510 to X62 (base 62).
12510 = 2×621 + 1×620 = [2,1]
This requires use of integer division and modulo. A pseudo-code example:
digits = []
while num > 0
remainder = modulo(num, 62)
digits.push(remainder)
num = divide(num, 62)
digits = digits.reverse
Now map the indices 2 and 1 to your alphabet. This is how your mapping (with an array for example) could look like:
0 → a
1 → b
...
25 → z
...
52 → 0
61 → 9
With 2 → c and 1 → b you will receive cb62 as the shortened URL.
http://shor.ty/cb
How to resolve a shortened URL to the initial ID
The reverse is even easier. You just do a reverse lookup in your alphabet.
e9a62 will be resolved to "4th, 61st, and 0th letter in alphabet".
e9a62 = [4,61,0]
= 4×622 + 61×621 + 0×620 = 1915810
Now find your database-record with WHERE id = 19158
and do the redirect.
Some implementations (provided by commenters)
Why would you want to use a hash?
You can just use a simple translation of your auto-increment value to an alphanumeric value. You can do that easily by using some base conversion. Say you character space (AZ,az,0-9 etc') has 40 characters, convert the id to a base-40 number and use the characters are the digits.
public class UrlShortener {
private static final String ALPHABET = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
private static final int BASE = ALPHABET.length();
public static String encode(int num) {
StringBuilder sb = new StringBuilder();
while ( num > 0 ) {
sb.append( ALPHABET.charAt( num % BASE ) );
num /= BASE;
}
return sb.reverse().toString();
}
public static int decode(String str) {
int num = 0;
for ( int i = 0; i < str.length(); i++ )
num = num * BASE + ALPHABET.indexOf(str.charAt(i));
return num;
}
}
链接地址: http://www.djcxy.com/p/12600.html
上一篇: 用一次乘法提取位
下一篇: 如何编码URL缩短器?