[ACCEPTED]-How do I create unique IDs, like YouTube?-database
Kevin van Zonneveld has written an excellent 9 article including a PHP function to do exactly 8 this. His approach is the best I've found 7 while researching this topic.
His function 6 is quite clever. It uses a fixed $index 5 variable so problematic characters can be 4 removed (vowels for instance, or to avoid 3 O and 0 confusion). It also has an option 2 to obfuscate ids so that they are not easily 1 guessable.
Try this: http://php.net/manual/en/function.uniqid.php
uniqid — Generate a unique ID...
Gets 13 a prefixed unique identifier based on the 12 current time in microseconds.
Caution This function 11 does not generate cryptographically secure 10 values, and should not be used for cryptographic 9 purposes. If you need a cryptographically 8 secure value, consider using random_int(), random_bytes(), or openssl_random_pseudo_bytes() instead.
Warning This 7 function does not guarantee uniqueness of 6 return value. Since most systems adjust 5 system clock by NTP or like, system time 4 is changed constantly. Therefore, it is 3 possible that this function does not return 2 unique ID for the process/thread. Use
more_entropy
to 1 increase likelihood of uniqueness...
base62 or base64 encode your primary key's 7 value then store it in another field.
example 6 base62 for primary key 12443 = 3eH
saves 5 some space, which is why im sure youtube 4 is using it.
doing a base62(A-Za-z0-9) encode 3 on your PK or unique identifier will prevent 2 the overhead of having to check to see if 1 the key already exists :)
I had a similar issue - I had primary id's 26 in the database, but I did not want to expose 25 them to the user - it would've been much 24 better to show some sort of a hash instead. So, I 23 wrote hashids.
Documentation: http://www.hashids.org/php/
Souce: https://github.com/ivanakimov/hashids.php
Hashes 22 created with this class are unique and decryptable. You 21 can provide a custom salt value, so others 20 cannot decrypt your hashes (not that it's 19 a big problem, but still a "good-to-have").
To 18 encrypt a number your would do this:
require('lib/Hashids/Hashids.php');
$hashids = new Hashids\Hashids('this is my salt');
$hash = $hashids->encrypt(123);
Your 17 $hash
would now be: YDx
You can also set minimum 16 hash length as the second parameter to the 15 constructor so your hashes can be longer. Or 14 if you have a complex clustered system you 13 could even encrypt several numbers into 12 one hash:
$hash = $hashids->encrypt(2, 456); /* aXupK */
(for example, if you have a user 11 in cluster 2 and an object with primary id 10 456) Decryption works the same way:
$numbers = $hashids->decrypt('aXupK');
$numbers
would then 9 be: [2, 456]
.
The good thing about this is you don't 8 even have to store these hashes in the database. You 7 could get the hash from url once request 6 comes in and decrypt it on the fly - and 5 then pull by primary id's from the database 4 (which is obviously an advantage in speed).
Same 3 with output - you could encrypt the id's 2 on the way out, and display the hash to 1 the user.
EDIT:
- Changed urls to include both doc website and code source
- Changed example code to adjust to the main lib updates (current PHP lib version is 0.3.0 - thanks to all the open-source community for improving the lib)
Auto-incrementing can easily be crawled. These 9 cannot be predicted, and therefore cannot 8 be sequentially crawled.
I suggest going 7 with a double-url format (Similar to the 6 SO URLs):
yoursite.com/video_idkey/url_friendly_video_title
If you required both the id, and 5 the title in the url, you could then use 4 simple numbers like 0001, 0002, 0003, etc.
Generating 3 these keys can be really simple. You could 2 use the uniqid() function in PHP to generate 13 1 chars, or 23 with more entropy.
If you want short URLs and predictability 1 is not a concern, you can convert the auto-incrementing ID to a higher base.
Here is a small function that generates 2 unique key randomly each time. It has very 1 fewer chances to repeat same unique ID.
function uniqueKey($limit = 10) {
$characters = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ';
$randstring = '';
for ($i = 0; $i < $limit; $i++) {
$randstring .= $characters[rand(0, strlen($characters))];
}
return $randstring;
}
source: generate random unique IDs like YouTube or TinyURL in PHP
A way to do it is by a hash function with 2 unique input every time.
example (you've 1 tagged the question with php therfore):
$uniqueID = null
do {
$uniqueID = sha1( $fileName + date() );
} while ( !isUnique($uniqueID) )
Consider using something like:
$id = base64_encode(md5(uniqid(),true));
uniqid 8 will get you a unique identifier. MD5 will 7 diffuse it giving you a 128 bit result. Base 6 64 encoding that will give you 6 bits per 5 character in an identifier suitable for 4 use on the web, weighing in around 23 characters 3 and computationally intractable to guess. If 2 you want to be even more paranoid ugrade 1 from md5 to sha1 or higher.
There should be a library for PHP to generate 8 these IDs. If not, it's not difficult to 7 implement it.
The advantage is that later 6 you won't have name conflicts, when you 5 try to reorganize or merge different server 4 resources. With numeric ids you would have 3 to change some of them to resolve conflicts 2 and that will result in Url change leading 1 to SEO hit.
So much of this depends on what you need 9 to do. How 'unique' is unique? Are you serving 8 up the unique ID's, and do they mean something 7 in your DB? if so, a sequential # might 6 be ok.
ON the other hand, if you use sequential 5 #'s someone could systematically steal your 4 content by iterating thru the numbers.
There 3 are filesystem commands that will generate 2 unique file names - you could use those.
Or 1 GUID's.
Results of hash functions like SHA-1 or 16 MD5 and GUIDs tend to become very long, which 15 is probably something you don't want. (You've 14 specifically mentioned YouTube as an example: Their 13 identifiers stay relatively short even with 12 the bazillion videos they are hosting.)
This 11 is why you might want to look into converting 10 your numeric IDs, which you are using behind 9 the scenes, into another base when putting 8 them into URLs. Flickr e.g. uses Base58 7 for their canonical short URLs. Details 6 about this are available here: http://www.flickr.com/groups/api/discuss/72157616713786392/. If you 5 are looking for a generic solution, have 4 a look at the PEAR package Mathe_Basex.
Please 3 note that even in another base, the IDs 2 can still be predicted from outside of your 1 application.
I don't have a formula but we do this on 8 a project that I'm on. (I can't share it). But 7 we basically generate one character at a 6 time and append the string.
Once we have 5 a completed string, we check it against 4 the database. If there is no other, we go 3 with it. If it is a duplicate, we start 2 the process over. Not very complicated.
The 1 advantage is, I guess that of a GUID.
This is NOT PHP but can be converted to php or 31 as it's Javascript & so clinetside without 30 the need to slow down the server.. it can 29 be used as you post whatever needs a unique 28 id to your php.
Here is a way to create unique 27 ids limited to
9 007 199 254 740 992 unique id's
it always returns 9 charachters.
where 26 iE2XnNGpF
is 9 007 199 254 740 992
You can encode a long Number
and then decode 25 the 9char generated String
and it returns the 24 number.
basically this function uses the 23 62base index Math.log() and Math.Power to 22 get the right index based on the number.. i 21 would explain more about the function but 20 ifound it some time ago and can't find the 19 site anymore and it toke me very long time 18 to get how this works... anyway i rewrote 17 the function from 0.. and this one is 2-3 16 times faster than the one that i found. i 15 looped through 10million checking if the 14 number is the same as the enc dec process 13 and it toke 33sec with this one and the 12 other one 90sec.
var UID={
ix:'abcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ',
enc:function(N){
N<=9007199254740992||(alert('OMG no more uid\'s'));
var M=Math,F=M.floor,L=M.log,P=M.pow,r='',I=UID.ix,l=I.length,i;
for(i=F(L(N)/L(l));i>=0;i--){
r+=I.substr((F(N/P(l,i))%l),1)
};
return UID.rev(new Array(10-r.length).join('a')+r)
},
dec:function(S){
var S=UID.rev(S),r=0,i,l=S.length,I=UID.ix,j=I.length,P=Math.pow;
for(i=0;i<=(l-1);i++){r+=I.indexOf(S.substr(i,1))*P(j,(l-1-i))};
return r
},
rev:function(a){return a.split('').reverse().join('')}
};
As i wanted a 9 character 11 string i also appended a
's on the generated 10 string which are 0
's.
To encode a number you 9 need to pass a Number
and not a string.
var uniqueId=UID.enc(9007199254740992);
To decode 8 the Number again you need to pass the 9char 7 generated String
var id=UID.dec(uniqueId);
here are some numbers
console.log(UID.enc(9007199254740992))//9 biliardi o 9 milioni di miliardi
console.log(UID.enc(1)) //baaaaaaaa
console.log(UID.enc(10)) //kaaaaaaaa
console.log(UID.enc(100)) //Cbaaaaaaa
console.log(UID.enc(1000)) //iqaaaaaaa
console.log(UID.enc(10000)) //sBcaaaaaa
console.log(UID.enc(100000)) //Ua0aaaaaa
console.log(UID.enc(1000000)) //cjmeaaaaa
console.log(UID.enc(10000000)) //u2XFaaaaa
console.log(UID.enc(100000000)) //o9ALgaaaa
console.log(UID.enc(1000000000)) //qGTFfbaaa
console.log(UID.enc(10000000000)) //AOYKUkaaa
console.log(UID.enc(100000000000)) //OjO9jLbaa
console.log(UID.enc(1000000000000)) //eAfM7Braa
console.log(UID.enc(10000000000000)) //EOTK1dQca
console.log(UID.enc(100000000000000)) //2ka938y2a
As you can 6 see there are alot of a
's and you don't want 5 that... so just start with a high number.
let's 4 say you DB id is 1 .. just add 100000000000000
so that 3 you have 100000000000001
and you unique id looks like youtube's 2 id 3ka938y2a
i don't think it's easy to fulfill the 1 other 8907199254740992
unique id's
More Related questions
We use cookies to improve the performance of the site. By staying on our site, you agree to the terms of use of cookies.