How to create GUID / UUID?
How to create GUID / UUID?
Question
I'm trying to create globally-unique identifiers in JavaScript. I'm not sure what routines are available on all browsers, how "random" and seeded the built-in random number generator is, etc.
The GUID / UUID should be at least 32 characters and should stay in the ASCII range to avoid trouble when passing them around.
Accepted Answer
UUIDs (Universally Unique IDentifier), also known as GUIDs (Globally Unique IDentifier), according to RFC 4122, are identifiers designed to provide certain uniqueness guarantees.
While it is possible to implement an RFC-compliant UUIDs in a few lines of JS (E.g. see @broofa's answer, below) there are several common pitfalls:
- Invalid id format (UUIDs must be of the form "
xxxxxxxx-xxxx-Mxxx-Nxxx-xxxxxxxxxxxx
", where x is one of [0-9, a-f] M is one of [1-5], and N is [8, 9, a, or b] - Use of a low-quality source of randomness (such as
Math.random
)
Thus, developers writing code for production environments are encouraged to use a rigorous, well-maintained implementation such as the uuid module.
Popular Answer
For an RFC4122 version 4 compliant solution, this one-liner(ish) solution is the most compact I could come up with:
function uuidv4() {
return 'xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx'.replace(/[xy]/g, function(c) {
var r = Math.random() * 16 | 0, v = c == 'x' ? r : (r & 0x3 | 0x8);
return v.toString(16);
});
}
console.log(uuidv4());
Update, 2015-06-02: Be aware that UUID uniqueness relies heavily on the underlying random number generator (RNG). The solution above uses Math.random()
for brevity, however Math.random()
is not guaranteed to be a high-quality RNG. See Adam Hyland's excellent writeup on Math.random() for details. For a more robust solution, consider using the uuid module, which uses higher quality RNG APIs.
Update, 2015-08-26: As a side-note, this gist describes how to determine how many IDs can be generated before reaching a certain probability of collision. For example, with 3.26x1015 version 4 RFC4122 UUIDs you have a 1-in-a-million chance of collision.
Update, 2017-06-28: A good article from Chrome developers discussing the state of Math.random PRNG quality in Chrome, Firefox, and Safari. tl;dr - As of late-2015 it's "pretty good", but not cryptographic quality. To address that issue, here's an updated version of the above solution that uses ES6, the crypto
API, and a bit of JS wizardry I can't take credit for:
function uuidv4() {
return ([1e7]+-1e3+-4e3+-8e3+-1e11).replace(/[018]/g, c =>
(c ^ crypto.getRandomValues(new Uint8Array(1))[0] & 15 >> c / 4).toString(16)
);
}
console.log(uuidv4());
Update, 2020-01-06: There is a proposal in the works for a standard uuid
module as part of the JS language
Read more... Read less...
I really like how clean Broofa's answer is, but it's unfortunate that poor implementations of Math.random
leave the chance for collision.
Here's a similar RFC4122 version 4 compliant solution that solves that issue by offsetting the first 13 hex numbers by a hex portion of the timestamp, and once depleted offsets by a hex portion of the microseconds since pageload. That way, even if Math.random
is on the same seed, both clients would have to generate the UUID the exact same number of microseconds since pageload (if high-perfomance time is supported) AND at the exact same millisecond (or 10,000+ years later) to get the same UUID:
function generateUUID() { // Public Domain/MIT
var d = new Date().getTime();//Timestamp
var d2 = (performance && performance.now && (performance.now()*1000)) || 0;//Time in microseconds since page-load or 0 if unsupported
return 'xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx'.replace(/[xy]/g, function(c) {
var r = Math.random() * 16;//random number between 0 and 16
if(d > 0){//Use timestamp until depleted
r = (d + r)%16 | 0;
d = Math.floor(d/16);
} else {//Use microseconds since page-load if supported
r = (d2 + r)%16 | 0;
d2 = Math.floor(d2/16);
}
return (c === 'x' ? r : (r & 0x3 | 0x8)).toString(16);
});
}
console.log(generateUUID())
broofa's answer is pretty slick, indeed - impressively clever, really... rfc4122 compliant, somewhat readable, and compact. Awesome!
But if you're looking at that regular expression, those many replace()
callbacks, toString()
's and Math.random()
function calls (where he's only using 4 bits of the result and wasting the rest), you may start to wonder about performance. Indeed, joelpt even decided to toss out RFC for generic GUID speed with generateQuickGUID
.
But, can we get speed and RFC compliance? I say, YES! Can we maintain readability? Well... Not really, but it's easy if you follow along.
But first, my results, compared to broofa, guid
(the accepted answer), and the non-rfc-compliant generateQuickGuid
:
Desktop Android
broofa: 1617ms 12869ms
e1: 636ms 5778ms
e2: 606ms 4754ms
e3: 364ms 3003ms
e4: 329ms 2015ms
e5: 147ms 1156ms
e6: 146ms 1035ms
e7: 105ms 726ms
guid: 962ms 10762ms
generateQuickGuid: 292ms 2961ms
- Note: 500k iterations, results will vary by browser/cpu.
So by my 6th iteration of optimizations, I beat the most popular answer by over 12X, the accepted answer by over 9X, and the fast-non-compliant answer by 2-3X. And I'm still rfc4122 compliant.
Interested in how? I've put the full source on http://jsfiddle.net/jcward/7hyaC/3/ and on http://jsperf.com/uuid-generator-opt/4
For an explanation, let's start with broofa's code:
function broofa() {
return 'xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx'.replace(/[xy]/g, function(c) {
var r = Math.random()*16|0, v = c == 'x' ? r : (r&0x3|0x8);
return v.toString(16);
});
}
console.log(broofa())
So it replaces x
with any random hex digit, y
with random data (except forcing the top 2 bits to 10
per the RFC spec), and the regex doesn't match the -
or 4
characters, so he doesn't have to deal with them. Very, very slick.
The first thing to know is that function calls are expensive, as are regular expressions (though he only uses 1, it has 32 callbacks, one for each match, and in each of the 32 callbacks it calls Math.random() and v.toString(16)).
The first step toward performance is to eliminate the RegEx and its callback functions and use a simple loop instead. This means we have to deal with the -
and 4
characters whereas broofa did not. Also, note that we can use String Array indexing to keep his slick String template architecture:
function e1() {
var u='',i=0;
while(i++<36) {
var c='xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx'[i-1],r=Math.random()*16|0,v=c=='x'?r:(r&0x3|0x8);
u+=(c=='-'||c=='4')?c:v.toString(16)
}
return u;
}
console.log(e1())
Basically, the same inner logic, except we check for -
or 4
, and using a while loop (instead of replace()
callbacks) gets us an almost 3X improvement!
The next step is a small one on the desktop but makes a decent difference on mobile. Let's make fewer Math.random() calls and utilize all those random bits instead of throwing 87% of them away with a random buffer that gets shifted out each iteration. Let's also move that template definition out of the loop, just in case it helps:
function e2() {
var u='',m='xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx',i=0,rb=Math.random()*0xffffffff|0;
while(i++<36) {
var c=m[i-1],r=rb&0xf,v=c=='x'?r:(r&0x3|0x8);
u+=(c=='-'||c=='4')?c:v.toString(16);rb=i%8==0?Math.random()*0xffffffff|0:rb>>4
}
return u
}
console.log(e2())
This saves us 10-30% depending on platform. Not bad. But the next big step gets rid of the toString function calls altogether with an optimization classic - the look-up table. A simple 16-element lookup table will perform the job of toString(16) in much less time:
function e3() {
var h='0123456789abcdef';
var k='xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx';
/* same as e4() below */
}
function e4() {
var h=['0','1','2','3','4','5','6','7','8','9','a','b','c','d','e','f'];
var k=['x','x','x','x','x','x','x','x','-','x','x','x','x','-','4','x','x','x','-','y','x','x','x','-','x','x','x','x','x','x','x','x','x','x','x','x'];
var u='',i=0,rb=Math.random()*0xffffffff|0;
while(i++<36) {
var c=k[i-1],r=rb&0xf,v=c=='x'?r:(r&0x3|0x8);
u+=(c=='-'||c=='4')?c:h[v];rb=i%8==0?Math.random()*0xffffffff|0:rb>>4
}
return u
}
console.log(e4())
The next optimization is another classic. Since we're only handling 4-bits of output in each loop iteration, let's cut the number of loops in half and process 8-bits each iteration. This is tricky since we still have to handle the RFC compliant bit positions, but it's not too hard. We then have to make a larger lookup table (16x16, or 256) to store 0x00 - 0xff, and we build it only once, outside the e5() function.
var lut = []; for (var i=0; i<256; i++) { lut[i] = (i<16?'0':'')+(i).toString(16); }
function e5() {
var k=['x','x','x','x','-','x','x','-','4','x','-','y','x','-','x','x','x','x','x','x'];
var u='',i=0,rb=Math.random()*0xffffffff|0;
while(i++<20) {
var c=k[i-1],r=rb&0xff,v=c=='x'?r:(c=='y'?(r&0x3f|0x80):(r&0xf|0x40));
u+=(c=='-')?c:lut[v];rb=i%4==0?Math.random()*0xffffffff|0:rb>>8
}
return u
}
console.log(e5())
I tried an e6() that processes 16-bits at a time, still using the 256-element LUT, and it showed the diminishing returns of optimization. Though it had fewer iterations, the inner logic was complicated by the increased processing, and it performed the same on desktop, and only ~10% faster on mobile.
The final optimization technique to apply - unroll the loop. Since we're looping a fixed number of times, we can technically write this all out by hand. I tried this once with a single random variable r that I kept re-assigning, and performance tanked. But with four variables assigned random data up front, then using the lookup table, and applying the proper RFC bits, this version smokes them all:
var lut = []; for (var i=0; i<256; i++) { lut[i] = (i<16?'0':'')+(i).toString(16); }
function e7()
{
var d0 = Math.random()*0xffffffff|0;
var d1 = Math.random()*0xffffffff|0;
var d2 = Math.random()*0xffffffff|0;
var d3 = Math.random()*0xffffffff|0;
return lut[d0&0xff]+lut[d0>>8&0xff]+lut[d0>>16&0xff]+lut[d0>>24&0xff]+'-'+
lut[d1&0xff]+lut[d1>>8&0xff]+'-'+lut[d1>>16&0x0f|0x40]+lut[d1>>24&0xff]+'-'+
lut[d2&0x3f|0x80]+lut[d2>>8&0xff]+'-'+lut[d2>>16&0xff]+lut[d2>>24&0xff]+
lut[d3&0xff]+lut[d3>>8&0xff]+lut[d3>>16&0xff]+lut[d3>>24&0xff];
}
console.log(e7())
Modualized: http://jcward.com/UUID.js - UUID.generate()
The funny thing is, generating 16 bytes of random data is the easy part. The whole trick is expressing it in String format with RFC compliance, and it's most tightly accomplished with 16 bytes of random data, an unrolled loop and lookup table.
I hope my logic is correct -- it's very easy to make a mistake in this kind of tedious bit-work. But the outputs look good to me. I hope you enjoyed this mad ride through code optimization!
Be advised: my primary goal was to show and teach potential optimization strategies. Other answers cover important topics such as collisions and truly random numbers, which are important for generating good UUIDs.
Here's some code based on RFC 4122, section 4.4 (Algorithms for Creating a UUID from Truly Random or Pseudo-Random Number).
function createUUID() {
// http://www.ietf.org/rfc/rfc4122.txt
var s = [];
var hexDigits = "0123456789abcdef";
for (var i = 0; i < 36; i++) {
s[i] = hexDigits.substr(Math.floor(Math.random() * 0x10), 1);
}
s[14] = "4"; // bits 12-15 of the time_hi_and_version field to 0010
s[19] = hexDigits.substr((s[19] & 0x3) | 0x8, 1); // bits 6-7 of the clock_seq_hi_and_reserved to 01
s[8] = s[13] = s[18] = s[23] = "-";
var uuid = s.join("");
return uuid;
}
let uniqueId = Math.random().toString(36).substring(2) + Date.now().toString(36);
document.getElementById("unique").innerHTML =
Math.random().toString(36).substring(2) + (new Date()).getTime().toString(36);
<div id="unique">
</div>
If ID's are generated more than 1 millisecond apart, they are 100% unique.
If two ID's are generated at shorter intervals, and assuming that the random method is truly random, this would generate ID's that are 99.99999999999999% likely to be globally unique (collision in 1 of 10^15)
You can increase this number by adding more digits, but to generate 100% unique ID's you will need to use a global counter.
if you need RFC compatibility, this formatting will pass as a valid version 4 GUID:
let u = Date.now().toString(16) + Math.random().toString(16) + '0'.repeat(16);
let guid = [u.substr(0,8), u.substr(8,4), '4000-8' + u.substr(13,3), u.substr(16,12)].join('-');
let u = Date.now().toString(16)+Math.random().toString(16)+'0'.repeat(16);
let guid = [u.substr(0,8), u.substr(8,4), '4000-8' + u.substr(13,3), u.substr(16,12)].join('-');
document.getElementById("unique").innerHTML = guid;
<div id="unique">
</div>
Edit: The above code follow the intention, but not the letter of the RFC. Among other discrepancies it's a few random digits short. (Add more random digits if you need it) The upside is that this it's really fast :) You can test validity of your GUID here
Fastest GUID like string generator method in the format XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
. This does not generate standard-compliant GUID.
Ten million executions of this implementation take just 32.5 seconds, which is the fastest I've ever seen in a browser (the only solution without loops/iterations).
The function is as simple as:
/**
* Generates a GUID string.
* @returns {string} The generated GUID.
* @example af8a8416-6e18-a307-bd9c-f2c947bbb3aa
* @author Slavik Meltser.
* @link http://slavik.meltser.info/?p=142
*/
function guid() {
function _p8(s) {
var p = (Math.random().toString(16)+"000000000").substr(2,8);
return s ? "-" + p.substr(0,4) + "-" + p.substr(4,4) : p ;
}
return _p8() + _p8(true) + _p8(true) + _p8();
}
To test the performance, you can run this code:
console.time('t');
for (var i = 0; i < 10000000; i++) {
guid();
};
console.timeEnd('t');
I'm sure most of you will understand what I did there, but maybe there is at least one person that will need an explanation:
The algorithm:
- The
Math.random()
function returns a decimal number between 0 and 1 with 16 digits after the decimal fraction point (for example0.4363923368509859
). - Then we take this number and convert
it to a string with base 16 (from the example above we'll get
0.6fb7687f
).
Math.random().toString(16)
. - Then we cut off the
0.
prefix (0.6fb7687f
=>6fb7687f
) and get a string with eight hexadecimal characters long.
(Math.random().toString(16).substr(2,8)
. - Sometimes the
Math.random()
function will return shorter number (for example0.4363
), due to zeros at the end (from the example above, actually the number is0.4363000000000000
). That's why I'm appending to this string"000000000"
(a string with nine zeros) and then cutting it off withsubstr()
function to make it nine characters exactly (filling zeros to the right). - The reason for adding exactly nine zeros is because of the worse case scenario, which is when the
Math.random()
function will return exactly 0 or 1 (probability of 1/10^16 for each one of them). That's why we needed to add nine zeros to it ("0"+"000000000"
or"1"+"000000000"
), and then cutting it off from the second index (3rd character) with a length of eight characters. For the rest of the cases, the addition of zeros will not harm the result because it is cutting it off anyway.
Math.random().toString(16)+"000000000").substr(2,8)
.
The assembly:
- The GUID is in the following format
XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
. - I divided the GUID into 4 pieces, each piece divided into 2 types (or formats):
XXXXXXXX
and-XXXX-XXXX
. - Now I'm building the GUID using these 2 types to assemble the GUID with call 4 pieces, as follows:
XXXXXXXX
-XXXX-XXXX
-XXXX-XXXX
XXXXXXXX
. - To differ between these two types, I added a flag parameter to a pair creator function
_p8(s)
, thes
parameter tells the function whether to add dashes or not. - Eventually we build the GUID with the following chaining:
_p8() + _p8(true) + _p8(true) + _p8()
, and return it.
Enjoy! :-)