DocumentDB auto generated ID: GUID or UUID? Which variant?
TL;DR: Are the IDs that are auto-generated by DocumentDB supposed to be GUIDs or UUIDs, and is there actually a difference? If they are UUIDs, then which variant/version of UUID?
Background: Some of the DocumentDB client libraries will auto generate an ID for you if you do not provide one. I have seen it mentioned in the Azure blog and in several related questions that the generated IDs are GUIDs. I know there is some discussion over whether GUIDs are UUIDs, with many people saying that they are.
The problem: However, I have noticed that some of the IDs that DocumentDB auto-generates do not follow the UUID RFC, which allows only the digits 1-5 in the "version" nibble ( V
in xxxxxxxx-xxxx-Vxxx-xxxx-xxxxxxxxxxxx
). DocumentDB generates IDs with any hex digit in that nibble, for example d981befd-d19b-ee48-35bd-c1b507d3ec4f
, whose version nibble is the first e
of ee48
.
It is possible that this depends on which client is used to create the documents. In our DocumentDB database, we have documents with the third grouping dde5
, 627a
, fe95
, and so on. These documents were stored from within a stored procedure by calling Collection.createDocument()
with the options {'disableAutomaticIdGeneration': false}
. Other documents that I create through the third party DocumentDB Studio application always have 4xxx
in the third grouping, which is a valid UUID version. However, documents that I create through the Azure portal have non-standard third groupings like b359
.
Question: Are the auto-generated DocumentDB IDs supposed to be GUIDs or UUIDs, and is there actually a difference? If UUIDs, then which variant?
Poking around in the source code on GitHub, I found that the various client and server side libraries use several different methods for creating what they're calling a GUID (in some libraries) or a UUID (in other libraries).
The nodejs client, Javascript client, and server-side library manufacture what they call a GUID by concatenating series of hex digits and hyphens. Note that these are random, but do not comply with the rules for creating RFC4122 version 4 UUIDs.
The Python client and Java client call their respective standard library methods to generate a random (version 4) UUID.
The .NET client is available via NuGet, but the source code is not yet published.
Summary: