Tags

, , , ,

Database Table Row Keys

SAITO has grown over time – it contains 390 Windows forms, over 150,000 lines of code and the executable (EXE) is approaching 15 megabytes in size. There are well over 100 tables in its database. Historically, the key to a row in a table in a database has been strongly preferred to be a unique value. That has meant database architects have either chosen some natural combination of columns, like sensor ID and timestamp in our case, or used a synthetic value such as an automatically assigned integer. This latter is usually fast, simple to understand, there are plenty of integers, and the contents of the key more or less reflect the order that an associated row was inserted into the table. A second synthetic value is nicknamed a UUID, which is an abbreviation for Universally Unique Identifier. Microsoft’s implementation of these are GUIDs where the G stands for ‘globally’. GUID is a 128-bit value consisting of one group of 8 hexadecimal digits, followed by three groups of 4 hexadecimal digits each, followed by one group of 12 hexadecimal digits. Here’s an example: 6B29FC40-CA47-1067-B31D-00DD010662DA.

The good thing about GUIDs is that they are easy to generate: one might code something like lszMyGUID = GUID.NewGUID.ToString to load a GUID into a string variable called lszMyGUID.

Millions of programming years ago (in the 1990s) disks were getting progressively larger and the MBR scheme for partitioning a disk imposed limits on disk size and on database size. A planet-wide standard called UEFI = Unified Extensible Firmware Interface was agreed to, and Intel developed a new scheme today known as GPT = GUID Partition Table that removed these limits.

It turns out that auto-numbering can have all sorts of subtle problems in a heavily multi-threaded environment. So we like auto-numbering, but only for some tables. Typically, these would be tables where rows are added by humans or at least are added at relatively low velocities.

In our SAITO application tables where we expect intense input-output activity either have natural column combinations or (second choice) GUIDs for the primary key. Among the challenges with GUIDs are they are bulky, they don’t (on purpose) have any meaning and they scatter the data all over the disk. The hub on which the data wheel of SAITO spins would be the daily sensor measurements. These need to be collected quickly and accurately at nearly the speed that they are generated and then eventually archived.

Advertisements