Two Presentations

Tags

, , ,

One for the Embedded Systems Conference in Santa Clara in December and the other for the ARM Conference in the autumn. Slide decks are on the Silver Wolf Wushu web site in the Presentations section. The Embedded Systems Conference presentation was titled “Wind Sweeps Fallen Blossoms” – a reference to a movement in New Frame Road #2 and a movement in Single Saber.

Obtain the Optane™ (part 18 – and last!)

Tags

, , , ,

We give three scenarios: inserts of table rows with natural keys (two integers plus a datetime); inserts of table rows with GUIDs as the keys; and archiving which can be thought of as a SQL SELECT of many rows, writing the data to a text file, deleting the rows, and then inserting corresponding rows in a table stored on a disk. We show results using a Microsoft SQL Server 2014 database.

Timings are in seconds (3600 = 1 hour) for approximately 16 million rows.

Insert natural keys 1 thread 16 threads 32 threads
HDD 7689 3479 2174
SSD 3632 1543 969
Optane 1316 583 373
Insert GUID keys 1 thread 16 threads 32 threads
HDD 9842 7342 5391
SSD 4744 2501 1496
Optane 2263 1580 1051
Archive 1 thread 16 threads 32 threads
HDD 14117 7397 5015
SSD 6376 4540 3096
Optane 2972 1833 1201

We would summarize as follows: the Optane was roughly FOUR to SIX times faster than a very good hard disk and ALMOST THREE times faster than a modern solid state drive.

 
Everyone should obtain an Optane.

 
We note that GUIDs should not be a designer’s first choice for a key and that 32 threads are very likely to be an over-subscription for many machines, so the measures using 16 threads are probably going to be more commonly encountered in practice.

Obtain the Optane™ (part 17)

Tags

, , , , ,

ThreeCPUBoxes.jpg

One time-honored soution is to upgrade the processor. This can get rather complicated if motherboards, memory, fans and power supplies are in play. The Extreme Core I7 (rightmost) has 10 cores (20 threads with HyperThreading) and quad channel DDR4 memory with a huge 25 megabytes of cache. And, as advertised, the new Intel Turbo Boost Max Technology 3.0 does provide more than 10% improvement in single-thread work.

But we really had a input-output problem: the disk was the limiting factor.

HDDSSDOptane

There was strident insistence that the Intel Optane (above right) would only work on Windows 10 64-bit systems AND only with with Intel’s seventh (7th) generation Kaby Lake processors. We were told only the H270, Q270, Z270, B250 and Q250 chipsets would work. So we were forced to use an excellent CPU.

Obtain the Optane™ (part 17)

Tags

, , , ,

HyperThreading

What we generally encounter for database as opposed to purely computational work is that without Intel’s HyperThreading (blue bars on the chart above) wall clock time for a quad core processor minimizes at about 10 threads. With HyperThreading (red bars on the chart on the previous page) the minimum is 15% to 20% lower (here about a significant 60 seconds) and there are gains to be realized from scheduling 6 to 8 additional threads.

ThreadAverageSpeeds

As the number of threads increases it is reasonable to expect a slight linear increase in the time needed to process a fixed number of transactions.

It seemed to us that with a reasonable CPU, a decent hard disk and some aggressive thread management, we could sort out the incoming devices and capture the sensor measurements. Then Professor Peter Wayne and others at Harvard Medical school e-wrote to tell us it was important to measure even during sitting and standing. The ideal in the WuJi style of meditation we teach is to be motionless. What they had found was that sway – how far and how quickly one’s head moved from an ideal position – was a powerful indicator.

So now we had much less time to do a great deal more work.

Obtain the Optane™ (part 16)

Tags

, ,

EARLY RESULTS

Running on an inexpensive quad core laptop (1 GHz processing speed) we obtained the following results (averages) when loading 231,440 sensor measurements

  1. 712 seconds – debug mode; database on the C drive
  2. 622 seconds – release mode; database on the C drive
  3. 682 seconds – debug mode; database on an external drive (USB connection)
  4. 590 seconds – release mode; database on an external drive

The laptop was largely cleared of any other applications, so the primary contenders for CPU and other resources were Windows 10 and the AVG anti-virus software. Debug is about 15% slower than release, but neither one on either disk is fast enough. We need processing to keep pace with the class.

Running on a very expensive desktop (quad core; an Intel core i7 with 3.4 GHz processing speed) we obtain

  1. 343 seconds – debug mode; database on the C drive
  2. 257 seconds – release mode; database on the C drive

Note the increase in velocity and the 25% difference in debug versus release.

For purposes of discussion our SAITO application software in these scenarios has two database tables of interest: one, which we will call Table N, has a natural primary key; the other table, Table G, uses a GUID as the primary key. The measures below are seconds of wall-clock time for 320,000 rows. The figures are an average of five runs – the individual runs did not vary much as we (by intent) “only” had anti-virus and the operating system running.

We needed to be well under 240 seconds so multi-threading was needed.

Description 1 thread insert 1 thread select 16 thread insert 16 thread select
Table N disk (*) 514 292 306 213
Table G disk 398 366 212 236
Table N Optane 146 96 81 63
Table G Optane 116 157 55 77

* = we measured on a Western Digital Passport Ultra (external), a Maxtor Personal Storage 3100 drive (external), a Seagate SRD0NF2 drive (external), two internal hard disk drives; and an Intel 520 Solid State Drive. The figures above are for a 5400 RPM 500 gigabyte internal drive. Our full statistics have 1,2,4,8, and 16 threads for each of the tables and the six drives.

Obtain the Optane™ (part 15)

Tags

, , , ,

Database Table Row Keys

SAITO has grown over time – it contains 390 Windows forms, over 150,000 lines of code and the executable (EXE) is approaching 15 megabytes in size. There are well over 100 tables in its database. Historically, the key to a row in a table in a database has been strongly preferred to be a unique value. That has meant database architects have either chosen some natural combination of columns, like sensor ID and timestamp in our case, or used a synthetic value such as an automatically assigned integer. This latter is usually fast, simple to understand, there are plenty of integers, and the contents of the key more or less reflect the order that an associated row was inserted into the table. A second synthetic value is nicknamed a UUID, which is an abbreviation for Universally Unique Identifier. Microsoft’s implementation of these are GUIDs where the G stands for ‘globally’. GUID is a 128-bit value consisting of one group of 8 hexadecimal digits, followed by three groups of 4 hexadecimal digits each, followed by one group of 12 hexadecimal digits. Here’s an example: 6B29FC40-CA47-1067-B31D-00DD010662DA.

The good thing about GUIDs is that they are easy to generate: one might code something like lszMyGUID = GUID.NewGUID.ToString to load a GUID into a string variable called lszMyGUID.

Millions of programming years ago (in the 1990s) disks were getting progressively larger and the MBR scheme for partitioning a disk imposed limits on disk size and on database size. A planet-wide standard called UEFI = Unified Extensible Firmware Interface was agreed to, and Intel developed a new scheme today known as GPT = GUID Partition Table that removed these limits.

It turns out that auto-numbering can have all sorts of subtle problems in a heavily multi-threaded environment. So we like auto-numbering, but only for some tables. Typically, these would be tables where rows are added by humans or at least are added at relatively low velocities.

In our SAITO application tables where we expect intense input-output activity either have natural column combinations or (second choice) GUIDs for the primary key. Among the challenges with GUIDs are they are bulky, they don’t (on purpose) have any meaning and they scatter the data all over the disk. The hub on which the data wheel of SAITO spins would be the daily sensor measurements. These need to be collected quickly and accurately at nearly the speed that they are generated and then eventually archived.

Obtain the Optane™ (part 14)

Tags

, , , , , ,

Most aides carry devices like an iPad and some students also have smart phones. The SAITO software has to sort out what a device does and who it belongs to. Class starts with a formal bow and salute, followed by five minutes of sitting meditation and then five minutes of standing meditation. Then several minutes of centuries-old Chen family warm-up exercises, so we had thought we had a comfortable amount of time until the first Tai Chi Chuan set to perform this identification process. Until Professor Peter Wayne and others at Harvard Medical School pointed out it was useful to measure movements during sitting and standing. We’ll see what the upcoming Internet of Things Conference and the Sensors Expo (both in San Jose California in May and June, respectively) showcase in terms of hardware, but we are leaning toward pressure sensors embedded in chair seats and personal foot mats.

The shortest and simplest (and, therefore, the first taught) of the Chen Family style sets is known by the precise but not especially imaginative name of 18 Movements. Once they learn this set, students would perform it twice per class forever. The students can see a canonical video of Grandmaster Chen Zhenglei, who choreographed 18 Movements, either projected on large mirrors or on smart glasses. 16 students times 20 sensors ties several times per second gets to be a lot of measurements to store in a database very quickly. Well over 100,000 sustained database inserts per minute. And we have to extract the raw sensor data from the Internet of Things hub where it is stored.

IntelIOTGateway

Obtain the Optane™ (part 12)

Tags

, ,

A Typical Class

On an average day three two-hour long classes each with 16 students, most of whom have autism spectrum disabilities. That means they often have expressive language disabilities (cannot speak), behavioral issues and may have medical challenges like seizures, tachycardia (heart rate suddenly triples) or overheating. Before class starts the teacher places a tub file for each student on each table and checks that necessary clothing and objects are available. Things start when a bus or van arrives and we get a head count of students and their aides from the driver. We use biometrics to check everyone in – we currently use multiple fingertip readers to keep the bottleneck to a minimum.

USBBBoth

The SAITO software sends emails to designated parents, schools or other third parties indicating the student did (or did not) arrive.

If we are expecting a guest viewer or teacher there will have been a poster of him or her in view on the way to the practice area. Students have added the habit of touching a portrait of Grandmaster Chen Zhenglei. The significance remains elusive.

PhilCeciliaCZLPoster

Four days a week students dress informally – that usually means a school t-shirt and traditional black pants. Once a week or so students dress in semi-formal black cotton uniforms (leftmost of the images below) for film that will be sent to outside reviewers. If we have a guest, or there is a dress rehearsal or an exhibition, then everyone dresses in full formal silks (center and rightmost of the images below).

ThreeOutfits

Obtain the Optane™ (part 11)

Tags

,

11. Return to the Performance Monitor window. When ready to begin logging ReadyBoost activity, just click the green Play icon.

Perfmon07

12. After the test interval save the logged data to a file

Perfmon08

13. Click the Stop icon

14. Select Performance Monitor in the navigation pane.

15. Click the View Log Data icon

16. When the Performance Monitor Properties dialog box appears, click the Add button.

17. Locate and select your log file, as shown previously

Perfmon09

In our case, for a typical slice of time Ready Boost was largely useless as we expected. The configuration was a server that was intended to run SAITO and the Ready Boost drive had Windows, our anti-malware software and the SAITO executable. We prefer to load Windows are infrequently as possible, which means the anti-virus software gets loaded infrequently as well. Similarly, SAITO tends to be kept running, so there would be little get use in optimizing program loads.

Our database was on a hard disk.

That was going to change.