Thursday, June 11, 2020

Extensible Database Tester in Python, with Permutations

Here is some Python code that allows you to generate all combinations of options by choosing one item from an arbitrary number of different lists.  Not only that, but each option is actually a data structure that also includes lots of metadata about the ramifications of the option, in addition to the option value itself.

Let's say you're looking to run performance testing on different types of tables and underlying data arrangements in Amazon Redshift Spectrum.  You can devise a template SQL query that can be configurable in multiple ways.  For instance, you could choose between multiple tables or views from which to query data (say, if comparing Redshift to Redshift Spectrum performance), and you also want to see the performance of count(1) vs. count(*) to see if one uses more data or works quicker on Redshift vs. Spectrum.  Thus, you already have two lists of options, and need to come up with the four permutations of these options.

The Groundwork

Basically, to lay some ground rules, each option contains both the value to be placed into the query, as well as a tag that can be appended to a string in order to designate the specific query that was run when looking back at query performance later.  A single list of choices, with expandability for more lists, looks like this:

config.query_options = {
    "table_choice": [
        {"value": "my_redshift_table", "tag": "rst"},
        {"value": "my_spectrum_table", "tag": "spt"}

One can continue adding more lists in that same parent, the configuration option dictionary, as a sibling of table_choice.  It's interesting to have a tag field for each option because in Redshift and Spectrum, there are a series of tables that may exist under the pg_catalog schema.  One of these tables is called stl_query, which actually lists queries run on the Redshift cluster and how long they took to run.  (This assumes you have such logging enabled for the cluster.)  As such, if you want to refer back to this data later, you can concatenate all the tags from the selected options and SELECT it as a column in order to refer to the specific query later.  This could be implemented by means of:

SELECT 'tag-cat1a-cat2b' as tag, other_columns_I_actually_care_about FROM table ETC...

And this could be a mechanism of indicating that option A was picked for category 1, and option B was picked for category 2.  Now, you could search the stl_query field for metrics on the original query, but then you would have to synthesize the whole query once again to search for how it ran given the specific set of options.  But with the tag, now you can easily search for the performance of a query with a given set of options:

SELECT * from pg_catalog.stl_query WHERE querytxt ilike '%tag-cat1a-cat2b%';

While we're at it, let's have a look at what the body of configuration templates looks like.  These templates are what all the options above get stuffed into.

config.queries = {
    "aggregation-query": {
        "params": ["count_type", "table_choice"],
        "query": (lambda params : f"""
            select count({params['count_type']['value']})
            from {params['table_choice']['value']} tbl
            where tbl.evt_yr = '2020'
            and tbl.evt_mo = '6'
            and tbl.evt_day = '11';""")

Above, we have defined a single SQL query, to which both configuration lists of count_type and table_choice are important.  (As you will see later, we can pick and choose what lists we want, in case we don't want all combinations.). Then, the actual query itself is a heredoc where the options are filled in by means of a formatted string.  The use of a lambda function allows us to pass in any combination relevant to the query, no matter how many parameters it has (i.e. it doesn't matter if the options were chosen from all the lists or not.)  This allows for code reuse by not forcing a fixed number of parameters like you would have if you were trying to pass parameters directly into a string formatter such as %s or .format().  We can even pass in options with "extras", i.e. more than just value and tag, in case some options have additional ramifications.

Methods to the Madness of Concocting Combinations

There are easy ways to come up with all combinations already with list comprehensions.  However, these are inflexible since the number of lists must be fixed, and it is not easily expandable into other types of data structures like dictionaries.  Or, even if the above isn't true, it starts to look quite complex and becomes difficult to read.

We can rely on the itertools library to help us with making the permutations:

    params = [[{desired_key: option} for option in config.query_options[desired_key]] for desired_key in desired_keys]
    param_combos = list(itertools.product(*params))

The first line creates a dictionary object consisting of, following the example configuration above: {"table_choice": {"value": "my_redshift_table", "tag": "rst"}}

The desired_keys list is something belonging to each configuration template, and designates exactly what parameter lists are important to it.  That is to say, if you were to have multiple groups of choices in the structure at the top of this post, but one group doesn't matter to a particular template (e.g. query), then just omit it from this list.  Anyway, the dictionary object created by the above code snippet is nested in an array of similar dictionary objects for each entry in the table_choice list, and then multiple such arrays exist in a parent array for each subsequent list of configurations, if specified.

If you imagine this boiled down into just literals, you would have [[a, b, c], [1, 2]].  The characters abc1, and 2 represent five different dictionaries.  They are grouped into lists based on what list of options they come from.  Each of these dictionaries looks like this:

{"desired_key": {"tag": "cat1a", "value": "its_value"}}

The dictionaries ab, and c are derived from the first group of options (e.g. the table_choice), thus the desired_key field would actually say table_choice for these items.  The dictionaries 1 and 2 are derived from the second list of options (e.g. the count_type), thus the desired_key field would actually say count_type for these items.

The second line is where the magic happens in creating the iterations.  The product function takes items inside each sub-list and pairs them with each item inside the other sub-lists to form all possible combinations of items from each list (note that order doesn't matter, so these are not permutations).  This function produces a list of tuples, where each tuple contains items from each list.  Again, with just literals (from the array in the previous paragraph), you would get [(a, 1), (a, 2), (b, 1), (b, 2), (c, 1), (c, 2)].  However, our objects are a lot more complex than that, so you can imagine taking the dictionary structure in the previous paragraph and substituting each letter and number in the above list of tuples with that dictionary structure.  It gets pretty lengthy to write out!

Now that we have made all the combinations, we have to consider how to use them to fill in template placeholders in our long string values.  First, let's consider how to make our tag concatenation, i.e. the tag-cat1a-cat2b from above.  To do this, note that each combination made by product is a tuple in a list.  Consider each tuple in the list as the variable param_set, and then use the join function over a list comprehension as such:

suffix = '-'.join([i[list(i)[0]]['tag'] for i in param_set])

Because each dictionary item in the param_set tuple contains one key, and just one key, but it is unknown (it could be either table_choice or count_type in this example), we can abstract that away by transforming the dictionary into a list and asking for its first element.  This is what list(i)[0] does.  Once we have that, we can fetch the tag item inside each dictionary i in the tuple, i of course representing the one selected option from each list making up the particular combination param_set.  Then, the join function puts them all together in one long string.

The next step is to actually fill in the configuration template with the desired values.  Unfortunately, it won't be easy to do this with arbitrary keys cooped up inside tuples, so let's extract the dictionary object from each tuple and merge them together.  We can do this by creating an empty dict, and then iterating over each tuple to update the new dict with the dicts inside all of the tuples.

params = {}
for i in param_set:
initial_query = config.queries[key]['query'](params)

The final line calls a lambda function that exists inside a dictionary object listing each parameter template.  For convenience, here is the configuration template again:

config.queries = {
    "aggregation-query": {
        "params": ["count_type", "table_choice"],
        "query": (lambda params : f"""
            select count({params['count_type']['value']})
            from {params['table_choice']['value']} tbl
            where tbl.evt_yr = '2020'
            and tbl.evt_mo = '6'
            and tbl.evt_day = '11';""")

In this case, the key is aggregation-query, and the query is the lambda function, and we pass in params as an argument to the lambda function, where params contains the selected option from each option list for this given combination.

The Whole Enchilada

This is what the entire code looks like.  If you're looking for an example or sample in this tutorial, look no further.

def make_params(desired_keys):
    params = [[{desired_key: option} for option in config.query_options[desired_key]] for desired_key in desired_keys]
    param_combos = list(itertools.product(*params))
    return param_combos

for key in config.queries.keys():
    print("Testing key " + key)
    if (config.queries[key]['params'] == []):
        param_combos = [()]
        param_combos = make_params(config.queries[key]['params'])

    for param_set in param_combos:
        suffix = '-'.join([i[list(i)[0]]['tag'] for i in param_set])
        name = key + ("-" + suffix) if (suffix != "") else key
        params = {}
        for i in param_set:
        initial_query = config.queries[key]['query'](params)

        identity_string = f"select '{name}' as name, "
        query = initial_query.replace("select ", identity_string, 1)

Accommodating Templates that Need No Options

"What if the parameter template doesn't have any parameters," you ask?  First off, to illustrate what that would look like, the configuration template for a query with no desired options would look like this:

    "no-params-needed": {
        "params": [],
        "query": (lambda params : f"""select 1;""")

In this case, before the for loop runs that finds each param_set from within your param_combos (i.e. the params list that is empty), you need to set param_combos to a list with a single empty dict object in it, so it can see that there is one param_set that is empty.  It will construct an empty suffix, and will pass in an empty dictionary into a lambda function that won't even pay attention to the dictionary at all.

What about additional parameters in your options?

Let's say that some of your Rosetta Spectrum tables are based on Spark dataframes that were partitioned in various ways.  Some tables have daily partitions, whereas others are monthly.  As such, the daily tables have event_day as a partition column, whereas the monthly tables don't.  To select on day, you will need to utilize parsing a timestamp column.  Here is what that looks like:

Daily partitioned data: WHERE event_day = '10'
Monthly partitioned data: WHERE DATE_PART('day', tbl.event_timestamp) = 10

To account for the different WHERE clauses you must use in your queries, note these aren't individual options themselves, but they are in fact tied to the selection of a particular option.  As such, you need to specify these extra parameters with the option for which table you are going to select from, knowing whether it's backed by daily or monthly data.

You can define dictionaries for each partition type, as such:

extras_daily = {
    "event_day": "event_day = '10'",

extras_monthly = {
    "event_day": "DATE_PART('day', tbl.event_timestamp) = 10",

Note that if you're trying to select on a range of dates, you can add another key to both of these dictionaries to write the query to look at a range over event_day or event_timestamp, depending on what is required by the partition type.

In any event, you incorporate this into your options dictionaries as such:

config.query_options = {
    "table_choice": [
       {"value": "daily_table", "tag": "dy_tbl", "extras": extras_daily},
       {"value": "monthly_table", "tag": "mo_tbl", "extras": extras_monthly}

And into your configuration template and Lambda function as such:

config.queries = {
    "aggregation-query": {
        "params": ["count_type", "table_choice"],
        "query": (lambda params : f"""
            select count({params['count_type']['value']})
            from {params['table_choice']['value']} tbl
            where tbl.evt_yr = '2020'
            and tbl.evt_mo = '6'
            and {params['table_choice']['extras']['event_day']};""")

And voila!  Now you have a highly flexible, customizable way to create combinations of different options from different lists into a configuration template.

Thursday, October 3, 2019

Touring with Turo - The Pros and Cons

Driving to more remote yet scenic places as a side trip to a big city can reveal places and ways of life unknown to folks who don’t get out of the downtown bubble.  Not everything a city dweller enjoys is in the city, so part of the experience of a culture is to get to know where folks go to take a break.  For instance, when visiting Washington, DC, why not hit Virginia Beach too?  Maybe next time you’re in Boston, think of taking a couple hours to go to Cape Cod.  Maybe if you’re in New York City, take a visit to Long Island (though better plan that one in advance).  But if you find yourself in Detroit, it might be better to go to Windsor just across the Canadian border and stay put. :-P  As such, for once, I will regale you with experiences not from software I'm writing, but that I'm using as a consumer.

Background - What brought me here?

I was in quite a conundrum during a recent trip, visiting a city where several famous Ivy League schools were all about to start at once, and the vast majority of the students would be moving in.  I arrived in the city late Tuesday, had a conference Wednesday, and then had Thursday and Friday free.  One of these days was to be allocated for taking a tour of somewhere I could easily get to by car.  As such, I had to pick if I wanted Thursday or Friday to use the car rental.

The benefits I saw to picking Thursday included:
  • Possibly higher availability of rental cars — that is, closer to the weekend could mean higher demand
  • No rush to get back to the airport for the flight home Friday night

However, Friday saw the following benefits:
  • I could leave my luggage in the car.  Now, since my luggage simply consisted of a backpack, it wasn't too crucial, other than wearing it all day would get me all hot and sweaty, so it would be nice to lock it in the car while walking around.
  • Turn the car in at the airport right before my flight, without having to make a special trip to the rental car place which is probably at the airport anyway
Then there’s the whole modern-day dilemma of whether to rent from a traditional rental car company or to go with a car-sharing service like Turo.  I thought I’d try something different because a lot of the standard rental car places seemed to be in areas not exactly in the direction I wanted to travel, and the ones with the least taxes would force me to have to drive through the city (and all its traffic) in order to go the desired direction.  Plus, why not try something completely novel?

As I found myself into Thursday, I’d pretty much decided to rent the car that day rather than wait until Friday.  I checked Turo and availability was already vanishing rapidly.   I saw something else that was available later in the afternoon, which gave me some time to visit a museum I had prioritized before picking up the car.  Well, on my way to get the car, the cab driver (yes, a real live cabbie was just chilling out in front!) started chatting and saying this part of town I was going to pick up the Turo rental isn’t somewhere I want to be waiting on a ride for very long.  Uh-oh…

Fine Points of Comparison

Having gone through the afternoon and night with the Turo rental, here are things to consider if you want to give Turo a try:
  • Good for knowing exactly what vehicle you will get.  Most rental car companies only give you a vague idea as to the size of the vehicle, such as “mid-sized” (which most folks would consider compact), or even “compact” (which might as well be the Tata Nano, the $2,500 car produced in India).  Of course, they might specify a vehicle that fits into their categories, but those specific vehicles never seem to be available by the time you get there.
  • Flexible scheduling.  Book the car to be available exactly when you need it for how long you need it.  Want to return it at 11:00 PM or later?  Maybe the owner will facilitate that for you.
  • Book a car in cities or areas you’re familiar with, otherwise you might get freaked out if you’re returning late at night and it’s in a neighborhood that people claim is sketchy.  Of course, this is perhaps what I get for picking the cheapest car by a long shot in the city that day, so your mileage may vary depending on your price sensitivity.
  • The prices didn’t really seem that much better than a rental car company.  However, I was feeling this way after mostly having seen Porsches made available for $150 or $200/day, and I don’t really know how much a regular rental car company would ask for a Porsche.  That’s expensive even relative to what I normally get from a typical rental car company, but even for the cheaper vehicles, it seemed like Expedia was showing conventional deals for maybe $10-$30 more per day than what Turo was offering.
On the other hand, it’s worth considering these points about a rental car company before you steer all the way into the Turo camp:
  • Better guarantees about the quality of vehicle you’re getting.  It’s not up to one individual to maintain the car or take care of check engine lights; if a traditional rental has a problem, the company should back it with a guarantee or replacement from their ample-sized lot.  Better yet, such problems should be unlikely in the first place if the staff rigorously inspects and repairs vehicles in a timely fashion in between rentals at company service bays.
  • Often, rental car companies have nice grounds.  Even if they might be in a sketchy area outside, I’ve never felt threatened or unsafe on their property because they are well-lit, often gated, and with ample room to park your vehicle as you return it (which may be problematic if you’re looking to return the vehicle late at night in a densely-populated area).
  • However, you would need to make sure the facility is open 24 hours if you’re planning to return it late.  Otherwise you might be charged an extra day or have to go out of your way to drop it off properly before you can go on.

The Bottom Line

Given my own personal preference, a rental car company would be the way I’d go if renting a car in an unfamiliar city and not planning to return it until late.

Thursday, September 5, 2019

Unit Test Database Mocking in Golang is Killing Me!

For some reason, writing a particular set of unit tests in Golang proved to be an extreme hassle.  In this case, I was trying to add a column to a table and then write a unit test to verify this new functionality.  No matter how much I looked around for where to correctly specify how to build this table with the new column, the Cassandra database kept complaining the column didn’t exist.  Imagine how frustrating it is to specify a relational database schema that seems to be ignored by your unit tests.


Our system consists of a Cassandra database plus a Go backend server.  The Go unit tests require Cassandra to be running locally on a Docker container, but do not seem to actually utilize the existing databases, opting to make its own that is out of reach from anything I can see from TablePlus.  The Go code itself utilizes these objects through some level of indirection by a manager, a harness, and a couple modules dealing in the actual queries.

The Fix

In the test harness, I was running several commands following the format

column1 boolean,
column2 int

Well, turns out that despite my best efforts to define how the table schema was supposed to be set up, the test engine was not actually running the setup every time.  In fact, it was relying on some old cached copy of the tables living somewhere.  Upon changing the command to

CREATE TABLE table { … }

I was able to see the tests perform correctly on my local machine.  Oddly, in looking for errors upon creation, it would still complain of timeouts on occasion, but overall the functionality appeared correct — the unit test was at least passing.  Interestingly, the QA environment from which we automatically run these tests from a CircleCI pipeline did not need the whole DROP TABLE fix; it worked properly upon simply specifying the correct column configuration in the test harness.

One caveat was that I had to split out the DROP and CREATE statements into separate calls to Session.Query().  When trying to separate them with a semicolon, Go complained it could not execute the statement.

Ultimately, this provided hours of frustration for me in trying to figure out why the unit test would not work locally.  I was afraid the same situation would apply to QA, but fortunately it didn’t.

Thursday, August 1, 2019

John Osborne, Retro Pinball Designer, in Lodi

Back in May, I got to hang out in California for a couple weeks to attend a couple conferences: Google I/O and IoT World.  I did a couple panels at IoT World, and one of them is already summarized on this previous blog post; the other one is coming later.  However, as a bonus during my time in California, I also got to attend the Golden State Pinball Festival in Lodi.  Here, they had John Osborne, a pinball designer who worked at Gottlieb from 1972-1984.  The following are expansions of notes I took during his Ask Me Anything presentation.

A Bit of Background

John Osborne started his professional journey by studying electromechanical (EM) engineering at Fresno State.  This, of course, elicited cheers from the local crowd.  After college, he started working at Gottlieb in 1972 at the age of 21.  Besides Gottlieb, he was also interested in working at Chicago Coin, which he said was sketchy, because the hiring manager's last name was always changing; as such, he never applied.  He was also interested in working at Williams, but rather than talking to corporate recruiting, he was told to talk to a distributor.  That also seemed weird, so he didn't bother.  There were some stories about being flown from Fresno to Chicago to meet the team and do interviews and such, but those are better told in person!  Anyway, the first thing John did after design was to work on the penultimate manual to describe all you would ever need to know about EM games.  However, it never got published due to still being unfinished when solid-state machines came out.  It doesn't seem like John kept any drafts or notes of this manual, sadly.

The Process of Design

Concepts would originate from both customers & engineers, but names & themes would usually not come from Gottlieb. The game Blue Note was something John came up with all by himself, but themes like poker & pool would always sell.

The first stage of design at Gottlieb involved the hand sample, where you assemble the game yourself.  This includes drilling holes on the playfield and other important spots, and running wire by hand.  I can testify to this being a large hassle from having done Wylie 1-Flip, which even still had much fewer components and thus less wire to run.  Nevertheless, after the hand sample comes the engineering sample: this involves drafting a formal layout (with schematics & cables) to make tooling, and using nail board to run cables in a more organized layout.  At this point, you basically have the real game, except for screen printing and artwork.  Lots of games would then be played on this machine where many metrics and percentages would be calculated, including score, how much of the game's objectives were completed, how far in any sequences you got, etc. Wayne Neyens, the head of engineering at Gottlieb, wanted people to test games who weren't too skilled at pinball; the average player was ideal for simulating what would actually happen in the field.  However, people testing the games would get yelled at for sitting while playing.

Given 3-ball vs. 5-ball play modes (why anyone would set an EM to 3-ball play is beyond me), the replay scores should be comparable given the amount of play, so logic might raise the necessary scores when moving into 5-ball play.  However, I'm not sure any EM schematic I've seen actually employs this logic.  In any event, to calculate the scoring for replays, the Gottlieb testers would employ a tally sheet that lists all possible scores for the game, rounded to nearest 1,000.  By tallying your rounded-off score, it effectively makes a histogram of scores achieved on the game.  The designers would then set the recommended first replay value to be the median of the tallied scores.  The second replay would be set to 14,000 points above that, then the third replay is another 8,000 points above the second replay value.  Before the first arrow was placed (onto the median value), 50 games needed to be played.  This tends to result in 30% replays in games in the field.  These tallies did not include specials, which were rare (2-3% of games).

Two game samples would be played with real money to make sure every last mechanic would work in the game prior to real production.  Portale or Lanielle, being the two better distributors of Gottlieb games, would typically get the sample games, thus the engineering samples might have wound up in the wild and into someone's private collection nowadays.

The most interesting thing to me that John Osborne said was that good games like Spirit of 76 or Card Whiz, which became popular, would keep the shop busy and the engineering team at leisure with little to do. On the other hand, whenever the design team was cranking out dogs, it kept engineering busy trying to satisfy unhappy distributors & a bored machine shop waiting for the next big hit to yield many orders.

The Oddball Add-a-Ball Games, and Italian games in general

New York and Wisconsin were big add-a-ball markets due to laws and the stigma against gambling.  Even "shoot again" features didn't satiate these laws.  Some add-a-ball games have an extra ball penalty upon tilt. Add-a-balls generally award 2.5 balls per game.

Italian add-a-ball games offer no replays at all.  The legislature had a unique way to envision how to protect the currency, and for pinball machines, it required manufacturers not to step up the ball count unit because buying 5 balls and getting 6 would devalue the Italian lira.  Another oddity about the Italian games is the "Light box advance unit" (LBAU) featuring a "card" rather than an apron that says "Buttons" rather than "Flipper buttons".  These games (such as Team One, exported to Italy as Kicker) increment a "Wow" feature that lights up lights and then takes each Wow off when you lose the ball rather than adjusting the ball count unit.  Yet even the LBAU didn't work in some Italian cities; they wanted a novelty feature, and this entailed setting the "Wow" feature to score a ton of points and only reset like 1/2 the sequence.

Transformers in Italian EM games run at 230 volts and 50 Hertz, but yet feature 6 primary taps (including 170V, 190V, and 210V for people who live far from the power distribution center). Incidentally, solenoids & flippers run hotter at 50Hz.

John might be the only representative of Gottlieb at this point :-P

This is a neat device that was fashioned for a hockey game.  As opposed to the action of foosball, this mechanism would allow the player figure on the field to spin more naturally and control a puck by rotating left or right.

A bunch of memorabilia, including a rare Q*bert drink coaster

Some EM Tips & Tricks

As John is one of the few EM designers still around, attendees were anxious to hear about some maintenance tips and tricks that have been lost to time.

All 1, 2, and 4-player light boxes are the same, except for Centigrade 37.  As such, as Gottlieb designers built an EM game, all that was necessary was for it to be compatible with the standard light boxes.

White lube goes onto any mechanical parts relating to discs.  Black lube should be applied to gears. Use just a dot of light oil between plastic & metal parts, like the metal/plastic interaction in a score reel or even a shooter rod.  The step spindle on a decagon unit would get some white lube. Parts catalogs from the 70s would mention recommended procedures.

The V relay was a neat innovation, since this relay subtracts if you press the replay button only, not if you're trying to coin in another player into your game. The price of 1 game for 25 cents, 3 for 50 cents was a cool mechanical innovation.

One interesting glitch that was expensive to operators was a Chicago Coin video game where you'd pull a shooter to start the video game. You could cause the lights on the game to flicker by messing with this shooter and/or other buttons, and the electromechanical noise through the lines would actually add credits to the game.  Gottlieb Totem had some kind of a weird trick to add either 68 or 86 credits when inserting quarters and performing some sort of interaction that might be described elsewhere on the Internet.

Developing for Solid State Machines

The development machine used to write all the game firmware was the Rockwell PPS/4.  If I recall, the language of choice was Fortran, and all the engineers on staff learned how to program, even if their background was originally electromechanical engineering with relays and solenoids.  (It's easy to think of solenoids as Boolean logic anyway; with solid state, now they have access to larger data structures and traditional math.)  However, John's account of life at Gottlieb was that after their sale to Columbia Pictures in 1977, it always felt like the company was on its last legs and about to close.  As such, there weren't too many engineers at Gottlieb that had to learn Fortran!

One device they used to test game code was called a "Romulator." It spoofed a PROM, allowing its user to enter machine code and plug it into your game.  However, its battery life was terrible.  Once you had the game code the way you wanted it, you had to run as fast as you could to the one PROM burner in the building, which was 3 offices away.  And if someone stopped you in the hall to chat... well, there goes your game!

Relating to Haunted House

According to IPDB, Haunted House was the last game John designed at Gottlieb.  As tends to be par for the course (for me anyway) when asking Gottlieb engineers about their games, he was disappointed with the outcome, complaining that the "design committee" had really taken his game and made it unrecognizable.  (On the other hand, John Trudeau lamented about the build quality of the Gold Wings game.)

If John had his way to modify the game, he would hide the trap door, and show the ball action as it happens down by the flippers rather than hiding it.  The game program got way too complicated; he'd rather see a simpler rule set, but advised me that folks tend to value items when retained in their original state rather than being "hacked" or modified in some fashion.  I had mentioned two things to him: one was to use Hall effect sensors to track the ball and only activate the correct set of flippers with one flipper button (rather than having to remember to press an alternate set of flipper buttons when it reaches a particular level), and also to add multiball to the game (and apparently this hack already exists).

Thursday, May 30, 2019

My IoT World 2019 Panels: Recap

I was graciously invited to give two panel discussions at the IoT World conference that happened last week in Santa Clara, CA.  Since the panels are not recorded, here are my thoughts and jots from before and during the Wednesday 5/15/2019 panel, entitled Wrangling IoT Data for Machine Learning.  (Actually, I'm going into even more detail than I had time for at the panel.)  Despite that the conference organizers approached me about speaking on behalf of my former employer about some topics that honestly I was given just a few weeks to investigate and could only report back with failures even now, I managed to convince them that I was fluent in other things that were more generic -- unrelated to the job I knew I was about to quit.

(Note: My thoughts and jots for the Thursday 5/16 panel are coming later.)

Business Calculations

The first question we were tasked with answering in this panel related to the business calculations that must be made before taking on a project in Machine Learning; also, how one might calculate return on investment, and what use cases make sense or not.

Hello [Company], Tell Us About Yourself

Before deciding whether to build, buy, or partner (the three ways in which one takes on any technical project), analyzing your staff's competencies needs to be top of mind.  If you don't already have staff competent in data science, IoT, or the skills you need to finish the project, then in order to be good at hiring, you need to ensure your corporate culture, rewards, mission, vision, virtues, and especially the task at hand is going to appeal to potential recruits.  You could have devoted employees who care about the outcome, want to see it through, and work together to build a well-architected solution with good continuity.  With the solution's architecture well-understood by the team as they build it, their "institutional memory" allows them to add features quickly, or at least know where they would best fit.  Or, you could hire folks who only stay for a short-term basis, with different developers spending lots of time wrapping their heads around the code and then refactoring it to fit the way they think, which takes away time from actually writing any useful new business logic.  The end result may be brittle and not well-suited for reuse.  Certainly it is healthy to add people to the team with differing viewpoints, but small teams of people should not completely change or else it will kill the project's momentum.  (Trust me, I've lived this.)

If you're not ready to augment your staff or address these hiring concerns, it's OK.  An IoT project is complex to develop because at this time, there is not an easy "in-a-box" solution; still many services are required to be integrated, such as sensor chips, boards, firmware, communication, maybe a gateway, a data analytics and aggregation engine, and the cloud.  In fact, there are plenty of valuable and trustworthy solutions providers you can choose from, and you can meet a lot of them on the IoT World vendor floor.  By buying a product that complements your company's skill set, you can deliver a more well-rounded product.  And a good service provider will have a variety of partners they work with for themselves: with a robust knowledge of the landscape, you will more likely find something that truly suits your needs.  Now, if you are starting off with zero expertise in IoT or machine learning, there are vendors who will sell you complete turn-key solutions, but it is not likely to be cheap because each domain involved with IoT requires distinct expertise, and currently integration of these domains is fraught with tedium (though there are groups looking to abstract away some of the distinctions and make this easier).

Finally, if you are clever, you may find a way in which your solution or some part of it may in fact be a value add to a solutions provider, thus giving you even more intimate access to their own intellectual property, revenue streams, or ecosystem of partners.  In this case, you are truly becoming a partner, establishing your position on the channel ecosystem, and not just being another client.

It's All About the Benjamins

Particular to the data, there is a cost involved to aggregate, store, and analyze it.  Where is it being put -- the cloud right away?  Physical storage on a gateway?  If so, what kind of storage are you buying, and what is the data retention policy for it?  If the devices are doing a common task, how do you aggregate it for analysis, especially if you are trying to train a machine learning model without the cloud?  And if you are using the cloud, what is your upload schedule if you are choosing to batch upload the data?  It had better not be at peak times, or at least not impact the system trying to run analysis too.

One big piece of food for thought is: does your data retention policy conflict with training your machine learning algorithm?  This is important from a business perspective because your data may not be around long enough, for various reasons, to make a useful model.  Or, on the flip side, your model may be learning from so much information that it might pick up contradictory signals from changing underlying conditions, such as a bull market turning into a bear market.  (However, this case can be rectified in several ways, such as feeding in additional uncorrelated attributes for each example, or picking a different model better suited to accounting for time series data.)

Perhaps far from the last monetary consideration is to examine your existing infrastructure.  Are sensors even deployed where you need them?  There could be a substantial cost of going into secure or dangerous areas.  For instance, in the oil & gas industry, there are specially designated safety zones called Class I, Division 1 where anything that could cause a spark would blow up a facility, causing major damage and loss of life.  Personnel and equipment must be thoroughly vetted so as to avoid potentially deadly situations.  Or, better yet, is there a way to monitor the infrastructure remotely or from afar, thus avoiding requiring access to such sensitive areas?  Using remote video or sound monitoring may remedy the need for intrusive monitoring, but the remote system put in place needs to be at least as reliable as the risk you assume by going into such sensitive areas in the first place.

Figuring the Return On Investment

Briefly, I want to touch on some points to keep in mind when considering the ROI on an IoT project.  Hopefully these will mostly already be obvious to you.  They break down into three categories: tangible impacts, intangible impacts, and monetization.  We should not fail to consider a project just because we cannot figure out how to quantitatively measure its impact.

First, the tangible impacts: a successful IoT project (particularly in an industrial realm) will reduce downtime by employing predictive maintenance analysis or warn before issues get out of hand.  This increases productivity, reduces RMAs/defects in products, and could reduce job site accidents as well.  In this case, it is a lot easier to measure operational efficiency.

The things that may be harder to account for include the safety mindset that might be brought about by a well-implemented IoT tool that users find helpful or essential to doing their job, rather than obtrusive or threatening their job by telling on them when they mess up.  One baseline could be comparing safety accidents year over year, but this number cannot be taken at face value; it must be compared to other numbers of productivity, and even then it might never account for other side effects of having a better safety mindset, such as improved job satisfaction, which could lead to a better home life for users of the IoT tool.

Finally, one unexpected way the product could pay off could be monetization.  By making part of it generic and selling it as a service, you might build a user base who themselves are freed up to focus on their skill sets.  Maybe you have built up a data warehouse that others might find useful, made some digital twin models of items others use, or are performing some kind of transformation on recorded data in order to derive insight.  In any event, this gives your product legs; in case the main premise of it fails or does not pay off, then at least some of the work is still valuable.

Where AI Makes Sense

I have gotten into discussions about this with people who think AI and machine learning is the answer to everything.  To me, machine learning is more than just filling out a business rule table, such as "at 6:30 I always turn down the thermostat to 73, so always make sure it's 73 by then".  In short, machine learning is most fun and applicable to a problem when the target state changes.  For instance, you're a bank trying to decide whether or not to give someone credit, but the underlying credit market changes over the course of a few years, thus affecting the risk of taking on new business.  Problems like these really get the best bang for their buck out of machine learning models because the model can be updated constantly on new data.  One way to find out when to trigger model training (if you're using a supervised approach, such as decision trees or neural networks) is to use an unsupervised approach such as K-means clustering, looking for larger groups of outliers becoming inputs to your model, and then making sure your original model is still performing well or if it has failed to generalize to potential changes in underlying conditions.

Other types of interesting problems for AI & ML are those involving image or audio data, for which researchers have tried for decades using classical mathematical approaches but for which basic neural networks showed dramatic improvements in accuracy over the classical approaches.  Neural networks are simply better at learning which features really matter to the outcome.  They will build up the appropriate filter, whether it represents some intrinsic property of a sound wave or some portion of a picture.

The most creative uses of AI and ML will enable previously impossible interactions.  Think about something crazy like teaching a speech recognition engine on custom words for your specific application and embedding it into some tiny device, or possibly using a smartphone camera to take pictures of something to learn its size.

Run Machine Learning Where Again? - Cloud, Edge, Gateway

The apps I usually build for clients usually revolve around these three characteristics:

  • Clients are typically highly price sensitive
  • Latency is a non-issue
  • Sensors send data every ~5 minutes unless conditions deteriorate
With this in mind, I am looking to reduce the bill of materials cost as much as possible, and so I make the edge as dumb as it can get.  The analytics goes into the cloud.  And even if you're a believer in data being processed on the edge, you're probably not going to get away without cloud somewhere in your project anyway.  A robust cloud provider will offer solutions for not just data aggregation/analysis, but also:
  • Device firmware updates over-the-air
  • Data visualization tools
  • Digital twins
Plus, advanced machine learning training is only taking place on the cloud through advanced clusters of GPUs or TPUs, due to the scale of data and number of epochs required to train a useful image or NLP model to a reasonable degree of accuracy.  Thus, you might as well put data into the cloud anyway (even if not streaming, then in batch) unless you plan to:
  • Generate test data manually or using other means
  • Run a GPU cluster along with your edge to do retraining
However, with advents in transfer learning, and with cheaper hardware coming out like Intel Movidius, nVidia Jetson, and Google Coral, edge training will become more of a reality.

Friction-Free Cloud

As I am most familiar with Google product offerings, Firebase allows for running models locally with no cloud connection.  Their cloud can serve an old model until training is finished.  If you wish to run your models on the edge, you will need to get clever about exactly when to deploy the new model: either in "blue/green" fashion (a flash cut to the new model all at once) or using "canary" deployments (where a small percentage of inputs are classified with the new model for starters).

Furthermore, given that we are unlikely to get rid of the cloud in IoT projects anytime soon, a big opportunity is to make tools whose user experience is the same from the cloud to the edge device in order to improve continuity and reduce frustration.

Picking an AI/ML Platform

The third question in the panel related to picking a machine learning service provider.  My general thoughts on this revolve around considering the providers who have built products useful to specific industries.  On the vendor floor, there were small companies with solutions catering to manufacturing, supply chain, chemicals, utilities, transportation, oil & gas, and more.  Larger companies have consulting arms to build projects for multiple different industries.  In either case, whoever you choose can hopefully bring domain-specific knowledge about your industry to solve your machine learning problem, and can save time by already having common digital twins in a repository or common KPIs for your assets or employees.  The hope here is that with a vendor targeting a specific industry, they will have already accumulated domain knowledge so they won't need so much "getting up to speed" about the general problem your company faces, but can jump right into solving higher-order creative problems.

However, if these services are built on top of a cloud provider that decides to crawl into the space of the specialized provider you choose to work with, it could obviate them.  For instance, if Google decides to get into a particular business where there are already players, they will offer a similar service but for free.  As such, pick one service provider positioned for growth, with staying power due to a niche or protected IP.  Or, actually pick multiple technologies or providers of different sizes to protect against one going extinct.  For instance, maybe different types of wireless radios might be useful in your application.  But imagine if you'd put all your eggs in WiMAX in the early 2010s; you wouldn't have much of a solution now.  As such, it is helpful to find tools and technologies that are at least interoperable with partners, even if the use case is specific.

Other Considerations In Passing

Besides what was addressed above in the panel, there were some remarks prepared in case we had additional time (but it seems we ran out).

Tune In To the Frequency of Retraining

Models over time will likely need to adapt to changing inputs.  A good machine learning model should be able to generalize to novel input -- that is, make correct predictions on data that hasn't been seen before.  However, there are a few indicators that might indicate it's time to retrain or enhance the model.
  • More misses or false positives.  In data science parlance, a confusion matrix is the breakdown of how many items of a given class were labeled into which class.  The diagonal of this matrix is the correct answer (i.e. class 1 -> 1, 2 -> 2, and so on).  Thus, if numbers outside the diagonal start getting high, this is a bad sign for the model's performance on accuracy.
  • Changing underlying conditions.  As described earlier, one could imagine this as a bull market turning into a bear market.
However, there could be multiple paths to monitor the need for retraining, or even mitigate it.
  • Consider a push/pull relationship between supervised and unsupervised models, as described above.  If outliers are becoming more common in unsupervised models, consider making sure your supervised models are cognizant of these examples by running more training.  Perhaps new classes of objects need to be introduced into your supervised models.
  • Maybe the wrong model is at play.  There could be a fundamental problem where, for example, a linear regression is in play where a logistic regression should really be used.
  • Perhaps the business KPIs actually need to be re-evaluated.  Are the outcomes produced by the data in the right ballpark for what we even want to know about, or are we coming up with the wrong business metric altogether?

In the quest for real-time analysis of your model, it should be analyzed whether or not such a task is attainable, or even required.  Factors that could drive whether to do it could include:
  • Is it mission-critical?
  • How many objects need to be analyzed in real-time?  Too many objects will increase demand on the processor.
  • Is analysis cheap enough at the edge to conduct with modern silicon?
I’ve usually advocated against using deep learning when there are simpler mathematical models requiring less compute, even if it takes more feature engineering up front.  However, it’s probably not long until we have silicon so cheap that we can run and even train such advanced models with relative ease.  And the good news is the more powerful the analysis engine (i.e. operating on 2D video rather than 1D sensor data), the more analyses we can draw from the same data, requiring less hardware updates and instead relying on simpler updates to firmware and software.

One particular question to the panel involved how humans educate machines.  Currently, we rely on annotations on data to make it obvious where we should be drawing from.  This can be something as simple as putting a piece of data into an appropriate database column.  However, unstructured data like server logs is becoming ever more important for deriving insights.

But maybe on the flip side of this is when do machines begin to educate each other, and educate humans as well?  The most obvious play on this regards decision support.  If humans can become educated by an AI tool in an unobtrusive way to, say, be safer on the job, then this is one such way we can make an impact on ourselves through machines.  Another good way is to gain insight into decisions being made for regulatory purposes.  As certain institutions are audited to ensure there is no discrimination or advantages being given to certain parties, a machine learning model needs to be auditable and educate its human interpreters into its behavior.  And education doesn't have to be hard; even kids in middle school are fully capable of playing with Google's machine learning tools to build amazing products, unbounded by years of skepticism formed by bad engineering experiences.

However, the more dubious problem is when machines train other machines.  While this could be a good thing in some applications, like hardening security, right now you can see generative adversarial networks (GANs) being used to create deep fakes.  Now it is possible to spoof someone's voice, or even generate fake videos of events that never happened, all to push an agenda or confuse people in trial courts.


Obviously, this is a lot more than can be said in 40 minutes, and frankly more than I even intended to write.  However, it is a complex field right now and all good food for thought, and hopefully by airing out some of these thoughts, it will help simplify and demystify the use of AI in IoT so we can converge on a more universal standard and set of best practices.