Thursday, June 22, 2017

My Tensorflow Project Isn't Saving the World

Among all the hype around the latest and greatest technologies, there is so much publicity devoted toward how they are being used in grand schemes to cure cancer, reduce energy waste, conserve water, solve poverty, and so forth.  While all these things are wonderful to humanity, there has to be someone left in the background who helps all the do-gooders unwind when it's time to take a break!

The TL/DR Version: Get To the Point!

Use clever arguments when loading up your Docker container so you don't have to shut it down and restart it when you want to mount external directories from the host filesystem or expose the port for the Tensorboard server.  There is also nvidia-docker available if you want to use your CUDA cores.

sudo nvidia-docker run -it -p 6006:6006 -v ~/Pictures/video-game-training/:/video-game-training bash

Use the --output_user_root option in your Bazel builds so you can save it to that external directory on the host you provided earlier.  This way, when you have to shut down your Docker instance, your Bazel build will still be there (though you will have to recreate some symlinks in the Bazel project directory).

bazel --output_user_root=/video-game-training/bazel-build build tensorflow/examples/image_retraining:retrain

Don't forget to store your image category directories within a "training image root" directory at the same level as the bazel-build directory, or else Bazel might try to train on its own model files.

Also, don't forget that if you export the trained model to somewhere outside /tmp, and then iterate on this model, that you pass the location to the correct model to the classification step.  Otherwise, you might classify with the wrong model, which could lead to confusion and frustration.

Use my fork of the Imker repo (maybe someday I'll make a pull request to put it in the mainstream code) if you want to download only a portion of the images in a particular category from any Wiki site such as Wikimedia Commons.  This could be built upon so you can segregate training and test data.

Just Use the Devel Docker Image; CUDA Optional

Ignoring my original plans for what I was planning to do with TensorFlow, it struck me one night to build a classifier that could recognize different game cartridges for the Nintendo Entertainment System (NES).  I had a lot of pre-work to embark on because it had been a long time since my system had been updated with the latest supporting packages.  However, all of it ended up being all for naught; I found the "virtualenv" approach for installing Tensorflow to be so fraught with tedium that I ended up going for the simple Docker approach.  This is the Tensorflow installation approach I've been recommending since November and it seems to still be worth sticking to.

I have a pretty old nVidia graphics card (a GeForce 650 Ti) in my (mostly even older) desktop running Linux (and Windows at times, mostly during tax season).  It still supports nVidia Compute Capability 3.0 which is just barely enough to run the capabilities I need to perform machine learning, play with the Blockchain, and so forth.  To make Tensorflow performant inside Docker, a special add-on called nvidia-docker allows access to your CUDA cores from inside your Docker container, so I can still get blazing fast performance from my own hardware without needing to install everything in my primary environment (which is evidently too jacked up to support the Tensorflow installation).  Docker is great for providing a uniform, trouble-free experience when running apps anyway because it provides an isolated environment not subject to your system's specific configuration.  However, the version of Docker originally on my system was so old that the required libraries for nvidia-docker were not present; luckily, the upgrade path was simple thanks to their clear instructions.

In fact, thanks in part to my pre-work from before, and lots of good Internet guides on this topic already, getting Tensorflow working on my desktop in this manner went smoothly, if not for some early trial and error, and of course the usual long wait times for compilations to finish.  As I've often said, just use Docker.

Once you have Docker and nvidia-docker installed, here is the best way to run the Tensorflow image.  Note that if you don't have the image already, Docker will automatically download it:

sudo nvidia-docker run -it -p 6006:6006 -v ~/Pictures/video-game-training/:/video-game-training bash

Let's break this down:

  • There's a way to avoid running docker with sudo, but it hides any semblance of auditability or traceability for when users go beyond their expected behaviors and start to get mischievous.
  • nvidia-docker is the binary that supports Docker instances accessing CUDA cores.
  • run tells Docker to launch the specified image in its own isolated environment, with its own filesystem and process tree.
  • -it (or -i -t) specifies first to run the container in Interactive mode, leaving stdin (standard input) open even if nothing is attached.  Secondly, a pseudo-TTY port is opened so the user can actually send input to the container.
  • -p 6006:6006 exposes the Tensorboard port inside the container to the host.  When you start the server, you can access it through localhost:6006 on a browser on your host machine.  Tensorboard is a great way to visualize what is going on inside your training algorithm from the model construction and details standpoint, plus illustrate simple representations of how the data exists in the classification space (as simple as you can make it in as few dimensions as we humans can easily perceive).
  • The -v option allows you to specify or mount a directory (not an entire filesystem; there's a different way to do that) from your native filesystem to include into your Docker container as it runs.  In this case, I wanted to expose the video-game-training directory from my user account's Pictures folder onto my Docker instance as /video-game-training so that the algorithm would have access to all my training data.
  • is the Docker image name.
  • bash is the command to run on the Docker image once it starts.  You can run any executable you want, but it is easiest to run a terminal instance.

First Crack At Building a Classifier: Aligning Pictures And Commands

For object classifiers, good training data comes from as many images as you can get of the subject material.  To support this, I took videos of various NES game cartridges while moving the camera around so as to film it from various angles.  Depending on the lighting, the sun or lights would also reflect back into the camera and cause slight imperfections in the label.  I labored for quite a while in the hot Texas sun taking videos of these games with different backgrounds behind the cartridges so that the classifier would learn how to focus on what is important.

Once my environment was all set up and ready to go, I ran this Tensorflow example pretty much verbatim.  It took approximately 24 minutes to run the first step which sets up the Bazel build to run the training task.  However, as my Docker instance did not have any training data loaded into it, I had to exit out of it in order to add the file mount as described above.  Unfortunately, upon logging back into my Docker container, all this pre-work had been wiped out as a result of it all being built in some temporary .cache directory under the root home.  And, to add insult to injury, running that Bazel setup command the second time took more than twice as long -- clocking in at just short of 50 minutes!

Lesson Learned

One easy way to avoid losing your entire Bazel build when Docker decides to refresh the file system from scratch is to specify the --output_user_root option to Bazel before building to be the same as the external file system or directory from the host that you mounted inside Docker.  In my case, this meant specifying the following setting for my build:

bazel --output_user_root=/video-game-training/bazel-build build tensorflow/examples/image_retraining:retrain

Continuing With Trying To Break Bazel And My Docker Instance

Now, this meant I had to put my training examples one level deeper in this directory, or else the next step would possibly try to train on whatever output is in the Bazel build directory itself.  After running the Bazel build, I exited my Docker instance to see what would happen.  When I reopened it, I found that the symlinks in the /tensorflow folder had been changed to point to /root/.cache/bazel, which did not exist (and never existed because I made the build in another folder).  It took just a hair bit of manual tedium to point the symlinks back to the right place, but upon doing so, the bazel-bin "retrain" command specified in the Google example to actually perform training worked without a hitch.  With everything in place, this command took less than 15 minutes to perform 4,000 training steps utilizing my approximately 800 pictures of each the MegaMan and MegaMan 2 cartridges.  The exact syntax looks like this:

bazel-bin/tensorflow/examples/image_retraining/retrain --image_dir /video-game-training/pictures

The output of this step produces two files in the /tmp/ directory: output_graph.pb and output_labels.txt (also /tmp/retrain_logs/ is important if you want to look at your TensorBoard at any point).  I moved these files into a model/ directory inside the directory exposed to Docker from my host system.

As for classification, I utilized the same strategy, using the --output_user_root option on the bazel build "label_image" step (obviously ignoring the conjoined bazel-bin step for the time being, thus stopping short of image classification).  This Bazel build took about 20 minutes:

bazel --output_user_root=/video-game-training/bazel-build build tensorflow/examples/label_image:label_image

Once this step was complete, I exited and re-entered Docker once again, and my symlinks had been similarly screwed up.  Upon restoring them (like last time), I found a picture of the MegaMan 2 cartridge from out on the Internet, and ran it through the classifier in this manner:

bazel-bin/tensorflow/examples/label_image/label_image \
--graph=/video-game-training/model/output_graph.pb \
--labels=/video-game-training/model/output_labels.txt \
--output_layer=final_result \
--image=/video-game-training/megaman2-ex-01.jpg \

And voila, a reproducible classification each time, without having to leave my Docker instance open, simply by reconstructing those symlinks!  (That part could easily be scripted in a batch file, in fact.)

Note: Without that last line in the classification command, you will probably stumble into an error saying "Running model failed: Not found: FeedInputs: unable to find feed output input".   As it turns out, Google's example command is a little bit deficient, but fortunately some forum posts succinctly clarified the issue and offered the solution.

Because the Whole World Isn't Video Game Artwork

My training data consisted of only pictures of the label up-close, and mostly ignored the rest of the cartridge.  However, my first classification picture was in fact of the entire cartridge.  I was astounded at the results, because even considering this difference, the algorithm was 96% certain that my picture of the MegaMan 2 cartridge was in fact MegaMan 2; the 4% remainder was its (very weak) confidence that it was the original MegaMan cartridge.  Now, having spent most of my professional career up until now as a tester, I immediately wanted to see how it would perform on junk input.  I fed it an old picture of one of my pinball machines (Gold Wings, of no relation whatsoever to MegaMan), but the algorithm was 86% confident that what I just showed it was in fact MegaMan, and only 14% confident that it was MegaMan 2.  This was amusing to me, because I suppose in the algorithm's limited worldview of only having been trained on examples of MegaMan or MegaMan 2, it was in no position to say with any authority that anything was in fact neither!

Wikimedia Commons appealed to me as a good location to get quality public-domain photos to use as "negative" training examples (though I suppose I could have used private images with rights held by the authors, and since their data is buried deep within a machine learning model, you would never be the wiser!).  The only downside is their site offers only 200 photos at a time for a given category, and it would be a huge waste to sit there, expand each one, and manually click Save.  Fortunately, Wikimedia Commons supports API calls that will allow you to download all the media for a given category.  Better yet, there is already a Java program called Imker that offers a CLI and GUI wrapper around the API calls.

The only problem with Imker is their current UI only offers you the ability to download every single file within a given category, not to break it up into just a fraction of randomly-selected images.  Nevertheless, Imker is open-sourced, so I forked the Git repo and began hacking away at the Java code so that I could download just 10,000 of the 272,812 images currently in the "PD-user" category on Wikimedia Commons.  After sorting out a lingering issue, and waiting a few hours (thanks in large part to my crude rate limiter), I have 10,000 images from A-Z, not to mention A-Z in other languages, consisting of roughly 75% JPEGs, 18% PNGs, 5% SVGs, 1% GIFs (even animated), and some TIFFs thrown in for good mix.  Not only that, but the images consist of things like maps, diagrams of all sorts of things in many different languages, road signs, cars, street scenes, landmarks, molecular diagrams, and all sorts of other random stuff only a small percentage of the population could possibly care about. :-P

The beautiful part about using the pre-trained, robust Inception model is that you don't have to worry about scaling your input data to a particular size.  I was able to use these images just exactly as they came, and I only had trouble with two images that apparently contained bad data and failed to download properly (had Imker not stopped due to some exceptions regarding unhealthy API responses, this might have been avoided).  Apparently, it even dealt with all these file formats adeptly too.

Important Note: One thing that stumped me as to why my model was only showing "megaman1" and "megaman2" after I had trained "not-games" was because I was using an old copy of the model in my classification argument.  Make sure you set the correct path to your model!

In any event, the Tensorflow model retrained to distinguish between Mega Man 1, Mega Man 2, and "Not a game" performed successfully in my two trials thus far.

Trained on MM1 or MM2 MM1, MM2, or Not a game
Confidence Mega Man 2 Pinball machine Mega Man 2 Pinball machine
Mega Man 1 3.9% 86% 4.0% 9.3%
Mega Man 2 96.1% 14% 56.8% 1.6%
Not a Game N/A N/A 39.2% 89.1%

Thursday, June 1, 2017

Pre-Google I/O Entertainment: Old Electronics Stores and Computer Resellers!

The opportunity Google gave me to attend Google I/O, their annual conference, two weeks ago required me to travel to the Bay Area in California in order to attend in person.  Also known as Silicon Valley, it is an area steeped in computer history, featuring (of course) the Computer History Museum, not to mention large offices or global headquarters for many current and long-gone tech behemoths, plus all the tiny startups making millions off various Internet and mobile technologies.  As someone who has been using computers their entire life (well over 25 years now), I am enthusiastic about the way forward but do not want to forget about the winding, bumpy way that has gotten us to this point.

As I seek to bolster my collection of retro-tech, it is fascinating to pontificate on what all these devices would have cost brand new.  There's no way my family could have afforded but one or two these things back in the day, but as technology marches on and leaves so much of itself in the dust, follow along with me as I walk through some of the few remaining stores and shops dedicated to the Hardware Era of Silicon Valley.

Definitely not where Google I/O was.

However, pretty much right across the street from this Yahoo! building was the first stop on my tour after picking up my Google I/O badge: Weird Stuff Warehouse.  Not having been into such a computer surplus/resale store, I was filled with just about as much wonderment as I was upon walking into my first neighborhood computer store back in 2000 (let's just say it was much better than most neighborhood computer stores, and certainly a different experience from the big box retailers).  Upon walking in, you are greeted with about four aisles of tested working stuff of all kinds, including computers & parts, video equipment, and other assorted electronics.  There are several counters and associates waiting to offer help in this area.  That might not sound like much, but wait; it gets more interesting.

Behind this "Open to the public" sign (actually right where I'm standing when I took this picture), at the far corner of the first room from the entrance, is a whole plethora of aisles in their "As-is" section devoted to old software, I/O cards of all kinds, computer peripherals, cables, test equipment, server racks, typewriters, old telephony equipment, hard drives, floppy drives, CD/DVD drives, tape drives of all types, and even the obscure media that goes with these tape drives.

Some of the aisles in the "As-is" section of WeirdStuff Warehouse.

It is difficult to convey through pictures just how much there is to look at here because from the camera's perspective, it all disappears into the vanishing point so quickly, and so many of the bins are very small.  But after about three or four hours perusing Weird Stuff trying to pick as much SCSI components as I could muster, I had one of their associates search for some interesting stuff out of the back (namely more SCSI drives).  As it turns out, most folks say that they don't have everything necessarily out on display nor listed on their website; generally the stuff listed on their website isn't out in the aisles available to be browsed in-person.  Also, one of the guys from the Vintage Computer Forums says he's got a standing order with WeirdStuff where they'll let him know if they get anything on his wish list.  What a neat service that could be, but I'd hate (love?) to see how much stuff he's ended up with over the years!  Anyway, once I was through, I hailed another Lyft ride who whisked me on to Anchor Electronics.

Because when I think Anchor, or Electronics, I think "wire-frame dirigible..." ?!?

Anchor is in a small building right across the street from the southeast corner of the NVIDIA offices.  Walking into Anchor for the first time, it really felt like more of a typical electronic component store (i.e. more like a well-stocked Radio Shack) than WeirdStuff.  Everything in Anchor was some sort of component or tool neatly organized on their shelves.  I didn't really have a lot of time to browse around, having only about 25 minutes there before they closed, but I also wasn't really in need of components either.  They do happen to have various protoboards for everything from ISA to Arduino shields, and a small smattering of Atari 8-bit parts, of interest to vintage computer folks, but I'm really more interested in Atari ST (Sixteen/Thirty-two) systems; they don't carry those types of parts.  I did manage to get into a discussion with another fellow in the store and their main technical support guy Orville by helping brainstorm solutions for some short-distance presence detection type of application, and Orville was interested that his store was one of my primary sights to see in the Valley.  However, it was closing time, and I needed to leave.

Scenes from inside Anchor Electronics, a relatively small but dense store, including the one telling me it's time to go!

Now I was a bit split.  Do I take another Lyft down into San Jose to Excess Electronics and contend with tons of Silicon Valley traffic, or do I just take what's convenient?  Ultimately, Excess will just have to wait until next time, as Orville ended up driving me to my next destination just a few minutes after close; this would be HSC Electronics.  Along the short 1.5-mile journey there, Orville pointed out all sorts of buildings along the street that house famous names now but held other large names 10, 20, or 30 years ago that have since gone extinct -- most notably, the Qualcomm building on Kifer Road real close to HSC that formerly housed 3Com.

HSC is a really big electronic component store that, as much as I love Tanner's in Dallas, makes Tanner look like an itty-bitty Radio Shack in comparison to a great big Fry's store.  The fellows at HSC were also very cordial, and Orville was also buddies with them (and possibly performing a bit of reconnaissance on the competition... you never know!).  Once I expressed my interest in old computers with them such as the Amiga and Atari, they brought out some oddities for me to see: the KIM-1 (1976's version of a Raspberry Pi) and another old-time single-board computer I can't remember now but was based on the Motorola 6800 series processors if I can recall correctly.  And then they showed me a real claim to fame for their store:

Gee, some no-name hack from San Francisco bought an oscilloscope from them.  Who gives? :-P

No, seriously, look closely at that picture above.  If you're not impressed that somewhere, someone was keeping records at HSC for years and years and remembered that kid when he got famous, then I don't know what to tell you.  However, they said the same thing to me (more or less "remember us when you're famous"), though they don't have my name scribbled on a nice large "Name" field like they would have done if I walked into that store back in the '60s too.  Chances are they might have it on some way less interesting credit card log somewhere, but who's to know.

Nevertheless, I spent quite a while browsing this store too, firstly in sheer awe that the arrangement resembles Tanner's so much (but with aisles twice as high, and many more of them) and secondly trying to jog my memory for stuff I could possibly need.

The test bench section and the Self-Serve wire area.  They only ask a couple reasonable things of the test area: let them know ahead of time if you're testing anything with vacuum tubes, and don't leave hot leads lying around.

Aisles upon aisles of stuff, including one barely wider than my shoulders.  Also, mind you, I was homeless most of the day, having checked out of my hotel early that morning and not able to check into the rental house until that evening; as such, I was having to carry all my bags, toiletries, clothes, and my purchases with me at all times up and down the aisles.

Putting All This In Perspective

First off, without the assistance of Raymond, a local buddy of mine who runs and travels out to these stores relatively often (and who also contributed to this relevant thread on the Vintage Computer Forums), I probably would have ended up in some lame stores that wouldn't pay such homage to retro technology and would only be looking to resell last year's Cisco servers, or else some general vintage store that once upon a time had a computer section but now only mostly sells hipster clothes and occasionally gets someone's old laptop once in a great while.  Nevertheless, here is how all these things fit together physically:

It should be noted that Raymond highly recommended St. John's Bar & Grill as a good place to have a burger, especially if you need to ship some of your larger hauls via the FedEx in the same complex.

Also Intriguing To the Music Nerd

For those of you who happen to be band nerds too, it should be noted that the Santa Clara Vanguard, an extremely competitive and highly-ranked drum and bugle corps and an original member of DCI (Drum Corps International), has their headquarters little more than half a mile north of Anchor Electronics, just on the other side of the NVIDIA offices.  If I were ever good enough to make that corps when I was in school, it would have been awfully intriguing for me to explore the tech scene whenever (if ever?) I had a break from practice, though in actuality I can't imagine the members really spending much time in that office compared to out on the rehearsal field or traveling anyway.