Studio711.com – Ben Martens

Geek

Disney+ Review

As the parent of a six year old and a lover of both Star Wars and the Avengers, signing up for Disney+ was a no-brainer. I jumped on an early deal and pre-paid for 3 years of the service so we’re on board whether it’s good or not. The only hiccup I’ve had so far came on launch day when the app was overloaded, but since then the reliability has been great.

As I sat on the couch last Friday watching Jungle Book with the family, I kept wondering what my 10 year old self would have thought if it could have seen me watching Jungle Book on a 10 foot screen in my own house without any tapes or discs in a player as it is controlled via my phone.

I’ve seen some comments about various movies that aren’t on the service, but it’s a treat to scroll through the list of what IS on the service. You know how you scroll through Netflix or Hulu and you’ve never heard of most of it? Not so with Disney+. It’s hit after hit after hit. No more frustrating “Disney vault”. It’s all there at your fingertips.

Over the past few years, I’ve felt the Disney brand rising up the list in my head. They’re becoming synonymous with a high quality but sometimes pricey product. Thankfully Disney+ only gets the first part of that. The cost is $6.99/month with cheaper options if you pay ahead. That’s crazy low when you compare it to other services.

Disney+ gets two thumbs up from me!

Strata 2019 San Francisco

My company was nice enough to send me down to San Francisco last week to attend the Strata Data Conference. If there’s a bigger conference in my field of data engineering/science/analysis, I don’t know what it is.

I attended a big data conference four years ago, but going to Strata was a huge step up both in terms of the quality of the event planning and in the quality of the talks. I came away with a stronger vision about things I want our team at work to accomplish and how we can make a bigger effect on our business group.

I skipped all the social events surrounding the conference, but I filled both days wither every talk I could cram into my schedule. A couple were total duds, but there were a lot of great ones from Netflix, Lyft, Uber, Intuit and others.

Aside from the conference itself, it was strange to be traveling alone. I did spend one evening in a movie theater watching Captain Marvel, but otherwise I mostly hung out in my room. I felt guilty about temporarily forcing Tyla into single parent mode and leaving my team at work short-handed, so I spent a lot of my free time working on the laptop and trying to make good use of my time.

My hotel was right next to Moscone West where the conference was held and that was fantastic. I was able to get from my room to a talk in about 5 minutes. That let me hustle back to the room even when we had ~45 minute breaks to get away from the crowds and recharge a bit. It’s surprising how tiring it is to sit on your rear end and listen to talks all day. I felt like my brain was very full!

It was a great trip, and while it’s not something that I need to do every year, I hope I can go back in 3-4 years. Thank you Tyla for holding down the fort while I took this trip!

Patent Application

Azure Data Explorer has made a dramatic impact on my career. It has inspired a whole new breed of data engineering and it feels like a wide open playground for ideas and innovation. There were so many new ideas and patterns floating around in my head that I decided to attempt the patent process (through work) for one of them. I’ve never been through it before and it was interesting to see all the different levels of scrutiny and checks that go into it before you even sit down with a lawyer to start drafting the application.

I’m thrilled to announce that I’ve completed all of that work and my patent application has been submitted! Unfortunately… I’ve been advised not to share the details of it yet. After about 18 months, the US Patent Office will publish the application. At that point it will be public information on their site but it will still take another 2-3 years from that point for them to review it and either approve it or ask for some more information.

So I guess the point of this post is to say that I’m really excited about applying for my first patent. Even if it doesn’t get approved, it’s neat to see how the process works and it has me thinking whether or not other ideas are patentable too.

Analyzing Water Data in Azure Data Explorer

One of my favorite systems at work officially launched a couple weeks ago as Azure Data Explorer (internally called Kusto). I’ve been doing some blogging for their team on their Tech Community site. You can see all my posts on my profile page. This post will use Azure Data Explorer too but I thought it fit better on this blog.

A year or two ago, our local water company replaced all of the meters with digital, cellular meters. I immediately asked if that meant we’d get access to more data and they said it was coming in the future. The future is now! If you happen to live in Woodinville, you can get connected with these instructions.

The site is nice and lets you see charts, but by now you probably know that I love collecting data about random things so I immediately tried to figure out how to download the raw data. The only download directly supported form their site is the bi-monthly usage from the bills, but from the charts, I could see that hourly data was available somewhere. A little spelunking in the Chrome dev tools revealed the right REST endpoint to call to get a big JSON array full of the water usage for every hour in the last ~11 months.

I pulled that into Azure Data Explorer and started querying to see what I could learn. This first chart shows the median water usage by three hour chunks of the day. Tyla and I usually both shower in the morning so it makes sense that 6-9am has the heaviest usage.

WaterUsage
| summarize 
    sum(Gallons)
    by Hour=bin(hourofday(Timestamp), 3), bin(Timestamp, 1d)
| summarize percentile(sum_Gallons, 50) by Hour
| render columnchart  with (title = 'Median Water Usage by 3 Hour Bin', legend = hidden)

I feel like there’s probably a better way to do write the next query, but this works. It’s the cumulative usage throughout each month. The four lines at the top of the chart are the summer months when I’m using the irrigation in the yard. The lines that drop off at the end of the month are because I ran the x axis all the way from 1 to 31 for every month so months don’t have enough data, but it still conveys the general idea. It’s interesting how similar all the non-watering months are.

union
(
    WaterUsage
    | summarize Gallons=sum(Gallons) by bin(Timestamp, 1d)
    | extend Month=monthofyear(Timestamp), Day = dayofmonth(Timestamp)
),
(
    // Original data had some missing rows
    datatable(Timestamp:datetime, Gallons:long, Month:long, Day:long)
    [
        datetime(2018-11-26T00:00:00.0000000Z), 0, 11, 26, 
        datetime(2018-11-27T00:00:00.0000000Z), 0, 11, 27, 
    ]
)
| order by Timestamp asc
| serialize MonthlyWater=row_cumsum(Gallons, Month != prev(Month))
| project Month, Day, MonthlyWater
| make-series sum(MonthlyWater) on Day from 1 to 32 step 1 by Month
| render linechart with  (ycolumns = sum_MonthlyWater, series = Day, Month, legend=hidden, title='Cumulative Gallons By Month')

The data is in 10 gallon increments so it’s not super precise but it’s a LOT better than the two month resolution I had previously. I’m excited to play around with this data and see if we can start decreasing our usage.

Along these same lines, I heard that the local power company is starting to install power meters with Zigbee connectivity so there’s a chance that I’ll be able to start getting more insight into my power consumption in a similar fashion…

Security Questions

Many sites still use “security” questions to help you retrieve your account. When you first create an account, they ask you things like “What was the name of your first pet?” and “What color was your first car?” Even if you’re doing well and using a long, random, unique password for that site, you probably just destroyed your security by answering those questions. I’m pretty sure I could answer most of those questions for some of my friends. This is a common route for hackers too, especially with all the information available on social media sites.

Pro-tip: you can lie. It’s ok. I already use Last Pass to create and store random, unique, strong passwords for every single account so I just generate more random characters for these security questions. In Last Pass, there’s a notes field for every account that you store so I drop the questions and answers right in that note field so I have them for later if I need to retrieve my account via the security questions.

Yes, I changed these after taking the screenshot.

Truck Stats

Last year, Tyla got me an OBDII data logger (Automatic) for my birthday and, of course, I ended up writing an app to download my trip data so I could analyze it. I still get those analysis reports twice per day and they continue to be interesting. For example, I don’t know why, but the last two weeks have had some of the worst traffic on my way home from work in the last year. Now that I have over a year of data, there’s enough to calculate some semi-interesting stats on my drives in our 2016 F150 3.5L Ecoboost:

    • The average trip to work takes me 26.3 minutes.
    • The average trip home takes me 33.9 minutes.
    • It feels like if I leave work a couple minutes early, I’ll avoid the worst of the traffic. Here’s my average commute time based on when I leave. (The x-axis is in 24 hour time so 17 is 5pm.) The y-axis is my average commute home in minutes. It does look like if I leave about 10 minutes before 5 my commute is generally 5-10 minutes faster.
    • My most fuel efficient trip was a 43.7mpg drive along the 3.5 mile route from my house to Home Depot. Not bad for a 5000 pound truck! (A lot of it is downhill and I like to see how little gas I can use on that route…)
      • Best fuel mileage for a trip over 10 miles: Church to Totem Lake AutoZone 28.0mpg
      • Best fuel mileage for a trip over 50 miles: Crystal Mountain to our house 24.7mpg
    • My worst gas mileage is going from Work to the butcher. It’s a short trip and when it’s really cold, my truck spends the whole time idling at stop lights and trying to warm up. I’ve gotten 3.5mpg on that route a couple times!
    • Of the days that I drive the truck, I spend an average of 69.3 minute driving.
    • The most driving in one day was 366 minutes. That was May 25, 2018 when we drove down to Ocean Park for Memorial Day.

I love having all this data! I could do this all night but I should probably cut it off here and go to bed. By the way, all of these charts and stats were created with public preview of Azure Data Explorer. We’ve been using that product internally for a couple years and it makes stuff like the stats above ridiculously fast and easy. If you’re at all involved in data engineering or data analysis, you need to get familiar with Azure Data Explorer!

8-Bit Ben

The site Fiverr.com (pronouned “five-er”) has been around for quite a while, but I just recently used it for the first time. The idea is that you can pay someone $5 (or something very cheap) to do a small digital task for you. You can peruse it yourself to see all the various offerings, but I wanted a pixel art picture of myself.

I’m quite happy with the 8-bit avatar that was drawn for me by user arveyyudi. He even allowed for a minor change to my hair color after he sent me the first copy. I’ve set this as my avatar on a few sites and even in our Outlook directory at work.

Standing Desk Monitor

We have nice standing desks at work. They have electric motors with memory settings so it’s quick and easy to switch between standing up or sitting down. I believe that it’s significantly healthier to stand up at least part of the day, but I find myself being lazy and sitting for most of the day. I also know that it’s relatively easy to motivate myself by measuring whatever I’m trying to improve. Time for a project!

To measure whether I’m standing or sitting, I decided to use a distance sensor that either sits on top of the desk and looks at the floor, or sits on the floor and looks up. I’m sure there are cheaper ways to do this, but I ordered a SparkFun BlackBoard, Distance Sensor Breakout, and a Qwiic cable to connect them. There was no soldering required. I plugged it all in and I was good to go. I laser cut a wood box to hold all the components.

I wrote a simple program for the Arduino-compatible BlackBoard that would send a measurement when it received a keystroke and then I wrote a program that runs on the computer to periodically request measurements (via USB) and upload them to a database in the cloud. I put a website on top of the page and voila!

A friend at work heard about the idea and wanted to compete with me so now we are both running these devices. You can track our progress at http://standupweb.azurewebsites.net/

Cloud Backup

As I mentioned about a year ago, CrashPlan is closing shop for home users and focusing on the small business market. My contract with them is up in a few months so I did some research to pick a new cloud backup provider. I had about 4TB stored on CrashPlan so a key feature for me was unlimited backup size. I settled on Backblaze. They’ve been around for quite a while and have a feature set that meets my needs and a price that doesn’t break the bank.

So now begins the arduous journey of uploading 4TB of data over my Comcast connection. Comcast limits me to 1TB per month with pretty heavy penalties for going over. I normally use 300-400GB/month so it’s going to take quite a while to upload my data again.

Comcast provides a web page to view your usage, but I wanted something a little easier to monitor. My router keeps track of my usage and it’s roughly the same as what Comcast says so I wrote an app that grabs the usage numbers from my router every hour and stores them in a database. Now I can quickly check my usage, predict where I’m going to end up, etc. That gives me the info I need to turn my backup on and off to use up as much of that 1TB as possible without going over.

I’ve got about 1TB uploaded and I’m happy so far. Their software is ridiculously easy to use and they have a phone app for accessing random files on the go. It’s a good final step in the 3-2-1 backup strategy which means that you should keep 3 copies of your data. 2 are stored locally and 1 is stored remotely.

Air Quality

The smoke that I wrote about last week has thankfully mostly cleared out. When it was really bad, I spent quite a bit of time checking around various websites to see how bad the smoke was, and I was frustrated that I couldn’t get a quick answer easily from my phone. I was also interested in getting some long term trend charts. So, being a geek, I wrote a program that pulls down the data from the closest air quality sensor (between Kenmore and Lynnwood) and stores it. I also made a simple web page that’s optimized for cell phone displays. Pin it to your home screen and you have a quick and easy way to check the air quality (assuming you live near me.) http://localairquality.azurewebsites.net

The reading is usually 2-3 hours behind the current time, but the levels don’t usually change too much in a couple hours.

I also discovered a fantastic blog that writes about the current smoke situation and the smoke forecast. Bookmark this one too: http://wasmoke.blogspot.com