Studio711.com – Ben Martens

Geek

Analyzing Water Data in Azure Data Explorer

One of my favorite systems at work officially launched a couple weeks ago as Azure Data Explorer (internally called Kusto). I’ve been doing some blogging for their team on their Tech Community site. You can see all my posts on my profile page. This post will use Azure Data Explorer too but I thought it fit better on this blog.

A year or two ago, our local water company replaced all of the meters with digital, cellular meters. I immediately asked if that meant we’d get access to more data and they said it was coming in the future. The future is now! If you happen to live in Woodinville, you can get connected with these instructions.

The site is nice and lets you see charts, but by now you probably know that I love collecting data about random things so I immediately tried to figure out how to download the raw data. The only download directly supported form their site is the bi-monthly usage from the bills, but from the charts, I could see that hourly data was available somewhere. A little spelunking in the Chrome dev tools revealed the right REST endpoint to call to get a big JSON array full of the water usage for every hour in the last ~11 months.

I pulled that into Azure Data Explorer and started querying to see what I could learn. This first chart shows the median water usage by three hour chunks of the day. Tyla and I usually both shower in the morning so it makes sense that 6-9am has the heaviest usage.

WaterUsage
| summarize 
    sum(Gallons)
    by Hour=bin(hourofday(Timestamp), 3), bin(Timestamp, 1d)
| summarize percentile(sum_Gallons, 50) by Hour
| render columnchart  with (title = 'Median Water Usage by 3 Hour Bin', legend = hidden)

I feel like there’s probably a better way to do write the next query, but this works. It’s the cumulative usage throughout each month. The four lines at the top of the chart are the summer months when I’m using the irrigation in the yard. The lines that drop off at the end of the month are because I ran the x axis all the way from 1 to 31 for every month so months don’t have enough data, but it still conveys the general idea. It’s interesting how similar all the non-watering months are.

union
(
    WaterUsage
    | summarize Gallons=sum(Gallons) by bin(Timestamp, 1d)
    | extend Month=monthofyear(Timestamp), Day = dayofmonth(Timestamp)
),
(
    // Original data had some missing rows
    datatable(Timestamp:datetime, Gallons:long, Month:long, Day:long)
    [
        datetime(2018-11-26T00:00:00.0000000Z), 0, 11, 26, 
        datetime(2018-11-27T00:00:00.0000000Z), 0, 11, 27, 
    ]
)
| order by Timestamp asc
| serialize MonthlyWater=row_cumsum(Gallons, Month != prev(Month))
| project Month, Day, MonthlyWater
| make-series sum(MonthlyWater) on Day from 1 to 32 step 1 by Month
| render linechart with  (ycolumns = sum_MonthlyWater, series = Day, Month, legend=hidden, title='Cumulative Gallons By Month')

The data is in 10 gallon increments so it’s not super precise but it’s a LOT better than the two month resolution I had previously. I’m excited to play around with this data and see if we can start decreasing our usage.

Along these same lines, I heard that the local power company is starting to install power meters with Zigbee connectivity so there’s a chance that I’ll be able to start getting more insight into my power consumption in a similar fashion…

Security Questions

Many sites still use “security” questions to help you retrieve your account. When you first create an account, they ask you things like “What was the name of your first pet?” and “What color was your first car?” Even if you’re doing well and using a long, random, unique password for that site, you probably just destroyed your security by answering those questions. I’m pretty sure I could answer most of those questions for some of my friends. This is a common route for hackers too, especially with all the information available on social media sites.

Pro-tip: you can lie. It’s ok. I already use Last Pass to create and store random, unique, strong passwords for every single account so I just generate more random characters for these security questions. In Last Pass, there’s a notes field for every account that you store so I drop the questions and answers right in that note field so I have them for later if I need to retrieve my account via the security questions.

Yes, I changed these after taking the screenshot.

Truck Stats

Last year, Tyla got me an OBDII data logger (Automatic) for my birthday and, of course, I ended up writing an app to download my trip data so I could analyze it. I still get those analysis reports twice per day and they continue to be interesting. For example, I don’t know why, but the last two weeks have had some of the worst traffic on my way home from work in the last year. Now that I have over a year of data, there’s enough to calculate some semi-interesting stats on my drives in our 2016 F150 3.5L Ecoboost:

    • The average trip to work takes me 26.3 minutes.
    • The average trip home takes me 33.9 minutes.
    • It feels like if I leave work a couple minutes early, I’ll avoid the worst of the traffic. Here’s my average commute time based on when I leave. (The x-axis is in 24 hour time so 17 is 5pm.) The y-axis is my average commute home in minutes. It does look like if I leave about 10 minutes before 5 my commute is generally 5-10 minutes faster.
    • My most fuel efficient trip was a 43.7mpg drive along the 3.5 mile route from my house to Home Depot. Not bad for a 5000 pound truck! (A lot of it is downhill and I like to see how little gas I can use on that route…)
      • Best fuel mileage for a trip over 10 miles: Church to Totem Lake AutoZone 28.0mpg
      • Best fuel mileage for a trip over 50 miles: Crystal Mountain to our house 24.7mpg
    • My worst gas mileage is going from Work to the butcher. It’s a short trip and when it’s really cold, my truck spends the whole time idling at stop lights and trying to warm up. I’ve gotten 3.5mpg on that route a couple times!
    • Of the days that I drive the truck, I spend an average of 69.3 minute driving.
    • The most driving in one day was 366 minutes. That was May 25, 2018 when we drove down to Ocean Park for Memorial Day.

I love having all this data! I could do this all night but I should probably cut it off here and go to bed. By the way, all of these charts and stats were created with public preview of Azure Data Explorer. We’ve been using that product internally for a couple years and it makes stuff like the stats above ridiculously fast and easy. If you’re at all involved in data engineering or data analysis, you need to get familiar with Azure Data Explorer!

8-Bit Ben

The site Fiverr.com (pronouned “five-er”) has been around for quite a while, but I just recently used it for the first time. The idea is that you can pay someone $5 (or something very cheap) to do a small digital task for you. You can peruse it yourself to see all the various offerings, but I wanted a pixel art picture of myself.

I’m quite happy with the 8-bit avatar that was drawn for me by user arveyyudi. He even allowed for a minor change to my hair color after he sent me the first copy. I’ve set this as my avatar on a few sites and even in our Outlook directory at work.

Standing Desk Monitor

We have nice standing desks at work. They have electric motors with memory settings so it’s quick and easy to switch between standing up or sitting down. I believe that it’s significantly healthier to stand up at least part of the day, but I find myself being lazy and sitting for most of the day. I also know that it’s relatively easy to motivate myself by measuring whatever I’m trying to improve. Time for a project!

To measure whether I’m standing or sitting, I decided to use a distance sensor that either sits on top of the desk and looks at the floor, or sits on the floor and looks up. I’m sure there are cheaper ways to do this, but I ordered a SparkFun BlackBoard, Distance Sensor Breakout, and a Qwiic cable to connect them. There was no soldering required. I plugged it all in and I was good to go. I laser cut a wood box to hold all the components.

I wrote a simple program for the Arduino-compatible BlackBoard that would send a measurement when it received a keystroke and then I wrote a program that runs on the computer to periodically request measurements (via USB) and upload them to a database in the cloud. I put a website on top of the page and voila!

A friend at work heard about the idea and wanted to compete with me so now we are both running these devices. You can track our progress at http://standupweb.azurewebsites.net/

Cloud Backup

As I mentioned about a year ago, CrashPlan is closing shop for home users and focusing on the small business market. My contract with them is up in a few months so I did some research to pick a new cloud backup provider. I had about 4TB stored on CrashPlan so a key feature for me was unlimited backup size. I settled on Backblaze. They’ve been around for quite a while and have a feature set that meets my needs and a price that doesn’t break the bank.

So now begins the arduous journey of uploading 4TB of data over my Comcast connection. Comcast limits me to 1TB per month with pretty heavy penalties for going over. I normally use 300-400GB/month so it’s going to take quite a while to upload my data again.

Comcast provides a web page to view your usage, but I wanted something a little easier to monitor. My router keeps track of my usage and it’s roughly the same as what Comcast says so I wrote an app that grabs the usage numbers from my router every hour and stores them in a database. Now I can quickly check my usage, predict where I’m going to end up, etc. That gives me the info I need to turn my backup on and off to use up as much of that 1TB as possible without going over.

I’ve got about 1TB uploaded and I’m happy so far. Their software is ridiculously easy to use and they have a phone app for accessing random files on the go. It’s a good final step in the 3-2-1 backup strategy which means that you should keep 3 copies of your data. 2 are stored locally and 1 is stored remotely.

Air Quality

The smoke that I wrote about last week has thankfully mostly cleared out. When it was really bad, I spent quite a bit of time checking around various websites to see how bad the smoke was, and I was frustrated that I couldn’t get a quick answer easily from my phone. I was also interested in getting some long term trend charts. So, being a geek, I wrote a program that pulls down the data from the closest air quality sensor (between Kenmore and Lynnwood) and stores it. I also made a simple web page that’s optimized for cell phone displays. Pin it to your home screen and you have a quick and easy way to check the air quality (assuming you live near me.) http://localairquality.azurewebsites.net

The reading is usually 2-3 hours behind the current time, but the levels don’t usually change too much in a couple hours.

I also discovered a fantastic blog that writes about the current smoke situation and the smoke forecast. Bookmark this one too: http://wasmoke.blogspot.com

Windows 10 Multiple Desktops

Multiple desktops have been around operating systems for a very long time, but they came to Windows 10 in an easy-to-use feature. I’ve come to really enjoy them and thought I would share how I use the feature in my daily work because I have found that most people don’t know about it.

First off, what are multiple desktops? If you’re reading on your computer right now, you might have a collection of windows open. That’s a single desktop. Now imagine if you could switch to a new desktop and have a completely different set of windows open while still making it easy to get back to the old desktop. That’s multiple desktops.

There are plenty of tutorials online showing how to set it up and move windows between desktops so I’ll skip that part. The key thing for me is that multiple desktops help me context switch and focus at work. Desktop 1 is for email, IM, Spotify and other communication/peripheral stuff. Then I have a desktop for each task that I’m working on. Since I try to keep multi-tasking to a minimum, this means that ideally I only have one other desktop. Working on this other desktop helps me to stay focused on that activity and not get distracted by email, etc. If someone comes to ask me a question, I can flip to a new desktop, open windows to answer their question and then quickly jump back into the work I was doing.

It’s not a perfect solution though. Some apps don’t play nicely with multiple desktops. OneNote is probably the worst offender in my daily workflow. If I already have OneNote open on Desktop 1 and then I try to open it on Desktop 2, it flips me back to Desktop 1 and opens a second copy. Then I have to drag the window to Desktop 2. It’s annoying but not a deal-breaker.

It’s an advanced feature that takes a while to get used to, but consider giving it a try for a week or two to see if it fits your workflow.

P.S. One usage tip: To quickly flip back and forth once you have multiple desktops going, hold down CTRL+WINDOWS and press the left and right arrows.

Amazon Wishlists

Lots of people in my family use Amazon wish lists to share gift ideas. Amazon recently made it a bit more difficult to add stuff to wish lists from other sites so I thought I’d write up a quick guide on how to do it safely.

  1. First, make sure you’re using Chrome. That’s probably a good idea in general.
  2. Add the Amazon Assistant extension.
  3. You should now have an Amazon button in your Chrome tool bar. Click that and log in.
  4. Once you’re logged in, there’s an “Add to List” tab inside that Amazon menu and you can add your current age to your wish list.

But here’s the catch… when you add this as an extension, Amazon gets to see ALL of the sites that you are visiting and since you’re already logged in, they are building up quite a profile about you. If you go into the settings (click the Amazon extension and then click the little gear in the top left), you can click turn off everything in “Customize Content” and “Product Compare”.

That’s PROBABLY enough to stop them from tracking you, but personally, I just leave the extension disabled until I want to use it. To enable/disable the extension:

  • Click the Chrome menu button in the top right (three dots). Then click More Tools > Extensions
  • Toggle the blue slider for the Amazon Assistant extension.

I rarely add things to my list so it’s not too much of a hassle and I feel better not having them spy on me.

#deletefacebook

Facebook stock is down 21% on this news story about how Cambridge Analytica was able to use Facebook data to gather information about 50 million users. As usual, there is a lot of spin related to this story and it finally got confusing enough that I looked into it. I think what pushed me over the edge was hearing that Elon Musk had deleted the Tesla and SpaceX Facebook pages.

The “shady practices” that Cambridge Analytica used to gather it’s data are nothing new. If a user logs into your application, the Facebook Graph API not only lets you collect data on that user, but on all their friends as well. It has been pretty well known by API users and marketers in general. The Verge has a good article that explains that more.

One of the reasons this is getting so much press NOW instead of many years ago is that this specific instance is related to the Trump election. How sweet is a news story where you can combine privacy, BIG DATA, and a reason why dumb people were fooled into voting for Trump? The news outlets can make a lot of money off that combination.

What should normal users do about this?

  1. Go to Facebook, click Settings > Apps. First, delete all the apps that you don’t use regularly. Then click Apps Others Use and uncheck everything. That will stop sharing of your data with companies because your friends logged into something with their Facebook credentials.
  2. Don’t use your Facebook credentials to log into a website or an app. Always create a unique login for that specific application using your email address. And if you can’t create a login with your email address, then it’s probably shady anyway. The main reason all those dumb quizzes exist on Facebook is so that they can access your profile data (and your friend’s profile data.)
  3. Remove personal information from your Facebook profile. e.g. Is it really that important to have your birthday on your profile?
  4. If you want to go further, you can remove and hide some of your old activity. This is a pain, but I’ve documented it before.

 

So in summary, there’s nothing new about this news story that I can see, but people are finally realizing some of what has been going on. Unfortunately this is just the tip of the iceberg. Big data is here to stay and information about you is more valuable than you realize. A lot of it is really hard to control unless you’re willing to go full-tinfoil-hat, but it’s not too hard to take a few basic steps in that direction.