A personal history of tracking people on websites

This is me trying to figure out why.

In the earlier days of the web—that I remember—we all used counter scripts. I don’t know where they came from, but every place that hosted sites seemed to also have a cgi-bin directory. If you were lucky, you could find some sort of count.cgi script written in Perl that would just work. If you reloaded the page over and over again, the number would increase.

Of course, this wasn’t really reliable. If you didn’t understand Perl, and I didn’t know anybody that did at the time, it could be hard to make things work. Those cgi-bin directories differed in permissibility and, sooner or later, someone was going to exploit a vulnerability in your copy/pasted code.

Where did I even find that code!

A bit later. I don’t remember anything about signing up for this, but there was a service called “Webdigits” or “Web Counter” that offered a hosted tracker. This is the first time that I outsourced tracking the tracking of people to someone else.

A page of mine from January 1998 had the following:

<LAYER ID=counter TOP=975 LEFT=10>
<font size = 4>
<a href = http://www.digits.com>Web Counter</a> says you are the 
<IMG SRC="http://counter.digits.com/wc/-d/8/-c/8/-z/oldemail@isp.com" 
    ALIGN=middle WIDTH=60 HEIGHT=20 BORDER=0 HSPACE=4>
person to access this page.
</font>
</LAYER>

Retro HTML!

I have a feeling this was similar to the self-hosted script in that you could reload the page over and over again and the counter would just go up. Still nothing granular, but you (and everyone else) could see how many visits had accumulated.

The next tracker I remember was the very popular and now defunct Site Meter. This moved the web from tracking via an image to tracking via Javascript:

<!--WEBBOT bot="HTMLMarkup" startspan ALT="Site Meter" -->
<script type="text/javascript" language="JavaScript">
    var site="sm5accountname"
</script>
<script type="text/javascript" language="JavaScript1.2" 
    src="http://sm5.sitemeter.com/js/counter.js?site=sm5accountname">
</script>

Much more information about people was suddenly available—where they were from, how much time they spent on pages, and what sites led to their visit.

Here’s the amazing part: I have at least 27 receipts in my email over a 2.5 year period at $6.95/month. I paid for a service to track analytics!

At some point in 2007 Site Meter started injecting another script on behalf of a shady looking 3rd party company and the Blogosphere was up in arms. (Oh, we were still so young…)

Luckily (ha!), I got my invite to Google Analytics in March of 2006, so I was already using multiple trackers on my site when the Site Meter debacle happened.

It was during this period that Google Analytics really started to take over the analytics world. The interface was nicer, the data was probably more accurate, people were (supposedly) making a ton of money gaming AdWords, and this was your way in! I can see the spread in my email archive as I started adding analytics to any site I was involved with, convinced that it was good to know who was doing what and where on your slice of the web.

And it was all a lot of fun. There was a thrill to watching the numbers climb throughout the day if you wrote a post that a couple other bloggers picked up on or hit the first page of search results for a recent event.

The next few years (2007-2012) seem to have been relatively uneventful for me in analytics. I played with Piwik for a few months on a custom domain, but for some reason it didn’t stick. And in 2011 or 2012 I started using WordPress.com stats via the Jetpack plugin in addition to Google Analytics. These were nice in that straight-forward were available without effort in the WordPress admin.

And now.

Over the last few years my thinking has shifted.

When I first setup the global analytics dashboard at WSU, I was pretty excited. It achieved that goal of having a fun display to watch numbers on throughout the day as different events happened. But when you start digging through the available data that Google provides, it gets a little creepy. Visitor profiles are only non-identifying until you stare close enough. It doesn’t take long to take a guess at what department a person might be from. If you stared at the data for a few days, you could probably tell someone on campus that you knew what grad program they were interested in.

All the while, there was no investment in the actual use of analytics. The web team tracked them, the data was available, but often nobody looked at it, let alone tried to use it for anything purposeful.

At some point I was on a call with a company that was walking through their “simple” “retargeting” “conversion” “pixel”—I loathe the way that word is used—that they wanted to put on all university pages to track the success of a marketing campaign. We were easily able to talk things down to a single landing page, but during that conversation the guy bragged about how creepy they could get in tracking people on the web.

If you ever start digging into the mechanics of that “pixel”, things feel even uglier. This marketing company sells you the pixel that they source from this 3rd party that sources their platform from another. The opportunity for someone to misuse or mishandle the data just keeps growing the more you look at that pixel.

Anyhow.

That conversation really planted a seed—why does a university need to track people? Why does anyone need to track people?

I’m still on board with tracking visits. I like to see how many people view content. If they choose to pass the info on, I like to see what referred them to the site.

I don’t think data about individual people actually matters for the overwhelming majority of the web.

So I made a few adjustments based on this.

  • I removed Google Analytics. (also from VVV)
  • I disabled Jetpack, which removed the WordPress.com tracker. I tried to disable via filter first, but it kept going.
  • I setup a self-hosted installation of Matomo, an open source analytics platform (formerly Piwik) and I’ve anonymized IP addresses down to two octets.

I’m still working out how I use Matomo. As with many things, I did this on a weekend morning and haven’t looped back around to it yet. I hope to find an accurate enough mix of useful data and privacy. I’ll work on turning that into a framework and policy that I can use on any site I manage.

And I’m still working through how I feel about everything. When building the web with others as a partner, I don’t want to be dismissive towards current or future practices around analytics without having something constructive to offer. I do want to make sure that we keep people as part of the conversation rather than pixels and users.

And of course, I’m very open to conversation. I’m still a beginner in being mindful about this, even if I’ve been tracking people for over 20 years! 😂

Hints for when I configure OpenVPN and Tunnelblick the next time

I've gone through the process of configuring OpenVPN and Tunnelblick at least twice before and I never seem to get it right on the first or second try. This time I'll document a few of the paint points that I experienced even while following the excellent Digital Ocean guide to configuring OpenVPN on CentOS 6.

  1. Follow the "Initial OpenVPN Configuration" section from the DO document.
  2. When generating keys and certificates in the next section, the easy-rsa files are in /usr/share/easy-rsa/, not /usr/share/openvpn/easy-rsa/
  3. Be descriptive when running ./build-key client with something like ./build-key jeremy-home so that you don't get annoyed later that you have a config named "client".
  4. The DO docs don't mention configuring a TLS-Auth key, even though the OpenVPN configuration now has it by default. Do this with openvpn --genkey --secret /etc/openvpn/ta.key before attempting to start the openvpn service.
  5. You'll need a few more lines in client.ovpn to match the server config. These worked last time, but look at the OpenVPN logs when you try to connect for other errors.
    • tls-auth ta.key 1 (the server uses this with 0) to enable TLS-Auth.
    • cipher AES-256-CBC to fix 'cipher' is used inconsistently errors.
    • keysize 256 to fix 'keysize' is used inconsistently errors.
    • tun-mtu 1500 to set the MTU, though I'm not sure this is really necessary.
    • Remove comp-lzo from the client if it's configured. This appears to cause an IP packet with unknown IP version=15 seen error.
  6. Be sure to copy the contents of ta.key into a new <tls-auth> section at the end of client.ovpn so that the client has the same static TLS-Auth key as the server.

Throughout all this, remember that after you drag and drop a configuration file into Tunnelblick, it gets put somewhere else and needs to be manually reloaded every time you make a configuration change to the client.ovpn file you might be working with.

Things are now working with OpenVPN 2.4.4, easy-rsa 2.2.2, and Tunnelblick 3.7.4a.

What’s the right USB-C to HDMI adapter for a Dell XPS 13″ 9350 running Ubuntu 16.10

 

When I first got my Dell XPS 13″ 9350 (Developer Edition), I needed an adapter so that I could power an HDMI display via the laptop’s USB-C port.

After poking around for a few minutes, it seemed like the DA200 USB-C to HDMI/VGA/Ethernet/USB 3.0 adapter was the right choice. It works with plenty of Dell XPS laptops and says things like “USB-C to HDMI” in the name of the adapter.

I was wrong!

I have no idea why I was wrong, but thanks to the answer on this Ask Ubuntu post, I learned that the DA200 is not exactly what it claims to be. The only way this adapter actually works with a 55″ TV or similar is at 800×600 resolution. Definitely not what you expect when connecting HDMI.

After reading that post, I purchased the Apple USB-C Digital AV Multiport Adapter. It has only a 2 star rating on the Apple site, but it worked immediately and powered the 55″ TV as expected over HDMI at 1920×1080.

Hopefully this post helps make it more obvious via Google that the DA200 is the wrong adapter and the Apple USB-C Digitial AV Multiport Adapter works great!