A personal history of tracking people on websites

This is me trying to figure out why.

In the earlier days of the web—that I remember—we all used counter scripts. I don’t know where they came from, but every place that hosted sites seemed to also have a cgi-bin directory. If you were lucky, you could find some sort of count.cgi script written in Perl that would just work. If you reloaded the page over and over again, the number would increase.

Of course, this wasn’t really reliable. If you didn’t understand Perl, and I didn’t know anybody that did at the time, it could be hard to make things work. Those cgi-bin directories differed in permissibility and, sooner or later, someone was going to exploit a vulnerability in your copy/pasted code.

Where did I even find that code!

A bit later. I don’t remember anything about signing up for this, but there was a service called “Webdigits” or “Web Counter” that offered a hosted tracker. This is the first time that I outsourced tracking the tracking of people to someone else.

A page of mine from January 1998 had the following:

<LAYER ID=counter TOP=975 LEFT=10>
<font size = 4>
<a href = http://www.digits.com>Web Counter</a> says you are the 
<IMG SRC="http://counter.digits.com/wc/-d/8/-c/8/-z/oldemail@isp.com" 
    ALIGN=middle WIDTH=60 HEIGHT=20 BORDER=0 HSPACE=4>
person to access this page.
</font>
</LAYER>

Retro HTML!

I have a feeling this was similar to the self-hosted script in that you could reload the page over and over again and the counter would just go up. Still nothing granular, but you (and everyone else) could see how many visits had accumulated.

The next tracker I remember was the very popular and now defunct Site Meter. This moved the web from tracking via an image to tracking via Javascript:

<!--WEBBOT bot="HTMLMarkup" startspan ALT="Site Meter" -->
<script type="text/javascript" language="JavaScript">
    var site="sm5accountname"
</script>
<script type="text/javascript" language="JavaScript1.2" 
    src="http://sm5.sitemeter.com/js/counter.js?site=sm5accountname">
</script>

Much more information about people was suddenly available—where they were from, how much time they spent on pages, and what sites led to their visit.

Here’s the amazing part: I have at least 27 receipts in my email over a 2.5 year period at $6.95/month. I paid for a service to track analytics!

At some point in 2007 Site Meter started injecting another script on behalf of a shady looking 3rd party company and the Blogosphere was up in arms. (Oh, we were still so young…)

Luckily (ha!), I got my invite to Google Analytics in March of 2006, so I was already using multiple trackers on my site when the Site Meter debacle happened.

It was during this period that Google Analytics really started to take over the analytics world. The interface was nicer, the data was probably more accurate, people were (supposedly) making a ton of money gaming AdWords, and this was your way in! I can see the spread in my email archive as I started adding analytics to any site I was involved with, convinced that it was good to know who was doing what and where on your slice of the web.

And it was all a lot of fun. There was a thrill to watching the numbers climb throughout the day if you wrote a post that a couple other bloggers picked up on or hit the first page of search results for a recent event.

The next few years (2007-2012) seem to have been relatively uneventful for me in analytics. I played with Piwik for a few months on a custom domain, but for some reason it didn’t stick. And in 2011 or 2012 I started using WordPress.com stats via the Jetpack plugin in addition to Google Analytics. These were nice in that straight-forward were available without effort in the WordPress admin.

And now.

Over the last few years my thinking has shifted.

When I first setup the global analytics dashboard at WSU, I was pretty excited. It achieved that goal of having a fun display to watch numbers on throughout the day as different events happened. But when you start digging through the available data that Google provides, it gets a little creepy. Visitor profiles are only non-identifying until you stare close enough. It doesn’t take long to take a guess at what department a person might be from. If you stared at the data for a few days, you could probably tell someone on campus that you knew what grad program they were interested in.

All the while, there was no investment in the actual use of analytics. The web team tracked them, the data was available, but often nobody looked at it, let alone tried to use it for anything purposeful.

At some point I was on a call with a company that was walking through their “simple” “retargeting” “conversion” “pixel”—I loathe the way that word is used—that they wanted to put on all university pages to track the success of a marketing campaign. We were easily able to talk things down to a single landing page, but during that conversation the guy bragged about how creepy they could get in tracking people on the web.

If you ever start digging into the mechanics of that “pixel”, things feel even uglier. This marketing company sells you the pixel that they source from this 3rd party that sources their platform from another. The opportunity for someone to misuse or mishandle the data just keeps growing the more you look at that pixel.

Anyhow.

That conversation really planted a seed—why does a university need to track people? Why does anyone need to track people?

I’m still on board with tracking visits. I like to see how many people view content. If they choose to pass the info on, I like to see what referred them to the site.

I don’t think data about individual people actually matters for the overwhelming majority of the web.

So I made a few adjustments based on this.

  • I removed Google Analytics. (also from VVV)
  • I disabled Jetpack, which removed the WordPress.com tracker. I tried to disable via filter first, but it kept going.
  • I setup a self-hosted installation of Matomo, an open source analytics platform (formerly Piwik) and I’ve anonymized IP addresses down to two octets.

I’m still working out how I use Matomo. As with many things, I did this on a weekend morning and haven’t looped back around to it yet. I hope to find an accurate enough mix of useful data and privacy. I’ll work on turning that into a framework and policy that I can use on any site I manage.

And I’m still working through how I feel about everything. When building the web with others as a partner, I don’t want to be dismissive towards current or future practices around analytics without having something constructive to offer. I do want to make sure that we keep people as part of the conversation rather than pixels and users.

And of course, I’m very open to conversation. I’m still a beginner in being mindful about this, even if I’ve been tracking people for over 20 years! 😂

Tracking Your Heart Rate Via Webcam

I remember being fascinated by the Eulerian Video Magnification work when some of the videos were being spread around, so I was excited to see the Webcam Pulse Detector project pop up on Quantified Self as I was scrolling through some missed feeds this morning.

It didn’t seem too difficult to setup for somebody with some linux familiarity and I set off to make it happen on my laptop.

The entire process took a couple hours. Some of that was due to missteps in installing OpenCV or not using sudo in the right place. The rest was due to the unavoidable—some packages just take a long time to install.

Seeing it finally work is really, really cool. Using my forehead, the app seemed to consistently track my heart rate at around 54-57bpm. At the same time I measured my pulse at my wrist as 60bm. I’ll need to track the consistency over time and with non-resting heart rates as well, but that seems like an acceptable variance so far. Pretty cool stuff.

If you want to give it a go and you’re running OS X 10.8.3 on your machine, I’m embedding a gist with the commands I had to use to make this work along with some comments inline.

# Starting with...
# OS X 10.8.3
# python 2.7.2 // python --version
# c++ 4.0      // c++ --version
# g++ 4.2      // g++ --version
#
# Update/Install XCode command line utils
# c++ 4.2      // c++ --version
#
# Doing all my work in ~/Development
cd ~/Development

# Update brew packages
brew update

# Install Python Package Index
sudo easy_install pip

# Install NumPy - was already installed for me
sudo pip install numpy

# Verify that NumPy is available to Python
python
import numpy
quit()
# Would see a file import error if not successful

# SciPy requires a Fortran compiler, available via brew
brew install gfortran

# Install SciPy
sudo pip install scipy

# Verify that SciPy is available to Python
python
import scipy
quit()
# Would see a file import error if not successful

# Install matplotlib
sudo pip install matplotlib

# OpenCV 2.4.5
# 
# Download tar file for Mac/Linux from http://opencv.org/downloads.html
# to ~/Development
tar -xvf opencv-2.4.5.tar.gz
cd opencv-2.4.5
mkdir release
cd release
cmake -D CMAKE_BUILD_TYPE=RELEASE -D CMAKE_INSTALL_PREFIX=/usr/local -D BUILD_PYTHON_SUPPORT=ON -D BUILD_EXAMPLES=ON ..
make
make install

# OpenCV was still not available to Python until I added
# it to the PYTHONPATH variable
PYTHONPATH="/usr/local/python-2.7.2"
export PYTHONPATH

# Verify that cv2 is available to Python
python
import cv2
quit()
# Would see a file import error if not successful

# OpenMDAO 0.6.0
# 
# Download go-openmdao.py from http://openmdao.org/downloads-2/recent/
# to ~/Development
python go-openmdao.py

# Clone Webcam Pulse Detector repo
git clone git://github.com/thearn/webcam-pulse-detector.git

# Launch OpenMDAO terminal
. openmdao-0.6.0/bin/activate

# Launch Webcam Pulse Detector
cd webcam-pulse-detector
python get_pulse.py

There were also plenty of resources that proved invaluable in actually finding the right answers for installing some of these software packages: