Public Universities and Open Source

The following is a contextual document to go with my lightning talk at WordCamp San Francisco, 2014. You can download the slides, but they are fairly contextless. :)

In 1904, Kenyon Butterfield introduced what he called the Social Phase of Agricultural Education. He believed agricultural colleges should be “the inspiration, the guide, the simulator of all possible endeavors to improve farm and farmer.”

He saw public universities as having 3 functions:

  • Organ of Research
  • Educator of Students
  • Distributor of Information to those who cannot come to college

His work led to the Smith-Lever act of 1914, which established the concept of extension services. These extension services are embedded in public land grant universities and act as a conduit between research and community. This establishes the distribution of information, something Butterfield actually refers to as the democratization of truth.

Today, these extension services are part of the Cooperative State Research Education and Extension Service. Its ultimate goal being to “Advance agriculture, the environment, human health and well-being, and communities.”

The core mission of the WordPress project is to democratize publishing through open source, GPL software. It’s at this intersection of ideas between WordPress and public land grant universities where I get the most excited.

We also have a long way to go.

If you look at the criteria for an associate professor in a public land grant university on track to tenure, you’ll notice an alignment with Butterfield’s vision: Teaching, Research, and Service – three things a professor must focus on to become tenured faculty.

If you talk to an associate professor, you may hear other criteria – Research. Funding of research, publication of the research, citations of your published research in the research of others…

While service, or the distribution of information, is important, it’s often the money or prestige coming into the university that can trump all.

This actually isn’t that far off from other public non-land grant universities. A professor on track to tenure there is likely focused on Teaching and Research, without the specific instruction to distribute information to their community.

With that in mind. I’ve been focused on two questions lately:

  • How do we encourage sharing of work with the community?
  • How do we make it easier to share work when one is ready to do so?

WordPress makes the second question pretty easy. We can setup a site within minutes with all the functionality needed to start publishing in an interface that actually makes it pleasant to create content.

The first question is the hard one. It involves a change in mindset.

If I talk about my research as I’m doing it, what will stop a competing researcher in another university or company from leap frogging my work and publishing before I do? Do I lose the right to my intellectual property, possibly prevented from working on something because of a patent I should have gotten?

My favorite response so far is “Who Cares?” What is the absolute worst that will happen from somebody stealing your research?

I’m not trying to trivialize the work somebody is doing. I really do thing answering this question can help answer a larger question.

What can we do to make it easier to share work while also protecting the work that someone is doing?

I think if we look to the ethos of open source, we can find the answers we’re looking for.

The free software foundation defines the four freedoms of free and open source software as such:

  • The freedom to run the program as you wish, for any purpose (freedom 0).
  • The freedom to study how the program works, and change it so it does your computing as you wish (freedom 1). Access to the source code is a precondition for this.
  • The freedom to redistribute copies so you can help your neighbor (freedom 2).
  • The freedom to distribute copies of your modified versions to others (freedom 3). By doing this you can give the whole community a chance to benefit from your changes. Access to the source code is a precondition for this.

Some of this freedom is already accounted for with the rise of open access publishing. Many universities are creating policies where final research works must only be published in open access journals or, when published elsewhere, be part of an open access repository housed by the University.

However, if we really are distributing information to the community, should we not share the beginnings of our work as well, or at least the process?

Yochai Benkler, who has written a lot of great material on open source and communities, wrote that “the most precious of all public domains” is “our knowledge of the world that surrounds us”.

He also says:

“Information, knowledge, and culture are central to human freedom and human development.” – Wealth of Networks

I like those sentiments, that state of mind.

And I would like it if you gave thought to how we can, in the process of democratizing publishing, also find ways to help public universities democratize the truth.

SSL remains fairly terrifying

Moxie Marlinspike‘s presentation on SSL Stripping, while 5 years old, is both fascinating and terrifying. I’m not sure I’ll ever turn my secure VPN off again. At the same time, I’m not sure if it really does me any good.

The 55 minutes of his talk are very much worth it. Some moments from the video:

“when looking for security vulnerabilities … it’s good to start with places where developers probably don’t really know what they’re doing but feel really about the solutions they’ve come up with.”

“A padlock, who’d of thought … it doesn’t inspire security.”

“[EV Certs]: Now we’re supposed to pay extra for the Certificate Authorities to do the thing they were supposed to do to begin with.”

And the most important to remember, which is also the least assuring:

“Lots of times the security of HTTPS comes down to the security of HTTP, and HTTP is not secure.”

Major props to Zack, who prodded me to watch this many times before I finally ran into it again today.

CC by 2.0 licensed photograph by minhocos

Amazon’s petition for exemption to fly drones commercially

Amazon filed a petition for exemption with the FAA last week so that they could fly prototype drones outdoors as part of research and development for their future Prime Air offering. It’s a quick read, with a couple fun points:

Because Amazon is a commercial enterprise we have been limited to conducting R&D flights indoors or in other countries.


We will effectively operate our own private model airplane field, but with additional safeguards that go far beyond those that FAA has long-held provide a sufficient level of safety for public model airplane fields – and only with sUAS.

It’s pretty amazing to think that Amazon would have made this much progress—eight or nine generations—without flying anything outside. One of the items listed in their request was the mention that their drones flew up to 50mph with 5lb payloads. How big is this facility that they’re testing in?

Or is this a lie that many working on serious commercial efforts with drones right now is telling?


Finding the source of research news

A researcher at Washington State University had a role in some interesting news that came out yesterday. We published a great writeup: “Major study documents benefits of organic farming“. Newcastle University published a release: “New study finds significant differences between organic and non-organic food“. Large news organizations such as The Guardian and The New York Times provided a good digestion of the results.

Alas, a comment on Hacker News summed up my feelings on many of these:

There is no link to the paper or a preprint of the article.

Often when I read news like this, I want to dive in and at least skim the published research. But this is where our various content management systems break down the most.

Even though this paper is licensed under the very open Creative Commons CC by 3.0, which allows me to share and even build on the material as long as I provide proper attribution—it’s a horrible process to find.

At WSU News, our article included contact information for the researcher but no direct link to the paper or even the primary university’s release. In Newcastle’s release, a page is linked to that actually includes the full text of the paper, but the name doesn’t match the title and is a somewhat confusing experience. This is much better than many, as the paper is at least accessible. In the New York Times, the article links to the abstract at the National Center for Biotechnology Information. This abstract has a link for the full text at Cambridge Journals Online, where it isn’t actually published yet but will be on July 15th. The Guardian provides a way, but uses the words “several academic websites” to link to two different places. One is the same NCBI abstract the NYT links to. The other is at Research Gate, which actually has a link to the full paper but includes a really confusing order form in the first two pages of the PDF so that you aren’t actually sure what you’re looking at.

The best page I’ve found yet is actually another at WSU. Chuck Benbrook, the researcher involved with the study, published an article that links to a full page of resources, including the full text of the paper and supplemental data.

I guess the most discouraging part of this is the wide open license on the paper. It gets much harder to track things down when a paper is published in a paid journal. I’m lucky in that I work for a university. If I want access to a paper, there is likely a way. The process can be confusing though, especially if you aren’t used to the required jumping around.

To be clear, this is not a gripe on anyone writing the articles. It is a gripe on those of us creating the systems that manage this content.

The part I’m going to push for at WSU is a way to attach source data to these articles in a clear way. Every time an article is written about a piece of research, that research should have a clear space on the page—in the same spot every time—that provides instructions or direct access to a document.

And with that.

Higher antioxidant and lower cadmium concentrations and lower incidence of pesticide residues in organically grown crops: a systematic literature review and meta-analyses. Baranski, M., D. Srednicka-Tober, N. Volakakis, C. Seal, R. Sanderson, G. B. Stewart, C. Benbrook, B. Biavati, E. Markellou, C. Giotis, J. Gromadzka-Ostrowska, E. Rembiałkowska, K. Skwarło-Son, R. Tahvonen, D. Janovska, U. Niggli, P. Nicot and C. Leifert.

Links for PFS, DH, DHE, and ECDHE and SSL in general

So many acronyms.

I have many tabs open right now that I’m about to close and I’m not great at bookmarks. Here are some of the things I’ve been reading while trying to figure out PFS in SSL.

And I just bought this book: Bulletproof SSL and TLS


Managing the Environment

This is a brief companion post to the talk I gave yesterday for Web Conference at Penn State 2014. The tagline for the conference was “the future friendly web”, and the talk covered how this web can be created with rapid, incremental improvements with defined workflows around version control, provisioning, deployment, and testing.

It’s a bit interesting that the talk links on the conference site don’t have dates in URL. Let’s see if that web remains future friendly. :)

Thanks to everyone that attended the talk, hopefully there were some good takeaways. Please reach out if you have any questions about any of this.

Slides: Managing the Environment – Web Conference 2014

Useful links from the presentation:


Figuring out how to serve many SSL certificates, part 2.

I’ve been pretty happy over the last couple days with our A+ score at SSL Labs. I almost got discouraged this morning when it was discovered that LinkedIn wasn’t able to pull in the data from our HTTPS links properly when sharing articles.

Their bot, `LinkedInBot/1.0 (compatible; Mozilla/5.0; Jakarta Commons-HttpClient/3.1 +http://www.linkedin.com)`, uses an end of life HTTP client that happens to also be Java based. One of our warnings in the handshake simulation area was that clients using Java Runtime Environment 6u45 did not support 2048 DH params, something that we were using. I’m not entirely sure if LinkedIn has their JRE updated to 6u45, but I’m guessing that anything below that has the same issue.

I generated a new 1024 bit dhparams file to solve the immediate issue and reloaded nginx without changing any other configs. LinkedIn can now ingest our HTTPS links and we still have an A+ score. :)

Figuring out how to serve many SSL certificates, part 1.

In the process of figuring out how to configure SSL certificates for hundreds (maybe thousands) of domains in a single nginx configuration without a wildcard certificate, I decided it would be cool to use `server_name` as a variable in the nginx configuration:

`ssl_certificate /etc/nginx/ssl/$server_name.crt;`

Unfortunately, per this aptly named request on Server Fault—nginx use $server_name on ssl_certificate path—that’s not allowed.

Nginx docs explain it more:

Variables are evaluated in the run-time during the processing of each request, so they are rather costly compared to plain static configuration.

So with that, I’m going to have to generate a bunch of `server {}` blocks that point to the correct certificate and key files before including a common config. I can’t find any examples of this yet, so I’m still wondering if there’s a better way.

I’m about to create a repository named something like WSUWP-P2-Common that contains all common P2 related plugins and/or themes for use throughout the WSUWP ecosystem.

It’s purpose will be more of a built package rather than a development area. Development will still occur in individual repositories. When releases are pushed in those repositories, they can be deployed to the central package repository as well.

I feel like I’m reinventing the wheel though and that if I understood Composer enough, I could use that. But then part of me doesn’t care if I’m reinventing the wheel because it will just work with our current deploy process without much effort.

I also wonder if this is better of as a private repository. I guess if we run into a plugin that isn’t GPL compatible for some reason, we can create a separate repository for WSUWP-P2-Common-Private, but I’m hoping that isn’t the case.

If this works as a model, we’ll likely have other “package” repos in the near future for other sites.