Open edX at Scale using Kubernetes
Last week, Morgan and I presented at DevOpsDays Boston, a conference that brings development and operations together. There were professionals from organizations big and small to discuss the latest tools and processes for improving and streamlining operations of large scale web and mobile applications. Some of the companies who attended/sponsored were BMC Software, Chef, Puppet Labs, AppNeta, Rackspace, Constant Contact, Athena Health.
I’ve attended a few of these conferences which are superbly run by James Meickle, but this was the first time I’ve given a talk. Our talk topic was “Scaling Open edX with Kubernetes”, and we hesitated submitting this topic in the first place because it’s still very much a work in progress.
But we felt that our limited knowledge of Kubernetes, was still more than most people attending the conference, and sometimes the “newbie” perspective on a new technology is refreshing rather than an expert talk which might be over most people’s head. (scroll down to the bottom of this post to see the video recording of our talk and the slides)
Our primary goal was to provide a gentle introduction to what Kubernetes is and why you might want to use it, supported by a specific example to provide context. In our case, the example we cite in the talk is Open edX, a complex but highly scalable open source software application for authoring and delivering online courses.
Open edX is the same software that powers edX.org, a site that offers free and high-quality courses (MOOCs) from top universities all around the world. It’s similar to Coursera or Khan Academy, except that the code is 100% open source, so you can build your white-labeled Open edX-powered site or [shameless plug] pay us to do it [/shameless plug].
Since our initial multi-container Open edX explorations almost a year ago at the Open edX 2014 conference hackathon, we’ve been scouring the Docker ecosystem to investigate solutions suitable for building a highly scalable and resilient Open edX hosting infrastructure.
After looking at various tools including Mesosphere, Consul, Fleet, Docker Swarm, etc. we settled on Kubernetes to be the backbone of our next generation Open edX hosting platform.
Kubernetes is an open source project of Google, and is the foundation of their Google Container Engine service.
While Kubernetes is still a fairly new project (the 1.0 version was only announced production ready in July), Google is a leader in containerization and has been at the forefront of this technology long before Docker popularized it.
Containerization is a hot topic right now, and for good reason – containers make more efficient use of resources by allowing you to squeeze more services onto each VM. Pantheon, one of the the leading WordPress and Drupal hosting companies uses containers to power all of their customer’s site (in fact, this blog is running on a container provided by Pantheon).
Pantheon has written some great blog posts on the subject of why containers:
- 250,000 Reasons Why Containers are Legit
- A Non-Technical Perspective on Containers
- Why we built Pantheon with Containers instead of Virtual Machines
- Why We Chose Container-Based Infrastructure for Pantheon” (video interview with CTO David Strauss).
Gondor, the platform-as-a-service for Python developers recently switched over to using containers and a Kubernetes-powered hosting infrastructure.
We’ll be writing more in this blog about our progress with Open edX on Kubernetes, and we encourage you to sign up to receive updates about our Open edX SaaS offering by email:
Until next time, here is a video of our talk at DevOpsDays Boston (the first few minutes got clipped off), and the slidedeck if you just want to skim through the presentation quickly.