Case Study: 4 Stages of Engineering Growth at Pinterest
In the early days of Pinterest, the company was faced with the same question that faces any system designer: how does one scale a system for efficiency, speed, and concurrency? The team’s answer to this question is detailed in a lovely 2013 article on High Scalability. Of particular interest to product managers (I am one of those, after all) are the 4 stages of growth. These stages provide a fantastic model by which any digital product can be scaled.
March 2010 – The Age of Finding Yourself
When Pinterest was soft-launched, the team was small and had few resources, but also had a small user base. This provided an excellent time to congeal their product requirements, and rapidly iterate on their ideas. During this stage, Pinterest didn’t take it’s final form, but the team developed a solid prototype on which to build their ideas.
At launch, the tech stack looked like this (these tech breakdowns are sourced directly from the article cited earlier):
- 2 founders
- 1 engineer
- Rackspace
- 1 small web engine
- 1 small MySQL DB
10 months later in January 2011, it looked like this:
- Amazon EC2 + S3 + CloudFront
- 1 NGinX, 4 Web Engines (for redundancy, not really for load)
- 1 MySQL DB + 1 Read Slave (in case master goes down)
- 1 Task Queue + 2 Task Processors
- 1 MongoDB (for counters)
- 2 Engineers
As you can see by the system design, during 2010, not much growth occurred. They’re still operating in one database (not counting Mongo) and within the bounds of one web server. Instead, Pinterest’s team had drawn a sketch for the final vision of their product.
Sept 2011 – The Age of Experimentation
The focus of this stage is all in the name. Experimentation. During this time, Pinterest’s explosive growth took off. Every month, their user base was doubling. Such demand necessitated modifications and additions to the tech stack. The best solutions to each problem were not clear, so the team relied upon experimentation. Various technologies were integrated with varying success. Much as a product manager implements A/B testing to see what works, Pinterest tried various approaches to the same problem to see what worked.
As a result, the tech stack got very complicated, very quickly:
- Amazon EC2 + S3 + CloudFront
- 2NGinX, 16 Web Engines + 2 API Engines
- 5 Functionally sharded MySQL DB + 9 read slaves
- 4 Cassandra Nodes
- 15 Membase Nodes (3 separate clusters)
- 8 Memcache Nodes
- 10 Redis Nodes
- 3 Task Routers + 4 Task Processors
- 4 Elastic Search Nodes
- 3 Mongo Clusters
- 3 Engineers
Note FIVE different databases! The purpose of throwing five different databases into the stack is not to scale all five. It’s to identify which database works the best and run with that one. During such meteoric growth, it was probably a difficult decision to perform these experiments rather than just scale the MySQL they started with. However, achieving optimal performance is not just ideal, it’s vital when hyperscaling.
January 2012 – The Age of Maturity
In the Age of Maturity, Pinterest began to whittle down their convoluted stack to shape what would become their efficient, refined, and “mature” model. The team settled on MySQL for a primary database. The main effort of this stage in Pinterest’s growth was to get rid of things that didn’t work well, and to grow the things that did. This may seem like common sense, but the answers are only clear from the lessons learned during experimentation.
As a result, you see far fewer unique technologies in the tech stack, but far more numerous instances of the techs that are included:
- Amazon EC2 + S3 + Akamai, ELB
- 90 Web Engines + 50 API Engines
- 66 MySQL DBs (m1.xlarge) + 1 slave each
- 59 Redis Instances
- 51 Memcache Instances
- 1 Redis Task Manager + 25 Task Processors
- Sharded Solr
- 6 Engineers
October 2012 – The Age of Return
I liken this stage to the best part of driving a race car. You’ve spent some time to understand the controls, and now it’s time to lay on the gas pedal. Unsurprisingly, this is the best part of system design. Rapid scaling of the tech stack is designed to match rapid scaling of the user base.
You can see that the tech stack is composed of the same components, but in far greater numbers:
- Amazon EC2 + S3 + Edge Cast,Akamai, Level 3
- 180 Web Engines + 240 API Engines
- 88 MySQL DBs (cc2.8xlarge) + 1 slave each
- 110 Redis Instances
- 200 Memcache Instances
- 4 Redis Task Manager + 80 Task Processors
- Sharded Solr
- 40 Engineers (and growing)
The most important lesson to take away from this is that once the stack works, you can grow by simply adding more of the same thing. As the article says “you want to be able to scale by throwing money at the problem”. At that stage, there’s no requirement to evaluate differences between tech X and tech Y, or to do much cost analysis, or any requirement to train your team on something unfamiliar. By identifying what works well and simply growing it, Pinterest embraced one of the core tenets of scalability for technology.
Feature photo by Detlef Hansen – Own work, CC BY-SA 4.0, https://commons.wikimedia.org/w/index.php?curid=86439417