In 2018, Etsy initiated a migration of its services to cloud infrastructure and faced the challenge of overprovisioning database tier. This move aimed to enhance deployment processes and handle vast amounts of data more efficiently.
The Overprovisioning Challenge:
- Etsy’s payments databases were experiencing massive growth, and by the end of 2020, two of these databases were no longer vertically scalable. They had already reached the highest resource tier on Google Cloud Platform (GCP).
- The risk was that additional traffic spikes could lead to performance issues or even transaction losses.
- Etsy needed a long-term solution to address this overprovisioning challenge.
The Solution: Sharding with Vitess:
- Over the course of a year, Etsy migrated 23 tables (containing over 40 billion rows) from four payments databases into a single sharded environment managed by Vitess.
- Vitess is an open-source sharding system for MySQL, originally developed by YouTube.
- The migration involved changes to the data model, managing risks, and transitioning to Vitess.
- The ideal data model for sharding is shallow, with a single root entity that all other entities reference via foreign keys. Sharding based on this root entity allows related records to be placed on the same shard, minimizing cross-shard operations.
Results and Benefits:
- Compute Energy Savings: Etsy achieved over 50% savings in compute energy.
- Cost Reduction: Costs decreased by 42%, thanks to committed-use discounts and optimizations.
- Collaboration Tools: Etsy leveraged Google Workspace for efficient communication.
- Environmental Impact: The migration improved the sustainability of Etsy’s business.
Etsy’s journey exemplifies how strategic technology decisions can drive efficiency, scalability, and sustainability.
Interested in Most Common Cloud mistakes? Check out the book!