Aggregation

Scaling a High-Throughput Text Messaging Platform for Black Friday and Cyber Monday Success

Vibes' Chief Technology Officer, Brian Garofola, shares 3 key strategies used for scaling The Vibes Mobile Engagement Platform during peak holiday shopping days.

Brian Garofola
Chief Technology Officer
Table of Contents
Table of Contents

As the shopping frenzy we saw on Black Friday and Cyber Monday continues, businesses need to ensure their communication channels can handle the surge in demand. Text messaging continues to be a high growth, critical channel for reaching customers with both time-sensitive information, such as fraud notifications or one-time passwords, as well as marketing messages like time-limited sales.

For Vibes, scaling to meet peak volumes has been essential for providing a seamless and reliable experience for our customers. During Cyber Week 2023, Vibes sent 65% more messages than we did in the same week in 2022, and 2x as many messages as we send on an average, non-Cyber Week November day. Even more impressive is that our platform routed these messages with no delays, delivering 99% of marketing messages in less than 30 seconds and 99% of time-sensitive transactional messages in less than 3 seconds. These numbers exceeded our service level objectives and exhibit Vibes commitment to quality, reliability, and performance.

Keep reading as we break down three key strategies for scaling a high-throughput text messaging platform like Vibes during these peak shopping days.

The Vibes platform is architected for resiliency and scale.

To handle the traffic we see on Black Friday and Cyber Monday, it's crucial that we architect our platform to be both scalable and resilient, minimizing bottlenecks and single points of failure. This involves a comprehensive analysis of the system to identify potential weak points that could hinder scalability. As we perform this analysis, we apply architecture patterns and practices that have been proven to help cloud-native platforms like ours scale to meet record demand.

One key pattern designed by Vibes is the bulkhead pattern, which borrows its name from how naval ships are engineered. If the hull of a ship is compromised, only the damaged section fills with water, which prevents the ship from sinking. The Vibes platform stores all routing information in-memory across all instances of the platform to avoid the need for expensive database queries. Based on this routing information, each message is directed to the correct resource pool. If that specific resource pool encounters issues, all other resources pools are unaffected. More specifically, issues with one carrier or one Vibes customer don’t affect the performance of other carriers or customers.

While the bulkhead pattern helps us with resiliency, we also need to consider scalability. Scalability is the measure of a system's ability to handle varying amounts of work by adding or removing resources from the system. At Vibes we employ an approach known as diagonal scaling. Every component of the Vibes platform is designed to be independently scaled horizontally by adding additional instances.  We've also found that scaling up is valuable and an appropriate level of over-provisioning enables us to better handle unexpected short surges in traffic. Scaling up involves adding more resources - CPU and memory - to existing instances. By scaling both up and out - diagonally - we were able to absorb the enormous demands of Black Friday and Cyber Monday and provide our customers with the delivery quality that they’ve come to trust us for.

The Vibes cloud-based platform delivers speed and availability.

In the tech world we’re hearing talks of companies choosing to move away from the cloud and migrate back to on-premise infrastructure. For Vibes, the benefits of a public cloud provider – on-demand scalability, global availability, managed services, and more – were absolutely critical to our success during Cyber Week 2023.

As demand surged on Black Friday, we began to see elevated resource usage in our data stores. We were able to quickly respond to that condition by upscaling the data stores to alleviate the pressure. We also run a number of managed services, and because we don’t carry the operational burden of these services, our teams were able to focus their energy on the differentiation that Vibes provides to our customers.

Such a demand in scale also requires a commitment and focus to ensure fast and reliable message availability, from information on the latest sale to fraud notifications to one-time passwords. No matter the message type, just a few seconds of downtime results in many thousands of missed messages, which can make all of the difference between a good customer experience and a bad one that a brand’s consumers won’t forget.  

Our platform is deployed in a highly available configuration, running in 3 geo-diverse regions and in 3 availability zones per region. This configuration enables the platform to be highly tolerant to failures at a server, availability zone, or region level and ensures that customers’ messages are still delivered even during widespread, cloud service provider outages.

A reliable messaging platform brands can count on.

Having the technology to manage the scale of Black Friday & Cyber Monday is one thing, yet it's also essential to understand the shape of the traffic over the course of the day. This is why our data science team utilizes “discrete event simulation”, a powerful tool for modeling and analyzing the behavior of complex systems to help solve for many of the different variables involved in providing a high-quality messaging service. From the rate of incoming traffic from our customers, messaging queue depths, and the rate of outgoing traffic to the wireless carriers, these simulations help to predict how the system will perform under different conditions.

These predictions are invaluable when it comes to planning for events like Cyber Week as it enabled us to identify potential bottlenecks and make informed decisions on resource allocation and system optimizations, ensuring that our platform was able to respond dynamically to changing conditions throughout peak messaging periods.

Simulation output example

While this turned out to be a record for Vibes in terms of holiday message volume, we have been setting new records every holiday season for years as we see texting continue to soar as a marketing channel. And every year, our approach to this holiday week has been consistent to make sure that we deliver the uptime, speed and deliverability our customers expect and deserve.

By eliminating bottlenecks, leveraging the advantages of public cloud services, and using discrete event simulation to forecast traffic surges, we were once again able to ensure a reliable and seamless experience for our customers and their text message subscribers during the busiest shopping days of the year.

Brian Garofola
Chief Technology Officer
Cookies Preferences
Close Cookie Preference Manager
Cookie Settings
By clicking “Accept All Cookies”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage and assist in our marketing efforts.
Strictly Necessary (Always Active)
Cookies required to enable basic website functionality.
Made by Flinch 77
Oops! Something went wrong while submitting the form.