Consistency and Timeliness of Data in Distributed Computing
This story uses an example of dynamic pricing in the low-cost airlines ticketing industry to show how the consistency and timeliness of data is an essential component of a concurrent system.
I’ll use my experience of trying to purchase a low-cost airfare to show how the failure to implement effective data synchronicity created an unsuccessful customer experience. I’ll then describe how I implemented a concurrent system that demonstrates big data concepts on a small scale.
This is an accessible article for technology managers, engineers, and others interested in distributed computing, big data, customer experience, and dynamic pricing.
What We Need from Data
When it comes to managing data in concurrent systems there are two key requirements: consistency and timeliness. In this article we use the example of low-cost airfares — which are based on big data — to illustrate what happens when a distributed computing platform fails to meet those two requirements.
Glossary of Terms
In my previous article, Ten Key Terms in Distributed Computing, I go over a basic vocabulary for understanding the low-cost airlines example.
- Multitasking & Multithreading
- Nodes & Clusters
- Distributed Computing
No Ticket at Any Price
A few weeks ago I found a ticket one-way to Barcelona for $205 on SkyScanner. SkyScanner is one of several travel fare aggregator websites and travel metasearch engines for low-cost tickets from different airlines and resellers. I thought $205 was a great price so I attempted to purchase it.
I filled out the payment form with my credit card information, I clicked the submit button, and received a response page with the confirmation number.
The web page displayed a message that there was a problem with my payment. I went to re-enter my payment, but the web server refreshed the page. My transaction with the confirmation number and payment form had disappeared, and the price on the web page for the same seat had changed to $235.
I called the airline. The ticket agent said my confirmation number did not exist, but he did find the ticket for $235 and tried to buy it. It wasn’t available. But there was one available for $299. Actually — no — the ticket agent couldn’t buy that one either. But he was persistent and kept looking. Even though the tickets were available for sale, he could not buy a ticket at any price. We quickly reached $399 for a one-way ticket to Spain when decided to try again later.
In another twist, I was charged for a ticket I didn’t purchase by an airline I didn’t recognize.
What’s Going on Here?
In my example, one concurrent process issued a confirmation number before another process completed processing my payment. Thousands of buyers competing for the same low-priced ticket, combined with the poorly implemented update to system-wide data, caused the system to keep raising the price of seats without enabling their purchase.
Poorly Implemented Dynamic Pricing
Dynamic pricing is when a company changes its prices in response to demand. In theory this provides lower prices when demand is low, and higher prices when there are more buyers.
To effectively implement dynamic pricing, a software platform must concurrently track inventory and prices, current buyers, and market trends as a whole — including real-time competitor prices — then present the data to customers in a consistent and timely manner.
One Example of Big Data Concurrency at a Small-Scale
Low scale data still has to be correct in real-time. Consider the following example.
I created a time clock app. Managers could see where HVAC contractors were in real-time on a Google Map. To achieve this, the database was updated consistently by the HVAC contractors’ apps. Business logic keeps workers from logging in if they are already logged in.
Thirty timesheets per day is small data. Yet the implementation of this system demonstrates how data integrity enables a concurrent application to deliver both efficiency and a high-quality user experience.
Concurrency at Scale
This map shows all American Airlines flights. There’s so many AA flights in the USA that the country is blocked out.
American Airlines has to manage all this flight data concurrently. This means providing consistent and timely data to travel agents, ticket salespeople, web sites, resellers, affiliates, and aggregators like SkyScanner.
And this map shows the scale of just one airline’s data. (Map source: https://www.flightsfrom.com/AA)
Have It Both Ways
My experience illustrates what happens when all the users of a highly distributed system don’t have access to the most current data. When data is managed properly, the software platform both optimizes revenues and delivers a quality customer experience.