New hardware is the most visible cue for technological progress as computers become more powerful. Even with the shiniest hardware, the software that plays a critical role in many systems is too old.

Southwest Airlines wasn't able to return to business as usual the way other airlines did after the winter storm. A week after the storm passed, more than 2,300 flights were canceled.

Southwest has an open secret that it desperately needed to update its scheduling systems. Southwest unions had warned about the software flaws that contributed to the smaller-scale meltdowns. Without more government regulation and oversight, we may see more fiascos like this one, which could strand hundreds of thousands of Southwest passengers over Christmas week. The problem is not limited to a single company or industry.

There is a gap between what the software needs to be and what it is when using older or deficient software that needs updating. Aging code is a common cause of technical debt in older companies, but it can also be found in newer systems due to the fact that software can be written in a rapid and shoddy way. The latter is cheaper and quicker than the other way around.

It is similar to building a building. It would be cheaper and quicker to skip strict earthquake or fire codes if you had the option. If there was no earthquake or fire, the building would look and feel the same. The inhabitants of the building would pay the debt if there were earthquakes or fires.

We return to Southwest. The flight attendants' union picketed in front of airports as part of their negotiations. They had one protest sign. There is a sign with a graphic showing a stuck software bar. A few months ago, they put a sign on the side of a truck and drove it around Love Field in Dallas as well as the nearby Southwest headquarters. Updating the creaking scheduling technology was placed above the union's demands for increased pay by the union in March.

The president of the pilots' union pointed out that the antiquated crew-scheduling technology was leading to cascading disruptions when Southwest had another cancellation crisis. Gary Kelly conceded that Southwest's tools could use improvement even as he objected to the pilots' claims.

It appears that the improvement didn't happen.

According to the president of Southwest's flight attendants' union, when there is a weather event, the employees have to go through a lengthy process to get things fixed.

If a crew from Buffalo doesn't arrive in Baltimore because their flight was canceled, the employees have had to manually call in to let the company know where they are and get hotels arranged for them.

Employees were left on the phone for three, six, seven, eight, 12 hours, and even one of 17 hours, just to let the company know where they were and get hotel rooms arranged, according to a conversation I had withLyn. The Federal Aviation Administration requires a certain amount of rest between flights. Even if they were at an airport with a flight that needed them, they weren't allowed to fly once they managed corporate contact. Employee accounts of such misery can be found in online forums.

Southwest would have to find a new crew in Baltimore to replace the one that didn't arrive from Buffalo. Potential candidates in Baltimore may be on hold for hours trying to let the company know of their location.

This week it cascaded to a systemwide halt.

You might be wondering why anyone has to call in at all, since the company should know exactly which flights got canceled and who flew where, based on passenger lists. Montgomery says that Southwest had an old system that broke down, forcing employees to call in.

The crews can notify the company of their actual location via an app or website, but they can't get their hotel assignments that way. The vice president for product strategy at Arcos told me that they sell work force management software to airlines and other companies. There is more than one layer of software that has to be written and integrated into the airline's scheduling software.

Southwest acknowledges that technology played a role in the debacle, but doesn't acknowledge past decisions that contributed to why this happened now. ChrisPerry, a Southwest spokesman, told me that their systems were overwhelmed by the disruption. The magnitude and scale of the disruptions made it difficult for our technology to align our resources. Our crew schedulers tackled the issue manually, which is a tedious, long process that takes time and trained resources to complete.

ImageCustomers at the check-in area for Southwest Airlines at Denver International Airport on Thursday.
Customers at the check-in area for Southwest Airlines at Denver International Airport on Thursday.Credit...Matthew Staver for The New York Times
Customers at the check-in area for Southwest Airlines at Denver International Airport on Thursday.

External events like weather and the fact that Southwest has more "point-to-point" flights than most airlines can cause such breakdowns. The point-to-point flight model doesn't fully explain how Southwest couldn't fly its regular schedule until a week after the storm.

Southwest didn't update its systems.

If you are a corporate executive whose compensation is tied to stock prices and earnings statements released every three months, there are strong incentives to address any immediate problem by simply adding duct tape and wire to what you already have. You can cross your fingers that something bad will happen under someone else's watch. The plight of a company's customers and employees is no longer relevant to the fortunes of its current top executives.

Southwest C.E.O. Kelly's compensation was a record $9.2 million in 2020, despite the fact that the company lost more than $3 billion due to the swine flu and the compensation for the median employee fell. The company said his compensation was in place before the swine flu hit. The company spent $8.5 billion of its excess cash on buying its own stock, a common practice among airlines which helps increase the value of the stock. Southwest received billions of dollars in grants and low-interest loans from the government after the Pandemic struck. Kelly, an accountant who became the C.E.O. of Southwest in 2004, retired earlier this year with an estimated net worth in the tens of millions of dollars.

Many people point to the Y2K scare when talking about technical debt. When computer memory was cheap, software programs used two digits, instead of four, to indicate the year. It wasn't going to work in the new millennium when confusions between 1905 and 2005 could cause programs to glitch or crash.

Some people think that the implication is that technical debt is not a big deal. We didn't ignore the problem and made it through Y2K intact $100 billion was spent to fix the underlying problem in the U.S. These are the kind of efforts that don't get a lot of attention.

The Y2K incident is a great example of how quickly hardware has advanced. Software written in the era when two digits mattered is still running many systems today. My not-that-fancy phone has what would have been an unimaginable 128 gigabytes of memory. The fact that software doesn't come along by itself shouldn't be overlooked.

We haven't built a regulatory environment where companies have incentives to address technical debt, rather than passing the burden on to customers, employees or the next management

How would incentives look like? It would be different depending on industry. It's possible that airlines will be held responsible for the problems they cause to the public. They could be forced to compensate passengers for delayed or canceled flights because of weather or events outside their control. The implementation has hit a lot of problems.

It is possible for companies to be fined for major failures. The fines will be seen as a cost of doing business if they are too small.

The company agreed to pay a penalty of at least $575 million to the FTC after it failed to institute a security update for its software. It was just a few dollars per affected customer and a small portion of the company's revenue in the year after the hack. Even though they would have preferred not to have been fined, it was still a cost they would have to endure. Richard Smith resigned from his position as the C.E.O. He collected $18 million in pension money even though he failed and was fined.

We can't keep turning the operation of more and more of our infrastructure and lives to outdated software and self-interested executives. Real debt is technical debt Someone is going to pay it eventually. If we don't hold companies and executives accountable for preventable failures, we'll have to pay.