Facebook Outage Caused by a Stage of Error, Says

[ad_1]

The company said a series of errors made during maintenance on Facebook’s network caused the outage that took its services offline on Monday. blog post It was published on Tuesday.

Facebook’s family of apps, which includes Instagram, WhatsApp and Messenger, has been offline for more than five hours as employees scramble to repair the damage. more than that 3.5 billion people It uses Facebook services to communicate with friends and family around the world, distribute political messages, and expand its business through advertising and outreach.

Santosh Janardhan, Facebook’s vice president of infrastructure, wrote that the first problem occurred in a network he called the “backbone” of Facebook, connecting data centers around the world. blog post.

During the maintenance of the network, a command was issued to evaluate how much capacity was available. But the command backfired, disconnecting the network and preventing Facebook’s data centers from communicating, said Mr Janardhan. An audit tool designed to catch incorrect commands failed to detect the error, he added.

But that was just the beginning of the problems. “This change has resulted in a complete disconnection of our server connections between our data centers and the internet,” Mr. Janardhan wrote. “And this complete loss of connection caused a second problem that made things worse.”

While Facebook’s data centers were offline, neither were the company’s servers that managed their web addresses. “This made it impossible for the rest of the internet to find our servers,” said Mr Janardhan.

As the company became clearer the extent of the outage, Facebook engineers struggled to restore access as their data centers were heavily guarded and employees were unable to gain immediate access.

“We’ve done extensive work on hardening our systems to prevent unauthorized access, and it was interesting to see how this hardening slowed us down as we tried to recover from an outage that was caused not by malicious activity but by our own mistake.” Written by Janardhan.

When engineers broke into Facebook’s data centers and got to work, they were able to restore the network. But Mr Janardhan said they need to be gradual when bringing servers online so as not to overwhelm the system.

The company added that it plans to examine how the outage occurred and create drills that will allow employees to fix Facebook’s systems faster.

[ad_2]

Source link

Leave a Reply

Your email address will not be published. Required fields are marked *