13 January: GitBook search incident post-mortem
Between Friday 10th and Monday 13th January, sites published with GitBook suffered a degraded search experience. During these 3 days, we estimate around a quarter of searches resulted in "no results" - even though content matching the search existed.
What happened
On the 10th, we updated a third-party library used in the infrastructure of GitBook sites. The new version of the dependency contained an undocumented change that created a situation where search requests occasionally returned no results. This didn't come up during our automated tests or QA due to inconsistent nature of the issue. Later that day, after realising this update was the source of the issue, we reverted the change whilst we investigated further.
In order to keep your docs loading fast, we ensure GitBook content is available in multiple locations across the globe. Normally, we see changes deployed across all data centres within a few minutes. However, we continued to receive reports of empty searches later in the day on the 10th.
Over the weekend we investigated why our CDN was lagging behind and identified a configuration issue with our CDN provider that prevented this change from being pushed as quickly as normally. After resolving the issue with our CDN provider on Sunday, the changes pushed out at the speed we expected. By the end of Monday, all data centres were up-to-date.
Action items
As the incident took place over the weekend, response time was slightly slower - especially as we had to reach out to our CDN provider. We'll work internally on a better procedure for handling issues outside of working hours.
We're working a lot more closely with our CDN provider to audit our configuration and make sure GitBook content can continue to scale across the globe without the risk of downtime.
We've improved the robustness of our QA passes for published content - including specific tests for search and Ask AI.
Last updated
Was this helpful?