The incident is resolved, we are making improvements to prevent similar issues from happening in the future.
Posted Dec 07, 2020 - 20:25 UTC
Monitoring
The situation is relatively stable, though we still see some timeouts. We continue monitoring to ensure everything works as expected.
Posted Dec 07, 2020 - 19:11 UTC
Update
A lot of agents are connected, but we still have a high percentage that appear unresponsive, so we will need to gradually reconnect them again, to get to a stable state.
Posted Dec 07, 2020 - 17:25 UTC
Update
The system is recovering, but we will need some more time for it to stabilize and get all the agents connected again.
Posted Dec 07, 2020 - 15:02 UTC
Identified
We have identified the component that causes the issue and are attempting to provide a fix