This incident is fully resolved and CDRs are caught up.
In an overnight configuration change, one of our customers created a situation where for each inbound call to their numbers, they generated repeated attempts outbound to that number, sometimes thousands. This was configured across multiple numbers and affected two in separate incidents this morning. The consequence was tens of thousands of additional calls in-flight outbound at any one time, which were then coming back in to the Simwood network. To make matters worse, their outbound calls were also egressing over other routes, but looping back and coming into the Simwood network from various other carriers. We show over 1m of such calls in the first hours of this morning.
The customer's equipment was then overloaded causing calls to eventually time-out, but of course compounding the number in-flight. Calls were rejected here due to rate and channel limits but the rate and amplification were such that this didn't totally alleviate the problem. From the Simwood side, this caused load issues predominantly in London which manifested as increased PDD for customers. Based on reports so far and what we've seen, other Availability Zones were unaffected but we were seeing this traffic across all of them.
In both cases the numbers were blocked here which restored the situation to normal and, separately, BT NMC mitigated a problem caused for them by 'gapping' (a.k.a rate-limiting) the affected number range on their network, for calls destined for Simwood, having originated from other carriers we do not have bilaterals with.
Posted Jul 31, 2019 - 09:54 UTC
Calls should be completing normally now but we’re aware there is a delay in CDRs being processed.
Posted Jul 31, 2019 - 08:03 UTC
We are investigating sporadic reports of calls failing, we are investigating this and more information will be provided as soon as it becomes available.