[getdns-api] UDP failover improvements

Robert Groenenberg robert.groenenberg at broadforward.com
Wed Feb 28 10:56:39 UTC 2018


Hi Willem, Sara,

To improve (in our view) getdns with respect to the failover/retry 
behaviour towards UDP upstreams, we've made 1 fix and 2 enhancements:

1) restrict the back_off value of an upstream to a configurable maximum. 
This avoids that the back_off value (doubled at each timeout for an 
upstream) keeps growing until the value rolls over. We didn't want the 
interval for retrying an upstream to grow to values like 2^16 or bigger 
when that upstream had an outage. Note that the retry interval still is 
in 'query attempts', perhaps we want to make that time-based at some point.

2) when an upstream has been unavailable and is found to be Ok at some 
point, its back_off value is not reset. So on a subsequent timeout the 
back_off continues with the value from the previous failure. We consider 
this a bug.

3) when all configured upstreams of a context are unavailable, in our 
view it makes more sense to retry these in a round-robin fashion instead 
of sticking to the back_off values (especially when one becomes 
unavailable earlier than another). The original backoff mechanism may 
lead that one unavailable upstream is tried hundreds or thousands of 
times before another one is given a try, while the latter may be 
available again. Switching to round-robin when all are unavailable for a 
number of attempts will lead to faster recovery.

I have these changes available on top of the latest 'develop' branch. 
Shall I create pull-requests for them?
(Credits also go to my colleague Shikha Sharma)

Cheers,
Robert
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.getdnsapi.net/pipermail/spec/attachments/20180228/7bdb0b66/attachment.htm>


More information about the spec mailing list