[getdns-api] Error handling in getdns
sara at sinodun.com
Thu Jan 7 13:21:31 UTC 2016
Willem and I have been discussing the error handling in getdns and would like to propose some changes based on our implementation experience.
1) Asynchronous error handling
At the moment the async callbacks are specified as below:
The response has the requested data in it
The calling program cancelled the callback; response is NULL
The requested action timed out; response is filled in with empty structures
The requested action had an error; response is NULL
With this approach there is no mechanism to provide any more fine grained information to the user when an ERROR is return because the response in the callback is NULL.
When considering a mainly UDP based approach this is probably sufficient, but to cater for TCP and TLS (with authentication) we would like to change the above so that a variety of errors that can occur that are not timeouts can be communicated to the caller.
We would like to propose that the ERROR case is changed to have the same response as the TIMEOUT case i.e:
The requested action had an error; response is filled in with empty structures
And so the response structure would look similar to this example for a TIMEOUT:
2) New GETDNS_RESPSTATUS codes
The new GETDNS_RESPSTATUS_ error cases we would like to add at this time are:
TRANSPORT_SETUP_FAILED - for the case where no connection could be made over any specified transport to any upstream (for example, only TLS is specified but none of the available upstreams support it).
TLS_FEATURE_NOT_SUPPORTED - for the cases where getdns can’t support the configured transport/authentication options at runtime because the available TLS library doesn’t have the required functionality (for example support for TLS 1.2 or hostname verification methods)).
TLS_AUTH_FAILED - for the case when using TLS only and authentication is required but fails. This is strictly a sub-case of TRANSPORT_SETUP_FAILED but seems worthy of a separate status code.
3) Synchronous timeouts
When calling the API synchronously, the return type of the functions is getdns_return_t. There is currently no value for GETDNS_RETURN_TIMEOUT and the behaviour for the sync calls is not clearly specified for a timeout in the spec. So our implementation currently uses GETDNS_RETURN_GOOD (and returns the response dict as in 1 above), the best alternative error code would be GETDNS_RETURN_GENERIC_ERROR. So we would like to propose adding the value GETDNS_RETURN_TIMEOUT to the getdns_return_t type.
As a future activity we note that the above mechanisms can only relay a single error code. Since completing an API call can involve
- performing multiple DNS queries
- using multiple upstreams
- using multiple transports
- TLS authentication that can fail for various reasons
- DNSSEC validations
- TSIG validation
we are considering adding an ‘error log trail’ utility that would be recorded during execution and could be returned in the response dict. Feedback on this is welcomed.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the spec