[getdns-api] async comments (0.268)

William Chan (陈智昌) willchan at chromium.org
Wed Feb 6 10:17:59 MST 2013


Just a few more points of information.

On Thu, Feb 7, 2013 at 12:36 AM, Phillip Hallam-Baker <hallam at gmail.com> wrote:
>
>
> On Tue, Feb 5, 2013 at 11:41 PM, William Chan (陈智昌) <willchan at chromium.org>
> wrote:
>>
>> On Wed, Feb 6, 2013 at 1:27 PM, Dan Winship <dan.winship at gmail.com> wrote:
>> > On 02/05/2013 08:59 PM, Paul Hoffman wrote:
>> >>> getdns() implementation authors (which, in the long run really means
>> >>> "libc/libresolv maintainers")
>> >>
>> >> Stop right there. There is *no* assumption that this will be be part of
>> >> libc or libresolv any time soon.
>> >
>> > "It is important that the implementation should try to replicate as best
>> > as possible the logic of a local getaddrinfo() when creating a new
>> > context."
>> >
>> > In the short term, you might be able to hardcode the behavior of
>> > getaddrinfo() on specific releases of specific OSes that you care about,
>> > but that doesn't seem like a plausible long-term solution. (None of the
>> > other async DNS libraries have managed to achieve this, why would getdns
>> > be any different?)
>>
>> FWIW, Chromium's async DNS library successfully (as far as we can
>> tell) replicates the behavior from getaddrinfo() on Mac and Linux and
>> we're wrapping up Windows support (there's some trickery in detecting
>> dual stack support in the same way Windows does...we're working on
>> it).
>>
>> >
>> >> That's exactly right: they don't want to, and can't. If there was only
>> >> one async library that I had to worry about, you would be correct, but it is
>> >> very clear different people want to use different async libraries. This
>> >> leaves four choices:
>> >> a) I pick one and ignore the users of all other async libraries
>> >> b) I make a generic hole for those libraries and try to shoehorn every
>> >> possible library's calls into that hole
>> >> c) I don't do async
>> >> d) I leave it up to the implementer, who will certainly hear from the
>> >> application developers about which libraries they want supported
>> >> I chose (d).
>> >
>> >> It sounds like you want (b) that slouches into (a).
>> >
>> > I want exactly (b). (And I don't think (d) would actually work, at all.)
>> >
>> >>> On unixy platforms, all getdns() implementations are going to be based
>> >>> on sockets and timeouts, and all event loops are going to be based on
>> >>> poll() or something equivalent.
>> >>
>> >> Err, no. There are many more choices than that. In fact, today, I
>> >> suspect that many more applications use { libevent | libev } instead of
>> >> polling.
>> >
>> > I wasn't saying the applications were based on poll, I was saying the
>> > event loops are. Sure, libevent can use poll or epoll or kqueue or
>> > whatever, but those are all just variations on the theme of "let me know
>> > when one of these file descriptors is ready". So if there's an API that
>> > lets getdns tell the local event loop what fds it cares about, then that
>> > would let it integrate with libevent, libuv, GLib, Qt, Twisted, etc.
>
>
> Looking at the Docs, I think Chromium is doing more what I suggested:
>
> http://www.chromium.org/developers/design-documents/dns-prefetching
>
> Since some DNS resolutions can take a long time, it is paramount that such
> delays in one resolution should not cause delays in other resolutions.
> Toward this end (on Windows, where there is no native support for
> asynchronous DNS resolution), Chromium currently employs 8 completely
> asynchronous worker threads to do nothing but perform DNS prefetch
> resolution. Each worker thread simply waits on a queue, gets the next
> requested domain name, and then blocks on a synchronous Windows resolution
> function.   Eventually the operating system responds with a DNS resolution,
> the thread then discards it (leaving the OS cache warmed!), and waits for
> the next prefetch request.  With 8 threads, it is rare than more than one or
> two threads will block extensively, and most resolution proceed rather
> quickly (or as quickly as DNS can service them!).  On Debug builds, the
> "about:histograms/DNS.PrefetchQueue" has current stats on the queueing
> delay.

This is our old code that we're working on killing off. It uses
getaddrinfo(). We've written our own async stub resolver that we're
slowly rolling out, and will hopefully eventually kill this
getaddrinfo() based code.

>
>
> I interpret that to mean that the basic unit of construction here is a
> blocking call to the DNS resolver, imagine it is something like the
> following:
>
> void dns_lookup (dns_work_item *work) {
>     // construct request
>     // blocking udp send
>     // blocking udp wait for response with time out
>     // parse response
>     }
>
> Then there is a separate worker process which is essentially a dispatch loop
> on dns_lookup:
>
> void farm_tasks ((void (ANY *work)) *delegate,
>                  *ANY work_items, int number_items, int max_threads) {
>    int i, threads;
>    threads =0;
>    //create lock here
>    for (i = 0; i < number_items ; i++) {
>        dispatch (delegate, work_items[i], lock);
>        if !(threads ++ < max_threads) {
>            wait (lock)
>            }
>        }
>     }
>
> Since none of the code in the farm_tasks routine is DNS specific and most of
> that code is actually code that would be useful in a wide range of farmed
> task applications, it does not look to me like it is something that you
> would need or want in a DNS library. It is probably something you would want
> in a companion library but not something I would personally find useful in
> the DNS library because my DNS lookup routines are going to involve more
> than just the common DNS code, there is going to be a (small) amount of code
> to put the returned results into the data structures I use in my
> applications.
>
> In general the rule on async programming should probably be that a service
> API like a DNS API should neither create nor terminate threads.
>
> If you have an environment like .NET where the UDP calls already have async
> versions, you might want to splice the 'handle DNS response' work into the
> callback initiated by the UDP call.
>
>
> I think this discussion is demonstrating the real reason that writing a
> DNSAPI is hard. The problem is that there are many styles of async
> programing and the C programming environment does not force a particular
> choice on the programmer. When selecting an API to use inside a program, one
> of the chief design considerations is going to be whether the async approach
> used by the API is compatible with the style the programmer is already using
> elsewhere.
>
> This is one of the main reasons the programming world in general has moved
> on from C and pretty much decided to barf on C++. It is perfectly easy to
> write a C library to handle structures like linked lists. But there isn't a
> linked list implementation in the base language. So every C API that needs
> linked lists has to fashion them for themselves and each library takes a
> different approach and when you combine a dozen APIs you end up with a dozen
> different linked list handling approaches.
>
>
> I don't think that finding the perfect C async API is possible here. But it
> probably isn't necessary either. The only parts of the API that are async
> are the I/O calls which are actually quite easy to code. The hard part of
> the API are the calls to encode and decode messages. I therefore suggest the
> following as an 80:20 solution:
>
> Core blocks:
>
> dns_get_host_resolver  - Return a structure with the IP addresses of the
> host DNS services
> dns_encode_message - convert a dns_message structure to an array of byte
> dns_decode_message - convert an array of byte to a dns_message structure
> dns_decode_record - convert the RDATA portion of a DNS record to an
> appropriate C structure
>
> dns_query_blocking - make a single DNS query and block on the response or a
> timeout with no internal caching
>
>
> The core blocks are required to build a higher level interface regardless of
> what that interface might look like. So we might as well expose them to the
> programmer who might want to use them. I often find it less difficult to
> roll my own code than to try to work out what the particular style of async
> approach another API applies.
>
> One higher level interface might look something like:
>
> dns_client_init - initialize a DNS client
> dns_client_query - blocking client query
> dns_client_query_start - initiate an async client query (non blocking)
> dns_client_query_end - complete an async client query (blocking)
> dns_client_query_prefetch - queue up a domain anticipating a future query
> (non blocking)
>
>
> Adding prefetching to the model has some interesting consequences. It means
> that schemes like Certificate Transparency make a lot more sense. It might
> also explain why Chrome has a habit of freezing on certain web pages from my
> home network. I think that what is happening is that the simultaneous
> prefetches are overwhelming the DNS resolver.

Yes, this has historically been known to be true. If it still happens,
and you can reproduce, then we'd like to know. Please file a bug
report with us by following the instructions at
https://sites.google.com/a/chromium.org/dev/for-testers/providing-network-details.

>
> --
> Website: http://hallambaker.com/



More information about the getdns-api mailing list