From robert.groenenberg at broadforward.com Thu Sep 8 13:13:41 2016 From: robert.groenenberg at broadforward.com (Robert Groenenberg) Date: Thu, 8 Sep 2016 15:13:41 +0200 Subject: [getdns-api] Segmentation fault under load Message-ID: <5eeceab4-125a-24d4-dd82-7faccb1d58b8@broadforward.com> Hi, Running an application that sends out ENUM queries using getdns (v1.0.0b2) and libevent2 on CentOS 6, runs fine for a low amount of requests. However, when sending ~50 requests per second, a segmenation fault occurs in _getdns_rbtree_insert() (some runs the Segv occurs in the compare function), after a few minutes. > Program terminated with signal 11, Segmentation fault. > #0 _getdns_rbtree_insert (rbtree=0x104a518, data=0x7f80f8004cc0) at > util/rbtree.c:240 > 240 if ((r = rbtree->cmp(data->key, node->key)) == 0) { > #0 _getdns_rbtree_insert (rbtree=0x104a518, data=0x7f80f8004cc0) at > util/rbtree.c:240 > #1 0x00007f80cc1dc0ef in _getdns_context_track_outbound_request > (dnsreq=0x7f80f8004cc0) > at ./context.c:3080 > #2 0x00007f80cc1c9fed in getdns_general_ns (context=0x10493e0, > loop=0x1024940, > name=, request_type=35, extensions= optimized out>, > userarg=0x7f80f8006b50, return_netreq_p=0x7f809c5cbb88, > callbackfn=0x7f80cc44f7d0 , internal_cb=0, > usenamespaces=0) > at ./general.c:452 > #3 0x00007f80cc1ca3f1 in _getdns_general_loop (context= optimized out>, > loop=, name=, > request_type=, > extensions=, userarg=, > netreq_p=0x7f809c5cbb88, > callback=0x7f80cc44f7d0 , internal_cb=0) at > ./general.c:517 > #4 0x00007f80cc1ca454 in getdns_general (context=, > name=, request_type=, > extensions=, userarg=, > transaction_id=0x7f80f8006b78, callbackfn=0x7f80cc44f7d0 > ) > at ./general.c:674 I suspect this to be a threading issue: the rbtree being accessed for both insert and delete from different threads. The call to /getdns_general//()/ is protected by a mutex in my application, so only one thread issues a query at a time. However, the event base runs in its own thread, as usual with libevent, so the problem probably lies in entries being deleted from the rbtree when the response is handled. Is it supposed to be possible with getdns to run the event base in its own thread? Thanks, Robert -------------- next part -------------- An HTML attachment was scrubbed... URL: From willem at nlnetlabs.nl Thu Sep 8 13:39:03 2016 From: willem at nlnetlabs.nl (Willem Toorop) Date: Thu, 8 Sep 2016 15:39:03 +0200 Subject: [getdns-api] Segmentation fault under load In-Reply-To: <5eeceab4-125a-24d4-dd82-7faccb1d58b8@broadforward.com> References: <5eeceab4-125a-24d4-dd82-7faccb1d58b8@broadforward.com> Message-ID: Op 08-09-16 om 15:13 schreef Robert Groenenberg: > Hi, > > Running an application that sends out ENUM queries using getdns > (v1.0.0b2) and libevent2 on CentOS 6, runs fine for a low amount of > requests. > However, when sending ~50 requests per second, a segmenation fault > occurs in _getdns_rbtree_insert() (some runs the Segv occurs in the > compare function), after a few minutes. > >> Program terminated with signal 11, Segmentation fault. >> #0 _getdns_rbtree_insert (rbtree=0x104a518, data=0x7f80f8004cc0) at >> util/rbtree.c:240 >> 240 if ((r = rbtree->cmp(data->key, node->key)) == 0) { > >> #0 _getdns_rbtree_insert (rbtree=0x104a518, data=0x7f80f8004cc0) at >> util/rbtree.c:240 >> #1 0x00007f80cc1dc0ef in _getdns_context_track_outbound_request >> (dnsreq=0x7f80f8004cc0) >> at ./context.c:3080 >> #2 0x00007f80cc1c9fed in getdns_general_ns (context=0x10493e0, >> loop=0x1024940, >> name=, request_type=35, extensions=> optimized out>, >> userarg=0x7f80f8006b50, return_netreq_p=0x7f809c5cbb88, >> callbackfn=0x7f80cc44f7d0 , internal_cb=0, >> usenamespaces=0) >> at ./general.c:452 >> #3 0x00007f80cc1ca3f1 in _getdns_general_loop (context=> optimized out>, >> loop=, name=, >> request_type=, >> extensions=, userarg=, >> netreq_p=0x7f809c5cbb88, >> callback=0x7f80cc44f7d0 , internal_cb=0) at >> ./general.c:517 >> #4 0x00007f80cc1ca454 in getdns_general (context=, >> name=, request_type=, >> extensions=, userarg=, >> transaction_id=0x7f80f8006b78, callbackfn=0x7f80cc44f7d0 >> ) >> at ./general.c:674 > > I suspect this to be a threading issue: the rbtree being accessed for > both insert and delete from different threads. The call to > /getdns_general//()/ is protected by a mutex in my application, so only > one thread issues a query at a time. However, the event base runs in its > own thread, as usual with libevent, so the problem probably lies in > entries being deleted from the rbtree when the response is handled. > > Is it supposed to be possible with getdns to run the event base in its > own thread? No getdns does not anticipate this modus operandi (as you found out yourself already). Running and scheduling should be done from the same thread. -- Willem > > Thanks, > Robert > > > _______________________________________________ > spec mailing list > spec at getdnsapi.net > From robert.groenenberg at broadforward.com Fri Sep 9 09:08:28 2016 From: robert.groenenberg at broadforward.com (Robert Groenenberg) Date: Fri, 9 Sep 2016 11:08:28 +0200 Subject: [getdns-api] Segmentation fault under load In-Reply-To: <96628419-3382-2395-e1ff-33fe150bb0f5@nlnetlabs.nl> References: <5eeceab4-125a-24d4-dd82-7faccb1d58b8@broadforward.com> <96628419-3382-2395-e1ff-33fe150bb0f5@nlnetlabs.nl> Message-ID: FYI: Queries are now posted in a queue and trigger an event by means event_active(), an event callback function then picks up queued queries and fires them off. As the event callback is run by the same thread as the callbacks, all getdns context access is now from the same thread. No more concurrent access to the rbtree and no SegV :-) Perhaps a note in the part describing how to use getdns with libevent would be useful. Kind regards, Robert On 09/08/2016 04:38 PM, Willem Toorop wrote: > Op 08-09-16 om 15:48 schreef Robert Groenenberg: >> Hi Willem, >> >> Thanks for the fast response. >> >> What is then the whole point of having an asynchronous API if the same >> thread that initiates the query also has to run the event loop to handle >> the responses? >> I.e. this: >> >>> else if ((r = *getdns_address*( context, query_name, extensions >>> , userarg, &transaction_id, callback))) >>> fprintf(stderr, "Error scheduling asynchronous request"); >>> else { >>> printf("Request with transaction ID %"PRIu64" scheduled.\n", >>> transaction_id); >>> if (*event_base_dispatch*(event_base) < 0) >>> fprintf(stderr, "Error dispatching events\n"); >>> } >> is pretty much the same as a synchronous API. >> >> Oh, wait, the solution is probably to have another callback of the same >> event base initiate the query and have the other threads trigger that >> callback via an event. >> I'll explore that path. > That's the idea yes... You could schedule a timeout event that walks > through a list of requests to schedule? And then experiment with what > rate still gives good performance. > > Or reschedule from the answer processing. For example to schedule in > such a way that you have only a certain amount of outstanding requests. > > Cheers, > -- Willem > > >> Cheers, >> Robert >> >> >> On 09/08/2016 03:39 PM, Willem Toorop wrote: >>> Op 08-09-16 om 15:13 schreef Robert Groenenberg: >>>> Hi, >>>> >>>> Running an application that sends out ENUM queries using getdns >>>> (v1.0.0b2) and libevent2 on CentOS 6, runs fine for a low amount of >>>> requests. >>>> However, when sending ~50 requests per second, a segmenation fault >>>> occurs in _getdns_rbtree_insert() (some runs the Segv occurs in the >>>> compare function), after a few minutes. >>>> >>>>> Program terminated with signal 11, Segmentation fault. >>>>> #0 _getdns_rbtree_insert (rbtree=0x104a518, data=0x7f80f8004cc0) at >>>>> util/rbtree.c:240 >>>>> 240 if ((r = rbtree->cmp(data->key, node->key)) == 0) { >>>>> #0 _getdns_rbtree_insert (rbtree=0x104a518, data=0x7f80f8004cc0) at >>>>> util/rbtree.c:240 >>>>> #1 0x00007f80cc1dc0ef in _getdns_context_track_outbound_request >>>>> (dnsreq=0x7f80f8004cc0) >>>>> at ./context.c:3080 >>>>> #2 0x00007f80cc1c9fed in getdns_general_ns (context=0x10493e0, >>>>> loop=0x1024940, >>>>> name=, request_type=35, extensions=>>>> optimized out>, >>>>> userarg=0x7f80f8006b50, return_netreq_p=0x7f809c5cbb88, >>>>> callbackfn=0x7f80cc44f7d0 , internal_cb=0, >>>>> usenamespaces=0) >>>>> at ./general.c:452 >>>>> #3 0x00007f80cc1ca3f1 in _getdns_general_loop (context=>>>> optimized out>, >>>>> loop=, name=, >>>>> request_type=, >>>>> extensions=, userarg=, >>>>> netreq_p=0x7f809c5cbb88, >>>>> callback=0x7f80cc44f7d0 , internal_cb=0) at >>>>> ./general.c:517 >>>>> #4 0x00007f80cc1ca454 in getdns_general (context=, >>>>> name=, request_type=, >>>>> extensions=, userarg=, >>>>> transaction_id=0x7f80f8006b78, callbackfn=0x7f80cc44f7d0 >>>>> ) >>>>> at ./general.c:674 >>>> I suspect this to be a threading issue: the rbtree being accessed for >>>> both insert and delete from different threads. The call to >>>> /getdns_general//()/ is protected by a mutex in my application, so only >>>> one thread issues a query at a time. However, the event base runs in its >>>> own thread, as usual with libevent, so the problem probably lies in >>>> entries being deleted from the rbtree when the response is handled. >>>> >>>> Is it supposed to be possible with getdns to run the event base in its >>>> own thread? >>> No getdns does not anticipate this modus operandi (as you found out >>> yourself already). Running and scheduling should be done from the same >>> thread. >>> >>> >>> -- Willem >>> >>>> Thanks, >>>> Robert >>>> >>>>