From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 7001 invoked by alias); 13 Feb 2015 20:52:00 -0000 Mailing-List: contact glibc-bugs-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: glibc-bugs-owner@sourceware.org Received: (qmail 6976 invoked by uid 48); 13 Feb 2015 20:51:56 -0000 From: "eric.newton at gmail dot com" To: glibc-bugs@sourceware.org Subject: [Bug libc/17977] New: gethostbyname_r hangs forever Date: Fri, 13 Feb 2015 20:52:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: glibc X-Bugzilla-Component: libc X-Bugzilla-Version: unspecified X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: eric.newton at gmail dot com X-Bugzilla-Status: NEW X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: unassigned at sourceware dot org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter cc attachments.created Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: http://sourceware.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2015-02/txt/msg00175.txt.bz2 https://sourceware.org/bugzilla/show_bug.cgi?id=17977 Bug ID: 17977 Summary: gethostbyname_r hangs forever Product: glibc Version: unspecified Status: NEW Severity: normal Priority: P2 Component: libc Assignee: unassigned at sourceware dot org Reporter: eric.newton at gmail dot com CC: drepper.fsp at gmail dot com Created attachment 8126 --> https://sourceware.org/bugzilla/attachment.cgi?id=8126&action=edit proposed patch for the bug A large (java) multi-threaded server process was found to be hanging on calls to gethostbyname_r. It was further determined that it only hung when /etc/hosts.conf contained "reorder on". Inspecting the source for _res_hconf_reorder_addrs, it is straightforward to see the bug. Assume there are 3 threads executing the function at the same time. All see num_ifs is -1 at line 407, and attempt to get the lock on line 422. One thread gets the lock at line 422, initializes the static data structure, and unlocks the lock. The next thread gets the lock. It double-checks the value of num_ifs at line 425. Seeing that it is now >0, it skips the initialization. But this thread does not unlock the lock. The last thread hangs on the lock forever. -- You are receiving this mail because: You are on the CC list for the bug.