From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 19650 invoked by alias); 2 Mar 2007 14:58:22 -0000 Received: (qmail 19633 invoked by uid 22791); 2 Mar 2007 14:58:21 -0000 X-Spam-Check-By: sourceware.org Received: from sunsite.ms.mff.cuni.cz (HELO sunsite.mff.cuni.cz) (195.113.15.26) by sourceware.org (qpsmtpd/0.31) with ESMTP; Fri, 02 Mar 2007 14:58:17 +0000 Received: from sunsite.mff.cuni.cz (localhost.localdomain [127.0.0.1]) by sunsite.mff.cuni.cz (8.13.8/8.13.8) with ESMTP id l22ExxfC005149; Fri, 2 Mar 2007 15:59:59 +0100 Received: (from jakub@localhost) by sunsite.mff.cuni.cz (8.13.8/8.13.8/Submit) id l22ExsY4005145; Fri, 2 Mar 2007 15:59:54 +0100 Date: Fri, 02 Mar 2007 14:58:00 -0000 From: Jakub Jelinek To: Ulrich Drepper Cc: Glibc hackers , kuznet@ms2.inr.ac.ru Subject: [PATCH] Fix ifaddrs error handling Message-ID: <20070302145954.GF1826@sunsite.mff.cuni.cz> Reply-To: Jakub Jelinek Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.2.2i Mailing-List: contact libc-hacker-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-hacker-owner@sourceware.org X-SW-Source: 2007-03/txt/msg00001.txt.bz2 Hi! netlink apparently only allows one pending dumper for one netlink connection: /* A dump is in progress... */ spin_lock(&nlk->cb_lock); if (nlk->cb) { spin_unlock(&nlk->cb_lock); netlink_destroy_callback(cb); sock_put(sk); return -EBUSY; } nlk->cb = cb; spin_unlock(&nlk->cb_lock); If some box has too many interfaces and the 4K default buffer isn't sufficient, some messages can be truncated (with MSG_TRUNC set in flags). glibc in this case increments the sequence counter, resends the request and ignores all responses with older sequence numbers. If the old responses all were into the message that got truncated, maybe with NLMSG_DONE alone in a next message, then this will work just fine, but if MSG_TRUNC happens say on the 1st out of 3 response messages, when we reissue the request a NLMSG_ERR -EBUSY response is queued. The following patch fixes that by retrying with a new socket (should be very rare, most of the people don't have so many interfaces and even if they have so many, the incremented buffer size is remembered within the application, so further getifaddrs etc. calls will start with a big enough buffer). Alternatively, perhaps we could when we see MSG_TRUNC recvmsg in a loop until we see NLMSG_DONE with that seq number (though, I'm not sure if we have a guarantee NLMSG_DONE was not in a MSG_TRUNC message). This is reproduceable on ia64 with ~ 80 interfaces, or e.g. on x86_64 with 80 interfaces too, if initial buf_size in __netlink_request is artificially lowered to say 250. 2007-03-02 Jakub Jelinek * sysdeps/unix/sysv/linux/ifaddrs.c (__netlink_request): Retry with a new netlink socket if NLMSG_ERR -EBUSY is seen after some MSG_TRUNC message. --- libc/sysdeps/unix/sysv/linux/ifaddrs.c.jj 2007-03-02 14:52:11.000000000 +0100 +++ libc/sysdeps/unix/sysv/linux/ifaddrs.c 2007-03-02 15:14:22.000000000 +0100 @@ -135,6 +135,7 @@ __netlink_request (struct netlink_handle return -1; size_t this_buf_size = buf_size; + size_t orig_this_buf_size = this_buf_size; if (__libc_use_alloca (this_buf_size)) buf = alloca (this_buf_size); else @@ -236,6 +237,36 @@ __netlink_request (struct netlink_handle struct nlmsgerr *nlerr = (struct nlmsgerr *) NLMSG_DATA (nlmh); if (nlmh->nlmsg_len < NLMSG_LENGTH (sizeof (struct nlmsgerr))) errno = EIO; + else if (nlerr->error == -EBUSY + && orig_this_buf_size != this_buf_size) + { + /* If EBUSY and MSG_TRUNC was seen, try again with a new + netlink socket. */ + struct netlink_handle hold = *h; + if (__netlink_open (h) < 0) + { + *h = hold; + goto out_fail; + } + __netlink_close (&hold); + orig_this_buf_size = this_buf_size; + nlm_next = *new_nlm_list; + while (nlm_next != NULL) + { + struct netlink_res *tmpptr; + + tmpptr = nlm_next->next; + free (nlm_next); + nlm_next = tmpptr; + } + *new_nlm_list = NULL; + count = 0; + h->seq++; + + if (__netlink_sendreq (h, type) < 0) + goto out_fail; + break; + } else errno = -nlerr->error; goto out_fail; Jakub