public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libc/11432] New: getsysstat.c returns erroneous results
@ 2010-03-25 16:02 rsa at us dot ibm dot com
  2010-03-25 16:05 ` [Bug libc/11432] " rsa at us dot ibm dot com
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: rsa at us dot ibm dot com @ 2010-03-25 16:02 UTC (permalink / raw)
  To: glibc-bugs

The following commit introduced an error in fetching the number of cpus from
/proc/stat on a system with a large number of cpus (though I think the problem
has more to do with how the buffer gets filled):

http://sourceware.org/git/?p=glibc.git;a=commitdiff;h=6a3d03ff58742430a252beac4a1917506512e319

I'm going to attach a test-case and an example /proc/stat file which triggers
the erroneous condition after this explanation.

I believe that the first part of the commit is fine.  The second part is
erroneous and redundant, namely:

+  else if (nl + 5 >= *re)
+    {
+      memmove (buffer, nl, *re - nl);
+      *re = buffer + (*re - nl);
+      nl = *cp = buffer;
+
+      ssize_t n = read_not_cancel (fd, *re, buffer_end - *re);
+      if (n < 0)
+       return NULL;
+
+      *re += n;
+    }

This is meant to "Make sure there are always at least 4 bytes in the returned
line.", i.e. to pick up a minimum of "cpu*" line.

If the current buffer doesn't contain enough characters to complete a
"cpu[:digit:]" line (or any other new line of 4 characters) following the last
'\n' it'll try to move what's partially in the end of the buffer to the front
(preserving that portion) and then read in some more, hoping that a '\n' is in
the next read.

This else block has a critical mistake.  It's not processing on the correct
line.  The next_line() function is supposed to be processing the line *before*
the '\n' that is found with memchr(), but this else block overwrites this line
with the fragment of the line *after* the \n.

This, of course, results in incorrect incrementing of the number of cpu* lines
when next_line returns a pointer to the line after the found '\n', not the
current '\n'.

We could attempt to fix it so that it pre-emptively shifts the current & partial
next line to the front of the buffer so we can read a valid next line in the
next iteration as follows (which has been tested and fixes the problem):

  else if (nl + 5 >= *re)
    {
     /* Pre-emptively shift current & partial next line to front of the buffer
        so we can read a full next line in the next iteration.  */

      memmove (buffer, *cp, *re - *cp);
      *re = buffer + (*re - *cp);

      /* Current line is now at the front of the buffer.  */
      *cp = buffer;

      /* Find where the current line's \n moved.  */
      nl = memchr (*cp, '\n', *re - *cp);

      /* Something would be very very wrong with our memmove if we can't find
         the current line's '\n' after the move.  */
      assert (nl == NULL);

      ssize_t n = read_not_cancel (fd, *re, buffer_end - *re);
      if (n < 0)
        return NULL;

      *re += n;
      /* return current line from front of buffer.  */
      res = *cp;
    }

But I think this whole else block is redundant.  The fixed version does exactly
the same thing as Jakub's chunk of the patch but Jakub's chunk also accounts for
lines that are too long:

+	      nl = memchr (*cp, '\n', *re - *cp);
+	      while (nl == NULL && *re == buffer_end)
+		{
+		  /* Truncate too long lines.  */
+		  *re = buffer + 3 * (buffer_end - buffer) / 4;
+		  n = read_not_cancel (fd, *re, buffer_end - *re);
+		  if (n < 0)
+		    return NULL;
+
+		  nl = memchr (*re, '\n', n);
+		  **re = '\n';
+		  *re += n;
+		}
 	    }
+	  else
+	    nl = memchr (*cp, '\n', *re - *cp);

 	  res = *cp;
-	  nl = memchr (*cp, '\n', *re - *cp);

I'm going to attach a patch which removes the erroneous else block.

-- 
           Summary: getsysstat.c returns erroneous results
           Product: glibc
           Version: unspecified
            Status: NEW
          Severity: normal
          Priority: P2
         Component: libc
        AssignedTo: drepper at redhat dot com
        ReportedBy: rsa at us dot ibm dot com
                CC: glibc-bugs at sources dot redhat dot com


http://sourceware.org/bugzilla/show_bug.cgi?id=11432

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug libc/11432] getsysstat.c returns erroneous results
  2010-03-25 16:02 [Bug libc/11432] New: getsysstat.c returns erroneous results rsa at us dot ibm dot com
@ 2010-03-25 16:05 ` rsa at us dot ibm dot com
  2010-03-25 16:10 ` rsa at us dot ibm dot com
  2010-04-04  2:21 ` drepper at redhat dot com
  2 siblings, 0 replies; 5+ messages in thread
From: rsa at us dot ibm dot com @ 2010-03-25 16:05 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From rsa at us dot ibm dot com  2010-03-25 16:04 -------
Created an attachment (id=4680)
 --> (http://sourceware.org/bugzilla/attachment.cgi?id=4680&action=view)
example /proc/stat file which shows the error

The following test case:

#include<stdio.h>
#include<sys/sysinfo.h>
int main()
{
int lcpus=get_nprocs();
printf("logical cpus = %d\n",lcpus);
return 0;
}

Can be run against the attached /proc/stat file which is known to reproduce the

problem:

This stat file can be used with the testcase by bind mounting it over
/proc/stat:
cp stat /dev/shm/stat
mount --bind /dev/shm/stat /proc/stat

When run this should show:

logical cpus = 1024

since cpu1024 is high cpu number.  But it shows something like (which may vary
based on the size of the 'buffer' in the next_line algorithm):

logical cpus = 137

-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=11432

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug libc/11432] getsysstat.c returns erroneous results
  2010-03-25 16:02 [Bug libc/11432] New: getsysstat.c returns erroneous results rsa at us dot ibm dot com
  2010-03-25 16:05 ` [Bug libc/11432] " rsa at us dot ibm dot com
@ 2010-03-25 16:10 ` rsa at us dot ibm dot com
  2010-04-04  2:21 ` drepper at redhat dot com
  2 siblings, 0 replies; 5+ messages in thread
From: rsa at us dot ibm dot com @ 2010-03-25 16:10 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From rsa at us dot ibm dot com  2010-03-25 16:09 -------
Created an attachment (id=4681)
 --> (http://sourceware.org/bugzilla/attachment.cgi?id=4681&action=view)
patch to remove erroneous else block from next_line() function.

This patch removes the erroneous else block.  It has been shown in testing to
eliminate the error and the test returns the correct number of CPUs.

-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=11432

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug libc/11432] getsysstat.c returns erroneous results
  2010-03-25 16:02 [Bug libc/11432] New: getsysstat.c returns erroneous results rsa at us dot ibm dot com
  2010-03-25 16:05 ` [Bug libc/11432] " rsa at us dot ibm dot com
  2010-03-25 16:10 ` rsa at us dot ibm dot com
@ 2010-04-04  2:21 ` drepper at redhat dot com
  2 siblings, 0 replies; 5+ messages in thread
From: drepper at redhat dot com @ 2010-04-04  2:21 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From drepper at redhat dot com  2010-04-04 02:21 -------
I've checked in the change.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED


http://sourceware.org/bugzilla/show_bug.cgi?id=11432

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug libc/11432] getsysstat.c returns erroneous results
       [not found] <bug-11432-131@http.sourceware.org/bugzilla/>
@ 2014-06-30 18:23 ` fweimer at redhat dot com
  0 siblings, 0 replies; 5+ messages in thread
From: fweimer at redhat dot com @ 2014-06-30 18:23 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=11432

Florian Weimer <fweimer at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
              Flags|                            |security-

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-06-30 18:23 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-03-25 16:02 [Bug libc/11432] New: getsysstat.c returns erroneous results rsa at us dot ibm dot com
2010-03-25 16:05 ` [Bug libc/11432] " rsa at us dot ibm dot com
2010-03-25 16:10 ` rsa at us dot ibm dot com
2010-04-04  2:21 ` drepper at redhat dot com
     [not found] <bug-11432-131@http.sourceware.org/bugzilla/>
2014-06-30 18:23 ` fweimer at redhat dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).