public inbox for frysk@sourceware.org
 help / color / mirror / Atom feed
* [patch] Stat vs slurp vs linux 2.6.22
@ 2007-07-25 11:30 Mark Wielaard
  2007-07-25 13:30 ` Andrew Cagney
  0 siblings, 1 reply; 5+ messages in thread
From: Mark Wielaard @ 2007-07-25 11:30 UTC (permalink / raw)
  To: frysk


[-- Attachment #1.1: Type: text/plain, Size: 1726 bytes --]

Hi,

On newer kernels (2.6.22) not just init (process id 1) can have a zero
parent pid, but other kernel processes/threads also have a parent pid of
zero (in particular kthreadd). This was done for systems with thousands
of CPUs which create ten-thousand (kernel) processes and having them all
children of init was not scaling (and not really necessary for the
kernel-threads, which don't need to be reaped by init in the first
place). This exposed a bug in our Stat parsing through LinuxHost. See
http://sourceware.org/bugzilla/show_bug.cgi?id=4838

Stat uses slurp which was written to handle these kind of issues
where /proc/<pid>/stat might not exist or disappear in between calls.
But it relied on the slurp() functions returning a failure value (NULL
or -1). A recent rewrite of slurp to use Errno.tryOpen() which throws an
Exception on error, broke Stat handling in such cases. Besides that
slurp() had a bug in that it could free() a given buffer that it hadn't
created itself (Stat was the owner of that buffer), this could lead to
strange memory corruption. Both issues were fixed by:

2007-07-25  Mark Wielaard  <mark@klomp.org>

    Fixes bug #4838
    * cni/slurp.cxx (uslurp): Catch Errno after tryOpen().
    (slurp): Likewise and don't free given buffer and return -1.
    (slurp_thread): Likewise.

Tested on x86_64 (Fedora Core 6) and x86 (Fedora 7 - with new 2.6.22
kernel). The failing tests reported in the bug report PASS with this
test on the new kernel. And introduces no new FAILs.

It might make sense to rewrite slurp as a normal java class so these
cross C/Java error handling issues. Is there a reason for slurp to have
to be written in C/CNI?

Cheers,

Mark

[-- Attachment #1.2: slurp.patch --]
[-- Type: text/x-patch, Size: 1919 bytes --]

Index: frysk-sys/frysk/sys/proc/cni/slurp.cxx
===================================================================
RCS file: /cvs/frysk/frysk-sys/frysk/sys/proc/cni/slurp.cxx,v
retrieving revision 1.10
diff -u -r1.10 slurp.cxx
--- frysk-sys/frysk/sys/proc/cni/slurp.cxx	18 Jul 2007 18:23:52 -0000	1.10
+++ frysk-sys/frysk/sys/proc/cni/slurp.cxx	25 Jul 2007 10:53:47 -0000
@@ -1,6 +1,6 @@
 // This file is part of the program FRYSK.
 //
-// Copyright 2005, Red Hat Inc.
+// Copyright 2005, 2007 Red Hat Inc.
 //
 // FRYSK is free software; you can redistribute it and/or modify it
 // under the terms of the GNU General Public License as published by
@@ -47,6 +47,7 @@
 
 #include <gcj/cni.h>
 
+#include "frysk/sys/Errno.h"
 #include "frysk/sys/cni/Errno.hxx"
 #include "frysk/sys/proc/cni/slurp.hxx"
 
@@ -73,7 +74,15 @@
   }
 
   // Open the file file.
-  int fd = tryOpen (file, O_RDONLY, 0);
+  int fd;
+  try
+    {
+      fd = tryOpen (file, O_RDONLY, 0);
+    }
+  catch (frysk::sys::Errno *ignored)
+    {
+      fd = 0;
+    }
   if (!fd)
     {
       ::free (buf);
@@ -127,12 +136,18 @@
     throwRuntimeException ("snprintf: buffer overflow");
   
   // Open the file file.
-  int fd = tryOpen (file, O_RDONLY, 0);
-
+  int fd;
+  try
+    {
+      fd = tryOpen (file, O_RDONLY, 0);
+    }
+  catch (frysk::sys::Errno *ignored)
+    {
+      fd = 0;
+    }
   if (!fd)
     {
-      ::free (buf);
-      return 0;
+      return -1;
     }
 
 
@@ -165,12 +180,18 @@
     throwRuntimeException ("snprintf: buffer overflow");
   
   // Open the file file.
-  int fd = tryOpen (file, O_RDONLY, 0);
-
+  int fd;
+  try
+    {
+      fd = tryOpen (file, O_RDONLY, 0);
+    }
+  catch (frysk::sys::Errno *ignored)
+    {
+      fd = 0;
+    }
   if (!fd)
     {
-      ::free (buf);
-      return 0;
+      return -1;
     }
 
 

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [patch] Stat vs slurp vs linux 2.6.22
  2007-07-25 11:30 [patch] Stat vs slurp vs linux 2.6.22 Mark Wielaard
@ 2007-07-25 13:30 ` Andrew Cagney
  2007-07-25 14:59   ` Mark Wielaard
  0 siblings, 1 reply; 5+ messages in thread
From: Andrew Cagney @ 2007-07-25 13:30 UTC (permalink / raw)
  To: Mark Wielaard; +Cc: frysk

Mark Wielaard wrote:
> Hi,
>
> On newer kernels (2.6.22) not just init (process id 1) can have a zero
> parent pid, but other kernel processes/threads also have a parent pid of
> zero (in particular kthreadd). This was done for systems with thousands
> of CPUs which create ten-thousand (kernel) processes and having them all
> children of init was not scaling (and not really necessary for the
> kernel-threads, which don't need to be reaped by init in the first
> place). This exposed a bug in our Stat parsing through LinuxHost. See
> http://sourceware.org/bugzilla/show_bug.cgi?id=4838
>
> Stat uses slurp which was written to handle these kind of issues
> where /proc/<pid>/stat might not exist or disappear in between calls.
> But it relied on the slurp() functions returning a failure value (NULL
> or -1). A recent rewrite of slurp to use Errno.tryOpen() which throws an
> Exception on error, broke Stat handling in such cases. Besides that
> slurp() had a bug in that it could free() a given buffer that it hadn't
> created itself (Stat was the owner of that buffer), this could lead to
> strange memory corruption. Both issues were fixed by:
>
>   
> 2007-07-25  Mark Wielaard  <mark@klomp.org>
>
>     Fixes bug #4838
>     * cni/slurp.cxx (uslurp): Catch Errno after tryOpen().
>     (slurp): Likewise and don't free given buffer and return -1.
>     (slurp_thread): Likewise.
>
> Tested on x86_64 (Fedora Core 6) and x86 (Fedora 7 - with new 2.6.22
> kernel). The failing tests reported in the bug report PASS with this
> test on the new kernel. And introduces no new FAILs.
>
> It might make sense to rewrite slurp as a normal java class so these
> cross C/Java error handling issues. Is there a reason for slurp to have
> to be written in C/CNI?
>   

-> memory pressure; notice how it does a callback
-> performance

it was written in java
> Cheers,
>
> Mark
>   
> ------------------------------------------------------------------------
>
> Index: frysk-sys/frysk/sys/proc/cni/slurp.cxx
> ===================================================================
> RCS file: /cvs/frysk/frysk-sys/frysk/sys/proc/cni/slurp.cxx,v
> retrieving revision 1.10
> diff -u -r1.10 slurp.cxx
> --- frysk-sys/frysk/sys/proc/cni/slurp.cxx	18 Jul 2007 18:23:52 -0000	1.10
> +++ frysk-sys/frysk/sys/proc/cni/slurp.cxx	25 Jul 2007 10:53:47 -0000
> @@ -1,6 +1,6 @@
>  // This file is part of the program FRYSK.
>  //
> -// Copyright 2005, Red Hat Inc.
> +// Copyright 2005, 2007 Red Hat Inc.
>  //
>  // FRYSK is free software; you can redistribute it and/or modify it
>  // under the terms of the GNU General Public License as published by
> @@ -47,6 +47,7 @@
>  
>  #include <gcj/cni.h>
>  
> +#include "frysk/sys/Errno.h"
>  #include "frysk/sys/cni/Errno.hxx"
>  #include "frysk/sys/proc/cni/slurp.hxx"
>  
> @@ -73,7 +74,15 @@
>    }
>  
>    // Open the file file.
> -  int fd = tryOpen (file, O_RDONLY, 0);
> +  int fd;
> +  try
> +    {
> +      fd = tryOpen (file, O_RDONLY, 0);
> +    }
> +  catch (frysk::sys::Errno *ignored)
> +    {
> +      fd = 0;
> +    }
>    if (!fd)
>      {
>        ::free (buf);
> @@ -127,12 +136,18 @@
>      throwRuntimeException ("snprintf: buffer overflow");
>    
>    // Open the file file.
> -  int fd = tryOpen (file, O_RDONLY, 0);
> -
> +  int fd;
> +  try
> +    {
> +      fd = tryOpen (file, O_RDONLY, 0);
> +    }
> +  catch (frysk::sys::Errno *ignored)
> +    {
> +      fd = 0;
> +    }
>    if (!fd)
>      {
> -      ::free (buf);
> -      return 0;
> +      return -1;
>      }
>  
>  
> @@ -165,12 +180,18 @@
>      throwRuntimeException ("snprintf: buffer overflow");
>    
>    // Open the file file.
> -  int fd = tryOpen (file, O_RDONLY, 0);
> -
> +  int fd;
> +  try
> +    {
> +      fd = tryOpen (file, O_RDONLY, 0);
> +    }
> +  catch (frysk::sys::Errno *ignored)
> +    {
> +      fd = 0;
> +    }
>    if (!fd)
>      {
> -      ::free (buf);
> -      return 0;
> +      return -1;
>      }
>  
>  
>   

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [patch] Stat vs slurp vs linux 2.6.22
  2007-07-25 13:30 ` Andrew Cagney
@ 2007-07-25 14:59   ` Mark Wielaard
  2007-07-25 15:39     ` Andrew Cagney
  0 siblings, 1 reply; 5+ messages in thread
From: Mark Wielaard @ 2007-07-25 14:59 UTC (permalink / raw)
  To: Andrew Cagney; +Cc: frysk

Hi Andrew,

On Wed, 2007-07-25 at 09:30 -0400, Andrew Cagney wrote:
> > It might make sense to rewrite slurp as a normal java class so these
> > cross C/Java error handling issues. Is there a reason for slurp to have
> > to be written in C/CNI?
> >   
> -> memory pressure; notice how it does a callback
> -> performance

Interesting. It does feel a little like a premature optimization though,
seeing that we are finding bugs in it because of the Java/C/CNI mixture
of error and memory handling that are hard to track down.

How precisely do callbacks work in slurp  (I have to admit that I
haven't noticed them) and why can't you do that with a normal java
implementation?

Thanks,

Mark

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [patch] Stat vs slurp vs linux 2.6.22
  2007-07-25 14:59   ` Mark Wielaard
@ 2007-07-25 15:39     ` Andrew Cagney
  2007-07-26  7:24       ` Mark Wielaard
  0 siblings, 1 reply; 5+ messages in thread
From: Andrew Cagney @ 2007-07-25 15:39 UTC (permalink / raw)
  To: Mark Wielaard; +Cc: frysk

Mark Wielaard wrote:
> Hi Andrew,
>
> On Wed, 2007-07-25 at 09:30 -0400, Andrew Cagney wrote:
>   
>>> It might make sense to rewrite slurp as a normal java class so these
>>> cross C/Java error handling issues. Is there a reason for slurp to have
>>> to be written in C/CNI?
>>>   
>>>       
>> -> memory pressure; notice how it does a callback
>> -> performance
>>     
>
> Interesting. It does feel a little like a premature optimization though,
> seeing that we are finding bugs in it because of the Java/C/CNI mixture
> of error and memory handling that are hard to track down.
>
>   
We're talking here about Scan, not slurp; the first implementation did 
use File et.al.  It was changed from java to a builder to reduce memory 
pressure and produced a dramatic performance boost.

Scan, by not double allocating memory read also keeps memory pressure down.

Going forward, the less memory dropped on the floor by the core, the 
easier it will be to push more of this into C++.

Andrew

>
>   

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [patch] Stat vs slurp vs linux 2.6.22
  2007-07-25 15:39     ` Andrew Cagney
@ 2007-07-26  7:24       ` Mark Wielaard
  0 siblings, 0 replies; 5+ messages in thread
From: Mark Wielaard @ 2007-07-26  7:24 UTC (permalink / raw)
  To: Andrew Cagney; +Cc: frysk

[-- Attachment #1: Type: text/plain, Size: 2155 bytes --]

Hi Andrew,

On Wed, 2007-07-25 at 11:40 -0400, Andrew Cagney wrote:
> Mark Wielaard wrote:
> >>> It might make sense to rewrite slurp as a normal java class so these
> >>> cross C/Java error handling issues. Is there a reason for slurp to have
> >>> to be written in C/CNI?
> >>>   
> >> -> memory pressure; notice how it does a callback
> >> -> performance
> >>     
> > Interesting. It does feel a little like a premature optimization though,
> > seeing that we are finding bugs in it because of the Java/C/CNI mixture
> > of error and memory handling that are hard to track down.
> >
> We're talking here about Scan, not slurp; the first implementation did 
> use File et.al.  It was changed from java to a builder to reduce memory 
> pressure and produced a dramatic performance boost.
> 
> Scan, by not double allocating memory read also keeps memory pressure down.

Does Scan refer to some old class used before slurp? I couldn't easily
track the history in CVS since the files seem to have moved a lot. Or
are you talking about the scanInt() and scanLong() functions of slurp?

In the later case there is no reason why reading the whole thing into a
buffer and then extracting the data out of it would be more or less
efficient in some other language. If you would really like to keep the
memory pressure down you could even just use a DataInputStream so you
don't need to read in a buffer at all and get reading primitive
datatypes for free.

> Going forward, the less memory dropped on the floor by the core, the 
> easier it will be to push more of this into C++.

I am not sure I follow what you are saying here. I agree that writing
memory efficient code is good, but I don't see where C++ comes in. Since
most of frysk is actually written in java, we want to provide a higher
level interface in java for the core, most of the nasty bugs come from
the C side with memory resource corruption and exception handling errors
and we might want to look at using JNI instead of CNI it makes sense to
rely less on C instead of more to make things easier for us all and
safer for the user.

Cheers,

Mark

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2007-07-26  7:24 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-07-25 11:30 [patch] Stat vs slurp vs linux 2.6.22 Mark Wielaard
2007-07-25 13:30 ` Andrew Cagney
2007-07-25 14:59   ` Mark Wielaard
2007-07-25 15:39     ` Andrew Cagney
2007-07-26  7:24       ` Mark Wielaard

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).