public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
* [Patch] Fix AIX shared library load broken during fork ().
@ 2023-07-10 11:35 Aditya Kamath1
  2023-07-10 14:11 ` Alan Modra
  2023-07-10 16:00 ` Tom Tromey
  0 siblings, 2 replies; 7+ messages in thread
From: Aditya Kamath1 @ 2023-07-10 11:35 UTC (permalink / raw)
  To: Ulrich Weigand, Aditya Kamath1 via Gdb-patches
  Cc: Alan Modra, Alan Modra, Sangamesh Mallayya


[-- Attachment #1.1: Type: text/plain, Size: 9651 bytes --]

Respected Ulrich and GDB community members,

Hi,

Please find attached a patch. {See: 0001-Fix-AIX-shared-library-load-broken-during-fork.patch}
This is a patch to fix the loading of the shared library during the fork () event.

In the recent commit {https://sourceware.org/git/?p=binutils-gdb.git;a=commitdiff;h=de7b90610e9e939c49290229c282eb9171c560b9} the committer mentions that he has added a add_range () function which now adds a start, end range which now guards anyone from calling or having two or more file pointer pointing to the same file in the archive. So, if this is the case, then we BFD goes to execute the following condition
+  if (bfd_seek (abfd, (namlen & 1) + SXCOFFARFMAG, SEEK_CUR) != 0
+      || !add_range (abfd, start, abfd->where + ret->parsed_size))
+    {
+      free (ret);
+      return NULL;
+    }


We will satisy the add_range () part of the OR condition and free the ret variable and return NULL. Kindly note this happens in the coff-rs6000.c file in bfd directory.

As an impact of this in our GDB code in solib-aix.c file during a fork () event, after the parent shared libraries are loaded and the child shared library in going to be loaded, when GDB executes the line.
gdb_bfd_ref_ptr object_bfd
    (gdb_bfd_openr_next_archived_file (archive_bfd.get (), NULL));

in search of the object file in the archive, but object_bfd we receive is NULL. Then the while (object_bfd != NULL) does not satisfy and we cannot load the shared the library.

As result in AIX for Example code 1 {pasted below this email}  we see the output as shown below.

------------------------------------------------------------------------------------------------------------------------------------
bash-5.1$ ./gdb ~/gdb_tests/multi-thread-fork
GNU gdb (GDB) 14.0.50.20230602-git
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "powerpc64-ibm-aix7.2.0.0".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
https://www.gnu.org/software/gdb/bugs/.
Find the GDB manual and other documentation resources online at:
    http://www.gnu.org/software/gdb/documentation/.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /home/aditya/gdb_tests/multi-thread-fork...
(gdb) set detach-on-fork off
(gdb) r
Starting program: /home/aditya/gdb_tests/multi-thread-fork
[New Thread 258]
[New Thread 515]
[New inferior 2 (process 25166094)]
warning: "/usr/lib/libpthreads.a": member "shr_comm.o" missing.
warning: "/usr/lib/libpthread.a": member "shr_xpg5.o" missing.
warning: "/usr/lib/libc.a": member "shr.o" missing.
warning: Could not load shared library symbols for 3 libraries, e.g. /usr/lib/libpthreads.a(shr_comm.o).
Use the "info sharedlibrary" command to see the complete listing.
Do you need "set solib-search-path" or "set sysroot"?
I am parent
[New process 10551700]
[New inferior 3 (process 9306438)]
warning: "/usr/lib/libpthreads.a": member "shr_comm.o" missing.
warning: "/usr/lib/libpthread.a": member "shr_xpg5.o" missing.
warning: "/usr/lib/libc.a": member "shr.o" missing.
warning: Could not load shared library symbols for 3 libraries, e.g. /usr/lib/libpthreads.a(shr_comm.o).
Use the "info sharedlibrary" command to see the complete listing.
Do you need "set solib-search-path" or "set sysroot"?
I am parent

Thread 1.4 received signal SIGINT, Interrupt.
[Switching to process 10551700]
0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)
(gdb) inferior 2
[Switching to inferior 2 [process 25166094] (/home/aditya/gdb_tests/multi-thread-fork)]
[Switching to thread 2.1 (process 25166094)]
#0  0xd0594fc8 in ?? ()
(gdb) info sharedlibrary
From        To          Syms Read   Shared Object Library
                        No          /usr/lib/libpthreads.a(shr_comm.o)
0xd05bb240  0xd05bb9a1  Yes (*)     /usr/lib/libcrypt.a(shr.o)
                        No          /usr/lib/libpthread.a(shr_xpg5.o)
                        No          /usr/lib/libc.a(shr.o)
(*): Shared library is missing debugging information.
(gdb)

------------------------------------------------------------------------------------------------------------------------------------
So, what I tried in this patch is to keep the pointers pointing to a shared object file stored in a vector and when BFD does not find any object file use the already stored pointer pointing to the shared object for the new inferior.

After applying the patch, the output is as below:-

------------------------------------------------------------------------------------------------------------------------------------
bash-5.1$ ./gdb ~/gdb_tests/multi-thread-fork
GNU gdb (GDB) 14.0.50.20230602-git
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "powerpc64-ibm-aix7.2.0.0".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
https://www.gnu.org/software/gdb/bugs/.
Find the GDB manual and other documentation resources online at:
    http://www.gnu.org/software/gdb/documentation/.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /home/aditya/gdb_tests/multi-thread-fork...
(gdb) set detach-on-fork off
(gdb) r
Starting program: /home/aditya/gdb_tests/multi-thread-fork
[New Thread 258]
[New Thread 515]
[New inferior 2 (process 27984338)]
[New process 10551730]
[New inferior 3 (process 28049732)]
I am parent
I am parent

Thread 1.4 received signal SIGINT, Interrupt.
[Switching to process 10551730]
0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)
(gdb) inferior 2
[Switching to inferior 2 [process 27984338] (/home/aditya/gdb_tests/multi-thread-fork)]
[Switching to thread 2.1 (process 27984338)]
#0  0xd0594fc8 in _sigsetmask () from /usr/lib/libpthread.a(shr_xpg5.o)
(gdb) info sharedlibrary
From        To          Syms Read   Shared Object Library
0xd05bc124  0xd05bf194  Yes (*)     /usr/lib/libpthreads.a(shr_comm.o)
0xd05bb240  0xd05bb9a1  Yes (*)     /usr/lib/libcrypt.a(shr.o)
0xd0576180  0xd05ba731  Yes (*)     /usr/lib/libpthread.a(shr_xpg5.o)
0xd0100dc0  0xd0575123  Yes (*)     /usr/lib/libcrypt.a(shr.o)
(*): Shared library is missing debugging information.
(gdb) info threads
  Id   Target Id                          Frame
  1.1  Thread 1 (tid 25625035, running)   0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)
  1.2  Thread 258 (tid 39256515, running) thread_function (arg=0x0) at /home/aditya/gdb_tests/multi-thread-fork.c:27
  1.3  Thread 515 (tid 35193173, running) thread_function (arg=warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)

0x0) at /home/aditya/gdb_tests/multi-thread-fork.c:27
  1.4  process 10551730                   0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)
* 2.1  process 27984338                   0xd0594fc8 in _sigsetmask () from /usr/lib/libpthread.a(shr_xpg5.o)
  3.1  process 28049732                   0xd0594fc8 in _sigsetmask () from /usr/lib/libpthread.a(shr_xpg5.o)
------------------------------------------------------------------------------------------------------------------------------------

So, the idea was almost close to solving it but still at distance from it. But I see that the parent process 10551730 which was a thread target is now reflecting as a process target as an impact of the same. In fact, the thread target is being set correctly for the parent process while I debugged the issue.

So, I currently, do not know why this target mismatch has happened and is it because of using the same pointers for the child process as well.

Since you are all experts, kindly guide me on where my analysis has gone wrong and what we can do to fix this issue. Also kindly review this patch.

Waiting for a reply,

Have a nice day ahead.

Thanks and regards,
Aditya.


------------------------------------------------------------------------------------------------------------------------------------
Example Code 1:-

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <pthread.h>
#include <assert.h>

pthread_barrier_t barrier;

#define NUM_THREADS 2

void *
thread_function (void *arg)
{
  /* This ensures that the breakpoint is only hit after both threads
     are created, so the test can always switch to the non-event
     thread when the breakpoint triggers.  */

  pthread_barrier_wait (&barrier);
  pid_t child;

  child = fork ();
  if (child > 0)
    printf ("I am parent \n");
  else
    printf (" Iam child \n");

  while (1); /* break here */
}

int
main (void)
{
  int i;

  alarm (300);

  pthread_barrier_init (&barrier, NULL, NUM_THREADS);

  for (i = 0; i < NUM_THREADS; i++)
    {
      pthread_t thread;
      int res;

      res = pthread_create (&thread, NULL,
                            thread_function, NULL);
      assert (res == 0);
    }

  while (1)
    sleep (1);

  return 0;
}

[-- Attachment #2: 0001-Fix-AIX-shared-library-load-broken-during-fork.patch --]
[-- Type: application/octet-stream, Size: 2820 bytes --]

From dfbe9cdf99f33267a1ccf30e8f8725f40835d7c9 Mon Sep 17 00:00:00 2001
From: Aditya Vidyadhar Kamath <Aditya.Kamath1@ibm.com>
Date: Mon, 10 Jul 2023 05:57:02 -0500
Subject: [PATCH] Fix AIX shared library load broken during fork ().

This patch is a fix to the AIX shared library not loading
for a new inferior during a fork () event.  The reason being in
AIX we archive shared libraries and we are not allowed to have many
pointers pointing to the same archive file as per a recent commit in bfd.

So we keep track of every pointer pointing to a shared object in an archive
such that the new inferior can reuse them.
---
 gdb/solib-aix.c | 39 +++++++++++++++++++++++++++++++++++++--
 1 file changed, 37 insertions(+), 2 deletions(-)

diff --git a/gdb/solib-aix.c b/gdb/solib-aix.c
index 93aa6c4e040..00c12a6e314 100644
--- a/gdb/solib-aix.c
+++ b/gdb/solib-aix.c
@@ -29,6 +29,15 @@
 #include "gdbcmd.h"
 #include "gdbsupport/scope-exit.h"
 
+/* Vector to keep the pointer to object in an archive.  */
+
+/* Once accessed we are not able to refer the same object
+   with a different pointer due to add_range () in bfd.
+   So we maintain the pointers in this vector to access
+   them during every shared library load for a new inferior.  */
+
+std::vector<gdb_bfd_ref_ptr> object_bfd_vector;
+
 /* Our private data in struct so_list.  */
 
 struct lm_info_aix : public lm_info_base
@@ -605,10 +614,33 @@ solib_aix_bfd_open (const char *pathname)
 
   gdb_bfd_ref_ptr object_bfd
     (gdb_bfd_openr_next_archived_file (archive_bfd.get (), NULL));
+  if (object_bfd == NULL)
+    {
+      auto it = object_bfd_vector.begin ();
+      while (it != object_bfd_vector.end ())
+	{
+	  gdb_bfd_ref_ptr object_bfd_tmp = *it;
+	  std::string s = bfd_get_filename (object_bfd_tmp.get ());
+	  if (s.find ('(') != std::string::npos)
+	    {
+	      int pos = s.find ('(');
+	      int len = s.find (')') - s.find ('(');
+	      /* For a new inferior which cannot access the file in 
+		 its parent would be having a pointer to the same
+		 file, we will return the parent's bdf ref ptr.  */
+	      if (s.substr (pos+1, len-1) == member_name)
+		return object_bfd_tmp;
+	    }
+	  it++;
+	}
+    }
   while (object_bfd != NULL)
     {
       if (member_name == bfd_get_filename (object_bfd.get ()))
-	break;
+	{
+	  object_bfd_vector.push_back (object_bfd);
+	  break;
+	}
 
       std::string s = bfd_get_filename (object_bfd.get ());
 
@@ -621,7 +653,10 @@ solib_aix_bfd_open (const char *pathname)
 	  int pos = s.find ('(');
 	  int len = s.find (')') - s.find ('(');
 	  if (s.substr (pos+1, len-1) == member_name)
-	    return object_bfd;
+	    {
+	      object_bfd_vector.push_back (object_bfd);
+	      return object_bfd;
+	    }
 	}
 
       object_bfd = gdb_bfd_openr_next_archived_file (archive_bfd.get (),
-- 
2.38.3


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Patch] Fix AIX shared library load broken during fork ().
  2023-07-10 11:35 [Patch] Fix AIX shared library load broken during fork () Aditya Kamath1
@ 2023-07-10 14:11 ` Alan Modra
  2023-07-10 14:29   ` Aditya Kamath1
  2023-07-10 16:00 ` Tom Tromey
  1 sibling, 1 reply; 7+ messages in thread
From: Alan Modra @ 2023-07-10 14:11 UTC (permalink / raw)
  To: Aditya Kamath1
  Cc: Ulrich Weigand, Aditya Kamath1 via Gdb-patches, Alan Modra,
	Sangamesh Mallayya

On Mon, Jul 10, 2023 at 11:35:09AM +0000, Aditya Kamath1 wrote:
> In the recent commit {https://sourceware.org/git/?p=binutils-gdb.git;a=commitdiff;h=de7b90610e9e939c49290229c282eb9171c560b9} the committer mentions that he has added a add_range () function which now adds a start, end range which now guards anyone from calling or having two or more file pointer pointing to the same file in the archive. So, if this is the case, then we BFD goes to execute the following condition
> +  if (bfd_seek (abfd, (namlen & 1) + SXCOFFARFMAG, SEEK_CUR) != 0
> +      || !add_range (abfd, start, abfd->where + ret->parsed_size))
> +    {
> +      free (ret);
> +      return NULL;
> +    }
> 
> 
> We will satisy the add_range () part of the OR condition and free the ret variable and return NULL. Kindly note this happens in the coff-rs6000.c file in bfd directory.

As the comment in add_range says, you should be able to read the same
element more than once due to _bfd_get_elf_at_filepos calling
_bfd_look_for_bfd_in_cache, which caches elements by file position.

So why isn't that working?

Oh, I think I see the bug.  no_element_cache is set when reading the
first element of an archive to figure out the proper target.  If
no_element_cache is set then we can't call add_range.

Can you try this patch?

	* coff-rs6000.c (_bfd_xcoff_read_ar_hdr): Don't add_range when
	no_element_cache is set.

diff --git a/bfd/coff-rs6000.c b/bfd/coff-rs6000.c
index 271a24fff69..bb9604b0780 100644
--- a/bfd/coff-rs6000.c
+++ b/bfd/coff-rs6000.c
@@ -1744,7 +1744,8 @@ _bfd_xcoff_read_ar_hdr (bfd *abfd)
 
   /* Skip over the XCOFFARFMAG at the end of the file name.  */
   if (bfd_seek (abfd, (namlen & 1) + SXCOFFARFMAG, SEEK_CUR) != 0
-      || !add_range (abfd, start, abfd->where + ret->parsed_size))
+      || (!abfd->no_element_cache
+	  && !add_range (abfd, start, abfd->where + ret->parsed_size)))
     {
       free (ret);
       return NULL;


-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [Patch] Fix AIX shared library load broken during fork ().
  2023-07-10 14:11 ` Alan Modra
@ 2023-07-10 14:29   ` Aditya Kamath1
  2023-07-10 15:04     ` Alan Modra
  0 siblings, 1 reply; 7+ messages in thread
From: Aditya Kamath1 @ 2023-07-10 14:29 UTC (permalink / raw)
  To: Alan Modra
  Cc: Ulrich Weigand, Aditya Kamath1 via Gdb-patches, Alan Modra,
	Sangamesh Mallayya

[-- Attachment #1: Type: text/plain, Size: 6642 bytes --]

Hi Alan, Ulrich and community,

Thank you for the reply.

>As the comment in add_range says, you should be able to read the same
>element more than once due to _bfd_get_elf_at_filepos calling
>_bfd_look_for_bfd_in_cache, which caches elements by file position.

>So why isn't that working?

>Oh, I think I see the bug.  no_element_cache is set when reading the
>first element of an archive to figure out the proper target.  If
>no_element_cache is set then we can't call add_range.

>Can you try this patch?

So actually it does not work. Kindly check the output pasted below and the patch applied.

Kindly let me know what we can do next..

Thanks and regards,
Aditya.

bash-5.1$ ./gdb ~/gdb_tests/multi-thread-fork
GNU gdb (GDB) 14.0.50.20230602-git
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "powerpc64-ibm-aix7.2.0.0".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
https://www.gnu.org/software/gdb/bugs/.
Find the GDB manual and other documentation resources online at:
    http://www.gnu.org/software/gdb/documentation/.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /home/aditya/gdb_tests/multi-thread-fork...
(gdb) set detach-on-fork off
(gdb) r
Starting program: /home/aditya/gdb_tests/multi-thread-fork
[New Thread 258]
[New Thread 515]
[New inferior 2 (process 20054400)]
warning: "/usr/lib/libpthreads.a": member "shr_comm.o" missing.
warning: "/usr/lib/libpthread.a": member "shr_xpg5.o" missing.
warning: "/usr/lib/libc.a": member "shr.o" missing.
warning: Could not load shared library symbols for 3 libraries, e.g. /usr/lib/libpthreads.a(shr_comm.o).
Use the "info sharedlibrary" command to see the complete listing.
Do you need "set solib-search-path" or "set sysroot"?
I am parent
[New process 22282610]
[New inferior 3 (process 17564062)]
warning: "/usr/lib/libpthreads.a": member "shr_comm.o" missing.
warning: "/usr/lib/libpthread.a": member "shr_xpg5.o" missing.
warning: "/usr/lib/libc.a": member "shr.o" missing.
warning: Could not load shared library symbols for 3 libraries, e.g. /usr/lib/libpthreads.a(shr_comm.o).
Use the "info sharedlibrary" command to see the complete listing.
Do you need "set solib-search-path" or "set sysroot"?
I am parent

Thread 1.4 received signal SIGINT, Interrupt.
[Switching to process 22282610]
0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)
(gdb) inferior 2
[Switching to inferior 2 [process 20054400] (/home/aditya/gdb_tests/multi-thread-fork)]
[Switching to thread 2.1 (process 20054400)]
#0  0xd0594fc8 in ?? ()
(gdb) info sharedlibrary
From        To          Syms Read   Shared Object Library
                        No          /usr/lib/libpthreads.a(shr_comm.o)
0xd05bb240  0xd05bb9a1  Yes (*)     /usr/lib/libcrypt.a(shr.o)
                        No          /usr/lib/libpthread.a(shr_xpg5.o)
                        No          /usr/lib/libc.a(shr.o)
(*): Shared library is missing debugging information.
(gdb) q
A debugging session is active.

        Inferior 1 [process 22282610] will be killed.
        Inferior 2 [process 20054400] will be killed.
        Inferior 3 [process 17564062] will be killed.

Quit anyway? (y or n) y
bash-5.1$
bash-5.1$ diff -u ../bfd/coff-rs6000.c_orig ../bfd/coff-rs6000.c
--- ../bfd/coff-rs6000.c_orig   2023-07-10 09:20:47.209312912 +0000
+++ ../bfd/coff-rs6000.c        2023-07-10 09:25:54.078326997 +0000
@@ -1744,7 +1744,8 @@
   /* Skip over the XCOFFARFMAG at the end of the file name.  */
   if (bfd_seek (abfd, (namlen & 1) + SXCOFFARFMAG, SEEK_CUR) != 0
-      || !add_range (abfd, start, abfd->where + ret->parsed_size))
+     || (!abfd->no_element_cache
+     && !add_range (abfd, start, abfd->where + ret->parsed_size)))
     {
       free (ret);
       return NULL;

From: Alan Modra <amodra@gmail.com>
Date: Monday, 10 July 2023 at 7:41 PM
To: Aditya Kamath1 <Aditya.Kamath1@ibm.com>
Cc: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>, Aditya Kamath1 via Gdb-patches <gdb-patches@sourceware.org>, Alan Modra <amodra@au1.ibm.com>, Sangamesh Mallayya <sangamesh.swamy@in.ibm.com>
Subject: [EXTERNAL] Re: [Patch] Fix AIX shared library load broken during fork ().
On Mon, Jul 10, 2023 at 11:35:09AM +0000, Aditya Kamath1 wrote:
> In the recent commit {https://sourceware.org/git/?p=binutils-gdb.git;a=commitdiff;h=de7b90610e9e939c49290229c282eb9171c560b9 } the committer mentions that he has added a add_range () function which now adds a start, end range which now guards anyone from calling or having two or more file pointer pointing to the same file in the archive. So, if this is the case, then we BFD goes to execute the following condition
> +  if (bfd_seek (abfd, (namlen & 1) + SXCOFFARFMAG, SEEK_CUR) != 0
> +      || !add_range (abfd, start, abfd->where + ret->parsed_size))
> +    {
> +      free (ret);
> +      return NULL;
> +    }
>
>
> We will satisy the add_range () part of the OR condition and free the ret variable and return NULL. Kindly note this happens in the coff-rs6000.c file in bfd directory.

As the comment in add_range says, you should be able to read the same
element more than once due to _bfd_get_elf_at_filepos calling
_bfd_look_for_bfd_in_cache, which caches elements by file position.

So why isn't that working?

Oh, I think I see the bug.  no_element_cache is set when reading the
first element of an archive to figure out the proper target.  If
no_element_cache is set then we can't call add_range.

Can you try this patch?

        * coff-rs6000.c (_bfd_xcoff_read_ar_hdr): Don't add_range when
        no_element_cache is set.

diff --git a/bfd/coff-rs6000.c b/bfd/coff-rs6000.c
index 271a24fff69..bb9604b0780 100644
--- a/bfd/coff-rs6000.c
+++ b/bfd/coff-rs6000.c
@@ -1744,7 +1744,8 @@ _bfd_xcoff_read_ar_hdr (bfd *abfd)

   /* Skip over the XCOFFARFMAG at the end of the file name.  */
   if (bfd_seek (abfd, (namlen & 1) + SXCOFFARFMAG, SEEK_CUR) != 0
-      || !add_range (abfd, start, abfd->where + ret->parsed_size))
+      || (!abfd->no_element_cache
+         && !add_range (abfd, start, abfd->where + ret->parsed_size)))
     {
       free (ret);
       return NULL;


--
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Patch] Fix AIX shared library load broken during fork ().
  2023-07-10 14:29   ` Aditya Kamath1
@ 2023-07-10 15:04     ` Alan Modra
  0 siblings, 0 replies; 7+ messages in thread
From: Alan Modra @ 2023-07-10 15:04 UTC (permalink / raw)
  To: Aditya Kamath1
  Cc: Ulrich Weigand, Aditya Kamath1 via Gdb-patches, Alan Modra,
	Sangamesh Mallayya

On Mon, Jul 10, 2023 at 02:29:46PM +0000, Aditya Kamath1 wrote:
> Hi Alan, Ulrich and community,
> 
> Thank you for the reply.
> 
> >As the comment in add_range says, you should be able to read the same
> >element more than once due to _bfd_get_elf_at_filepos calling
> >_bfd_look_for_bfd_in_cache, which caches elements by file position.
> 
> >So why isn't that working?
> 
> >Oh, I think I see the bug.  no_element_cache is set when reading the
> >first element of an archive to figure out the proper target.  If
> >no_element_cache is set then we can't call add_range.
> 
> >Can you try this patch?
> 
> So actually it does not work.

Yes, I'm not surprised.  It isn't the effect of no_element_cache.  We
just can't rely on the element cache.  I'll look into fixing the bfd
problem tomorrow, I need some sleep.

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Patch] Fix AIX shared library load broken during fork ().
  2023-07-10 11:35 [Patch] Fix AIX shared library load broken during fork () Aditya Kamath1
  2023-07-10 14:11 ` Alan Modra
@ 2023-07-10 16:00 ` Tom Tromey
  2023-07-10 23:17   ` Alan Modra
  1 sibling, 1 reply; 7+ messages in thread
From: Tom Tromey @ 2023-07-10 16:00 UTC (permalink / raw)
  To: Aditya Kamath1 via Gdb-patches
  Cc: Ulrich Weigand, Aditya Kamath1, Alan Modra, Alan Modra,
	Sangamesh Mallayya

>>>>> Aditya Kamath1 via Gdb-patches <gdb-patches@sourceware.org> writes:

> +std::vector<gdb_bfd_ref_ptr> object_bfd_vector;

I think it's a mistake to add a new global like this.
It seems to me that this would interact poorly with multi-inferior or
multi-target debugging.

Also, since nothing ever clears the vector, these BFDs will never be
closed.

Tom

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Patch] Fix AIX shared library load broken during fork ().
  2023-07-10 16:00 ` Tom Tromey
@ 2023-07-10 23:17   ` Alan Modra
  2023-07-11  5:08     ` Aditya Kamath1
  0 siblings, 1 reply; 7+ messages in thread
From: Alan Modra @ 2023-07-10 23:17 UTC (permalink / raw)
  To: Tom Tromey
  Cc: Aditya Kamath1 via Gdb-patches, Ulrich Weigand, Aditya Kamath1,
	Alan Modra, Sangamesh Mallayya

On Mon, Jul 10, 2023 at 10:00:40AM -0600, Tom Tromey wrote:
> >>>>> Aditya Kamath1 via Gdb-patches <gdb-patches@sourceware.org> writes:
> 
> > +std::vector<gdb_bfd_ref_ptr> object_bfd_vector;
> 
> I think it's a mistake to add a new global like this.
> It seems to me that this would interact poorly with multi-inferior or
> multi-target debugging.
> 
> Also, since nothing ever clears the vector, these BFDs will never be
> closed.

The proper place to fix this of course is in bfd.  Aditya, I think the
following ought to cure the problem you're seeing.  Did you open a PR?

	* coff-rs6000.c (add_range): Revise comment, noting possible fail.
	(_bfd_xcoff_openr_next_archived_file): Start with clean ranges.

diff --git a/bfd/coff-rs6000.c b/bfd/coff-rs6000.c
index 271a24fff69..59b9743356f 100644
--- a/bfd/coff-rs6000.c
+++ b/bfd/coff-rs6000.c
@@ -1598,14 +1598,21 @@ _bfd_xcoff_archive_p (bfd *abfd)
 
 /* Track file ranges occupied by elements.  Add [START,END) to the
    list of ranges and return TRUE if there is no overlap between the
-   new and any other element or the archive file header.  Note that
-   this would seem to preclude calling _bfd_get_elt_at_filepos twice
-   for the same element, but we won't get to _bfd_xcoff_read_ar_hdr if
-   an element is read more than once.  See _bfd_get_elt_at_filepos use
-   of _bfd_look_for_bfd_in_cache.  Also, the xcoff archive code
+   new and any other element or the archive file header.  This relies
+   on _bfd_xcoff_read_ar_hdr not being called more than once for the
+   same element, but that should be true (*).  The xcoff archive code
    doesn't call _bfd_read_ar_hdr when reading the armap, nor does it
    need to use extended name tables.  So those other routines in
-   archive.c that call _bfd_read_ar_hdr are unused.  */
+   archive.c that call _bfd_read_ar_hdr are unused.
+
+   *) There is one case where this might fail, but I think it is
+   sufficently unusual that it doesn't seem worth fixing:  When
+   scanning over archive elements using openr_next_archived_file, if
+   openr_next_archived_file is called twice with the same arguments
+   *and* the element returned is bfd_close'd between those calls then
+   we'll return false here.  The _bfd_look_for_bfd_in_cache use in
+   _bfd_get_elt_at_filepos stops this happening in the case where an
+   element is not closed.  */
 
 static bool
 add_range (bfd *abfd, ufile_ptr start, ufile_ptr end)
@@ -1770,12 +1777,14 @@ _bfd_xcoff_openr_next_archived_file (bfd *archive, bfd *last_file)
     {
       if (last_file == NULL)
 	{
+	  /* If we are scanning over elements twice in an open archive,
+	     which can happen in gdb after a fork, ensure we start the
+	     second scan with clean ranges.  */
+	  x_artdata (archive)->ranges.start = 0;
+	  x_artdata (archive)->ranges.end = SIZEOF_AR_FILE_HDR;
+	  x_artdata (archive)->ranges.next = NULL;
+	  x_artdata (archive)->ar_hdr_size = SIZEOF_AR_HDR;
 	  filestart = bfd_ardata (archive)->first_file_filepos;
-	  if (x_artdata (archive)->ar_hdr_size == 0)
-	    {
-	      x_artdata (archive)->ranges.end = SIZEOF_AR_FILE_HDR;
-	      x_artdata (archive)->ar_hdr_size = SIZEOF_AR_HDR;
-	    }
 	}
       else
 	GET_VALUE_IN_FIELD (filestart, arch_xhdr (last_file)->nextoff, 10);
@@ -1794,12 +1803,11 @@ _bfd_xcoff_openr_next_archived_file (bfd *archive, bfd *last_file)
     {
       if (last_file == NULL)
 	{
+	  x_artdata (archive)->ranges.start = 0;
+	  x_artdata (archive)->ranges.end = SIZEOF_AR_FILE_HDR_BIG;
+	  x_artdata (archive)->ranges.next = NULL;
+	  x_artdata (archive)->ar_hdr_size = SIZEOF_AR_HDR_BIG;
 	  filestart = bfd_ardata (archive)->first_file_filepos;
-	  if (x_artdata (archive)->ar_hdr_size == 0)
-	    {
-	      x_artdata (archive)->ranges.end = SIZEOF_AR_FILE_HDR_BIG;
-	      x_artdata (archive)->ar_hdr_size = SIZEOF_AR_HDR_BIG;
-	    }
 	}
       else
 	GET_VALUE_IN_FIELD (filestart, arch_xhdr_big (last_file)->nextoff, 10);

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [Patch] Fix AIX shared library load broken during fork ().
  2023-07-10 23:17   ` Alan Modra
@ 2023-07-11  5:08     ` Aditya Kamath1
  0 siblings, 0 replies; 7+ messages in thread
From: Aditya Kamath1 @ 2023-07-11  5:08 UTC (permalink / raw)
  To: Alan Modra, Tom Tromey
  Cc: Aditya Kamath1 via Gdb-patches, Ulrich Weigand, Alan Modra,
	Sangamesh Mallayya

[-- Attachment #1: Type: text/plain, Size: 11300 bytes --]

Respected Alan, Ulrich, Tom and community members,

Hi and Morning,

>The proper place to fix this of course is in bfd.  Aditya, I think the
>following ought to cure the problem you're seeing.  Did you open a PR?

>        * coff-rs6000.c (add_range): Revise comment, noting possible fail.
>       (_bfd_xcoff_openr_next_archived_file): Start with clean ranges.

Yes, this fixes the issue. Kindly see the output pasted below this email. No, I have not opened a PR. And I also did not apply any changes I had made which I had mentioned in my first email of this thread. I only applied your changes in coff-rs6000.c .. Thank you for figuring this out.

Also, let me know if you would like me to open a PR. Usually I attach the git format patch. Whichever works for you. So I think this is good to commit from our end.

Thank you Alan and community for your support once again. Thank you Tom for your suggestion to not write unfreeable global vectors. Will keep it in mind for the upcoming patches soon.

Have a nice day ahead.

Thanks and regards,
Aditya.

bash-5.1$ ./gdb ~/gdb_tests/multi-thread-fork
GNU gdb (GDB) 14.0.50.20230602-git
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "powerpc64-ibm-aix7.2.0.0".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
https://www.gnu.org/software/gdb/bugs/.
Find the GDB manual and other documentation resources online at:
    http://www.gnu.org/software/gdb/documentation/.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /home/aditya/gdb_tests/multi-thread-fork...
(gdb) set detach-on-fork off
(gdb) r
Starting program: /home/aditya/gdb_tests/multi-thread-fork
[New Thread 258]
[New Thread 515]
[New inferior 2 (process 27132274)]
[New inferior 3 (process 22020358)]
I am parent
I am parent

Thread 1.1 received signal SIGINT, Interrupt.
0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)
(gdb) inferior 2
[Switching to inferior 2 [process 27132274] (/home/aditya/gdb_tests/multi-thread-fork)]
[Switching to thread 2.1 (process 27132274)]
#0  0xd0594fc8 in _sigsetmask () from /usr/lib/libpthread.a(shr_xpg5.o)
(gdb) info sharedlibrary
From        To          Syms Read   Shared Object Library
0xd05bc124  0xd05bf194  Yes (*)     /usr/lib/libpthreads.a(shr_comm.o)
0xd05bb240  0xd05bb9a1  Yes (*)     /usr/lib/libcrypt.a(shr.o)
0xd0576180  0xd05ba731  Yes (*)     /usr/lib/libpthread.a(shr_xpg5.o)
0xd0100e00  0xd0575123  Yes (*)     /usr/lib/libc.a(shr.o)
(*): Shared library is missing debugging information.
(gdb) inferior 3
[Switching to inferior 3 [process 22020358] (/home/aditya/gdb_tests/multi-thread-fork)]
[Switching to thread 3.1 (process 22020358)]
#0  0xd0594fc8 in _sigsetmask () from /usr/lib/libpthread.a(shr_xpg5.o)
(gdb) info sharedlibrary
From        To          Syms Read   Shared Object Library
0xd05bc124  0xd05bf194  Yes (*)     /usr/lib/libpthreads.a(shr_comm.o)
0xd05bb240  0xd05bb9a1  Yes (*)     /usr/lib/libcrypt.a(shr.o)
0xd0576180  0xd05ba731  Yes (*)     /usr/lib/libpthread.a(shr_xpg5.o)
0xd0100e00  0xd0575123  Yes (*)     /usr/lib/libc.a(shr.o)
(*): Shared library is missing debugging information.
(gdb) q
A debugging session is active.

        Inferior 1 [process 18350382] will be killed.
        Inferior 2 [process 27132274] will be killed.
        Inferior 3 [process 22020358] will be killed.

Quit anyway? (y or n) y
bash-5.1$
bash-5.1$ diff -u ../bfd/coff-rs6000.c_orig ../bfd/coff-rs6000.c
--- ../bfd/coff-rs6000.c_orig   2023-07-10 09:20:47.209312912 +0000
+++ ../bfd/coff-rs6000.c        2023-07-10 23:54:41.714469979 +0000
@@ -1770,12 +1770,14 @@
     {
       if (last_file == NULL)
        {
+         /* If we are scanning over elements twice in an open archive,
+            which can happen in gdb after a fork, ensure we start the
+            second scan with clean ranges.  */
+         x_artdata (archive)->ranges.start = 0;
+         x_artdata (archive)->ranges.end = SIZEOF_AR_FILE_HDR;
+         x_artdata (archive)->ranges.next = NULL;
+         x_artdata (archive)->ar_hdr_size = SIZEOF_AR_HDR;
          filestart = bfd_ardata (archive)->first_file_filepos;
-         if (x_artdata (archive)->ar_hdr_size == 0)
-           {
-             x_artdata (archive)->ranges.end = SIZEOF_AR_FILE_HDR;
-             x_artdata (archive)->ar_hdr_size = SIZEOF_AR_HDR;
-           }
        }
       else
        GET_VALUE_IN_FIELD (filestart, arch_xhdr (last_file)->nextoff, 10);
@@ -1794,12 +1796,11 @@
     {
       if (last_file == NULL)
        {
+         x_artdata (archive)->ranges.start = 0;
+         x_artdata (archive)->ranges.end = SIZEOF_AR_FILE_HDR_BIG;
+         x_artdata (archive)->ranges.next = NULL;
+         x_artdata (archive)->ar_hdr_size = SIZEOF_AR_HDR_BIG;
          filestart = bfd_ardata (archive)->first_file_filepos;
-         if (x_artdata (archive)->ar_hdr_size == 0)
-           {
-             x_artdata (archive)->ranges.end = SIZEOF_AR_FILE_HDR_BIG;
-             x_artdata (archive)->ar_hdr_size = SIZEOF_AR_HDR_BIG;
-           }
        }
       else
        GET_VALUE_IN_FIELD (filestart, arch_xhdr_big (last_file)->nextoff, 10);

bash-5.1$ cat ~/gdb_tests/multi-thread-fork.c
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <pthread.h>
#include <assert.h>

pthread_barrier_t barrier;

#define NUM_THREADS 2

void *
thread_function (void *arg)
{
  /* This ensures that the breakpoint is only hit after both threads
     are created, so the test can always switch to the non-event
     thread when the breakpoint triggers.  */

  pthread_barrier_wait (&barrier);
  pid_t child;

  child = fork ();
  if (child > 0)
    printf ("I am parent \n");
  else
    printf (" Iam child \n");

  while (1); /* break here */
}

int
main (void)
{
  int i;

  alarm (300);

  pthread_barrier_init (&barrier, NULL, NUM_THREADS);

  for (i = 0; i < NUM_THREADS; i++)
    {
      pthread_t thread;
      int res;

      res = pthread_create (&thread, NULL,
                            thread_function, NULL);
      assert (res == 0);
    }

  while (1)
    sleep (1);

  return 0;
}
From: Alan Modra <amodra@gmail.com>
Date: Tuesday, 11 July 2023 at 4:47 AM
To: Tom Tromey <tom@tromey.com>
Cc: Aditya Kamath1 via Gdb-patches <gdb-patches@sourceware.org>, Ulrich Weigand <Ulrich.Weigand@de.ibm.com>, Aditya Kamath1 <Aditya.Kamath1@ibm.com>, Alan Modra <amodra@au1.ibm.com>, Sangamesh Mallayya <sangamesh.swamy@in.ibm.com>
Subject: [EXTERNAL] Re: [Patch] Fix AIX shared library load broken during fork ().
On Mon, Jul 10, 2023 at 10:00:40AM -0600, Tom Tromey wrote:
> >>>>> Aditya Kamath1 via Gdb-patches <gdb-patches@sourceware.org> writes:
>
> > +std::vector<gdb_bfd_ref_ptr> object_bfd_vector;
>
> I think it's a mistake to add a new global like this.
> It seems to me that this would interact poorly with multi-inferior or
> multi-target debugging.
>
> Also, since nothing ever clears the vector, these BFDs will never be
> closed.

The proper place to fix this of course is in bfd.  Aditya, I think the
following ought to cure the problem you're seeing.  Did you open a PR?

        * coff-rs6000.c (add_range): Revise comment, noting possible fail.
        (_bfd_xcoff_openr_next_archived_file): Start with clean ranges.

diff --git a/bfd/coff-rs6000.c b/bfd/coff-rs6000.c
index 271a24fff69..59b9743356f 100644
--- a/bfd/coff-rs6000.c
+++ b/bfd/coff-rs6000.c
@@ -1598,14 +1598,21 @@ _bfd_xcoff_archive_p (bfd *abfd)

 /* Track file ranges occupied by elements.  Add [START,END) to the
    list of ranges and return TRUE if there is no overlap between the
-   new and any other element or the archive file header.  Note that
-   this would seem to preclude calling _bfd_get_elt_at_filepos twice
-   for the same element, but we won't get to _bfd_xcoff_read_ar_hdr if
-   an element is read more than once.  See _bfd_get_elt_at_filepos use
-   of _bfd_look_for_bfd_in_cache.  Also, the xcoff archive code
+   new and any other element or the archive file header.  This relies
+   on _bfd_xcoff_read_ar_hdr not being called more than once for the
+   same element, but that should be true (*).  The xcoff archive code
    doesn't call _bfd_read_ar_hdr when reading the armap, nor does it
    need to use extended name tables.  So those other routines in
-   archive.c that call _bfd_read_ar_hdr are unused.  */
+   archive.c that call _bfd_read_ar_hdr are unused.
+
+   *) There is one case where this might fail, but I think it is
+   sufficently unusual that it doesn't seem worth fixing:  When
+   scanning over archive elements using openr_next_archived_file, if
+   openr_next_archived_file is called twice with the same arguments
+   *and* the element returned is bfd_close'd between those calls then
+   we'll return false here.  The _bfd_look_for_bfd_in_cache use in
+   _bfd_get_elt_at_filepos stops this happening in the case where an
+   element is not closed.  */

 static bool
 add_range (bfd *abfd, ufile_ptr start, ufile_ptr end)
@@ -1770,12 +1777,14 @@ _bfd_xcoff_openr_next_archived_file (bfd *archive, bfd *last_file)
     {
       if (last_file == NULL)
         {
+         /* If we are scanning over elements twice in an open archive,
+            which can happen in gdb after a fork, ensure we start the
+            second scan with clean ranges.  */
+         x_artdata (archive)->ranges.start = 0;
+         x_artdata (archive)->ranges.end = SIZEOF_AR_FILE_HDR;
+         x_artdata (archive)->ranges.next = NULL;
+         x_artdata (archive)->ar_hdr_size = SIZEOF_AR_HDR;
           filestart = bfd_ardata (archive)->first_file_filepos;
-         if (x_artdata (archive)->ar_hdr_size == 0)
-           {
-             x_artdata (archive)->ranges.end = SIZEOF_AR_FILE_HDR;
-             x_artdata (archive)->ar_hdr_size = SIZEOF_AR_HDR;
-           }
         }
       else
         GET_VALUE_IN_FIELD (filestart, arch_xhdr (last_file)->nextoff, 10);
@@ -1794,12 +1803,11 @@ _bfd_xcoff_openr_next_archived_file (bfd *archive, bfd *last_file)
     {
       if (last_file == NULL)
         {
+         x_artdata (archive)->ranges.start = 0;
+         x_artdata (archive)->ranges.end = SIZEOF_AR_FILE_HDR_BIG;
+         x_artdata (archive)->ranges.next = NULL;
+         x_artdata (archive)->ar_hdr_size = SIZEOF_AR_HDR_BIG;
           filestart = bfd_ardata (archive)->first_file_filepos;
-         if (x_artdata (archive)->ar_hdr_size == 0)
-           {
-             x_artdata (archive)->ranges.end = SIZEOF_AR_FILE_HDR_BIG;
-             x_artdata (archive)->ar_hdr_size = SIZEOF_AR_HDR_BIG;
-           }
         }
       else
         GET_VALUE_IN_FIELD (filestart, arch_xhdr_big (last_file)->nextoff, 10);

--
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-07-11  5:08 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-10 11:35 [Patch] Fix AIX shared library load broken during fork () Aditya Kamath1
2023-07-10 14:11 ` Alan Modra
2023-07-10 14:29   ` Aditya Kamath1
2023-07-10 15:04     ` Alan Modra
2023-07-10 16:00 ` Tom Tromey
2023-07-10 23:17   ` Alan Modra
2023-07-11  5:08     ` Aditya Kamath1

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).