public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c++/109164] New: aarch64 thread_local initialization error with -ftree-pre and -foptimize-sibling-calls
@ 2023-03-16 23:09 loganh at synopsys dot com
  2023-03-16 23:25 ` [Bug tree-optimization/109164] thread_local initialization error with -ftree-pre pinskia at gcc dot gnu.org
                   ` (10 more replies)
  0 siblings, 11 replies; 12+ messages in thread
From: loganh at synopsys dot com @ 2023-03-16 23:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109164

            Bug ID: 109164
           Summary: aarch64 thread_local initialization error with
                    -ftree-pre and -foptimize-sibling-calls
           Product: gcc
           Version: 12.1.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: loganh at synopsys dot com
  Target Milestone: ---

Created attachment 54687
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54687&action=edit
Bash script that reproduces the issue

With -ftree-pre, -foptimize-sibling-calls, and -O1 enabled, on
aarch64-linux-gnu, GCC 12.1.0 can generate code to access parts of thread_local
variables before the corresponding TLS init function is called if the variable
is accessed from a different TU than the variable is defined in. This
reordering could likely cause a number of different issues, but the one that
I've run into is that:
- When the compiler generates code to call a virtual function on a reference to
a to a global thread_local instance of an object defined in a different
translation unit, and
- The function calls itself in at least once branch,
the address of the  object is fetched from TLS before it's initialized, and
when the vtable lookup is attempted on that object to call the virtual function
the program segfaults.

Here's an example of the kind of code that will trip it up:

struct Struct  {                                                                
  virtual void virtual_func();                                                  
};                                                                              

extern thread_local Struct& thread_local_ref;                                   

bool other_func(void);                                                          

bool test_func(void) {                                                          
  thread_local_ref.virtual_func();                                              
  return other_func() && test_func();                                           
}

When this is compiled (on aarch64-linux-gnu, with -O1 and -ftree-pre and
-foptimize-sibling-calls) to an object file and then dumped with objdump -C -d,
this is the code produced:

0000000000000000 <test_func()>:                                                 
    0: a9be7bfd  stp x29, x30, [sp, #-32]!                                      
    4: 910003fd  mov x29, sp                                                    
    8: a90153f3  stp x19, x20, [sp, #16]                                        
    c: 90000000  adrp  x0, 0 <thread_local_ref>                                 
   10: f9400000  ldr x0, [x0]                                                   
   14: d53bd041  mrs x1, tpidr_el0                                              
   18: f8606834  ldr x20, [x1, x0]                                              
   1c: 90000013  adrp  x19, 0 <TLS init function for thread_local_ref>          
   20: f9400273  ldr x19, [x19]                                                 
   24: b4000053  cbz x19, 2c <test_func()+0x2c>                                 
   28: 94000000  bl  0 <TLS init function for thread_local_ref>                 
   2c: f9400280  ldr x0, [x20]                                                  
   30: f9400001  ldr x1, [x0]                                                   
   34: aa1403e0  mov x0, x20                                                    
   38: d63f0020  blr x1                                                         
   3c: 94000000  bl  0 <other_func()>                                           
   40: 12001c00  and w0, w0, #0xff                                              
   44: 35ffff00  cbnz  w0, 24 <test_func()+0x24>                                
   48: a94153f3  ldp x19, x20, [sp, #16]                                        
   4c: a8c27bfd  ldp x29, x30, [sp], #32                                        
   50: d65f03c0  ret  

Looking at addresses 0x14 through 0x18, you can see that the address of
'thread_local_ref' is read from the TLS block for the thread; the first time
this function is called, this will result in register x20 containing zero,
since the TLS block isn't intialized until the function call at 0x28. Directly
after that, at location 0x2c, a read is attempted from the address in register
x20 (zero) causing a segfault. Without -ftree-pre and -foptimize-sibling calls,
and without letting `test_func` call itself on at least one path, the code to
get the address of `thread_local_ref` is generated after the TLS init call, so
the problem does not occur.

I've attached a script that will reproduce what I've shown here, as well as
demonstrate the issue in action with a full executable that will produce the
segfault I've described.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2023-05-04  7:26 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-16 23:09 [Bug c++/109164] New: aarch64 thread_local initialization error with -ftree-pre and -foptimize-sibling-calls loganh at synopsys dot com
2023-03-16 23:25 ` [Bug tree-optimization/109164] thread_local initialization error with -ftree-pre pinskia at gcc dot gnu.org
2023-03-16 23:29 ` pinskia at gcc dot gnu.org
2023-03-16 23:31 ` pinskia at gcc dot gnu.org
2023-03-17 10:12 ` [Bug c++/109164] wrong code with thread_local reference, loops and -ftree-pre rguenth at gcc dot gnu.org
2023-03-17 12:01 ` jakub at gcc dot gnu.org
2023-03-20 19:32 ` cvs-commit at gcc dot gnu.org
2023-03-20 19:36 ` jakub at gcc dot gnu.org
2023-04-18  7:15 ` cvs-commit at gcc dot gnu.org
2023-05-02 20:16 ` cvs-commit at gcc dot gnu.org
2023-05-03 15:22 ` cvs-commit at gcc dot gnu.org
2023-05-04  7:26 ` jakub at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).