public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
@ 2014-12-02 19:19 Uros Bizjak
  2014-12-02 19:39 ` H.J. Lu
  2014-12-02 19:40 ` H.J. Lu
  0 siblings, 2 replies; 63+ messages in thread
From: Uros Bizjak @ 2014-12-02 19:19 UTC (permalink / raw)
  To: gcc-patches; +Cc: Sriraman Tallam, H.J. Lu, Jakub Jelinek

Hello!

> Ping.
>> Ping.
>>> Ping.
>>>> Ping.

It would probably help reviewers if you pointed to actual path
submission [1], which unfortunately contains the explanation in the
patch itself [2], which further explains that this functionality is
currently only supported with gold, patched with [3].

[1] https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00645.html
[2] https://gcc.gnu.org/ml/gcc-patches/2014-09/txt2CHtu81P1O.txt
[3] https://sourceware.org/ml/binutils/2014-05/msg00092.html

After a bit of the above detective work, I think that new gcc option
is not necessary. The configure should detect if new functionality is
supported in the linker, and auto-configure gcc to use it when
appropriate.

I have also added a couple of linker experts in the CC.

Uros.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2014-12-02 19:19 [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations Uros Bizjak
@ 2014-12-02 19:39 ` H.J. Lu
  2014-12-02 19:40 ` H.J. Lu
  1 sibling, 0 replies; 63+ messages in thread
From: H.J. Lu @ 2014-12-02 19:39 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: gcc-patches, Sriraman Tallam, Jakub Jelinek

On Tue, Dec 2, 2014 at 11:19 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
> Hello!
>
>> Ping.
>>> Ping.
>>>> Ping.
>>>>> Ping.
>
> It would probably help reviewers if you pointed to actual path
> submission [1], which unfortunately contains the explanation in the
> patch itself [2], which further explains that this functionality is
> currently only supported with gold, patched with [3].
>
> [1] https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00645.html
> [2] https://gcc.gnu.org/ml/gcc-patches/2014-09/txt2CHtu81P1O.txt
> [3] https://sourceware.org/ml/binutils/2014-05/msg00092.html
>
> After a bit of the above detective work, I think that new gcc option
> is not necessary. The configure should detect if new functionality is
> supported in the linker, and auto-configure gcc to use it when
> appropriate.
>
> I have also added a couple of linker experts in the CC.

I don't think i386_binds_local_p is correct.  What does it
return for hidden external variable?  I think it should be

bool local = default_binds_local_p (exp);
if (!local)
   local = ...
return local;


H.J.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2014-12-02 19:19 [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations Uros Bizjak
  2014-12-02 19:39 ` H.J. Lu
@ 2014-12-02 19:40 ` H.J. Lu
  2014-12-02 20:01   ` Uros Bizjak
  1 sibling, 1 reply; 63+ messages in thread
From: H.J. Lu @ 2014-12-02 19:40 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: gcc-patches, Sriraman Tallam, Jakub Jelinek

On Tue, Dec 2, 2014 at 11:19 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
> Hello!
>
>> Ping.
>>> Ping.
>>>> Ping.
>>>>> Ping.
>
> It would probably help reviewers if you pointed to actual path
> submission [1], which unfortunately contains the explanation in the
> patch itself [2], which further explains that this functionality is
> currently only supported with gold, patched with [3].
>
> [1] https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00645.html
> [2] https://gcc.gnu.org/ml/gcc-patches/2014-09/txt2CHtu81P1O.txt
> [3] https://sourceware.org/ml/binutils/2014-05/msg00092.html
>
> After a bit of the above detective work, I think that new gcc option
> is not necessary. The configure should detect if new functionality is
> supported in the linker, and auto-configure gcc to use it when
> appropriate.

I think GCC option is needed since one can use -fuse-ld= to
change linker.


-- 
H.J.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2014-12-02 19:40 ` H.J. Lu
@ 2014-12-02 20:01   ` Uros Bizjak
  2014-12-02 20:43     ` H.J. Lu
  2014-12-03 13:47     ` H.J. Lu
  0 siblings, 2 replies; 63+ messages in thread
From: Uros Bizjak @ 2014-12-02 20:01 UTC (permalink / raw)
  To: H.J. Lu; +Cc: gcc-patches, Sriraman Tallam, Jakub Jelinek

On Tue, Dec 2, 2014 at 8:40 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Tue, Dec 2, 2014 at 11:19 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
>> Hello!
>>
>>> Ping.
>>>> Ping.
>>>>> Ping.
>>>>>> Ping.
>>
>> It would probably help reviewers if you pointed to actual path
>> submission [1], which unfortunately contains the explanation in the
>> patch itself [2], which further explains that this functionality is
>> currently only supported with gold, patched with [3].
>>
>> [1] https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00645.html
>> [2] https://gcc.gnu.org/ml/gcc-patches/2014-09/txt2CHtu81P1O.txt
>> [3] https://sourceware.org/ml/binutils/2014-05/msg00092.html
>>
>> After a bit of the above detective work, I think that new gcc option
>> is not necessary. The configure should detect if new functionality is
>> supported in the linker, and auto-configure gcc to use it when
>> appropriate.
>
> I think GCC option is needed since one can use -fuse-ld= to
> change linker.

IMO, nobody will use this highly special x86_64-only option. It would
be best for gnu-ld to reach feature parity with gold as far as this
functionality is concerned. In this case, the optimization would be
auto-configured, and would fire automatically, without any user
intervention.

Uros.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2014-12-02 20:43     ` H.J. Lu
@ 2014-12-02 20:19       ` Jakub Jelinek
  2014-12-02 22:14         ` H.J. Lu
  0 siblings, 1 reply; 63+ messages in thread
From: Jakub Jelinek @ 2014-12-02 20:19 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Uros Bizjak, gcc-patches, Sriraman Tallam

On Tue, Dec 02, 2014 at 12:16:09PM -0800, H.J. Lu wrote:
> > IMO, nobody will use this highly special x86_64-only option. It would
> > be best for gnu-ld to reach feature parity with gold as far as this
> > functionality is concerned. In this case, the optimization would be
> > auto-configured, and would fire automatically, without any user
> > intervention.
> 
> I will implement it in ld after its support is checked into GCC.

I think it would be better to do it the other way around, so that gcc can be
configured against the right ld from the start.

	Jakub

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2014-12-02 20:01   ` Uros Bizjak
@ 2014-12-02 20:43     ` H.J. Lu
  2014-12-02 20:19       ` Jakub Jelinek
  2014-12-03 13:47     ` H.J. Lu
  1 sibling, 1 reply; 63+ messages in thread
From: H.J. Lu @ 2014-12-02 20:43 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: gcc-patches, Sriraman Tallam, Jakub Jelinek

On Tue, Dec 2, 2014 at 12:01 PM, Uros Bizjak <ubizjak@gmail.com> wrote:
> On Tue, Dec 2, 2014 at 8:40 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Tue, Dec 2, 2014 at 11:19 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
>>> Hello!
>>>
>>>> Ping.
>>>>> Ping.
>>>>>> Ping.
>>>>>>> Ping.
>>>
>>> It would probably help reviewers if you pointed to actual path
>>> submission [1], which unfortunately contains the explanation in the
>>> patch itself [2], which further explains that this functionality is
>>> currently only supported with gold, patched with [3].
>>>
>>> [1] https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00645.html
>>> [2] https://gcc.gnu.org/ml/gcc-patches/2014-09/txt2CHtu81P1O.txt
>>> [3] https://sourceware.org/ml/binutils/2014-05/msg00092.html
>>>
>>> After a bit of the above detective work, I think that new gcc option
>>> is not necessary. The configure should detect if new functionality is
>>> supported in the linker, and auto-configure gcc to use it when
>>> appropriate.
>>
>> I think GCC option is needed since one can use -fuse-ld= to
>> change linker.
>
> IMO, nobody will use this highly special x86_64-only option. It would
> be best for gnu-ld to reach feature parity with gold as far as this
> functionality is concerned. In this case, the optimization would be
> auto-configured, and would fire automatically, without any user
> intervention.

I will implement it in ld after its support is checked into GCC.

-- 
H.J.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2014-12-02 20:19       ` Jakub Jelinek
@ 2014-12-02 22:14         ` H.J. Lu
  2014-12-02 23:21           ` H.J. Lu
  0 siblings, 1 reply; 63+ messages in thread
From: H.J. Lu @ 2014-12-02 22:14 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Uros Bizjak, gcc-patches, Sriraman Tallam

On Tue, Dec 2, 2014 at 12:19 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Tue, Dec 02, 2014 at 12:16:09PM -0800, H.J. Lu wrote:
>> > IMO, nobody will use this highly special x86_64-only option. It would
>> > be best for gnu-ld to reach feature parity with gold as far as this
>> > functionality is concerned. In this case, the optimization would be
>> > auto-configured, and would fire automatically, without any user
>> > intervention.
>>
>> I will implement it in ld after its support is checked into GCC.
>
> I think it would be better to do it the other way around, so that gcc can be
> configured against the right ld from the start.
>

Consider it is done.  I will check it into binutils ths week.


-- 
H.J.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2014-12-02 22:14         ` H.J. Lu
@ 2014-12-02 23:21           ` H.J. Lu
  0 siblings, 0 replies; 63+ messages in thread
From: H.J. Lu @ 2014-12-02 23:21 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Uros Bizjak, gcc-patches, Sriraman Tallam

On Tue, Dec 2, 2014 at 2:14 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Tue, Dec 2, 2014 at 12:19 PM, Jakub Jelinek <jakub@redhat.com> wrote:
>> On Tue, Dec 02, 2014 at 12:16:09PM -0800, H.J. Lu wrote:
>>> > IMO, nobody will use this highly special x86_64-only option. It would
>>> > be best for gnu-ld to reach feature parity with gold as far as this
>>> > functionality is concerned. In this case, the optimization would be
>>> > auto-configured, and would fire automatically, without any user
>>> > intervention.
>>>
>>> I will implement it in ld after its support is checked into GCC.
>>
>> I think it would be better to do it the other way around, so that gcc can be
>> configured against the right ld from the start.
>>
>
> Consider it is done.  I will check it into binutils ths week.
>

It is on binutils master branch now:

https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=9a926d55ab4b6667f6c35b518d59b902fe490d9d

-- 
H.J.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2014-12-02 20:01   ` Uros Bizjak
  2014-12-02 20:43     ` H.J. Lu
@ 2014-12-03 13:47     ` H.J. Lu
  2014-12-03 15:01       ` H.J. Lu
  1 sibling, 1 reply; 63+ messages in thread
From: H.J. Lu @ 2014-12-03 13:47 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: gcc-patches, Sriraman Tallam, Jakub Jelinek

On Tue, Dec 2, 2014 at 12:01 PM, Uros Bizjak <ubizjak@gmail.com> wrote:
> On Tue, Dec 2, 2014 at 8:40 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Tue, Dec 2, 2014 at 11:19 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
>>> Hello!
>>>
>>>> Ping.
>>>>> Ping.
>>>>>> Ping.
>>>>>>> Ping.
>>>
>>> It would probably help reviewers if you pointed to actual path
>>> submission [1], which unfortunately contains the explanation in the
>>> patch itself [2], which further explains that this functionality is
>>> currently only supported with gold, patched with [3].
>>>
>>> [1] https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00645.html
>>> [2] https://gcc.gnu.org/ml/gcc-patches/2014-09/txt2CHtu81P1O.txt
>>> [3] https://sourceware.org/ml/binutils/2014-05/msg00092.html
>>>
>>> After a bit of the above detective work, I think that new gcc option
>>> is not necessary. The configure should detect if new functionality is
>>> supported in the linker, and auto-configure gcc to use it when
>>> appropriate.
>>
>> I think GCC option is needed since one can use -fuse-ld= to
>> change linker.
>
> IMO, nobody will use this highly special x86_64-only option. It would
> be best for gnu-ld to reach feature parity with gold as far as this
> functionality is concerned. In this case, the optimization would be
> auto-configured, and would fire automatically, without any user
> intervention.
>

Let's do it.  I implemented the same feature in bfd linker on both
master and 2.25 branch.

-- 
H.J.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2014-12-03 13:47     ` H.J. Lu
@ 2014-12-03 15:01       ` H.J. Lu
  2014-12-03 21:35         ` H.J. Lu
  0 siblings, 1 reply; 63+ messages in thread
From: H.J. Lu @ 2014-12-03 15:01 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: gcc-patches, Sriraman Tallam, Jakub Jelinek

On Wed, Dec 3, 2014 at 5:47 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Tue, Dec 2, 2014 at 12:01 PM, Uros Bizjak <ubizjak@gmail.com> wrote:
>> On Tue, Dec 2, 2014 at 8:40 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>> On Tue, Dec 2, 2014 at 11:19 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
>>>> Hello!
>>>>
>>>>> Ping.
>>>>>> Ping.
>>>>>>> Ping.
>>>>>>>> Ping.
>>>>
>>>> It would probably help reviewers if you pointed to actual path
>>>> submission [1], which unfortunately contains the explanation in the
>>>> patch itself [2], which further explains that this functionality is
>>>> currently only supported with gold, patched with [3].
>>>>
>>>> [1] https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00645.html
>>>> [2] https://gcc.gnu.org/ml/gcc-patches/2014-09/txt2CHtu81P1O.txt
>>>> [3] https://sourceware.org/ml/binutils/2014-05/msg00092.html
>>>>
>>>> After a bit of the above detective work, I think that new gcc option
>>>> is not necessary. The configure should detect if new functionality is
>>>> supported in the linker, and auto-configure gcc to use it when
>>>> appropriate.
>>>
>>> I think GCC option is needed since one can use -fuse-ld= to
>>> change linker.
>>
>> IMO, nobody will use this highly special x86_64-only option. It would
>> be best for gnu-ld to reach feature parity with gold as far as this
>> functionality is concerned. In this case, the optimization would be
>> auto-configured, and would fire automatically, without any user
>> intervention.
>>
>
> Let's do it.  I implemented the same feature in bfd linker on both
> master and 2.25 branch.
>

+bool
+i386_binds_local_p (const_tree exp)
+{
+  /* Globals marked extern are treated as local when linker copy relocations
+     support is available with -f{pie|PIE}.  */
+  if (TARGET_64BIT && ix86_copyrelocs && flag_pie
+      && TREE_CODE (exp) == VAR_DECL
+      && DECL_EXTERNAL (exp) && !DECL_WEAK (exp))
+    return true;
+  return default_binds_local_p (exp);
+}
+

It returns true with -fPIE and false without -fPIE.  It is lying to compiler.
Maybe legitimate_pic_address_disp_p is a better place.

-- 
H.J.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2014-12-03 15:01       ` H.J. Lu
@ 2014-12-03 21:35         ` H.J. Lu
  2014-12-04 12:44           ` Uros Bizjak
  0 siblings, 1 reply; 63+ messages in thread
From: H.J. Lu @ 2014-12-03 21:35 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: gcc-patches, Sriraman Tallam, Jakub Jelinek

[-- Attachment #1: Type: text/plain, Size: 2260 bytes --]

On Wed, Dec 3, 2014 at 7:01 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Wed, Dec 3, 2014 at 5:47 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Tue, Dec 2, 2014 at 12:01 PM, Uros Bizjak <ubizjak@gmail.com> wrote:
>>> On Tue, Dec 2, 2014 at 8:40 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>>> On Tue, Dec 2, 2014 at 11:19 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
>>>>> Hello!
>>>>>
>>>>>> Ping.
>>>>>>> Ping.
>>>>>>>> Ping.
>>>>>>>>> Ping.
>>>>>
>>>>> It would probably help reviewers if you pointed to actual path
>>>>> submission [1], which unfortunately contains the explanation in the
>>>>> patch itself [2], which further explains that this functionality is
>>>>> currently only supported with gold, patched with [3].
>>>>>
>>>>> [1] https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00645.html
>>>>> [2] https://gcc.gnu.org/ml/gcc-patches/2014-09/txt2CHtu81P1O.txt
>>>>> [3] https://sourceware.org/ml/binutils/2014-05/msg00092.html
>>>>>
>>>>> After a bit of the above detective work, I think that new gcc option
>>>>> is not necessary. The configure should detect if new functionality is
>>>>> supported in the linker, and auto-configure gcc to use it when
>>>>> appropriate.
>>>>
>>>> I think GCC option is needed since one can use -fuse-ld= to
>>>> change linker.
>>>
>>> IMO, nobody will use this highly special x86_64-only option. It would
>>> be best for gnu-ld to reach feature parity with gold as far as this
>>> functionality is concerned. In this case, the optimization would be
>>> auto-configured, and would fire automatically, without any user
>>> intervention.
>>>
>>
>> Let's do it.  I implemented the same feature in bfd linker on both
>> master and 2.25 branch.
>>
>
> +bool
> +i386_binds_local_p (const_tree exp)
> +{
> +  /* Globals marked extern are treated as local when linker copy relocations
> +     support is available with -f{pie|PIE}.  */
> +  if (TARGET_64BIT && ix86_copyrelocs && flag_pie
> +      && TREE_CODE (exp) == VAR_DECL
> +      && DECL_EXTERNAL (exp) && !DECL_WEAK (exp))
> +    return true;
> +  return default_binds_local_p (exp);
> +}
> +
>
> It returns true with -fPIE and false without -fPIE.  It is lying to compiler.
> Maybe legitimate_pic_address_disp_p is a better place.
>

Something like this?

-- 
H.J.

[-- Attachment #2: copyreloc.patch --]
[-- Type: text/x-patch, Size: 4904 bytes --]

diff --git a/gcc/config.in b/gcc/config.in
index 65d5e42..f34adb5 100644
--- a/gcc/config.in
+++ b/gcc/config.in
@@ -1411,6 +1411,12 @@
 #endif
 
 
+/* Define 0/1 if your linker supports -pie option with copy reloc. */
+#ifndef USED_FOR_TARGET
+#undef HAVE_LD_PIE_COPYRELOC
+#endif
+
+
 /* Define if your linker links a mix of read-only and read-write sections into
    a read-write section. */
 #ifndef USED_FOR_TARGET
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 211c9e6..eb43bc6 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -13113,7 +13113,10 @@ legitimate_pic_address_disp_p (rtx disp)
 		return true;
 	    }
 	  else if (!SYMBOL_REF_FAR_ADDR_P (op0)
-		   && SYMBOL_REF_LOCAL_P (op0)
+		   && (SYMBOL_REF_LOCAL_P (op0)
+		       || (HAVE_LD_PIE_COPYRELOC
+			   && flag_pie
+			   && !SYMBOL_REF_FUNCTION_P (op0)))
 		   && ix86_cmodel != CM_LARGE_PIC)
 	    return true;
 	  break;
diff --git a/gcc/configure b/gcc/configure
index 6b46bbb..811f05d 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -27025,6 +27025,53 @@ fi
 { $as_echo "$as_me:${as_lineno-$LINENO}: result: $gcc_cv_ld_pie" >&5
 $as_echo "$gcc_cv_ld_pie" >&6; }
 
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking linker PIE support with copy reloc" >&5
+$as_echo_n "checking linker PIE support with copy reloc... " >&6; }
+gcc_cv_ld_pie_copyreloc=no
+if test $gcc_cv_ld_pie = yes ; then
+  if test $in_tree_ld = yes ; then
+    if test "$gcc_cv_gld_major_version" -eq 2 -a "$gcc_cv_gld_minor_version" -ge 25 -o "$gcc_cv_gld_major_version" -gt 2; then
+      gcc_cv_ld_pie_copyreloc=yes
+    fi
+  elif test x$gcc_cv_as != x -a x$gcc_cv_ld != x ; then
+    # Check if linker supports -pie option with copy reloc
+    case "$target" in
+    i?86-*-linux* | x86_64-*-linux*)
+      cat > conftest1.s <<EOF
+	.globl	a_glob
+	.data
+	.type	a_glob, @object
+	.size	a_glob, 4
+a_glob:
+	.long	2
+EOF
+      cat > conftest2.s <<EOF
+	.text
+	.globl	main
+	.type	main, @function
+main:
+	movl	%eax, a_glob(%rip)
+	.size	main, .-main
+EOF
+      if $gcc_cv_as --64 -o conftest1.o conftest1.s > /dev/null 2>&1 \
+         && $gcc_cv_ld -shared -melf_x86_64 -o conftest1.so conftest1.o > /dev/null 2>&1 \
+         && $gcc_cv_as --64 -o conftest2.o conftest2.s > /dev/null 2>&1 \
+         && $gcc_cv_ld -pie -melf_x86_64 -o conftest conftest2.o conftest1.so > /dev/null 2>&1; then
+        gcc_cv_ld_pie_copyreloc=yes
+      fi
+      rm -f conftest conftest1.so conftest1.o conftest2.o conftest1.s conftest2.s
+      ;;
+    esac
+  fi
+
+cat >>confdefs.h <<_ACEOF
+#define HAVE_LD_PIE_COPYRELOC `if test x"$gcc_cv_ld_pie_copyreloc" = xyes; then echo 1; else echo 0; fi`
+_ACEOF
+
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $gcc_cv_ld_pie_copyreloc" >&5
+$as_echo "$gcc_cv_ld_pie_copyreloc" >&6; }
+
 { $as_echo "$as_me:${as_lineno-$LINENO}: checking linker EH-compatible garbage collection of sections" >&5
 $as_echo_n "checking linker EH-compatible garbage collection of sections... " >&6; }
 gcc_cv_ld_eh_gc_sections=no
diff --git a/gcc/configure.ac b/gcc/configure.ac
index 48c8000..a33f3a5 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -4693,6 +4693,49 @@ if test x"$gcc_cv_ld_pie" = xyes; then
 fi
 AC_MSG_RESULT($gcc_cv_ld_pie)
 
+AC_MSG_CHECKING(linker PIE support with copy reloc)
+gcc_cv_ld_pie_copyreloc=no
+if test $gcc_cv_ld_pie = yes ; then
+  if test $in_tree_ld = yes ; then
+    if test "$gcc_cv_gld_major_version" -eq 2 -a "$gcc_cv_gld_minor_version" -ge 25 -o "$gcc_cv_gld_major_version" -gt 2; then
+      gcc_cv_ld_pie_copyreloc=yes
+    fi
+  elif test x$gcc_cv_as != x -a x$gcc_cv_ld != x ; then
+    # Check if linker supports -pie option with copy reloc
+    case "$target" in
+    i?86-*-linux* | x86_64-*-linux*)
+      cat > conftest1.s <<EOF
+	.globl	a_glob
+	.data
+	.type	a_glob, @object
+	.size	a_glob, 4
+a_glob:
+	.long	2
+EOF
+      cat > conftest2.s <<EOF
+	.text
+	.globl	main
+	.type	main, @function
+main:
+	movl	%eax, a_glob(%rip)
+	.size	main, .-main
+EOF
+      if $gcc_cv_as --64 -o conftest1.o conftest1.s > /dev/null 2>&1 \
+         && $gcc_cv_ld -shared -melf_x86_64 -o conftest1.so conftest1.o > /dev/null 2>&1 \
+         && $gcc_cv_as --64 -o conftest2.o conftest2.s > /dev/null 2>&1 \
+         && $gcc_cv_ld -pie -melf_x86_64 -o conftest conftest2.o conftest1.so > /dev/null 2>&1; then
+        gcc_cv_ld_pie_copyreloc=yes
+      fi
+      rm -f conftest conftest1.so conftest1.o conftest2.o conftest1.s conftest2.s
+      ;;
+    esac
+  fi
+  AC_DEFINE_UNQUOTED(HAVE_LD_PIE_COPYRELOC,
+    [`if test x"$gcc_cv_ld_pie_copyreloc" = xyes; then echo 1; else echo 0; fi`],
+    [Define 0/1 if your linker supports -pie option with copy reloc.])
+fi
+AC_MSG_RESULT($gcc_cv_ld_pie_copyreloc)
+
 AC_MSG_CHECKING(linker EH-compatible garbage collection of sections)
 gcc_cv_ld_eh_gc_sections=no
 if test $in_tree_ld = yes ; then

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2014-12-03 21:35         ` H.J. Lu
@ 2014-12-04 12:44           ` Uros Bizjak
  2014-12-04 16:46             ` H.J. Lu
  0 siblings, 1 reply; 63+ messages in thread
From: Uros Bizjak @ 2014-12-04 12:44 UTC (permalink / raw)
  To: H.J. Lu; +Cc: gcc-patches, Sriraman Tallam, Jakub Jelinek

On Wed, Dec 3, 2014 at 10:35 PM, H.J. Lu <hjl.tools@gmail.com> wrote:

>>>>>> It would probably help reviewers if you pointed to actual path
>>>>>> submission [1], which unfortunately contains the explanation in the
>>>>>> patch itself [2], which further explains that this functionality is
>>>>>> currently only supported with gold, patched with [3].
>>>>>>
>>>>>> [1] https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00645.html
>>>>>> [2] https://gcc.gnu.org/ml/gcc-patches/2014-09/txt2CHtu81P1O.txt
>>>>>> [3] https://sourceware.org/ml/binutils/2014-05/msg00092.html
>>>>>>
>>>>>> After a bit of the above detective work, I think that new gcc option
>>>>>> is not necessary. The configure should detect if new functionality is
>>>>>> supported in the linker, and auto-configure gcc to use it when
>>>>>> appropriate.
>>>>>
>>>>> I think GCC option is needed since one can use -fuse-ld= to
>>>>> change linker.
>>>>
>>>> IMO, nobody will use this highly special x86_64-only option. It would
>>>> be best for gnu-ld to reach feature parity with gold as far as this
>>>> functionality is concerned. In this case, the optimization would be
>>>> auto-configured, and would fire automatically, without any user
>>>> intervention.
>>>>
>>>
>>> Let's do it.  I implemented the same feature in bfd linker on both
>>> master and 2.25 branch.
>>>
>>
>> +bool
>> +i386_binds_local_p (const_tree exp)
>> +{
>> +  /* Globals marked extern are treated as local when linker copy relocations
>> +     support is available with -f{pie|PIE}.  */
>> +  if (TARGET_64BIT && ix86_copyrelocs && flag_pie
>> +      && TREE_CODE (exp) == VAR_DECL
>> +      && DECL_EXTERNAL (exp) && !DECL_WEAK (exp))
>> +    return true;
>> +  return default_binds_local_p (exp);
>> +}
>> +
>>
>> It returns true with -fPIE and false without -fPIE.  It is lying to compiler.
>> Maybe legitimate_pic_address_disp_p is a better place.

Agreed.

> Something like this?

Yes.

OK, if Jakub doesn't have any objections here. Please also add
Sriraman as author to ChangeLog entry.

Thanks,
Uros.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2014-12-04 12:44           ` Uros Bizjak
@ 2014-12-04 16:46             ` H.J. Lu
  2014-12-04 19:32               ` Uros Bizjak
                                 ` (2 more replies)
  0 siblings, 3 replies; 63+ messages in thread
From: H.J. Lu @ 2014-12-04 16:46 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: gcc-patches, Sriraman Tallam, Jakub Jelinek

[-- Attachment #1: Type: text/plain, Size: 4771 bytes --]

On Thu, Dec 4, 2014 at 4:44 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
> On Wed, Dec 3, 2014 at 10:35 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>
>>>>>>> It would probably help reviewers if you pointed to actual path
>>>>>>> submission [1], which unfortunately contains the explanation in the
>>>>>>> patch itself [2], which further explains that this functionality is
>>>>>>> currently only supported with gold, patched with [3].
>>>>>>>
>>>>>>> [1] https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00645.html
>>>>>>> [2] https://gcc.gnu.org/ml/gcc-patches/2014-09/txt2CHtu81P1O.txt
>>>>>>> [3] https://sourceware.org/ml/binutils/2014-05/msg00092.html
>>>>>>>
>>>>>>> After a bit of the above detective work, I think that new gcc option
>>>>>>> is not necessary. The configure should detect if new functionality is
>>>>>>> supported in the linker, and auto-configure gcc to use it when
>>>>>>> appropriate.
>>>>>>
>>>>>> I think GCC option is needed since one can use -fuse-ld= to
>>>>>> change linker.
>>>>>
>>>>> IMO, nobody will use this highly special x86_64-only option. It would
>>>>> be best for gnu-ld to reach feature parity with gold as far as this
>>>>> functionality is concerned. In this case, the optimization would be
>>>>> auto-configured, and would fire automatically, without any user
>>>>> intervention.
>>>>>
>>>>
>>>> Let's do it.  I implemented the same feature in bfd linker on both
>>>> master and 2.25 branch.
>>>>
>>>
>>> +bool
>>> +i386_binds_local_p (const_tree exp)
>>> +{
>>> +  /* Globals marked extern are treated as local when linker copy relocations
>>> +     support is available with -f{pie|PIE}.  */
>>> +  if (TARGET_64BIT && ix86_copyrelocs && flag_pie
>>> +      && TREE_CODE (exp) == VAR_DECL
>>> +      && DECL_EXTERNAL (exp) && !DECL_WEAK (exp))
>>> +    return true;
>>> +  return default_binds_local_p (exp);
>>> +}
>>> +
>>>
>>> It returns true with -fPIE and false without -fPIE.  It is lying to compiler.
>>> Maybe legitimate_pic_address_disp_p is a better place.
>
> Agreed.
>
>> Something like this?
>
> Yes.
>
> OK, if Jakub doesn't have any objections here. Please also add
> Sriraman as author to ChangeLog entry.
>
> Thanks,
> Uros.

Here is the patch.   OK to install?

Thanks.

-- 
H.J.
---
Normally, with -fPIE/-fpie, GCC accesses globals that are extern to the
module using the GOT.  This is two instructions, one to get the address
of the global from the GOT and the other to get the value.  If it turns
out that the global gets defined in the executable at link-time, it still
needs to go through the GOT as it is too late then to generate a direct
access.

Examples:

foo.cc
------
int a_glob;
int main () {
  return a_glob; // defined in this file
}

With -O2 -fpie -pie, the generated code directly accesses the global via
PC-relative insn:

5e0   <main>:
   mov    0x165a(%rip),%eax        # 1c40 <a_glob>

foo.cc
------

extern int a_glob;
int main () {
  return a_glob; // defined in this file
}

With -O2 -fpie -pie, the generated code accesses global via GOT using
two memory loads:

6f0  <main>:
   mov    0x1609(%rip),%rax   # 1d00 <_DYNAMIC+0x230>
   mov    (%rax),%eax

This is true even if in the latter case the global was defined in the
executable through a different file.

Some experiments on google benchmarks shows that the extra memory loads
affects performance by 1% to 5%.

Solution - Copy Relocations:

When the linker supports copy relocations, GCC can always assume that
the global will be defined in the executable.  For globals that are truly
extern (come from shared objects), the linker will create copy relocations
and have them defined in the executable. Result is that no global access
needs to go through the GOT and hence improves performance.

This optimization only applies to undefined, non-weak global data.
Undefined, weak global data access still must go through the GOT.

This patch checks if linker supports PIE with copy reloc, which is
enabled in gold and bfd linker in bininutils 2.25, at configure time
and enables this optimization if the linker support is available.

gcc/

* configure.ac (HAVE_LD_PIE_COPYRELOC): Defined to 1 if
Linux/x86-64 linker supports PIE with copy reloc.
* config.in: Regenerated.
* configure: Likewise.

* config/i386/i386.c (legitimate_pic_address_disp_p): Allow
pc-relative address for undefined, non-weak, non-function
symbol reference in 64-bit PIE if linker supports PIE with
copy reloc.

* doc/sourcebuild.texi: Document pie_copyreloc target.

gcc/testsuite/

* gcc.target/i386/pie-copyrelocs-1.c: New test.
* gcc.target/i386/pie-copyrelocs-2.c: Likewise.
* gcc.target/i386/pie-copyrelocs-3.c: Likewise.
* gcc.target/i386/pie-copyrelocs-4.c: Likewise.

* lib/target-supports.exp (check_effective_target_pie_copyreloc):
New procedure.

[-- Attachment #2: 0001-x86-64-Optimize-access-to-globals-in-PIE-with-copy-r.patch --]
[-- Type: text/x-patch, Size: 15082 bytes --]

From d5559a969c541e5375da9372f6925f40b87df5f3 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" <hjl.tools@gmail.com>
Date: Thu, 4 Dec 2014 08:27:22 -0800
Subject: [PATCH] x86-64: Optimize access to globals in PIE with copy reloc

Normally, with -fPIE/-fpie, GCC accesses globals that are extern to the
module using the GOT.  This is two instructions, one to get the address
of the global from the GOT and the other to get the value.  If it turns
out that the global gets defined in the executable at link-time, it still
needs to go through the GOT as it is too late then to generate a direct
 access.

Examples:

foo.cc
------
int a_glob;
int main () {
  return a_glob; // defined in this file
}

With -O2 -fpie -pie, the generated code directly accesses the global via
PC-relative insn:

5e0   <main>:
   mov    0x165a(%rip),%eax        # 1c40 <a_glob>

foo.cc
------

extern int a_glob;
int main () {
  return a_glob; // defined in this file
}

With -O2 -fpie -pie, the generated code accesses global via GOT using
two memory loads:

6f0  <main>:
   mov    0x1609(%rip),%rax   # 1d00 <_DYNAMIC+0x230>
   mov    (%rax),%eax

This is true even if in the latter case the global was defined in the
executable through a different file.

Some experiments on google benchmarks shows that the extra memory loads
affects performance by 1% to 5%.

Solution - Copy Relocations:

When the linker supports copy relocations, GCC can always assume that
the global will be defined in the executable.  For globals that are truly
extern (come from shared objects), the linker will create copy relocations
and have them defined in the executable. Result is that no global access
needs to go through the GOT and hence improves performance.

This optimization only applies to undefined, non-weak global data.
Undefined, weak global data access still must go through the GOT.

This patch checks if linker supports PIE with copy reloc, which is
enabled in gold and bfd linker in bininutils 2.25, at configure time
and enables this optimization if the linker support is available.

gcc/

	* configure.ac (HAVE_LD_PIE_COPYRELOC): Defined to 1 if
	Linux/x86-64 linker supports PIE with copy reloc.
	* config.in: Regenerated.
	* configure: Likewise.

	* config/i386/i386.c (legitimate_pic_address_disp_p): Allow
	pc-relative address for undefined, non-weak, non-function
	symbol reference in 64-bit PIE if linker supports PIE with
	copy reloc.

	* doc/sourcebuild.texi: Document pie_copyreloc target.

gcc/testsuite/

	* gcc.target/i386/pie-copyrelocs-1.c: New test.
	* gcc.target/i386/pie-copyrelocs-2.c: Likewise.
	* gcc.target/i386/pie-copyrelocs-3.c: Likewise.
	* gcc.target/i386/pie-copyrelocs-4.c: Likewise.

	* lib/target-supports.exp (check_effective_target_pie_copyreloc):
	New procedure.
---
 gcc/ChangeLog                                    | 15 +++++++
 gcc/config.in                                    |  6 +++
 gcc/config/i386/i386.c                           |  6 ++-
 gcc/configure                                    | 47 ++++++++++++++++++++++
 gcc/configure.ac                                 | 43 ++++++++++++++++++++
 gcc/doc/sourcebuild.texi                         |  3 ++
 gcc/testsuite/ChangeLog                          | 11 +++++
 gcc/testsuite/gcc.target/i386/pie-copyrelocs-1.c | 14 +++++++
 gcc/testsuite/gcc.target/i386/pie-copyrelocs-2.c | 14 +++++++
 gcc/testsuite/gcc.target/i386/pie-copyrelocs-3.c | 14 +++++++
 gcc/testsuite/gcc.target/i386/pie-copyrelocs-4.c | 17 ++++++++
 gcc/testsuite/lib/target-supports.exp            | 51 ++++++++++++++++++++++++
 12 files changed, 240 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pie-copyrelocs-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pie-copyrelocs-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pie-copyrelocs-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pie-copyrelocs-4.c

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 928b6b8..7835ab0 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,18 @@
+2014-12-04  Sriraman Tallam  <tmsriram@google.com>
+	    H.J. Lu  <hongjiu.lu@intel.com>
+
+	* configure.ac (HAVE_LD_PIE_COPYRELOC): Defined to 1 if
+	Linux/x86-64 linker supports PIE with copy reloc.
+	* config.in: Regenerated.
+	* configure: Likewise.
+
+	* config/i386/i386.c (legitimate_pic_address_disp_p): Allow
+	pc-relative address for undefined, non-weak, non-function
+	symbol reference in 64-bit PIE if linker supports PIE with
+	copy reloc.
+
+	* doc/sourcebuild.texi: Document pie_copyreloc target.
+
 2014-12-03  Michael Meissner  <meissner@linux.vnet.ibm.com>
 
 	PR target/64019
diff --git a/gcc/config.in b/gcc/config.in
index 65d5e42..f34adb5 100644
--- a/gcc/config.in
+++ b/gcc/config.in
@@ -1411,6 +1411,12 @@
 #endif
 
 
+/* Define 0/1 if your linker supports -pie option with copy reloc. */
+#ifndef USED_FOR_TARGET
+#undef HAVE_LD_PIE_COPYRELOC
+#endif
+
+
 /* Define if your linker links a mix of read-only and read-write sections into
    a read-write section. */
 #ifndef USED_FOR_TARGET
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 211c9e6..4f1a18b 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -13113,7 +13113,11 @@ legitimate_pic_address_disp_p (rtx disp)
 		return true;
 	    }
 	  else if (!SYMBOL_REF_FAR_ADDR_P (op0)
-		   && SYMBOL_REF_LOCAL_P (op0)
+		   && (SYMBOL_REF_LOCAL_P (op0)
+		       || (HAVE_LD_PIE_COPYRELOC
+			   && flag_pie
+			   && !SYMBOL_REF_WEAK (op0)
+			   && !SYMBOL_REF_FUNCTION_P (op0)))
 		   && ix86_cmodel != CM_LARGE_PIC)
 	    return true;
 	  break;
diff --git a/gcc/configure b/gcc/configure
index 6b46bbb..811f05d 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -27025,6 +27025,53 @@ fi
 { $as_echo "$as_me:${as_lineno-$LINENO}: result: $gcc_cv_ld_pie" >&5
 $as_echo "$gcc_cv_ld_pie" >&6; }
 
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking linker PIE support with copy reloc" >&5
+$as_echo_n "checking linker PIE support with copy reloc... " >&6; }
+gcc_cv_ld_pie_copyreloc=no
+if test $gcc_cv_ld_pie = yes ; then
+  if test $in_tree_ld = yes ; then
+    if test "$gcc_cv_gld_major_version" -eq 2 -a "$gcc_cv_gld_minor_version" -ge 25 -o "$gcc_cv_gld_major_version" -gt 2; then
+      gcc_cv_ld_pie_copyreloc=yes
+    fi
+  elif test x$gcc_cv_as != x -a x$gcc_cv_ld != x ; then
+    # Check if linker supports -pie option with copy reloc
+    case "$target" in
+    i?86-*-linux* | x86_64-*-linux*)
+      cat > conftest1.s <<EOF
+	.globl	a_glob
+	.data
+	.type	a_glob, @object
+	.size	a_glob, 4
+a_glob:
+	.long	2
+EOF
+      cat > conftest2.s <<EOF
+	.text
+	.globl	main
+	.type	main, @function
+main:
+	movl	%eax, a_glob(%rip)
+	.size	main, .-main
+EOF
+      if $gcc_cv_as --64 -o conftest1.o conftest1.s > /dev/null 2>&1 \
+         && $gcc_cv_ld -shared -melf_x86_64 -o conftest1.so conftest1.o > /dev/null 2>&1 \
+         && $gcc_cv_as --64 -o conftest2.o conftest2.s > /dev/null 2>&1 \
+         && $gcc_cv_ld -pie -melf_x86_64 -o conftest conftest2.o conftest1.so > /dev/null 2>&1; then
+        gcc_cv_ld_pie_copyreloc=yes
+      fi
+      rm -f conftest conftest1.so conftest1.o conftest2.o conftest1.s conftest2.s
+      ;;
+    esac
+  fi
+
+cat >>confdefs.h <<_ACEOF
+#define HAVE_LD_PIE_COPYRELOC `if test x"$gcc_cv_ld_pie_copyreloc" = xyes; then echo 1; else echo 0; fi`
+_ACEOF
+
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $gcc_cv_ld_pie_copyreloc" >&5
+$as_echo "$gcc_cv_ld_pie_copyreloc" >&6; }
+
 { $as_echo "$as_me:${as_lineno-$LINENO}: checking linker EH-compatible garbage collection of sections" >&5
 $as_echo_n "checking linker EH-compatible garbage collection of sections... " >&6; }
 gcc_cv_ld_eh_gc_sections=no
diff --git a/gcc/configure.ac b/gcc/configure.ac
index 48c8000..a33f3a5 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -4693,6 +4693,49 @@ if test x"$gcc_cv_ld_pie" = xyes; then
 fi
 AC_MSG_RESULT($gcc_cv_ld_pie)
 
+AC_MSG_CHECKING(linker PIE support with copy reloc)
+gcc_cv_ld_pie_copyreloc=no
+if test $gcc_cv_ld_pie = yes ; then
+  if test $in_tree_ld = yes ; then
+    if test "$gcc_cv_gld_major_version" -eq 2 -a "$gcc_cv_gld_minor_version" -ge 25 -o "$gcc_cv_gld_major_version" -gt 2; then
+      gcc_cv_ld_pie_copyreloc=yes
+    fi
+  elif test x$gcc_cv_as != x -a x$gcc_cv_ld != x ; then
+    # Check if linker supports -pie option with copy reloc
+    case "$target" in
+    i?86-*-linux* | x86_64-*-linux*)
+      cat > conftest1.s <<EOF
+	.globl	a_glob
+	.data
+	.type	a_glob, @object
+	.size	a_glob, 4
+a_glob:
+	.long	2
+EOF
+      cat > conftest2.s <<EOF
+	.text
+	.globl	main
+	.type	main, @function
+main:
+	movl	%eax, a_glob(%rip)
+	.size	main, .-main
+EOF
+      if $gcc_cv_as --64 -o conftest1.o conftest1.s > /dev/null 2>&1 \
+         && $gcc_cv_ld -shared -melf_x86_64 -o conftest1.so conftest1.o > /dev/null 2>&1 \
+         && $gcc_cv_as --64 -o conftest2.o conftest2.s > /dev/null 2>&1 \
+         && $gcc_cv_ld -pie -melf_x86_64 -o conftest conftest2.o conftest1.so > /dev/null 2>&1; then
+        gcc_cv_ld_pie_copyreloc=yes
+      fi
+      rm -f conftest conftest1.so conftest1.o conftest2.o conftest1.s conftest2.s
+      ;;
+    esac
+  fi
+  AC_DEFINE_UNQUOTED(HAVE_LD_PIE_COPYRELOC,
+    [`if test x"$gcc_cv_ld_pie_copyreloc" = xyes; then echo 1; else echo 0; fi`],
+    [Define 0/1 if your linker supports -pie option with copy reloc.])
+fi
+AC_MSG_RESULT($gcc_cv_ld_pie_copyreloc)
+
 AC_MSG_CHECKING(linker EH-compatible garbage collection of sections)
 gcc_cv_ld_eh_gc_sections=no
 if test $in_tree_ld = yes ; then
diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index 20a206d..98ba1a6 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -1717,6 +1717,9 @@ or @code{EM_SPARCV9} executables.
 
 @item vect_cmdline_needed
 Target requires a command line argument to enable a SIMD instruction set.
+
+@item pie_copyreloc
+The x86-64 target linker supports PIE with copy reloc.
 @end table
 
 @subsubsection Environment attributes
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 0b4d31f..c31a0d9 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,14 @@
+2014-12-04  Sriraman Tallam  <tmsriram@google.com>
+	    H.J. Lu  <hongjiu.lu@intel.com>
+
+	* gcc.target/i386/pie-copyrelocs-1.c: New test.
+	* gcc.target/i386/pie-copyrelocs-2.c: Likewise.
+	* gcc.target/i386/pie-copyrelocs-3.c: Likewise.
+	* gcc.target/i386/pie-copyrelocs-4.c: Likewise.
+
+	* lib/target-supports.exp (check_effective_target_pie_copyreloc):
+	New procedure.
+
 2014-12-03  Paolo Carlini  <paolo.carlini@oracle.com>
 
 	PR c++/63558
diff --git a/gcc/testsuite/gcc.target/i386/pie-copyrelocs-1.c b/gcc/testsuite/gcc.target/i386/pie-copyrelocs-1.c
new file mode 100644
index 0000000..67711e3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pie-copyrelocs-1.c
@@ -0,0 +1,14 @@
+/* Check that GOTPCREL isn't used to access glob_a.  */
+/* { dg-do compile { target *-*-linux* } } */
+/* { dg-require-effective-target pie_copyreloc } */
+/* { dg-options "-O2 -fpie" } */
+
+extern int glob_a;
+
+int foo ()
+{
+  return glob_a;
+}
+
+/* glob_a should never be accessed with a GOTPCREL.  */ 
+/* { dg-final { scan-assembler-not "glob_a@GOTPCREL" { target { ! ia32 } } } } */
diff --git a/gcc/testsuite/gcc.target/i386/pie-copyrelocs-2.c b/gcc/testsuite/gcc.target/i386/pie-copyrelocs-2.c
new file mode 100644
index 0000000..923bd68
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pie-copyrelocs-2.c
@@ -0,0 +1,14 @@
+/* Check that GOTPCREL isn't used to access glob_a.  */
+/* { dg-do compile { target *-*-linux* } } */
+/* { dg-require-effective-target pie_copyreloc } */
+/* { dg-options "-O2 -fpie" } */
+
+int glob_a;
+
+int foo ()
+{
+  return glob_a;
+}
+
+/* glob_a should never be accessed with a GOTPCREL.  */ 
+/* { dg-final { scan-assembler-not "glob_a@GOTPCREL" { target { ! ia32 } } } } */
diff --git a/gcc/testsuite/gcc.target/i386/pie-copyrelocs-3.c b/gcc/testsuite/gcc.target/i386/pie-copyrelocs-3.c
new file mode 100644
index 0000000..3d695f1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pie-copyrelocs-3.c
@@ -0,0 +1,14 @@
+/* Check that PLT is used to access glob_a.  */
+/* { dg-do compile { target *-*-linux* } } */
+/* { dg-require-effective-target pie_copyreloc } */
+/* { dg-options "-O2 -fpie" } */
+
+extern int glob_a (void);
+
+int foo ()
+{
+  return glob_a ();
+}
+
+/* glob_a should be accessed with a PLT.  */ 
+/* { dg-final { scan-assembler "glob_a@PLT" { target { ! ia32 } } } } */
diff --git a/gcc/testsuite/gcc.target/i386/pie-copyrelocs-4.c b/gcc/testsuite/gcc.target/i386/pie-copyrelocs-4.c
new file mode 100644
index 0000000..8066e1d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pie-copyrelocs-4.c
@@ -0,0 +1,17 @@
+/* Check that GOTPCREL is used to access glob_a.  */
+/* { dg-do compile { target *-*-linux* } } */
+/* { dg-require-effective-target pie_copyreloc } */
+/* { dg-options "-O2 -fpie" } */
+
+extern int glob_a  __attribute__((weak));
+
+int foo ()
+{
+  if (&glob_a != 0)
+    return glob_a;
+  else
+    return 0;
+}
+
+/* weak glob_a should be accessed with a GOTPCREL.  */ 
+/* { dg-final { scan-assembler "glob_a@GOTPCREL" { target { ! ia32 } } } } */
diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index ac04d95..8169865 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -6090,3 +6090,54 @@ proc force_conventional_output_for { test } {
     }
 }
 
+# Return 1 if the x86-64 target supports PIE with copy reloc, 0
+# otherwise.  Cache the result.
+
+proc check_effective_target_pie_copyreloc { } {
+    global pie_copyreloc_available_saved
+    global tool
+    global GCC_UNDER_TEST
+
+    if { !([istarget x86_64-*-*] || [istarget i?86-*-*]) } {
+	return 0
+    }
+
+    # Need auto-host.h to check linker support.
+    if { ![file exists ../../auto-host.h ] } {
+	return 0
+    }
+
+    if [info exists pie_copyreloc_available_saved] {
+	verbose "check_effective_target_pie_copyreloc returning saved $pie_copyreloc_available_saved" 2
+    } else {
+	# Set up and compile to see if linker supports PIE with copy
+	# reloc.  Include the current process ID in the file names to
+	# prevent conflicts with invocations for multiple testsuites.
+
+	set src pie[pid].c
+	set obj pie[pid].o
+
+	set f [open $src "w"]
+	puts $f "#include \"../../auto-host.h\""
+	puts $f "#if HAVE_LD_PIE_COPYRELOC == 0"
+	puts $f "# error Linker does not support PIE with copy reloc."
+	puts $f "#endif"
+	close $f
+
+	verbose "check_effective_target_pie_copyreloc compiling testfile $src" 2
+	set lines [${tool}_target_compile $src $obj object ""]
+
+	file delete $src
+	file delete $obj
+
+	if [string match "" $lines] then {
+	    verbose "check_effective_target_pie_copyreloc testfile compilation passed" 2
+	    set pie_copyreloc_available_saved 1
+	} else {
+	    verbose "check_effective_target_pie_copyreloc testfile compilation failed" 2
+	    set pie_copyreloc_available_saved 0
+	}
+    }
+
+    return $pie_copyreloc_available_saved
+}
-- 
1.9.3


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2014-12-04 16:46             ` H.J. Lu
@ 2014-12-04 19:32               ` Uros Bizjak
  2015-02-03 19:25               ` Sriraman Tallam
  2015-02-27 23:39               ` H.J. Lu
  2 siblings, 0 replies; 63+ messages in thread
From: Uros Bizjak @ 2014-12-04 19:32 UTC (permalink / raw)
  To: H.J. Lu; +Cc: gcc-patches, Sriraman Tallam, Jakub Jelinek

On Thu, Dec 4, 2014 at 5:46 PM, H.J. Lu <hjl.tools@gmail.com> wrote:

>>>>>>>> It would probably help reviewers if you pointed to actual path
>>>>>>>> submission [1], which unfortunately contains the explanation in the
>>>>>>>> patch itself [2], which further explains that this functionality is
>>>>>>>> currently only supported with gold, patched with [3].
>>>>>>>>
>>>>>>>> [1] https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00645.html
>>>>>>>> [2] https://gcc.gnu.org/ml/gcc-patches/2014-09/txt2CHtu81P1O.txt
>>>>>>>> [3] https://sourceware.org/ml/binutils/2014-05/msg00092.html
>>>>>>>>
>>>>>>>> After a bit of the above detective work, I think that new gcc option
>>>>>>>> is not necessary. The configure should detect if new functionality is
>>>>>>>> supported in the linker, and auto-configure gcc to use it when
>>>>>>>> appropriate.
>>>>>>>
>>>>>>> I think GCC option is needed since one can use -fuse-ld= to
>>>>>>> change linker.
>>>>>>
>>>>>> IMO, nobody will use this highly special x86_64-only option. It would
>>>>>> be best for gnu-ld to reach feature parity with gold as far as this
>>>>>> functionality is concerned. In this case, the optimization would be
>>>>>> auto-configured, and would fire automatically, without any user
>>>>>> intervention.
>>>>>>
>>>>>
>>>>> Let's do it.  I implemented the same feature in bfd linker on both
>>>>> master and 2.25 branch.
>>>>>
>>>>
>>>> +bool
>>>> +i386_binds_local_p (const_tree exp)
>>>> +{
>>>> +  /* Globals marked extern are treated as local when linker copy relocations
>>>> +     support is available with -f{pie|PIE}.  */
>>>> +  if (TARGET_64BIT && ix86_copyrelocs && flag_pie
>>>> +      && TREE_CODE (exp) == VAR_DECL
>>>> +      && DECL_EXTERNAL (exp) && !DECL_WEAK (exp))
>>>> +    return true;
>>>> +  return default_binds_local_p (exp);
>>>> +}
>>>> +
>>>>
>>>> It returns true with -fPIE and false without -fPIE.  It is lying to compiler.
>>>> Maybe legitimate_pic_address_disp_p is a better place.
>>
>> Agreed.
>>
>>> Something like this?
>>
>> Yes.
>>
>> OK, if Jakub doesn't have any objections here. Please also add
>> Sriraman as author to ChangeLog entry.
>>
>> Thanks,
>> Uros.
>
> Here is the patch.   OK to install?
>
> Thanks.
>
> --
> H.J.
> ---
> Normally, with -fPIE/-fpie, GCC accesses globals that are extern to the
> module using the GOT.  This is two instructions, one to get the address
> of the global from the GOT and the other to get the value.  If it turns
> out that the global gets defined in the executable at link-time, it still
> needs to go through the GOT as it is too late then to generate a direct
> access.
>
> Examples:
>
> foo.cc
> ------
> int a_glob;
> int main () {
>   return a_glob; // defined in this file
> }
>
> With -O2 -fpie -pie, the generated code directly accesses the global via
> PC-relative insn:
>
> 5e0   <main>:
>    mov    0x165a(%rip),%eax        # 1c40 <a_glob>
>
> foo.cc
> ------
>
> extern int a_glob;
> int main () {
>   return a_glob; // defined in this file
> }
>
> With -O2 -fpie -pie, the generated code accesses global via GOT using
> two memory loads:
>
> 6f0  <main>:
>    mov    0x1609(%rip),%rax   # 1d00 <_DYNAMIC+0x230>
>    mov    (%rax),%eax
>
> This is true even if in the latter case the global was defined in the
> executable through a different file.
>
> Some experiments on google benchmarks shows that the extra memory loads
> affects performance by 1% to 5%.
>
> Solution - Copy Relocations:
>
> When the linker supports copy relocations, GCC can always assume that
> the global will be defined in the executable.  For globals that are truly
> extern (come from shared objects), the linker will create copy relocations
> and have them defined in the executable. Result is that no global access
> needs to go through the GOT and hence improves performance.
>
> This optimization only applies to undefined, non-weak global data.
> Undefined, weak global data access still must go through the GOT.
>
> This patch checks if linker supports PIE with copy reloc, which is
> enabled in gold and bfd linker in bininutils 2.25, at configure time
> and enables this optimization if the linker support is available.
>
> gcc/
>
> * configure.ac (HAVE_LD_PIE_COPYRELOC): Defined to 1 if
> Linux/x86-64 linker supports PIE with copy reloc.
> * config.in: Regenerated.
> * configure: Likewise.
>
> * config/i386/i386.c (legitimate_pic_address_disp_p): Allow
> pc-relative address for undefined, non-weak, non-function
> symbol reference in 64-bit PIE if linker supports PIE with
> copy reloc.
>
> * doc/sourcebuild.texi: Document pie_copyreloc target.
>
> gcc/testsuite/
>
> * gcc.target/i386/pie-copyrelocs-1.c: New test.
> * gcc.target/i386/pie-copyrelocs-2.c: Likewise.
> * gcc.target/i386/pie-copyrelocs-3.c: Likewise.
> * gcc.target/i386/pie-copyrelocs-4.c: Likewise.
>
> * lib/target-supports.exp (check_effective_target_pie_copyreloc):
> New procedure.

OK.

Thanks,
Uros.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2014-12-04 16:46             ` H.J. Lu
  2014-12-04 19:32               ` Uros Bizjak
@ 2015-02-03 19:25               ` Sriraman Tallam
  2015-02-03 19:26                 ` Sriraman Tallam
  2015-02-03 19:36                 ` Jakub Jelinek
  2015-02-27 23:39               ` H.J. Lu
  2 siblings, 2 replies; 63+ messages in thread
From: Sriraman Tallam @ 2015-02-03 19:25 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Uros Bizjak, gcc-patches, Jakub Jelinek

On Thu, Dec 4, 2014 at 8:46 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Thu, Dec 4, 2014 at 4:44 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
>> On Wed, Dec 3, 2014 at 10:35 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>
>>>>>>>> It would probably help reviewers if you pointed to actual path
>>>>>>>> submission [1], which unfortunately contains the explanation in the
>>>>>>>> patch itself [2], which further explains that this functionality is
>>>>>>>> currently only supported with gold, patched with [3].
>>>>>>>>
>>>>>>>> [1] https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00645.html
>>>>>>>> [2] https://gcc.gnu.org/ml/gcc-patches/2014-09/txt2CHtu81P1O.txt
>>>>>>>> [3] https://sourceware.org/ml/binutils/2014-05/msg00092.html
>>>>>>>>
>>>>>>>> After a bit of the above detective work, I think that new gcc option
>>>>>>>> is not necessary. The configure should detect if new functionality is
>>>>>>>> supported in the linker, and auto-configure gcc to use it when
>>>>>>>> appropriate.
>>>>>>>
>>>>>>> I think GCC option is needed since one can use -fuse-ld= to
>>>>>>> change linker.
>>>>>>
>>>>>> IMO, nobody will use this highly special x86_64-only option. It would
>>>>>> be best for gnu-ld to reach feature parity with gold as far as this
>>>>>> functionality is concerned. In this case, the optimization would be
>>>>>> auto-configured, and would fire automatically, without any user
>>>>>> intervention.
>>>>>>
>>>>>
>>>>> Let's do it.  I implemented the same feature in bfd linker on both
>>>>> master and 2.25 branch.
>>>>>
>>>>
>>>> +bool
>>>> +i386_binds_local_p (const_tree exp)
>>>> +{
>>>> +  /* Globals marked extern are treated as local when linker copy relocations
>>>> +     support is available with -f{pie|PIE}.  */
>>>> +  if (TARGET_64BIT && ix86_copyrelocs && flag_pie
>>>> +      && TREE_CODE (exp) == VAR_DECL
>>>> +      && DECL_EXTERNAL (exp) && !DECL_WEAK (exp))
>>>> +    return true;
>>>> +  return default_binds_local_p (exp);
>>>> +}
>>>> +
>>>>
>>>> It returns true with -fPIE and false without -fPIE.  It is lying to compiler.
>>>> Maybe legitimate_pic_address_disp_p is a better place.
>>
>> Agreed.
>>
>>> Something like this?
>>
>> Yes.
>>
>> OK, if Jakub doesn't have any objections here. Please also add
>> Sriraman as author to ChangeLog entry.
>>
>> Thanks,
>> Uros.
>
> Here is the patch.   OK to install?
>
> Thanks.
>
> --
> H.J.
> ---
> Normally, with -fPIE/-fpie, GCC accesses globals that are extern to the
> module using the GOT.  This is two instructions, one to get the address
> of the global from the GOT and the other to get the value.  If it turns
> out that the global gets defined in the executable at link-time, it still
> needs to go through the GOT as it is too late then to generate a direct
> access.
>
> Examples:
>
> foo.cc
> ------
> int a_glob;
> int main () {
>   return a_glob; // defined in this file
> }
>
> With -O2 -fpie -pie, the generated code directly accesses the global via
> PC-relative insn:
>
> 5e0   <main>:
>    mov    0x165a(%rip),%eax        # 1c40 <a_glob>
>
> foo.cc
> ------
>
> extern int a_glob;
> int main () {
>   return a_glob; // defined in this file
> }
>
> With -O2 -fpie -pie, the generated code accesses global via GOT using
> two memory loads:
>
> 6f0  <main>:
>    mov    0x1609(%rip),%rax   # 1d00 <_DYNAMIC+0x230>
>    mov    (%rax),%eax
>
> This is true even if in the latter case the global was defined in the
> executable through a different file.
>
> Some experiments on google benchmarks shows that the extra memory loads
> affects performance by 1% to 5%.
>
> Solution - Copy Relocations:
>
> When the linker supports copy relocations, GCC can always assume that
> the global will be defined in the executable.  For globals that are truly
> extern (come from shared objects), the linker will create copy relocations
> and have them defined in the executable. Result is that no global access
> needs to go through the GOT and hence improves performance.
>
> This optimization only applies to undefined, non-weak global data.
> Undefined, weak global data access still must go through the GOT.

Hi H.J.,

This was the original patch to i386.c to let global accesses take
advantage of copy relocations and avoid the GOT.


@@ -13113,7 +13113,11 @@ legitimate_pic_address_disp_p (rtx disp)
  return true;
     }
   else if (!SYMBOL_REF_FAR_ADDR_P (op0)
-   && SYMBOL_REF_LOCAL_P (op0)
+   && (SYMBOL_REF_LOCAL_P (op0)
+       || (HAVE_LD_PIE_COPYRELOC
+   && flag_pie
+   && !SYMBOL_REF_WEAK (op0)
+   && !SYMBOL_REF_FUNCTION_P (op0)))
    && ix86_cmodel != CM_LARGE_PIC)

I do not understand here why weak global data access must go through
the GOT and not use copy relocations. Ultimately, there is only going
to be one copy of the global either defined in the executable or the
shared object right?

Can we remove the check for SYMBOL_REF_WEAK?

Thanks
Sri



>
> This patch checks if linker supports PIE with copy reloc, which is
> enabled in gold and bfd linker in bininutils 2.25, at configure time
> and enables this optimization if the linker support is available.
>
> gcc/
>
> * configure.ac (HAVE_LD_PIE_COPYRELOC): Defined to 1 if
> Linux/x86-64 linker supports PIE with copy reloc.
> * config.in: Regenerated.
> * configure: Likewise.
>
> * config/i386/i386.c (legitimate_pic_address_disp_p): Allow
> pc-relative address for undefined, non-weak, non-function
> symbol reference in 64-bit PIE if linker supports PIE with
> copy reloc.
>
> * doc/sourcebuild.texi: Document pie_copyreloc target.
>
> gcc/testsuite/
>
> * gcc.target/i386/pie-copyrelocs-1.c: New test.
> * gcc.target/i386/pie-copyrelocs-2.c: Likewise.
> * gcc.target/i386/pie-copyrelocs-3.c: Likewise.
> * gcc.target/i386/pie-copyrelocs-4.c: Likewise.
>
> * lib/target-supports.exp (check_effective_target_pie_copyreloc):
> New procedure.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2015-02-03 19:25               ` Sriraman Tallam
@ 2015-02-03 19:26                 ` Sriraman Tallam
  2015-02-03 19:36                 ` Jakub Jelinek
  1 sibling, 0 replies; 63+ messages in thread
From: Sriraman Tallam @ 2015-02-03 19:26 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Uros Bizjak, gcc-patches, Jakub Jelinek, David Li, Cary Coutant

+davidxl +ccoutant

On Tue, Feb 3, 2015 at 11:25 AM, Sriraman Tallam <tmsriram@google.com> wrote:
> On Thu, Dec 4, 2014 at 8:46 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Thu, Dec 4, 2014 at 4:44 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
>>> On Wed, Dec 3, 2014 at 10:35 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>>
>>>>>>>>> It would probably help reviewers if you pointed to actual path
>>>>>>>>> submission [1], which unfortunately contains the explanation in the
>>>>>>>>> patch itself [2], which further explains that this functionality is
>>>>>>>>> currently only supported with gold, patched with [3].
>>>>>>>>>
>>>>>>>>> [1] https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00645.html
>>>>>>>>> [2] https://gcc.gnu.org/ml/gcc-patches/2014-09/txt2CHtu81P1O.txt
>>>>>>>>> [3] https://sourceware.org/ml/binutils/2014-05/msg00092.html
>>>>>>>>>
>>>>>>>>> After a bit of the above detective work, I think that new gcc option
>>>>>>>>> is not necessary. The configure should detect if new functionality is
>>>>>>>>> supported in the linker, and auto-configure gcc to use it when
>>>>>>>>> appropriate.
>>>>>>>>
>>>>>>>> I think GCC option is needed since one can use -fuse-ld= to
>>>>>>>> change linker.
>>>>>>>
>>>>>>> IMO, nobody will use this highly special x86_64-only option. It would
>>>>>>> be best for gnu-ld to reach feature parity with gold as far as this
>>>>>>> functionality is concerned. In this case, the optimization would be
>>>>>>> auto-configured, and would fire automatically, without any user
>>>>>>> intervention.
>>>>>>>
>>>>>>
>>>>>> Let's do it.  I implemented the same feature in bfd linker on both
>>>>>> master and 2.25 branch.
>>>>>>
>>>>>
>>>>> +bool
>>>>> +i386_binds_local_p (const_tree exp)
>>>>> +{
>>>>> +  /* Globals marked extern are treated as local when linker copy relocations
>>>>> +     support is available with -f{pie|PIE}.  */
>>>>> +  if (TARGET_64BIT && ix86_copyrelocs && flag_pie
>>>>> +      && TREE_CODE (exp) == VAR_DECL
>>>>> +      && DECL_EXTERNAL (exp) && !DECL_WEAK (exp))
>>>>> +    return true;
>>>>> +  return default_binds_local_p (exp);
>>>>> +}
>>>>> +
>>>>>
>>>>> It returns true with -fPIE and false without -fPIE.  It is lying to compiler.
>>>>> Maybe legitimate_pic_address_disp_p is a better place.
>>>
>>> Agreed.
>>>
>>>> Something like this?
>>>
>>> Yes.
>>>
>>> OK, if Jakub doesn't have any objections here. Please also add
>>> Sriraman as author to ChangeLog entry.
>>>
>>> Thanks,
>>> Uros.
>>
>> Here is the patch.   OK to install?
>>
>> Thanks.
>>
>> --
>> H.J.
>> ---
>> Normally, with -fPIE/-fpie, GCC accesses globals that are extern to the
>> module using the GOT.  This is two instructions, one to get the address
>> of the global from the GOT and the other to get the value.  If it turns
>> out that the global gets defined in the executable at link-time, it still
>> needs to go through the GOT as it is too late then to generate a direct
>> access.
>>
>> Examples:
>>
>> foo.cc
>> ------
>> int a_glob;
>> int main () {
>>   return a_glob; // defined in this file
>> }
>>
>> With -O2 -fpie -pie, the generated code directly accesses the global via
>> PC-relative insn:
>>
>> 5e0   <main>:
>>    mov    0x165a(%rip),%eax        # 1c40 <a_glob>
>>
>> foo.cc
>> ------
>>
>> extern int a_glob;
>> int main () {
>>   return a_glob; // defined in this file
>> }
>>
>> With -O2 -fpie -pie, the generated code accesses global via GOT using
>> two memory loads:
>>
>> 6f0  <main>:
>>    mov    0x1609(%rip),%rax   # 1d00 <_DYNAMIC+0x230>
>>    mov    (%rax),%eax
>>
>> This is true even if in the latter case the global was defined in the
>> executable through a different file.
>>
>> Some experiments on google benchmarks shows that the extra memory loads
>> affects performance by 1% to 5%.
>>
>> Solution - Copy Relocations:
>>
>> When the linker supports copy relocations, GCC can always assume that
>> the global will be defined in the executable.  For globals that are truly
>> extern (come from shared objects), the linker will create copy relocations
>> and have them defined in the executable. Result is that no global access
>> needs to go through the GOT and hence improves performance.
>>
>> This optimization only applies to undefined, non-weak global data.
>> Undefined, weak global data access still must go through the GOT.
>
> Hi H.J.,
>
> This was the original patch to i386.c to let global accesses take
> advantage of copy relocations and avoid the GOT.
>
>
> @@ -13113,7 +13113,11 @@ legitimate_pic_address_disp_p (rtx disp)
>   return true;
>      }
>    else if (!SYMBOL_REF_FAR_ADDR_P (op0)
> -   && SYMBOL_REF_LOCAL_P (op0)
> +   && (SYMBOL_REF_LOCAL_P (op0)
> +       || (HAVE_LD_PIE_COPYRELOC
> +   && flag_pie
> +   && !SYMBOL_REF_WEAK (op0)
> +   && !SYMBOL_REF_FUNCTION_P (op0)))
>     && ix86_cmodel != CM_LARGE_PIC)
>
> I do not understand here why weak global data access must go through
> the GOT and not use copy relocations. Ultimately, there is only going
> to be one copy of the global either defined in the executable or the
> shared object right?
>
> Can we remove the check for SYMBOL_REF_WEAK?
>
> Thanks
> Sri
>
>
>
>>
>> This patch checks if linker supports PIE with copy reloc, which is
>> enabled in gold and bfd linker in bininutils 2.25, at configure time
>> and enables this optimization if the linker support is available.
>>
>> gcc/
>>
>> * configure.ac (HAVE_LD_PIE_COPYRELOC): Defined to 1 if
>> Linux/x86-64 linker supports PIE with copy reloc.
>> * config.in: Regenerated.
>> * configure: Likewise.
>>
>> * config/i386/i386.c (legitimate_pic_address_disp_p): Allow
>> pc-relative address for undefined, non-weak, non-function
>> symbol reference in 64-bit PIE if linker supports PIE with
>> copy reloc.
>>
>> * doc/sourcebuild.texi: Document pie_copyreloc target.
>>
>> gcc/testsuite/
>>
>> * gcc.target/i386/pie-copyrelocs-1.c: New test.
>> * gcc.target/i386/pie-copyrelocs-2.c: Likewise.
>> * gcc.target/i386/pie-copyrelocs-3.c: Likewise.
>> * gcc.target/i386/pie-copyrelocs-4.c: Likewise.
>>
>> * lib/target-supports.exp (check_effective_target_pie_copyreloc):
>> New procedure.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2015-02-03 19:25               ` Sriraman Tallam
  2015-02-03 19:26                 ` Sriraman Tallam
@ 2015-02-03 19:36                 ` Jakub Jelinek
  2015-02-03 21:20                   ` Sriraman Tallam
  1 sibling, 1 reply; 63+ messages in thread
From: Jakub Jelinek @ 2015-02-03 19:36 UTC (permalink / raw)
  To: Sriraman Tallam; +Cc: H.J. Lu, Uros Bizjak, gcc-patches

On Tue, Feb 03, 2015 at 11:25:38AM -0800, Sriraman Tallam wrote:
> This was the original patch to i386.c to let global accesses take
> advantage of copy relocations and avoid the GOT.
> 
> 
> @@ -13113,7 +13113,11 @@ legitimate_pic_address_disp_p (rtx disp)
>   return true;
>      }
>    else if (!SYMBOL_REF_FAR_ADDR_P (op0)
> -   && SYMBOL_REF_LOCAL_P (op0)
> +   && (SYMBOL_REF_LOCAL_P (op0)
> +       || (HAVE_LD_PIE_COPYRELOC
> +   && flag_pie
> +   && !SYMBOL_REF_WEAK (op0)
> +   && !SYMBOL_REF_FUNCTION_P (op0)))
>     && ix86_cmodel != CM_LARGE_PIC)
> 
> I do not understand here why weak global data access must go through
> the GOT and not use copy relocations. Ultimately, there is only going
> to be one copy of the global either defined in the executable or the
> shared object right?
> 
> Can we remove the check for SYMBOL_REF_WEAK?

So, what will then happen if the weak undef symbol isn't defined anywhere?
In non-PIE binaries that is fine, the linker will store 0.
But in PIE binaries, the 0 would be biased by the PIE load bias and thus
wouldn't be NULL.
You can only optimize weak vars if there is some weak definition in the
current TU.

	Jakub

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2015-02-03 19:36                 ` Jakub Jelinek
@ 2015-02-03 21:20                   ` Sriraman Tallam
  2015-02-03 21:29                     ` H.J. Lu
  0 siblings, 1 reply; 63+ messages in thread
From: Sriraman Tallam @ 2015-02-03 21:20 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: H.J. Lu, Uros Bizjak, gcc-patches, David Li, Cary Coutant

On Tue, Feb 3, 2015 at 11:36 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Tue, Feb 03, 2015 at 11:25:38AM -0800, Sriraman Tallam wrote:
>> This was the original patch to i386.c to let global accesses take
>> advantage of copy relocations and avoid the GOT.
>>
>>
>> @@ -13113,7 +13113,11 @@ legitimate_pic_address_disp_p (rtx disp)
>>   return true;
>>      }
>>    else if (!SYMBOL_REF_FAR_ADDR_P (op0)
>> -   && SYMBOL_REF_LOCAL_P (op0)
>> +   && (SYMBOL_REF_LOCAL_P (op0)
>> +       || (HAVE_LD_PIE_COPYRELOC
>> +   && flag_pie
>> +   && !SYMBOL_REF_WEAK (op0)
>> +   && !SYMBOL_REF_FUNCTION_P (op0)))
>>     && ix86_cmodel != CM_LARGE_PIC)
>>
>> I do not understand here why weak global data access must go through
>> the GOT and not use copy relocations. Ultimately, there is only going
>> to be one copy of the global either defined in the executable or the
>> shared object right?
>>
>> Can we remove the check for SYMBOL_REF_WEAK?
>
> So, what will then happen if the weak undef symbol isn't defined anywhere?
> In non-PIE binaries that is fine, the linker will store 0.
> But in PIE binaries, the 0 would be biased by the PIE load bias and thus
> wouldn't be NULL.

Thanks for clarifying.

> You can only optimize weak vars if there is some weak definition in the
> current TU.

Would this be fine then?  Replace !SYMBOL_REF_WEAK (op0) with

!(SYMBOL_REF_WEAK (op0) && SYMBOL_REF_EXTERNAL_P (op0))

Thanks
Sri

>
>         Jakub

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2015-02-03 21:20                   ` Sriraman Tallam
@ 2015-02-03 21:29                     ` H.J. Lu
  2015-02-03 21:36                       ` Sriraman Tallam
  0 siblings, 1 reply; 63+ messages in thread
From: H.J. Lu @ 2015-02-03 21:29 UTC (permalink / raw)
  To: Sriraman Tallam
  Cc: Jakub Jelinek, Uros Bizjak, gcc-patches, David Li, Cary Coutant

On Tue, Feb 3, 2015 at 1:20 PM, Sriraman Tallam <tmsriram@google.com> wrote:
> On Tue, Feb 3, 2015 at 11:36 AM, Jakub Jelinek <jakub@redhat.com> wrote:
>> On Tue, Feb 03, 2015 at 11:25:38AM -0800, Sriraman Tallam wrote:
>>> This was the original patch to i386.c to let global accesses take
>>> advantage of copy relocations and avoid the GOT.
>>>
>>>
>>> @@ -13113,7 +13113,11 @@ legitimate_pic_address_disp_p (rtx disp)
>>>   return true;
>>>      }
>>>    else if (!SYMBOL_REF_FAR_ADDR_P (op0)
>>> -   && SYMBOL_REF_LOCAL_P (op0)
>>> +   && (SYMBOL_REF_LOCAL_P (op0)
>>> +       || (HAVE_LD_PIE_COPYRELOC
>>> +   && flag_pie
>>> +   && !SYMBOL_REF_WEAK (op0)
>>> +   && !SYMBOL_REF_FUNCTION_P (op0)))
>>>     && ix86_cmodel != CM_LARGE_PIC)
>>>
>>> I do not understand here why weak global data access must go through
>>> the GOT and not use copy relocations. Ultimately, there is only going
>>> to be one copy of the global either defined in the executable or the
>>> shared object right?
>>>
>>> Can we remove the check for SYMBOL_REF_WEAK?
>>
>> So, what will then happen if the weak undef symbol isn't defined anywhere?
>> In non-PIE binaries that is fine, the linker will store 0.
>> But in PIE binaries, the 0 would be biased by the PIE load bias and thus
>> wouldn't be NULL.
>
> Thanks for clarifying.
>
>> You can only optimize weak vars if there is some weak definition in the
>> current TU.
>
> Would this be fine then?  Replace !SYMBOL_REF_WEAK (op0) with
>
> !(SYMBOL_REF_WEAK (op0) && SYMBOL_REF_EXTERNAL_P (op0))
>

The full condition is:

                  && (SYMBOL_REF_LOCAL_P (op0)
                       || (HAVE_LD_PIE_COPYRELOC
                           && flag_pie
                           && !SYMBOL_REF_WEAK (op0)
                           && !SYMBOL_REF_FUNCTION_P (op0)))

If the weak op0 is defined in the current TU, shouldn't
SYMBOL_REF_LOCAL_P (op0)  be true for PIE?

-- 
H.J.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2015-02-03 21:29                     ` H.J. Lu
@ 2015-02-03 21:36                       ` Sriraman Tallam
  2015-02-03 22:03                         ` H.J. Lu
  0 siblings, 1 reply; 63+ messages in thread
From: Sriraman Tallam @ 2015-02-03 21:36 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Jakub Jelinek, Uros Bizjak, gcc-patches, David Li, Cary Coutant

On Tue, Feb 3, 2015 at 1:29 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Tue, Feb 3, 2015 at 1:20 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>> On Tue, Feb 3, 2015 at 11:36 AM, Jakub Jelinek <jakub@redhat.com> wrote:
>>> On Tue, Feb 03, 2015 at 11:25:38AM -0800, Sriraman Tallam wrote:
>>>> This was the original patch to i386.c to let global accesses take
>>>> advantage of copy relocations and avoid the GOT.
>>>>
>>>>
>>>> @@ -13113,7 +13113,11 @@ legitimate_pic_address_disp_p (rtx disp)
>>>>   return true;
>>>>      }
>>>>    else if (!SYMBOL_REF_FAR_ADDR_P (op0)
>>>> -   && SYMBOL_REF_LOCAL_P (op0)
>>>> +   && (SYMBOL_REF_LOCAL_P (op0)
>>>> +       || (HAVE_LD_PIE_COPYRELOC
>>>> +   && flag_pie
>>>> +   && !SYMBOL_REF_WEAK (op0)
>>>> +   && !SYMBOL_REF_FUNCTION_P (op0)))
>>>>     && ix86_cmodel != CM_LARGE_PIC)
>>>>
>>>> I do not understand here why weak global data access must go through
>>>> the GOT and not use copy relocations. Ultimately, there is only going
>>>> to be one copy of the global either defined in the executable or the
>>>> shared object right?
>>>>
>>>> Can we remove the check for SYMBOL_REF_WEAK?
>>>
>>> So, what will then happen if the weak undef symbol isn't defined anywhere?
>>> In non-PIE binaries that is fine, the linker will store 0.
>>> But in PIE binaries, the 0 would be biased by the PIE load bias and thus
>>> wouldn't be NULL.
>>
>> Thanks for clarifying.
>>
>>> You can only optimize weak vars if there is some weak definition in the
>>> current TU.
>>
>> Would this be fine then?  Replace !SYMBOL_REF_WEAK (op0) with
>>
>> !(SYMBOL_REF_WEAK (op0) && SYMBOL_REF_EXTERNAL_P (op0))
>>
>
> The full condition is:
>
>                   && (SYMBOL_REF_LOCAL_P (op0)
>                        || (HAVE_LD_PIE_COPYRELOC
>                            && flag_pie
>                            && !SYMBOL_REF_WEAK (op0)
>                            && !SYMBOL_REF_FUNCTION_P (op0)))
>
> If the weak op0 is defined in the current TU, shouldn't
> SYMBOL_REF_LOCAL_P (op0)  be true for PIE?

Thats not what I see for this:

zap.cc
---------
__attribute__((weak))
int glob;

int main()
{
   printf("%d\n", glob);
}

(gdb) p debug_rtx(op0)
(symbol_ref/i:DI ("glob") <var_decl 0x7ffff74f51c8 glob>)

(gdb) p SYMBOL_REF_LOCAL_P(op0)
$4 = false

(gdb) p SYMBOL_REF_WEAK (op0)
$5 = 1

(gdb) p SYMBOL_REF_EXTERNAL_P (op0)
$6 = false

Thanks
Sri




>
> --
> H.J.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2015-02-03 21:36                       ` Sriraman Tallam
@ 2015-02-03 22:03                         ` H.J. Lu
  2015-02-03 22:19                           ` Jakub Jelinek
  0 siblings, 1 reply; 63+ messages in thread
From: H.J. Lu @ 2015-02-03 22:03 UTC (permalink / raw)
  To: Sriraman Tallam
  Cc: Jakub Jelinek, Uros Bizjak, gcc-patches, David Li, Cary Coutant

On Tue, Feb 3, 2015 at 1:35 PM, Sriraman Tallam <tmsriram@google.com> wrote:
> On Tue, Feb 3, 2015 at 1:29 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Tue, Feb 3, 2015 at 1:20 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>>> On Tue, Feb 3, 2015 at 11:36 AM, Jakub Jelinek <jakub@redhat.com> wrote:
>>>> On Tue, Feb 03, 2015 at 11:25:38AM -0800, Sriraman Tallam wrote:
>>>>> This was the original patch to i386.c to let global accesses take
>>>>> advantage of copy relocations and avoid the GOT.
>>>>>
>>>>>
>>>>> @@ -13113,7 +13113,11 @@ legitimate_pic_address_disp_p (rtx disp)
>>>>>   return true;
>>>>>      }
>>>>>    else if (!SYMBOL_REF_FAR_ADDR_P (op0)
>>>>> -   && SYMBOL_REF_LOCAL_P (op0)
>>>>> +   && (SYMBOL_REF_LOCAL_P (op0)
>>>>> +       || (HAVE_LD_PIE_COPYRELOC
>>>>> +   && flag_pie
>>>>> +   && !SYMBOL_REF_WEAK (op0)
>>>>> +   && !SYMBOL_REF_FUNCTION_P (op0)))
>>>>>     && ix86_cmodel != CM_LARGE_PIC)
>>>>>
>>>>> I do not understand here why weak global data access must go through
>>>>> the GOT and not use copy relocations. Ultimately, there is only going
>>>>> to be one copy of the global either defined in the executable or the
>>>>> shared object right?
>>>>>
>>>>> Can we remove the check for SYMBOL_REF_WEAK?
>>>>
>>>> So, what will then happen if the weak undef symbol isn't defined anywhere?
>>>> In non-PIE binaries that is fine, the linker will store 0.
>>>> But in PIE binaries, the 0 would be biased by the PIE load bias and thus
>>>> wouldn't be NULL.
>>>
>>> Thanks for clarifying.
>>>
>>>> You can only optimize weak vars if there is some weak definition in the
>>>> current TU.
>>>
>>> Would this be fine then?  Replace !SYMBOL_REF_WEAK (op0) with
>>>
>>> !(SYMBOL_REF_WEAK (op0) && SYMBOL_REF_EXTERNAL_P (op0))
>>>
>>
>> The full condition is:
>>
>>                   && (SYMBOL_REF_LOCAL_P (op0)
>>                        || (HAVE_LD_PIE_COPYRELOC
>>                            && flag_pie
>>                            && !SYMBOL_REF_WEAK (op0)
>>                            && !SYMBOL_REF_FUNCTION_P (op0)))
>>
>> If the weak op0 is defined in the current TU, shouldn't
>> SYMBOL_REF_LOCAL_P (op0)  be true for PIE?
>
> Thats not what I see for this:
>
> zap.cc
> ---------
> __attribute__((weak))
> int glob;
>
> int main()
> {
>    printf("%d\n", glob);
> }
>
> (gdb) p debug_rtx(op0)
> (symbol_ref/i:DI ("glob") <var_decl 0x7ffff74f51c8 glob>)
>
> (gdb) p SYMBOL_REF_LOCAL_P(op0)
> $4 = false
>
> (gdb) p SYMBOL_REF_WEAK (op0)
> $5 = 1
>
> (gdb) p SYMBOL_REF_EXTERNAL_P (op0)
> $6 = false
>
> Thanks

So we aren't SYMBOL_REF_EXTERNAL_P nor
SYMBOL_REF_LOCAL_P.  What do we reference?



-- 
H.J.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2015-02-03 22:03                         ` H.J. Lu
@ 2015-02-03 22:19                           ` Jakub Jelinek
  2015-02-04  1:16                             ` H.J. Lu
  0 siblings, 1 reply; 63+ messages in thread
From: Jakub Jelinek @ 2015-02-03 22:19 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Sriraman Tallam, Uros Bizjak, gcc-patches, David Li, Cary Coutant

On Tue, Feb 03, 2015 at 02:03:14PM -0800, H.J. Lu wrote:
> So we aren't SYMBOL_REF_EXTERNAL_P nor
> SYMBOL_REF_LOCAL_P.  What do we reference?

That is reasonable.  There is no guarantee the extern weak symbol is local,
it could very well be non-local.  All that you know about the symbols is
that its address is non-NULL in that case.

	Jakub

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2015-02-03 22:19                           ` Jakub Jelinek
@ 2015-02-04  1:16                             ` H.J. Lu
  2015-02-04 18:27                               ` Sriraman Tallam
  0 siblings, 1 reply; 63+ messages in thread
From: H.J. Lu @ 2015-02-04  1:16 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Sriraman Tallam, Uros Bizjak, gcc-patches, David Li, Cary Coutant

On Tue, Feb 3, 2015 at 2:19 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Tue, Feb 03, 2015 at 02:03:14PM -0800, H.J. Lu wrote:
>> So we aren't SYMBOL_REF_EXTERNAL_P nor
>> SYMBOL_REF_LOCAL_P.  What do we reference?
>
> That is reasonable.  There is no guarantee the extern weak symbol is local,
> it could very well be non-local.  All that you know about the symbols is
> that its address is non-NULL in that case.
>

This may be true for shared library.  But it isn't true for PIE:

[hjl@gnu-6 copyreloc-3]$ cat x.c
__attribute__((weak))
int a;

extern void bar (void);

int main()
{
  if (a != 0)
    __builtin_abort();
  bar ();
  if (a != 30)
    __builtin_abort();
  return 0;
}
[hjl@gnu-6 copyreloc-3]$ cat bar.c
int a = -1;

void
bar ()
{
  a = 30;
}
[hjl@gnu-6 copyreloc-3]$ make
gcc -pie -O3 -g -fuse-ld=gold -fpie  -c x.i
gcc -pie -O3 -g -fuse-ld=gold -fpic    -c -o bar.o bar.c
gcc -pie  -shared -o libbar.so bar.o
gcc -pie -O3 -g -fuse-ld=gold -o x x.o libbar.so -Wl,-R,.
./x
[hjl@gnu-6 copyreloc-3]$

Even if a common symbol, a, is weak, all references to
a within PIE is local.

-- 
H.J.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2015-02-04  1:16                             ` H.J. Lu
@ 2015-02-04 18:27                               ` Sriraman Tallam
  2015-02-04 18:31                                 ` Jakub Jelinek
  0 siblings, 1 reply; 63+ messages in thread
From: Sriraman Tallam @ 2015-02-04 18:27 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Jakub Jelinek, Uros Bizjak, gcc-patches, David Li, Cary Coutant

On Tue, Feb 3, 2015 at 5:16 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Tue, Feb 3, 2015 at 2:19 PM, Jakub Jelinek <jakub@redhat.com> wrote:
>> On Tue, Feb 03, 2015 at 02:03:14PM -0800, H.J. Lu wrote:
>>> So we aren't SYMBOL_REF_EXTERNAL_P nor
>>> SYMBOL_REF_LOCAL_P.  What do we reference?
>>
>> That is reasonable.  There is no guarantee the extern weak symbol is local,
>> it could very well be non-local.  All that you know about the symbols is
>> that its address is non-NULL in that case.
>>
>
> This may be true for shared library.  But it isn't true for PIE:


Also, gcc and g++ are inconsistent about something even more simple:

$ cat x.c

int a;

int main() {
  printf("%d\n", a);
}

With gcc -fPIE x.c
SYMBOL_REF_LOCAL_P(op0) = false

With g++ -fPIE x.c
SYMBOL_REF_LOCAL_P(op0) = true



Sri


>
> [hjl@gnu-6 copyreloc-3]$ cat x.c
> __attribute__((weak))
> int a;
>
> extern void bar (void);
>
> int main()
> {
>   if (a != 0)
>     __builtin_abort();
>   bar ();
>   if (a != 30)
>     __builtin_abort();
>   return 0;
> }
> [hjl@gnu-6 copyreloc-3]$ cat bar.c
> int a = -1;
>
> void
> bar ()
> {
>   a = 30;
> }
> [hjl@gnu-6 copyreloc-3]$ make
> gcc -pie -O3 -g -fuse-ld=gold -fpie  -c x.i
> gcc -pie -O3 -g -fuse-ld=gold -fpic    -c -o bar.o bar.c
> gcc -pie  -shared -o libbar.so bar.o
> gcc -pie -O3 -g -fuse-ld=gold -o x x.o libbar.so -Wl,-R,.
> ./x
> [hjl@gnu-6 copyreloc-3]$
>
> Even if a common symbol, a, is weak, all references to
> a within PIE is local.
>
> --
> H.J.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2015-02-04 18:27                               ` Sriraman Tallam
@ 2015-02-04 18:31                                 ` Jakub Jelinek
  2015-02-04 18:38                                   ` H.J. Lu
  0 siblings, 1 reply; 63+ messages in thread
From: Jakub Jelinek @ 2015-02-04 18:31 UTC (permalink / raw)
  To: Sriraman Tallam; +Cc: H.J. Lu, Uros Bizjak, gcc-patches, David Li, Cary Coutant

On Wed, Feb 04, 2015 at 10:27:34AM -0800, Sriraman Tallam wrote:
> On Tue, Feb 3, 2015 at 5:16 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> > On Tue, Feb 3, 2015 at 2:19 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> >> On Tue, Feb 03, 2015 at 02:03:14PM -0800, H.J. Lu wrote:
> >>> So we aren't SYMBOL_REF_EXTERNAL_P nor
> >>> SYMBOL_REF_LOCAL_P.  What do we reference?
> >>
> >> That is reasonable.  There is no guarantee the extern weak symbol is local,
> >> it could very well be non-local.  All that you know about the symbols is
> >> that its address is non-NULL in that case.
> >>
> >
> > This may be true for shared library.  But it isn't true for PIE:
> 
> 
> Also, gcc and g++ are inconsistent about something even more simple:
> 
> $ cat x.c
> 
> int a;
> 
> int main() {
>   printf("%d\n", a);
> }
> 
> With gcc -fPIE x.c
> SYMBOL_REF_LOCAL_P(op0) = false
> 
> With g++ -fPIE x.c
> SYMBOL_REF_LOCAL_P(op0) = true

Try -fno-common for C and you'll get the same result as in C++.
Common symbols can't be considered SYMBOL_REF_LOCAL_P, they might resolve
to a non-common symbol from different TU.

	Jakub

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2015-02-04 18:31                                 ` Jakub Jelinek
@ 2015-02-04 18:38                                   ` H.J. Lu
  2015-02-04 18:42                                     ` Jakub Jelinek
  0 siblings, 1 reply; 63+ messages in thread
From: H.J. Lu @ 2015-02-04 18:38 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Sriraman Tallam, Uros Bizjak, gcc-patches, David Li, Cary Coutant

On Wed, Feb 4, 2015 at 10:31 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Wed, Feb 04, 2015 at 10:27:34AM -0800, Sriraman Tallam wrote:
>> On Tue, Feb 3, 2015 at 5:16 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> > On Tue, Feb 3, 2015 at 2:19 PM, Jakub Jelinek <jakub@redhat.com> wrote:
>> >> On Tue, Feb 03, 2015 at 02:03:14PM -0800, H.J. Lu wrote:
>> >>> So we aren't SYMBOL_REF_EXTERNAL_P nor
>> >>> SYMBOL_REF_LOCAL_P.  What do we reference?
>> >>
>> >> That is reasonable.  There is no guarantee the extern weak symbol is local,
>> >> it could very well be non-local.  All that you know about the symbols is
>> >> that its address is non-NULL in that case.
>> >>
>> >
>> > This may be true for shared library.  But it isn't true for PIE:
>>
>>
>> Also, gcc and g++ are inconsistent about something even more simple:
>>
>> $ cat x.c
>>
>> int a;
>>
>> int main() {
>>   printf("%d\n", a);
>> }
>>
>> With gcc -fPIE x.c
>> SYMBOL_REF_LOCAL_P(op0) = false
>>
>> With g++ -fPIE x.c
>> SYMBOL_REF_LOCAL_P(op0) = true
>
> Try -fno-common for C and you'll get the same result as in C++.
> Common symbols can't be considered SYMBOL_REF_LOCAL_P, they might resolve
> to a non-common symbol from different TU.

Common symbol should be resolved locally for PIE.


-- 
H.J.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2015-02-04 18:38                                   ` H.J. Lu
@ 2015-02-04 18:42                                     ` Jakub Jelinek
  2015-02-04 18:45                                       ` H.J. Lu
  0 siblings, 1 reply; 63+ messages in thread
From: Jakub Jelinek @ 2015-02-04 18:42 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Sriraman Tallam, Uros Bizjak, gcc-patches, David Li, Cary Coutant

On Wed, Feb 04, 2015 at 10:38:48AM -0800, H.J. Lu wrote:
> Common symbol should be resolved locally for PIE.

binds_local_p yes, binds_to_current_def_p no.

	Jakub

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2015-02-04 18:42                                     ` Jakub Jelinek
@ 2015-02-04 18:45                                       ` H.J. Lu
  2015-02-04 18:51                                         ` Sriraman Tallam
  0 siblings, 1 reply; 63+ messages in thread
From: H.J. Lu @ 2015-02-04 18:45 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Sriraman Tallam, Uros Bizjak, gcc-patches, David Li, Cary Coutant

On Wed, Feb 4, 2015 at 10:42 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Wed, Feb 04, 2015 at 10:38:48AM -0800, H.J. Lu wrote:
>> Common symbol should be resolved locally for PIE.
>
> binds_local_p yes, binds_to_current_def_p no.
>

Is SYMBOL_REF_LOCAL_P set to binds_local_p or
binds_to_current_def_p?


-- 
H.J.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2015-02-04 18:45                                       ` H.J. Lu
@ 2015-02-04 18:51                                         ` Sriraman Tallam
  2015-02-04 18:57                                           ` H.J. Lu
  0 siblings, 1 reply; 63+ messages in thread
From: Sriraman Tallam @ 2015-02-04 18:51 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Jakub Jelinek, Uros Bizjak, gcc-patches, David Li, Cary Coutant

On Wed, Feb 4, 2015 at 10:45 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Wed, Feb 4, 2015 at 10:42 AM, Jakub Jelinek <jakub@redhat.com> wrote:
>> On Wed, Feb 04, 2015 at 10:38:48AM -0800, H.J. Lu wrote:
>>> Common symbol should be resolved locally for PIE.
>>
>> binds_local_p yes, binds_to_current_def_p no.
>>
>
> Is SYMBOL_REF_LOCAL_P set to binds_local_p or
> binds_to_current_def_p?

Looks like binds_local_p:

varasm.c:
void
default_encode_section_info (tree decl, rtx rtl, int first ATTRIBUTE_UNUSED)
{
  ...
  if (targetm.binds_local_p (decl))
    flags |= SYMBOL_FLAG_LOCAL;

>
>
> --
> H.J.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2015-02-04 18:51                                         ` Sriraman Tallam
@ 2015-02-04 18:57                                           ` H.J. Lu
  2015-02-04 21:53                                             ` Sriraman Tallam
  0 siblings, 1 reply; 63+ messages in thread
From: H.J. Lu @ 2015-02-04 18:57 UTC (permalink / raw)
  To: Sriraman Tallam
  Cc: Jakub Jelinek, Uros Bizjak, gcc-patches, David Li, Cary Coutant

On Wed, Feb 4, 2015 at 10:51 AM, Sriraman Tallam <tmsriram@google.com> wrote:
> On Wed, Feb 4, 2015 at 10:45 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Wed, Feb 4, 2015 at 10:42 AM, Jakub Jelinek <jakub@redhat.com> wrote:
>>> On Wed, Feb 04, 2015 at 10:38:48AM -0800, H.J. Lu wrote:
>>>> Common symbol should be resolved locally for PIE.
>>>
>>> binds_local_p yes, binds_to_current_def_p no.
>>>
>>
>> Is SYMBOL_REF_LOCAL_P set to binds_local_p or
>> binds_to_current_def_p?
>
> Looks like binds_local_p:
>
> varasm.c:
> void
> default_encode_section_info (tree decl, rtx rtl, int first ATTRIBUTE_UNUSED)
> {
>   ...
>   if (targetm.binds_local_p (decl))
>     flags |= SYMBOL_FLAG_LOCAL;
>

Why is SYMBOL_REF_LOCAL_P false?


-- 
H.J.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2015-02-04 18:57                                           ` H.J. Lu
@ 2015-02-04 21:53                                             ` Sriraman Tallam
  2015-02-04 22:37                                               ` H.J. Lu
  0 siblings, 1 reply; 63+ messages in thread
From: Sriraman Tallam @ 2015-02-04 21:53 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Jakub Jelinek, Uros Bizjak, gcc-patches, David Li, Cary Coutant

On Wed, Feb 4, 2015 at 10:57 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Wed, Feb 4, 2015 at 10:51 AM, Sriraman Tallam <tmsriram@google.com> wrote:
>> On Wed, Feb 4, 2015 at 10:45 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>> On Wed, Feb 4, 2015 at 10:42 AM, Jakub Jelinek <jakub@redhat.com> wrote:
>>>> On Wed, Feb 04, 2015 at 10:38:48AM -0800, H.J. Lu wrote:
>>>>> Common symbol should be resolved locally for PIE.
>>>>
>>>> binds_local_p yes, binds_to_current_def_p no.
>>>>
>>>
>>> Is SYMBOL_REF_LOCAL_P set to binds_local_p or
>>> binds_to_current_def_p?
>>
>> Looks like binds_local_p:
>>
>> varasm.c:
>> void
>> default_encode_section_info (tree decl, rtx rtl, int first ATTRIBUTE_UNUSED)
>> {
>>   ...
>>   if (targetm.binds_local_p (decl))
>>     flags |= SYMBOL_FLAG_LOCAL;
>>
>
> Why is SYMBOL_REF_LOCAL_P false?

In varasm.c, default_binds_local_p_1


 /* Default visibility weak data can be overridden by a strong symbol
     in another module and so are not local.  */
  else if (DECL_WEAK (exp)
  && !resolved_locally)
    local_p = false;

For weak definition, it is set to false here.

Sri


>
>
> --
> H.J.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2015-02-04 21:53                                             ` Sriraman Tallam
@ 2015-02-04 22:37                                               ` H.J. Lu
  2015-02-04 22:47                                                 ` Bernhard Reutner-Fischer
  0 siblings, 1 reply; 63+ messages in thread
From: H.J. Lu @ 2015-02-04 22:37 UTC (permalink / raw)
  To: Sriraman Tallam
  Cc: Jakub Jelinek, Uros Bizjak, gcc-patches, David Li, Cary Coutant

On Wed, Feb 4, 2015 at 1:53 PM, Sriraman Tallam <tmsriram@google.com> wrote:
> On Wed, Feb 4, 2015 at 10:57 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Wed, Feb 4, 2015 at 10:51 AM, Sriraman Tallam <tmsriram@google.com> wrote:
>>> On Wed, Feb 4, 2015 at 10:45 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>>> On Wed, Feb 4, 2015 at 10:42 AM, Jakub Jelinek <jakub@redhat.com> wrote:
>>>>> On Wed, Feb 04, 2015 at 10:38:48AM -0800, H.J. Lu wrote:
>>>>>> Common symbol should be resolved locally for PIE.
>>>>>
>>>>> binds_local_p yes, binds_to_current_def_p no.
>>>>>
>>>>
>>>> Is SYMBOL_REF_LOCAL_P set to binds_local_p or
>>>> binds_to_current_def_p?
>>>
>>> Looks like binds_local_p:
>>>
>>> varasm.c:
>>> void
>>> default_encode_section_info (tree decl, rtx rtl, int first ATTRIBUTE_UNUSED)
>>> {
>>>   ...
>>>   if (targetm.binds_local_p (decl))
>>>     flags |= SYMBOL_FLAG_LOCAL;
>>>
>>
>> Why is SYMBOL_REF_LOCAL_P false?
>
> In varasm.c, default_binds_local_p_1
>
>
>  /* Default visibility weak data can be overridden by a strong symbol
>      in another module and so are not local.  */
>   else if (DECL_WEAK (exp)
>   && !resolved_locally)
           ^^^^^^^^^^^^^^^^^^^
Why is resolved_locally false? It should be true for common
symbol when compiling for PIE.

>     local_p = false;
>
> For weak definition, it is set to false here.
>

-- 
H.J.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2015-02-04 22:37                                               ` H.J. Lu
@ 2015-02-04 22:47                                                 ` Bernhard Reutner-Fischer
  2015-02-04 23:10                                                   ` H.J. Lu
  0 siblings, 1 reply; 63+ messages in thread
From: Bernhard Reutner-Fischer @ 2015-02-04 22:47 UTC (permalink / raw)
  To: H.J. Lu, Sriraman Tallam
  Cc: Jakub Jelinek, Uros Bizjak, gcc-patches, David Li, Cary Coutant

On February 4, 2015 11:37:01 PM GMT+01:00, "H.J. Lu" <hjl.tools@gmail.com> wrote:
>On Wed, Feb 4, 2015 at 1:53 PM, Sriraman Tallam <tmsriram@google.com>
>wrote:
>> On Wed, Feb 4, 2015 at 10:57 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>> On Wed, Feb 4, 2015 at 10:51 AM, Sriraman Tallam
><tmsriram@google.com> wrote:
>>>> On Wed, Feb 4, 2015 at 10:45 AM, H.J. Lu <hjl.tools@gmail.com>
>wrote:
>>>>> On Wed, Feb 4, 2015 at 10:42 AM, Jakub Jelinek <jakub@redhat.com>
>wrote:
>>>>>> On Wed, Feb 04, 2015 at 10:38:48AM -0800, H.J. Lu wrote:
>>>>>>> Common symbol should be resolved locally for PIE.
>>>>>>
>>>>>> binds_local_p yes, binds_to_current_def_p no.
>>>>>>
>>>>>
>>>>> Is SYMBOL_REF_LOCAL_P set to binds_local_p or
>>>>> binds_to_current_def_p?
>>>>
>>>> Looks like binds_local_p:
>>>>
>>>> varasm.c:
>>>> void
>>>> default_encode_section_info (tree decl, rtx rtl, int first
>ATTRIBUTE_UNUSED)
>>>> {
>>>>   ...
>>>>   if (targetm.binds_local_p (decl))
>>>>     flags |= SYMBOL_FLAG_LOCAL;
>>>>
>>>
>>> Why is SYMBOL_REF_LOCAL_P false?
>>
>> In varasm.c, default_binds_local_p_1
>>
>>
>>  /* Default visibility weak data can be overridden by a strong symbol
>>      in another module and so are not local.  */
>>   else if (DECL_WEAK (exp)
>>   && !resolved_locally)
>           ^^^^^^^^^^^^^^^^^^^
>Why is resolved_locally false? It should be true for common
>symbol when compiling for PIE.
>
>>     local_p = false;
>>
>> For weak definition, it is set to false here.

Yea and i think this is still wrong and known as 
http://gcc.gnu.org/PR32219

Thanks


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2015-02-04 22:47                                                 ` Bernhard Reutner-Fischer
@ 2015-02-04 23:10                                                   ` H.J. Lu
  2015-02-04 23:29                                                     ` H.J. Lu
  0 siblings, 1 reply; 63+ messages in thread
From: H.J. Lu @ 2015-02-04 23:10 UTC (permalink / raw)
  To: Bernhard Reutner-Fischer
  Cc: Sriraman Tallam, Jakub Jelinek, Uros Bizjak, gcc-patches,
	David Li, Cary Coutant

[-- Attachment #1: Type: text/plain, Size: 1717 bytes --]

On Wed, Feb 4, 2015 at 2:47 PM, Bernhard Reutner-Fischer
<rep.dot.nop@gmail.com> wrote:
> On February 4, 2015 11:37:01 PM GMT+01:00, "H.J. Lu" <hjl.tools@gmail.com> wrote:
>>On Wed, Feb 4, 2015 at 1:53 PM, Sriraman Tallam <tmsriram@google.com>
>>wrote:
>>> On Wed, Feb 4, 2015 at 10:57 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>>> On Wed, Feb 4, 2015 at 10:51 AM, Sriraman Tallam
>><tmsriram@google.com> wrote:
>>>>> On Wed, Feb 4, 2015 at 10:45 AM, H.J. Lu <hjl.tools@gmail.com>
>>wrote:
>>>>>> On Wed, Feb 4, 2015 at 10:42 AM, Jakub Jelinek <jakub@redhat.com>
>>wrote:
>>>>>>> On Wed, Feb 04, 2015 at 10:38:48AM -0800, H.J. Lu wrote:
>>>>>>>> Common symbol should be resolved locally for PIE.
>>>>>>>
>>>>>>> binds_local_p yes, binds_to_current_def_p no.
>>>>>>>
>>>>>>
>>>>>> Is SYMBOL_REF_LOCAL_P set to binds_local_p or
>>>>>> binds_to_current_def_p?
>>>>>
>>>>> Looks like binds_local_p:
>>>>>
>>>>> varasm.c:
>>>>> void
>>>>> default_encode_section_info (tree decl, rtx rtl, int first
>>ATTRIBUTE_UNUSED)
>>>>> {
>>>>>   ...
>>>>>   if (targetm.binds_local_p (decl))
>>>>>     flags |= SYMBOL_FLAG_LOCAL;
>>>>>
>>>>
>>>> Why is SYMBOL_REF_LOCAL_P false?
>>>
>>> In varasm.c, default_binds_local_p_1
>>>
>>>
>>>  /* Default visibility weak data can be overridden by a strong symbol
>>>      in another module and so are not local.  */
>>>   else if (DECL_WEAK (exp)
>>>   && !resolved_locally)
>>           ^^^^^^^^^^^^^^^^^^^
>>Why is resolved_locally false? It should be true for common
>>symbol when compiling for PIE.
>>
>>>     local_p = false;
>>>
>>> For weak definition, it is set to false here.
>
> Yea and i think this is still wrong and known as
> http://gcc.gnu.org/PR32219
>

Try this.


-- 
H.J.

[-- Attachment #2: pr32219.patch --]
[-- Type: text/x-patch, Size: 887 bytes --]

diff --git a/gcc/varasm.c b/gcc/varasm.c
index eb65b1f..c95eebd 100644
--- a/gcc/varasm.c
+++ b/gcc/varasm.c
@@ -6826,7 +6826,15 @@ default_binds_local_p_1 (const_tree exp, int shlib)
       && (TREE_STATIC (exp) || DECL_EXTERNAL (exp)))
     {
       varpool_node *vnode = varpool_node::get (exp);
-      if (vnode && (resolution_local_p (vnode->resolution) || vnode->in_other_partition))
+      /* If not building shared library, common or initialized symbols
+	 are also resolved locally, regardless they are weak or not.  */
+      if ((!shlib
+	   && (DECL_COMMON (exp)
+	       || (DECL_INITIAL (exp) != NULL
+		   && (in_lto_p
+		       || DECL_INITIAL (exp) != error_mark_node))))
+	  || (vnode && (resolution_local_p (vnode->resolution)
+			|| vnode->in_other_partition)))
 	resolved_locally = true;
       if (vnode
 	  && resolution_to_local_definition_p (vnode->resolution))

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2015-02-04 23:10                                                   ` H.J. Lu
@ 2015-02-04 23:29                                                     ` H.J. Lu
  2015-02-05 16:57                                                       ` Bernhard Reutner-Fischer
  2015-02-05 18:54                                                       ` Richard Henderson
  0 siblings, 2 replies; 63+ messages in thread
From: H.J. Lu @ 2015-02-04 23:29 UTC (permalink / raw)
  To: Bernhard Reutner-Fischer
  Cc: Sriraman Tallam, Jakub Jelinek, Uros Bizjak, gcc-patches,
	David Li, Cary Coutant

[-- Attachment #1: Type: text/plain, Size: 1859 bytes --]

On Wed, Feb 4, 2015 at 3:10 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Wed, Feb 4, 2015 at 2:47 PM, Bernhard Reutner-Fischer
> <rep.dot.nop@gmail.com> wrote:
>> On February 4, 2015 11:37:01 PM GMT+01:00, "H.J. Lu" <hjl.tools@gmail.com> wrote:
>>>On Wed, Feb 4, 2015 at 1:53 PM, Sriraman Tallam <tmsriram@google.com>
>>>wrote:
>>>> On Wed, Feb 4, 2015 at 10:57 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>>>> On Wed, Feb 4, 2015 at 10:51 AM, Sriraman Tallam
>>><tmsriram@google.com> wrote:
>>>>>> On Wed, Feb 4, 2015 at 10:45 AM, H.J. Lu <hjl.tools@gmail.com>
>>>wrote:
>>>>>>> On Wed, Feb 4, 2015 at 10:42 AM, Jakub Jelinek <jakub@redhat.com>
>>>wrote:
>>>>>>>> On Wed, Feb 04, 2015 at 10:38:48AM -0800, H.J. Lu wrote:
>>>>>>>>> Common symbol should be resolved locally for PIE.
>>>>>>>>
>>>>>>>> binds_local_p yes, binds_to_current_def_p no.
>>>>>>>>
>>>>>>>
>>>>>>> Is SYMBOL_REF_LOCAL_P set to binds_local_p or
>>>>>>> binds_to_current_def_p?
>>>>>>
>>>>>> Looks like binds_local_p:
>>>>>>
>>>>>> varasm.c:
>>>>>> void
>>>>>> default_encode_section_info (tree decl, rtx rtl, int first
>>>ATTRIBUTE_UNUSED)
>>>>>> {
>>>>>>   ...
>>>>>>   if (targetm.binds_local_p (decl))
>>>>>>     flags |= SYMBOL_FLAG_LOCAL;
>>>>>>
>>>>>
>>>>> Why is SYMBOL_REF_LOCAL_P false?
>>>>
>>>> In varasm.c, default_binds_local_p_1
>>>>
>>>>
>>>>  /* Default visibility weak data can be overridden by a strong symbol
>>>>      in another module and so are not local.  */
>>>>   else if (DECL_WEAK (exp)
>>>>   && !resolved_locally)
>>>           ^^^^^^^^^^^^^^^^^^^
>>>Why is resolved_locally false? It should be true for common
>>>symbol when compiling for PIE.
>>>
>>>>     local_p = false;
>>>>
>>>> For weak definition, it is set to false here.
>>
>> Yea and i think this is still wrong and known as
>> http://gcc.gnu.org/PR32219
>>
>

I am testing this patch.



-- 
H.J.

[-- Attachment #2: pr32219.patch --]
[-- Type: text/x-patch, Size: 1605 bytes --]

diff --git a/gcc/varasm.c b/gcc/varasm.c
index eb65b1f..36fd393 100644
--- a/gcc/varasm.c
+++ b/gcc/varasm.c
@@ -6826,11 +6826,17 @@ default_binds_local_p_1 (const_tree exp, int shlib)
       && (TREE_STATIC (exp) || DECL_EXTERNAL (exp)))
     {
       varpool_node *vnode = varpool_node::get (exp);
-      if (vnode && (resolution_local_p (vnode->resolution) || vnode->in_other_partition))
-	resolved_locally = true;
-      if (vnode
-	  && resolution_to_local_definition_p (vnode->resolution))
-	resolved_to_local_def = true;
+      /* If not building shared library, common or initialized symbols
+	 are also resolved locally, regardless they are weak or not.  */
+      if (vnode)
+	{
+	  if ((!shlib && vnode->definition)
+	      || vnode->in_other_partition
+	      || resolution_local_p (vnode->resolution))
+	    resolved_locally = true;
+	  if (resolution_to_local_definition_p (vnode->resolution))
+	    resolved_to_local_def = true;
+	}
     }
   else if (TREE_CODE (exp) == FUNCTION_DECL && TREE_PUBLIC (exp))
     {
@@ -6880,13 +6886,6 @@ default_binds_local_p_1 (const_tree exp, int shlib)
      symbols resolved from other modules.  */
   else if (shlib)
     local_p = false;
-  /* Uninitialized COMMON variable may be unified with symbols
-     resolved from other modules.  */
-  else if (DECL_COMMON (exp)
-	   && !resolved_locally
-	   && (DECL_INITIAL (exp) == NULL
-	       || (!in_lto_p && DECL_INITIAL (exp) == error_mark_node)))
-    local_p = false;
   /* Otherwise we're left with initialized (or non-common) global data
      which is of necessity defined locally.  */
   else

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2015-02-04 23:29                                                     ` H.J. Lu
@ 2015-02-05 16:57                                                       ` Bernhard Reutner-Fischer
  2015-02-05 18:54                                                       ` Richard Henderson
  1 sibling, 0 replies; 63+ messages in thread
From: Bernhard Reutner-Fischer @ 2015-02-05 16:57 UTC (permalink / raw)
  To: H.J. Lu
  Cc: Sriraman Tallam, Jakub Jelinek, Uros Bizjak, gcc-patches,
	David Li, Cary Coutant

On February 5, 2015 12:29:40 AM GMT+01:00, "H.J. Lu" <hjl.tools@gmail.com> wrote:
>On Wed, Feb 4, 2015 at 3:10 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Wed, Feb 4, 2015 at 2:47 PM, Bernhard Reutner-Fischer
>> <rep.dot.nop@gmail.com> wrote:
>>> On February 4, 2015 11:37:01 PM GMT+01:00, "H.J. Lu"
><hjl.tools@gmail.com> wrote:
>>>>On Wed, Feb 4, 2015 at 1:53 PM, Sriraman Tallam
><tmsriram@google.com>
>>>>wrote:
>>>>> On Wed, Feb 4, 2015 at 10:57 AM, H.J. Lu <hjl.tools@gmail.com>
>wrote:
>>>>>> On Wed, Feb 4, 2015 at 10:51 AM, Sriraman Tallam
>>>><tmsriram@google.com> wrote:
>>>>>>> On Wed, Feb 4, 2015 at 10:45 AM, H.J. Lu <hjl.tools@gmail.com>
>>>>wrote:
>>>>>>>> On Wed, Feb 4, 2015 at 10:42 AM, Jakub Jelinek
><jakub@redhat.com>
>>>>wrote:
>>>>>>>>> On Wed, Feb 04, 2015 at 10:38:48AM -0800, H.J. Lu wrote:
>>>>>>>>>> Common symbol should be resolved locally for PIE.
>>>>>>>>>
>>>>>>>>> binds_local_p yes, binds_to_current_def_p no.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Is SYMBOL_REF_LOCAL_P set to binds_local_p or
>>>>>>>> binds_to_current_def_p?
>>>>>>>
>>>>>>> Looks like binds_local_p:
>>>>>>>
>>>>>>> varasm.c:
>>>>>>> void
>>>>>>> default_encode_section_info (tree decl, rtx rtl, int first
>>>>ATTRIBUTE_UNUSED)
>>>>>>> {
>>>>>>>   ...
>>>>>>>   if (targetm.binds_local_p (decl))
>>>>>>>     flags |= SYMBOL_FLAG_LOCAL;
>>>>>>>
>>>>>>
>>>>>> Why is SYMBOL_REF_LOCAL_P false?
>>>>>
>>>>> In varasm.c, default_binds_local_p_1
>>>>>
>>>>>
>>>>>  /* Default visibility weak data can be overridden by a strong
>symbol
>>>>>      in another module and so are not local.  */
>>>>>   else if (DECL_WEAK (exp)
>>>>>   && !resolved_locally)
>>>>           ^^^^^^^^^^^^^^^^^^^
>>>>Why is resolved_locally false? It should be true for common
>>>>symbol when compiling for PIE.
>>>>
>>>>>     local_p = false;
>>>>>
>>>>> For weak definition, it is set to false here.
>>>
>>> Yea and i think this is still wrong and known as
>>> http://gcc.gnu.org/PR32219
>>>
>>
>
>I am testing this patch.

I cannot test it ATM, sorry.

Please make sure to add the test case from the PR32219, comment13  https://gcc.gnu.org/bugzilla/attachment.cgi?id=27716&action=diff#gcc-4_7-branch/gcc/testsuite/gcc.dg/visibility-21.c_sec1

The PR33219 should be marked as 4.8, 4.9, 5.0 regression, too.

Thanks for taking care of this one!


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2015-02-04 23:29                                                     ` H.J. Lu
  2015-02-05 16:57                                                       ` Bernhard Reutner-Fischer
@ 2015-02-05 18:54                                                       ` Richard Henderson
  2015-02-05 19:01                                                         ` H.J. Lu
  1 sibling, 1 reply; 63+ messages in thread
From: Richard Henderson @ 2015-02-05 18:54 UTC (permalink / raw)
  To: H.J. Lu, Bernhard Reutner-Fischer
  Cc: Sriraman Tallam, Jakub Jelinek, Uros Bizjak, gcc-patches,
	David Li, Cary Coutant

On 02/04/2015 03:29 PM, H.J. Lu wrote:
> +++ b/gcc/varasm.c
> @@ -6826,11 +6826,17 @@ default_binds_local_p_1 (const_tree exp, int shlib)
>        && (TREE_STATIC (exp) || DECL_EXTERNAL (exp)))
>      {
>        varpool_node *vnode = varpool_node::get (exp);
> -      if (vnode && (resolution_local_p (vnode->resolution) || vnode->in_other_partition))
> -	resolved_locally = true;
> -      if (vnode
> -	  && resolution_to_local_definition_p (vnode->resolution))
> -	resolved_to_local_def = true;
> +      /* If not building shared library, common or initialized symbols
> +	 are also resolved locally, regardless they are weak or not.  */
> +      if (vnode)
> +	{
> +	  if ((!shlib && vnode->definition)
> +	      || vnode->in_other_partition
> +	      || resolution_local_p (vnode->resolution))
> +	    resolved_locally = true;
> +	  if (resolution_to_local_definition_p (vnode->resolution))
> +	    resolved_to_local_def = true;
> +	}

This is only true if the target uses COPY relocations, which is not universally
true for all ELF targets.

You can't just make this change here in varasm.c and change everyone.


r~

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2015-02-05 18:54                                                       ` Richard Henderson
@ 2015-02-05 19:01                                                         ` H.J. Lu
  2015-02-05 19:59                                                           ` Richard Henderson
  0 siblings, 1 reply; 63+ messages in thread
From: H.J. Lu @ 2015-02-05 19:01 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Bernhard Reutner-Fischer, Sriraman Tallam, Jakub Jelinek,
	Uros Bizjak, gcc-patches, David Li, Cary Coutant

On Thu, Feb 5, 2015 at 10:54 AM, Richard Henderson <rth@redhat.com> wrote:
> On 02/04/2015 03:29 PM, H.J. Lu wrote:
>> +++ b/gcc/varasm.c
>> @@ -6826,11 +6826,17 @@ default_binds_local_p_1 (const_tree exp, int shlib)
>>        && (TREE_STATIC (exp) || DECL_EXTERNAL (exp)))
>>      {
>>        varpool_node *vnode = varpool_node::get (exp);
>> -      if (vnode && (resolution_local_p (vnode->resolution) || vnode->in_other_partition))
>> -     resolved_locally = true;
>> -      if (vnode
>> -       && resolution_to_local_definition_p (vnode->resolution))
>> -     resolved_to_local_def = true;
>> +      /* If not building shared library, common or initialized symbols
>> +      are also resolved locally, regardless they are weak or not.  */
>> +      if (vnode)
>> +     {
>> +       if ((!shlib && vnode->definition)
>> +           || vnode->in_other_partition
>> +           || resolution_local_p (vnode->resolution))
>> +         resolved_locally = true;
>> +       if (resolution_to_local_definition_p (vnode->resolution))
>> +         resolved_to_local_def = true;
>> +     }
>
> This is only true if the target uses COPY relocations, which is not universally
> true for all ELF targets.
>

Can you elaborate why it depends on COPY relocation?  There
is no COPY relocation on x86-64.


-- 
H.J.
---
[hjl@gnu-6 copyreloc-3]$ cat x.c
__attribute__((weak))
int foo;

extern void bar (void);

int main()
{
  if (foo != 0)
    __builtin_abort();
  bar ();
  if (foo != 30)
    __builtin_abort();
  return 0;
}
[hjl@gnu-6 copyreloc-3]$ cat bar.c
int foo = -1;

void
bar ()
{
  foo = 30;
}
[hjl@gnu-6 copyreloc-3]$ make x
gcc -pie -fpie -O3 -g -fuse-ld=gold -fpie    -c -o x.o x.c
gcc -pie -fpie -O3 -g -fuse-ld=gold -fpic    -c -o bar.o bar.c
gcc -pie -fpie  -shared -o libbar.so bar.o
gcc -pie -fpie -O3 -g -fuse-ld=gold -o x x.o libbar.so -Wl,-R,.
[hjl@gnu-6 copyreloc-3]$ ./x
[hjl@gnu-6 copyreloc-3]$ readelf -rW x.o | grep foo
0000000000000004  0000001100000009 R_X86_64_GOTPCREL
0000000000000000 foo - 4
0000000000000079  0000001100000001 R_X86_64_64
0000000000000000 foo + 0
[hjl@gnu-6 copyreloc-3]$ readelf -rW x | grep foo
[hjl@gnu-6 copyreloc-3]$ readelf -rW libbar.so | grep foo
00000000002008c8  0000000900000006 R_X86_64_GLOB_DAT
0000000000200900 foo + 0
[hjl@gnu-6 copyreloc-3]$

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2015-02-05 19:01                                                         ` H.J. Lu
@ 2015-02-05 19:59                                                           ` Richard Henderson
  2015-02-05 22:05                                                             ` Sriraman Tallam
  0 siblings, 1 reply; 63+ messages in thread
From: Richard Henderson @ 2015-02-05 19:59 UTC (permalink / raw)
  To: H.J. Lu
  Cc: Bernhard Reutner-Fischer, Sriraman Tallam, Jakub Jelinek,
	Uros Bizjak, gcc-patches, David Li, Cary Coutant

On 02/05/2015 11:01 AM, H.J. Lu wrote:
> Can you elaborate why it depends on COPY relocation?  There
> is no COPY relocation on x86-64.

Ho hum, we appear to have switched topics mid-thread.

I agree that we cannot override a weak symbol in the executable with even a
non-weak symbol in a shared library.


r~

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2015-02-05 19:59                                                           ` Richard Henderson
@ 2015-02-05 22:05                                                             ` Sriraman Tallam
  2015-02-05 22:47                                                               ` H.J. Lu
  2015-02-06 16:25                                                               ` H.J. Lu
  0 siblings, 2 replies; 63+ messages in thread
From: Sriraman Tallam @ 2015-02-05 22:05 UTC (permalink / raw)
  To: Richard Henderson
  Cc: H.J. Lu, Bernhard Reutner-Fischer, Jakub Jelinek, Uros Bizjak,
	gcc-patches, David Li, Cary Coutant

On Thu, Feb 5, 2015 at 11:59 AM, Richard Henderson <rth@redhat.com> wrote:
> On 02/05/2015 11:01 AM, H.J. Lu wrote:
>> Can you elaborate why it depends on COPY relocation?  There
>> is no COPY relocation on x86-64.
>
> Ho hum, we appear to have switched topics mid-thread.
>
> I agree that we cannot override a weak symbol in the executable with even a
> non-weak symbol in a shared library.

Hi HJ,

   Is your patch supposed to fix weak symbols too?  Will
SYMBOL_REF_LOCAL_P evaluate to true for weak defined symbols with this
patch?  I tested this in gcc-4_9 and it didnt seem to do that.

Thanks
Sri

>
>
> r~

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2015-02-05 22:05                                                             ` Sriraman Tallam
@ 2015-02-05 22:47                                                               ` H.J. Lu
  2015-02-05 22:48                                                                 ` Sriraman Tallam
  2015-02-06 16:25                                                               ` H.J. Lu
  1 sibling, 1 reply; 63+ messages in thread
From: H.J. Lu @ 2015-02-05 22:47 UTC (permalink / raw)
  To: Sriraman Tallam
  Cc: Richard Henderson, Bernhard Reutner-Fischer, Jakub Jelinek,
	Uros Bizjak, gcc-patches, David Li, Cary Coutant

On Thu, Feb 5, 2015 at 2:05 PM, Sriraman Tallam <tmsriram@google.com> wrote:
> On Thu, Feb 5, 2015 at 11:59 AM, Richard Henderson <rth@redhat.com> wrote:
>> On 02/05/2015 11:01 AM, H.J. Lu wrote:
>>> Can you elaborate why it depends on COPY relocation?  There
>>> is no COPY relocation on x86-64.
>>
>> Ho hum, we appear to have switched topics mid-thread.
>>
>> I agree that we cannot override a weak symbol in the executable with even a
>> non-weak symbol in a shared library.
>
> Hi HJ,
>
>    Is your patch supposed to fix weak symbols too?  Will
> SYMBOL_REF_LOCAL_P evaluate to true for weak defined symbols with this
> patch?  I tested this in gcc-4_9 and it didnt seem to do that.

I am working on a comprehensive patch.  I will post it
after testing is finished.

-- 
H.J.
--
[hjl@gnu-6 copyreloc-3]$  cat initweak.i
__attribute__((weak))
int xxxxxxxxxxxx = -1;

int
foo ()
{
  return xxxxxxxxxxxx;
}
[hjl@gnu-6 copyreloc-3]$ cat commonweak.i
__attribute__((weak))
int xxxxxxxxxxxx;

int
foo ()
{
  return xxxxxxxxxxxx;
}
[hjl@gnu-6 copyreloc-3]$ make initweak.s commonweak.s
/export/build/gnu/gcc-x32/build-x86_64-linux/gcc/xgcc
-B/export/build/gnu/gcc-x32/build-x86_64-linux/gcc/ -pie -fpie -O3
-fuse-ld=gold -S initweak.i
/export/build/gnu/gcc-x32/build-x86_64-linux/gcc/xgcc
-B/export/build/gnu/gcc-x32/build-x86_64-linux/gcc/ -pie -fpie -O3
-fuse-ld=gold -S commonweak.i
[hjl@gnu-6 copyreloc-3]$ cat commonweak.s initweak.s
.file "commonweak.i"
.section .text.unlikely,"ax",@progbits
.LCOLDB0:
.text
.LHOTB0:
.p2align 4,,15
.globl foo
.type foo, @function
foo:
.LFB0:
.cfi_startproc
movl xxxxxxxxxxxx(%rip), %eax
ret
.cfi_endproc
.LFE0:
.size foo, .-foo
.section .text.unlikely
.LCOLDE0:
.text
.LHOTE0:
.weak xxxxxxxxxxxx
.bss
.align 4
.type xxxxxxxxxxxx, @object
.size xxxxxxxxxxxx, 4
xxxxxxxxxxxx:
.zero 4
.ident "GCC: (GNU) 5.0.0 20150205 (experimental)"
.section .note.GNU-stack,"",@progbits
.file "initweak.i"
.section .text.unlikely,"ax",@progbits
.LCOLDB0:
.text
.LHOTB0:
.p2align 4,,15
.globl foo
.type foo, @function
foo:
.LFB0:
.cfi_startproc
movl xxxxxxxxxxxx(%rip), %eax
ret
.cfi_endproc
.LFE0:
.size foo, .-foo
.section .text.unlikely
.LCOLDE0:
.text
.LHOTE0:
.weak xxxxxxxxxxxx
.data
.align 4
.type xxxxxxxxxxxx, @object
.size xxxxxxxxxxxx, 4
xxxxxxxxxxxx:
.long -1
.ident "GCC: (GNU) 5.0.0 20150205 (experimental)"
.section .note.GNU-stack,"",@progbits
[hjl@gnu-6 copyreloc-3]$

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2015-02-05 22:47                                                               ` H.J. Lu
@ 2015-02-05 22:48                                                                 ` Sriraman Tallam
  0 siblings, 0 replies; 63+ messages in thread
From: Sriraman Tallam @ 2015-02-05 22:48 UTC (permalink / raw)
  To: H.J. Lu
  Cc: Richard Henderson, Bernhard Reutner-Fischer, Jakub Jelinek,
	Uros Bizjak, gcc-patches, David Li, Cary Coutant

On Thu, Feb 5, 2015 at 2:23 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Thu, Feb 5, 2015 at 2:05 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>> On Thu, Feb 5, 2015 at 11:59 AM, Richard Henderson <rth@redhat.com> wrote:
>>> On 02/05/2015 11:01 AM, H.J. Lu wrote:
>>>> Can you elaborate why it depends on COPY relocation?  There
>>>> is no COPY relocation on x86-64.
>>>
>>> Ho hum, we appear to have switched topics mid-thread.
>>>
>>> I agree that we cannot override a weak symbol in the executable with even a
>>> non-weak symbol in a shared library.
>>
>> Hi HJ,
>>
>>    Is your patch supposed to fix weak symbols too?  Will
>> SYMBOL_REF_LOCAL_P evaluate to true for weak defined symbols with this
>> patch?  I tested this in gcc-4_9 and it didnt seem to do that.
>
> I am working on a comprehensive patch.  I will post it
> after testing is finished.


Thanks!

Sri

>
> --
> H.J.
> --
> [hjl@gnu-6 copyreloc-3]$  cat initweak.i
> __attribute__((weak))
> int xxxxxxxxxxxx = -1;
>
> int
> foo ()
> {
>   return xxxxxxxxxxxx;
> }
> [hjl@gnu-6 copyreloc-3]$ cat commonweak.i
> __attribute__((weak))
> int xxxxxxxxxxxx;
>
> int
> foo ()
> {
>   return xxxxxxxxxxxx;
> }
> [hjl@gnu-6 copyreloc-3]$ make initweak.s commonweak.s
> /export/build/gnu/gcc-x32/build-x86_64-linux/gcc/xgcc
> -B/export/build/gnu/gcc-x32/build-x86_64-linux/gcc/ -pie -fpie -O3
> -fuse-ld=gold -S initweak.i
> /export/build/gnu/gcc-x32/build-x86_64-linux/gcc/xgcc
> -B/export/build/gnu/gcc-x32/build-x86_64-linux/gcc/ -pie -fpie -O3
> -fuse-ld=gold -S commonweak.i
> [hjl@gnu-6 copyreloc-3]$ cat commonweak.s initweak.s
> .file "commonweak.i"
> .section .text.unlikely,"ax",@progbits
> .LCOLDB0:
> .text
> .LHOTB0:
> .p2align 4,,15
> .globl foo
> .type foo, @function
> foo:
> .LFB0:
> .cfi_startproc
> movl xxxxxxxxxxxx(%rip), %eax
> ret
> .cfi_endproc
> .LFE0:
> .size foo, .-foo
> .section .text.unlikely
> .LCOLDE0:
> .text
> .LHOTE0:
> .weak xxxxxxxxxxxx
> .bss
> .align 4
> .type xxxxxxxxxxxx, @object
> .size xxxxxxxxxxxx, 4
> xxxxxxxxxxxx:
> .zero 4
> .ident "GCC: (GNU) 5.0.0 20150205 (experimental)"
> .section .note.GNU-stack,"",@progbits
> .file "initweak.i"
> .section .text.unlikely,"ax",@progbits
> .LCOLDB0:
> .text
> .LHOTB0:
> .p2align 4,,15
> .globl foo
> .type foo, @function
> foo:
> .LFB0:
> .cfi_startproc
> movl xxxxxxxxxxxx(%rip), %eax
> ret
> .cfi_endproc
> .LFE0:
> .size foo, .-foo
> .section .text.unlikely
> .LCOLDE0:
> .text
> .LHOTE0:
> .weak xxxxxxxxxxxx
> .data
> .align 4
> .type xxxxxxxxxxxx, @object
> .size xxxxxxxxxxxx, 4
> xxxxxxxxxxxx:
> .long -1
> .ident "GCC: (GNU) 5.0.0 20150205 (experimental)"
> .section .note.GNU-stack,"",@progbits
> [hjl@gnu-6 copyreloc-3]$

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2015-02-05 22:05                                                             ` Sriraman Tallam
  2015-02-05 22:47                                                               ` H.J. Lu
@ 2015-02-06 16:25                                                               ` H.J. Lu
  1 sibling, 0 replies; 63+ messages in thread
From: H.J. Lu @ 2015-02-06 16:25 UTC (permalink / raw)
  To: Sriraman Tallam
  Cc: Richard Henderson, Bernhard Reutner-Fischer, Jakub Jelinek,
	Uros Bizjak, gcc-patches, David Li, Cary Coutant

On Thu, Feb 5, 2015 at 2:05 PM, Sriraman Tallam <tmsriram@google.com> wrote:
> On Thu, Feb 5, 2015 at 11:59 AM, Richard Henderson <rth@redhat.com> wrote:
>> On 02/05/2015 11:01 AM, H.J. Lu wrote:
>>> Can you elaborate why it depends on COPY relocation?  There
>>> is no COPY relocation on x86-64.
>>
>> Ho hum, we appear to have switched topics mid-thread.
>>
>> I agree that we cannot override a weak symbol in the executable with even a
>> non-weak symbol in a shared library.
>
> Hi HJ,
>
>    Is your patch supposed to fix weak symbols too?  Will
> SYMBOL_REF_LOCAL_P evaluate to true for weak defined symbols with this
> patch?  I tested this in gcc-4_9 and it didnt seem to do that.

A patch is posted at

https://gcc.gnu.org/ml/gcc-patches/2015-02/msg00410.html

-- 
H.J.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2014-12-04 16:46             ` H.J. Lu
  2014-12-04 19:32               ` Uros Bizjak
  2015-02-03 19:25               ` Sriraman Tallam
@ 2015-02-27 23:39               ` H.J. Lu
  2015-02-27 23:46                 ` H.J. Lu
  2 siblings, 1 reply; 63+ messages in thread
From: H.J. Lu @ 2015-02-27 23:39 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: gcc-patches, Sriraman Tallam, Jakub Jelinek

On Thu, Dec 4, 2014 at 8:46 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Thu, Dec 4, 2014 at 4:44 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
>> On Wed, Dec 3, 2014 at 10:35 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>
>>>>>>>> It would probably help reviewers if you pointed to actual path
>>>>>>>> submission [1], which unfortunately contains the explanation in the
>>>>>>>> patch itself [2], which further explains that this functionality is
>>>>>>>> currently only supported with gold, patched with [3].
>>>>>>>>
>>>>>>>> [1] https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00645.html
>>>>>>>> [2] https://gcc.gnu.org/ml/gcc-patches/2014-09/txt2CHtu81P1O.txt
>>>>>>>> [3] https://sourceware.org/ml/binutils/2014-05/msg00092.html
>>>>>>>>
>>>>>>>> After a bit of the above detective work, I think that new gcc option
>>>>>>>> is not necessary. The configure should detect if new functionality is
>>>>>>>> supported in the linker, and auto-configure gcc to use it when
>>>>>>>> appropriate.
>>>>>>>
>>>>>>> I think GCC option is needed since one can use -fuse-ld= to
>>>>>>> change linker.
>>>>>>
>>>>>> IMO, nobody will use this highly special x86_64-only option. It would
>>>>>> be best for gnu-ld to reach feature parity with gold as far as this
>>>>>> functionality is concerned. In this case, the optimization would be
>>>>>> auto-configured, and would fire automatically, without any user
>>>>>> intervention.
>>>>>>
>>>>>
>>>>> Let's do it.  I implemented the same feature in bfd linker on both
>>>>> master and 2.25 branch.
>>>>>
>>>>
>>>> +bool
>>>> +i386_binds_local_p (const_tree exp)
>>>> +{
>>>> +  /* Globals marked extern are treated as local when linker copy relocations
>>>> +     support is available with -f{pie|PIE}.  */
>>>> +  if (TARGET_64BIT && ix86_copyrelocs && flag_pie
>>>> +      && TREE_CODE (exp) == VAR_DECL
>>>> +      && DECL_EXTERNAL (exp) && !DECL_WEAK (exp))
>>>> +    return true;
>>>> +  return default_binds_local_p (exp);
>>>> +}
>>>> +
>>>>
>>>> It returns true with -fPIE and false without -fPIE.  It is lying to compiler.
>>>> Maybe legitimate_pic_address_disp_p is a better place.
>>
>> Agreed.
>>
>>> Something like this?
>>
>> Yes.
>>
>> OK, if Jakub doesn't have any objections here. Please also add
>> Sriraman as author to ChangeLog entry.
>>
>> Thanks,
>> Uros.
>
> Here is the patch.   OK to install?
>
> Thanks.
>
> --
> H.J.
> ---
> Normally, with -fPIE/-fpie, GCC accesses globals that are extern to the
> module using the GOT.  This is two instructions, one to get the address
> of the global from the GOT and the other to get the value.  If it turns
> out that the global gets defined in the executable at link-time, it still
> needs to go through the GOT as it is too late then to generate a direct
> access.
>
> Examples:
>
> foo.cc
> ------
> int a_glob;
> int main () {
>   return a_glob; // defined in this file
> }
>
> With -O2 -fpie -pie, the generated code directly accesses the global via
> PC-relative insn:
>
> 5e0   <main>:
>    mov    0x165a(%rip),%eax        # 1c40 <a_glob>
>
> foo.cc
> ------
>
> extern int a_glob;
> int main () {
>   return a_glob; // defined in this file
> }
>
> With -O2 -fpie -pie, the generated code accesses global via GOT using
> two memory loads:
>
> 6f0  <main>:
>    mov    0x1609(%rip),%rax   # 1d00 <_DYNAMIC+0x230>
>    mov    (%rax),%eax
>
> This is true even if in the latter case the global was defined in the
> executable through a different file.
>
> Some experiments on google benchmarks shows that the extra memory loads
> affects performance by 1% to 5%.
>
> Solution - Copy Relocations:
>
> When the linker supports copy relocations, GCC can always assume that
> the global will be defined in the executable.  For globals that are truly
> extern (come from shared objects), the linker will create copy relocations
> and have them defined in the executable. Result is that no global access
> needs to go through the GOT and hence improves performance.
>
> This optimization only applies to undefined, non-weak global data.
> Undefined, weak global data access still must go through the GOT.
>
> This patch checks if linker supports PIE with copy reloc, which is
> enabled in gold and bfd linker in bininutils 2.25, at configure time
> and enables this optimization if the linker support is available.
>
> gcc/
>
> * configure.ac (HAVE_LD_PIE_COPYRELOC): Defined to 1 if
> Linux/x86-64 linker supports PIE with copy reloc.
> * config.in: Regenerated.
> * configure: Likewise.
>
> * config/i386/i386.c (legitimate_pic_address_disp_p): Allow
> pc-relative address for undefined, non-weak, non-function
> symbol reference in 64-bit PIE if linker supports PIE with
> copy reloc.
>
> * doc/sourcebuild.texi: Document pie_copyreloc target.
>
> gcc/testsuite/
>
> * gcc.target/i386/pie-copyrelocs-1.c: New test.
> * gcc.target/i386/pie-copyrelocs-2.c: Likewise.
> * gcc.target/i386/pie-copyrelocs-3.c: Likewise.
> * gcc.target/i386/pie-copyrelocs-4.c: Likewise.
>
> * lib/target-supports.exp (check_effective_target_pie_copyreloc):
> New procedure.

This caused:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65248

Should we turn it off by default?


-- 
H.J.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2015-02-27 23:39               ` H.J. Lu
@ 2015-02-27 23:46                 ` H.J. Lu
  0 siblings, 0 replies; 63+ messages in thread
From: H.J. Lu @ 2015-02-27 23:46 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: gcc-patches, Sriraman Tallam, Jakub Jelinek

On Fri, Feb 27, 2015 at 3:23 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Thu, Dec 4, 2014 at 8:46 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Thu, Dec 4, 2014 at 4:44 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
>>> On Wed, Dec 3, 2014 at 10:35 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>>
>>>>>>>>> It would probably help reviewers if you pointed to actual path
>>>>>>>>> submission [1], which unfortunately contains the explanation in the
>>>>>>>>> patch itself [2], which further explains that this functionality is
>>>>>>>>> currently only supported with gold, patched with [3].
>>>>>>>>>
>>>>>>>>> [1] https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00645.html
>>>>>>>>> [2] https://gcc.gnu.org/ml/gcc-patches/2014-09/txt2CHtu81P1O.txt
>>>>>>>>> [3] https://sourceware.org/ml/binutils/2014-05/msg00092.html
>>>>>>>>>
>>>>>>>>> After a bit of the above detective work, I think that new gcc option
>>>>>>>>> is not necessary. The configure should detect if new functionality is
>>>>>>>>> supported in the linker, and auto-configure gcc to use it when
>>>>>>>>> appropriate.
>>>>>>>>
>>>>>>>> I think GCC option is needed since one can use -fuse-ld= to
>>>>>>>> change linker.
>>>>>>>
>>>>>>> IMO, nobody will use this highly special x86_64-only option. It would
>>>>>>> be best for gnu-ld to reach feature parity with gold as far as this
>>>>>>> functionality is concerned. In this case, the optimization would be
>>>>>>> auto-configured, and would fire automatically, without any user
>>>>>>> intervention.
>>>>>>>
>>>>>>
>>>>>> Let's do it.  I implemented the same feature in bfd linker on both
>>>>>> master and 2.25 branch.
>>>>>>
>>>>>
>>>>> +bool
>>>>> +i386_binds_local_p (const_tree exp)
>>>>> +{
>>>>> +  /* Globals marked extern are treated as local when linker copy relocations
>>>>> +     support is available with -f{pie|PIE}.  */
>>>>> +  if (TARGET_64BIT && ix86_copyrelocs && flag_pie
>>>>> +      && TREE_CODE (exp) == VAR_DECL
>>>>> +      && DECL_EXTERNAL (exp) && !DECL_WEAK (exp))
>>>>> +    return true;
>>>>> +  return default_binds_local_p (exp);
>>>>> +}
>>>>> +
>>>>>
>>>>> It returns true with -fPIE and false without -fPIE.  It is lying to compiler.
>>>>> Maybe legitimate_pic_address_disp_p is a better place.
>>>
>>> Agreed.
>>>
>>>> Something like this?
>>>
>>> Yes.
>>>
>>> OK, if Jakub doesn't have any objections here. Please also add
>>> Sriraman as author to ChangeLog entry.
>>>
>>> Thanks,
>>> Uros.
>>
>> Here is the patch.   OK to install?
>>
>> Thanks.
>>
>> --
>> H.J.
>> ---
>> Normally, with -fPIE/-fpie, GCC accesses globals that are extern to the
>> module using the GOT.  This is two instructions, one to get the address
>> of the global from the GOT and the other to get the value.  If it turns
>> out that the global gets defined in the executable at link-time, it still
>> needs to go through the GOT as it is too late then to generate a direct
>> access.
>>
>> Examples:
>>
>> foo.cc
>> ------
>> int a_glob;
>> int main () {
>>   return a_glob; // defined in this file
>> }
>>
>> With -O2 -fpie -pie, the generated code directly accesses the global via
>> PC-relative insn:
>>
>> 5e0   <main>:
>>    mov    0x165a(%rip),%eax        # 1c40 <a_glob>
>>
>> foo.cc
>> ------
>>
>> extern int a_glob;
>> int main () {
>>   return a_glob; // defined in this file
>> }
>>
>> With -O2 -fpie -pie, the generated code accesses global via GOT using
>> two memory loads:
>>
>> 6f0  <main>:
>>    mov    0x1609(%rip),%rax   # 1d00 <_DYNAMIC+0x230>
>>    mov    (%rax),%eax
>>
>> This is true even if in the latter case the global was defined in the
>> executable through a different file.
>>
>> Some experiments on google benchmarks shows that the extra memory loads
>> affects performance by 1% to 5%.
>>
>> Solution - Copy Relocations:
>>
>> When the linker supports copy relocations, GCC can always assume that
>> the global will be defined in the executable.  For globals that are truly
>> extern (come from shared objects), the linker will create copy relocations
>> and have them defined in the executable. Result is that no global access
>> needs to go through the GOT and hence improves performance.
>>
>> This optimization only applies to undefined, non-weak global data.
>> Undefined, weak global data access still must go through the GOT.
>>
>> This patch checks if linker supports PIE with copy reloc, which is
>> enabled in gold and bfd linker in bininutils 2.25, at configure time
>> and enables this optimization if the linker support is available.
>>
>> gcc/
>>
>> * configure.ac (HAVE_LD_PIE_COPYRELOC): Defined to 1 if
>> Linux/x86-64 linker supports PIE with copy reloc.
>> * config.in: Regenerated.
>> * configure: Likewise.
>>
>> * config/i386/i386.c (legitimate_pic_address_disp_p): Allow
>> pc-relative address for undefined, non-weak, non-function
>> symbol reference in 64-bit PIE if linker supports PIE with
>> copy reloc.
>>
>> * doc/sourcebuild.texi: Document pie_copyreloc target.
>>
>> gcc/testsuite/
>>
>> * gcc.target/i386/pie-copyrelocs-1.c: New test.
>> * gcc.target/i386/pie-copyrelocs-2.c: Likewise.
>> * gcc.target/i386/pie-copyrelocs-3.c: Likewise.
>> * gcc.target/i386/pie-copyrelocs-4.c: Likewise.
>>
>> * lib/target-supports.exp (check_effective_target_pie_copyreloc):
>> New procedure.
>
> This caused:
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65248
>
> Should we turn it off by default?
>

Or we can provide a command line option to turn it off.


-- 
H.J.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2014-12-04 22:19 Dominique Dhumieres
@ 2014-12-04 23:54 ` H.J. Lu
  0 siblings, 0 replies; 63+ messages in thread
From: H.J. Lu @ 2014-12-04 23:54 UTC (permalink / raw)
  To: Dominique Dhumieres; +Cc: GCC Patches

On Thu, Dec 4, 2014 at 2:19 PM, Dominique Dhumieres <dominiq@lps.ens.fr> wrote:
>> Normally, with -fPIE/-fpie, GCC accesses globals that are extern to the
>> module using the GOT.  This is two instructions, one to get the address
>> of the global from the GOT and the other to get the value.  If it turns
>> out that the global gets defined in the executable at link-time, it still
>> needs to go through the GOT as it is too late then to generate a direct
>> access.
>>
>> Examples:
>>
>> foo.cc
>> ------
>> int a_glob;
>> int main () {
>>   return a_glob; // defined in this file
>> }
>>
>> With -O2 -fpie -pie, the generated code directly accesses the global via
>> PC-relative insn:
>>
>> 5e0   <main>:
>>    mov    0x165a(%rip),%eax        # 1c40 <a_glob>
>>
>> foo.cc
>> ------
>>
>> extern int a_glob;
>> int main () {
>>   return a_glob; // defined in this file
>> }
>>
>> With -O2 -fpie -pie, the generated code accesses global via GOT using
>> two memory loads:
>>
>> 6f0  <main>:
>>    mov    0x1609(%rip),%rax   # 1d00 <_DYNAMIC+0x230>
>>    mov    (%rax),%eax
>>
>> This is true even if in the latter case the global was defined in the
>> executable through a different file.
>>
>> Some experiments on google benchmarks shows that the extra memory loads
>> affects performance by 1% to 5%.
>>
>> Solution - Copy Relocations:
>>
>> When the linker supports copy relocations, GCC can always assume that
>> the global will be defined in the executable.  For globals that are truly
>> extern (come from shared objects), the linker will create copy relocations
>> and have them defined in the executable. Result is that no global access
>> needs to go through the GOT and hence improves performance.
>>
>> This optimization only applies to undefined, non-weak global data.
>> Undefined, weak global data access still must go through the GOT.
>>
>> This patch checks if linker supports PIE with copy reloc, which is
>> enabled in gold and bfd linker in bininutils 2.25, at configure time
>> and enables this optimization if the linker support is available.
>>
>> gcc/
>>
>> * configure.ac (HAVE_LD_PIE_COPYRELOC): Defined to 1 if
>> Linux/x86-64 linker supports PIE with copy reloc.
>> * config.in: Regenerated.
>> * configure: Likewise.
>>
>> * config/i386/i386.c (legitimate_pic_address_disp_p): Allow
>> pc-relative address for undefined, non-weak, non-function
>> symbol reference in 64-bit PIE if linker supports PIE with
>> copy reloc.
>>
>> * doc/sourcebuild.texi: Document pie_copyreloc target.
>>
>> gcc/testsuite/
>>
>> * gcc.target/i386/pie-copyrelocs-1.c: New test.
>> * gcc.target/i386/pie-copyrelocs-2.c: Likewise.
>> * gcc.target/i386/pie-copyrelocs-3.c: Likewise.
>> * gcc.target/i386/pie-copyrelocs-4.c: Likewise.
>>
>> * lib/target-supports.exp (check_effective_target_pie_copyreloc):
>> New procedure.
>
> It caused pr64189.
>

I checked in this as an obvious fix.  Sorry for the inconvenience.

-- 
H.J.
---
Index: ChangeLog
===================================================================
--- ChangeLog (revision 218407)
+++ ChangeLog (working copy)
@@ -1,3 +1,9 @@
+2014-12-04  H.J. Lu  <hongjiu.lu@intel.com>
+
+ PR bootstrap/64189
+ * configure.ac (HAVE_LD_PIE_COPYRELOC): Always define.
+ * configure: Regenerated.
+
 2014-12-04  Manuel López-Ibáñez  <manu@gcc.gnu.org>

  * diagnostic.c (diagnostic_color_init): New.
Index: configure
===================================================================
--- configure (revision 218407)
+++ configure (working copy)
@@ -27063,12 +27063,12 @@ EOF
       ;;
     esac
   fi
+fi

 cat >>confdefs.h <<_ACEOF
 #define HAVE_LD_PIE_COPYRELOC `if test x"$gcc_cv_ld_pie_copyreloc" =
xyes; then echo 1; else echo 0; fi`
 _ACEOF

-fi
 { $as_echo "$as_me:${as_lineno-$LINENO}: result: $gcc_cv_ld_pie_copyreloc" >&5
 $as_echo "$gcc_cv_ld_pie_copyreloc" >&6; }

Index: configure.ac
===================================================================
--- configure.ac (revision 218407)
+++ configure.ac (working copy)
@@ -4730,10 +4730,10 @@ EOF
       ;;
     esac
   fi
-  AC_DEFINE_UNQUOTED(HAVE_LD_PIE_COPYRELOC,
-    [`if test x"$gcc_cv_ld_pie_copyreloc" = xyes; then echo 1; else
echo 0; fi`],
-    [Define 0/1 if your linker supports -pie option with copy reloc.])
 fi
+AC_DEFINE_UNQUOTED(HAVE_LD_PIE_COPYRELOC,
+  [`if test x"$gcc_cv_ld_pie_copyreloc" = xyes; then echo 1; else echo 0; fi`],
+  [Define 0/1 if your linker supports -pie option with copy reloc.])
 AC_MSG_RESULT($gcc_cv_ld_pie_copyreloc)

 AC_MSG_CHECKING(linker EH-compatible garbage collection of sections)

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
@ 2014-12-04 22:19 Dominique Dhumieres
  2014-12-04 23:54 ` H.J. Lu
  0 siblings, 1 reply; 63+ messages in thread
From: Dominique Dhumieres @ 2014-12-04 22:19 UTC (permalink / raw)
  To: gcc-patches; +Cc: hjl.tools

> Normally, with -fPIE/-fpie, GCC accesses globals that are extern to the
> module using the GOT.  This is two instructions, one to get the address
> of the global from the GOT and the other to get the value.  If it turns
> out that the global gets defined in the executable at link-time, it still
> needs to go through the GOT as it is too late then to generate a direct
> access.
>
> Examples:
>
> foo.cc
> ------
> int a_glob;
> int main () {
>   return a_glob; // defined in this file
> }
>
> With -O2 -fpie -pie, the generated code directly accesses the global via
> PC-relative insn:
>
> 5e0   <main>:
>    mov    0x165a(%rip),%eax        # 1c40 <a_glob>
>
> foo.cc
> ------
>
> extern int a_glob;
> int main () {
>   return a_glob; // defined in this file
> }
>
> With -O2 -fpie -pie, the generated code accesses global via GOT using
> two memory loads:
>
> 6f0  <main>:
>    mov    0x1609(%rip),%rax   # 1d00 <_DYNAMIC+0x230>
>    mov    (%rax),%eax
>
> This is true even if in the latter case the global was defined in the
> executable through a different file.
>
> Some experiments on google benchmarks shows that the extra memory loads
> affects performance by 1% to 5%.
>
> Solution - Copy Relocations:
>
> When the linker supports copy relocations, GCC can always assume that
> the global will be defined in the executable.  For globals that are truly
> extern (come from shared objects), the linker will create copy relocations
> and have them defined in the executable. Result is that no global access
> needs to go through the GOT and hence improves performance.
>
> This optimization only applies to undefined, non-weak global data.
> Undefined, weak global data access still must go through the GOT.
>
> This patch checks if linker supports PIE with copy reloc, which is
> enabled in gold and bfd linker in bininutils 2.25, at configure time
> and enables this optimization if the linker support is available.
>
> gcc/
>
> * configure.ac (HAVE_LD_PIE_COPYRELOC): Defined to 1 if
> Linux/x86-64 linker supports PIE with copy reloc.
> * config.in: Regenerated.
> * configure: Likewise.
>
> * config/i386/i386.c (legitimate_pic_address_disp_p): Allow
> pc-relative address for undefined, non-weak, non-function
> symbol reference in 64-bit PIE if linker supports PIE with
> copy reloc.
>
> * doc/sourcebuild.texi: Document pie_copyreloc target.
>
> gcc/testsuite/
>
> * gcc.target/i386/pie-copyrelocs-1.c: New test.
> * gcc.target/i386/pie-copyrelocs-2.c: Likewise.
> * gcc.target/i386/pie-copyrelocs-3.c: Likewise.
> * gcc.target/i386/pie-copyrelocs-4.c: Likewise.
>
> * lib/target-supports.exp (check_effective_target_pie_copyreloc):
> New procedure.

It caused pr64189.

Dominique.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2014-09-08 22:19         ` Sriraman Tallam
  2014-09-19 21:11           ` Sriraman Tallam
@ 2014-12-02 19:06           ` H.J. Lu
  1 sibling, 0 replies; 63+ messages in thread
From: H.J. Lu @ 2014-12-02 19:06 UTC (permalink / raw)
  To: Sriraman Tallam
  Cc: Richard Henderson, GCC Patches, David Li, Cary Coutant,
	Ian Lance Taylor, Paul Pluzhnikov

On Mon, Sep 8, 2014 at 3:19 PM, Sriraman Tallam <tmsriram@google.com> wrote:
> On Tue, Sep 2, 2014 at 1:40 PM, Richard Henderson <rth@redhat.com> wrote:
>> On 06/20/2014 05:17 PM, Sriraman Tallam wrote:
>>> Index: config/i386/i386.c
>>> ===================================================================
>>> --- config/i386/i386.c        (revision 211826)
>>> +++ config/i386/i386.c        (working copy)
>>> @@ -12691,7 +12691,9 @@ legitimate_pic_address_disp_p (rtx disp)
>>>               return true;
>>>           }
>>>         else if (!SYMBOL_REF_FAR_ADDR_P (op0)
>>> -                && SYMBOL_REF_LOCAL_P (op0)
>>> +                && (SYMBOL_REF_LOCAL_P (op0)
>>> +                    || (TARGET_64BIT && ix86_copyrelocs && flag_pie
>>> +                        && !SYMBOL_REF_FUNCTION_P (op0)))
>>>                  && ix86_cmodel != CM_LARGE_PIC)
>>>           return true;
>>>         break;
>>
>> This is the wrong place to patch.
>>
>> You ought to be adjusting SYMBOL_REF_LOCAL_P, by providing a modified
>> TARGET_BINDS_LOCAL_P.
>
> I have done this in the new attached patch, I added a new function
> i386_binds_local_p which will check for this and call
> default_binds_local_p otherwise.
>
>>
>> Note in particular that I believe that you are doing the wrong thing with weak
>> and COMMON symbols, in that you probably ought not force a copy reloc there.
>
> I added an extra check to not do this for WEAK symbols. I also added a
> check for DECL_EXTERNAL so I believe this will also not be called for
> COMMON symbols.
>
>>
>> Note the complexity of default_binds_local_p_1, and the fact that all you
>> really want to modify is
>>
>>   /* If PIC, then assume that any global name can be overridden by
>>      symbols resolved from other modules.  */
>>   else if (shlib)
>>     local_p = false;
>>
>> near the bottom of that function.
>
> I did not understand what you mean here? Were you suggesting an
> alternative way of doing this?
>
> Thanks for reviewing

I'd like to see a few testcases:

1. One test to show it does the right thing for external variable.
2. One test to show it does the right thing for common symbol.
3. One test to show it does the right thing for weak symbol.
4. One test to show it does the right thing for external function.

Thanks.

-- 
H.J.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2014-11-10 23:35                 ` Sriraman Tallam
@ 2014-12-02 18:01                   ` Sriraman Tallam
  0 siblings, 0 replies; 63+ messages in thread
From: Sriraman Tallam @ 2014-12-02 18:01 UTC (permalink / raw)
  To: Richard Henderson
  Cc: GCC Patches, David Li, Cary Coutant, Ian Lance Taylor, Paul Pluzhnikov

Ping.

On Mon, Nov 10, 2014 at 3:22 PM, Sriraman Tallam <tmsriram@google.com> wrote:
> Ping.
>
> On Mon, Oct 6, 2014 at 1:43 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>> Ping.
>>
>> On Mon, Sep 29, 2014 at 10:57 AM, Sriraman Tallam <tmsriram@google.com> wrote:
>>> Ping.
>>>
>>> On Fri, Sep 19, 2014 at 2:11 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>>>> Hi Richard,
>>>>
>>>> I also ran the gcc testsuite with
>>>> RUNTESTFLAGS="--tool_opts=-mcopyrelocs" to check for issues.  The only
>>>> test that failed was g++.dg/tsan/default_options.C.  It uses -fpie
>>>> -pie and BFD ld to link. Since BFD ld does not support copy
>>>> relocations with -pie, it does not link. I linked with gold to make
>>>> the test pass.
>>>>
>>>> Could you please take another look at this patch?
>>>>
>>>> Thanks
>>>> Sri
>>>>
>>>> On Mon, Sep 8, 2014 at 3:19 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>>>>> On Tue, Sep 2, 2014 at 1:40 PM, Richard Henderson <rth@redhat.com> wrote:
>>>>>> On 06/20/2014 05:17 PM, Sriraman Tallam wrote:
>>>>>>> Index: config/i386/i386.c
>>>>>>> ===================================================================
>>>>>>> --- config/i386/i386.c        (revision 211826)
>>>>>>> +++ config/i386/i386.c        (working copy)
>>>>>>> @@ -12691,7 +12691,9 @@ legitimate_pic_address_disp_p (rtx disp)
>>>>>>>               return true;
>>>>>>>           }
>>>>>>>         else if (!SYMBOL_REF_FAR_ADDR_P (op0)
>>>>>>> -                && SYMBOL_REF_LOCAL_P (op0)
>>>>>>> +                && (SYMBOL_REF_LOCAL_P (op0)
>>>>>>> +                    || (TARGET_64BIT && ix86_copyrelocs && flag_pie
>>>>>>> +                        && !SYMBOL_REF_FUNCTION_P (op0)))
>>>>>>>                  && ix86_cmodel != CM_LARGE_PIC)
>>>>>>>           return true;
>>>>>>>         break;
>>>>>>
>>>>>> This is the wrong place to patch.
>>>>>>
>>>>>> You ought to be adjusting SYMBOL_REF_LOCAL_P, by providing a modified
>>>>>> TARGET_BINDS_LOCAL_P.
>>>>>
>>>>> I have done this in the new attached patch, I added a new function
>>>>> i386_binds_local_p which will check for this and call
>>>>> default_binds_local_p otherwise.
>>>>>
>>>>>>
>>>>>> Note in particular that I believe that you are doing the wrong thing with weak
>>>>>> and COMMON symbols, in that you probably ought not force a copy reloc there.
>>>>>
>>>>> I added an extra check to not do this for WEAK symbols. I also added a
>>>>> check for DECL_EXTERNAL so I believe this will also not be called for
>>>>> COMMON symbols.
>>>>>
>>>>>>
>>>>>> Note the complexity of default_binds_local_p_1, and the fact that all you
>>>>>> really want to modify is
>>>>>>
>>>>>>   /* If PIC, then assume that any global name can be overridden by
>>>>>>      symbols resolved from other modules.  */
>>>>>>   else if (shlib)
>>>>>>     local_p = false;
>>>>>>
>>>>>> near the bottom of that function.
>>>>>
>>>>> I did not understand what you mean here? Were you suggesting an
>>>>> alternative way of doing this?
>>>>>
>>>>> Thanks for reviewing
>>>>> Sri
>>>>>
>>>>>>
>>>>>>
>>>>>> r~

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2014-10-06 20:43               ` Sriraman Tallam
@ 2014-11-10 23:35                 ` Sriraman Tallam
  2014-12-02 18:01                   ` Sriraman Tallam
  0 siblings, 1 reply; 63+ messages in thread
From: Sriraman Tallam @ 2014-11-10 23:35 UTC (permalink / raw)
  To: Richard Henderson
  Cc: GCC Patches, David Li, Cary Coutant, Ian Lance Taylor, Paul Pluzhnikov

Ping.

On Mon, Oct 6, 2014 at 1:43 PM, Sriraman Tallam <tmsriram@google.com> wrote:
> Ping.
>
> On Mon, Sep 29, 2014 at 10:57 AM, Sriraman Tallam <tmsriram@google.com> wrote:
>> Ping.
>>
>> On Fri, Sep 19, 2014 at 2:11 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>>> Hi Richard,
>>>
>>> I also ran the gcc testsuite with
>>> RUNTESTFLAGS="--tool_opts=-mcopyrelocs" to check for issues.  The only
>>> test that failed was g++.dg/tsan/default_options.C.  It uses -fpie
>>> -pie and BFD ld to link. Since BFD ld does not support copy
>>> relocations with -pie, it does not link. I linked with gold to make
>>> the test pass.
>>>
>>> Could you please take another look at this patch?
>>>
>>> Thanks
>>> Sri
>>>
>>> On Mon, Sep 8, 2014 at 3:19 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>>>> On Tue, Sep 2, 2014 at 1:40 PM, Richard Henderson <rth@redhat.com> wrote:
>>>>> On 06/20/2014 05:17 PM, Sriraman Tallam wrote:
>>>>>> Index: config/i386/i386.c
>>>>>> ===================================================================
>>>>>> --- config/i386/i386.c        (revision 211826)
>>>>>> +++ config/i386/i386.c        (working copy)
>>>>>> @@ -12691,7 +12691,9 @@ legitimate_pic_address_disp_p (rtx disp)
>>>>>>               return true;
>>>>>>           }
>>>>>>         else if (!SYMBOL_REF_FAR_ADDR_P (op0)
>>>>>> -                && SYMBOL_REF_LOCAL_P (op0)
>>>>>> +                && (SYMBOL_REF_LOCAL_P (op0)
>>>>>> +                    || (TARGET_64BIT && ix86_copyrelocs && flag_pie
>>>>>> +                        && !SYMBOL_REF_FUNCTION_P (op0)))
>>>>>>                  && ix86_cmodel != CM_LARGE_PIC)
>>>>>>           return true;
>>>>>>         break;
>>>>>
>>>>> This is the wrong place to patch.
>>>>>
>>>>> You ought to be adjusting SYMBOL_REF_LOCAL_P, by providing a modified
>>>>> TARGET_BINDS_LOCAL_P.
>>>>
>>>> I have done this in the new attached patch, I added a new function
>>>> i386_binds_local_p which will check for this and call
>>>> default_binds_local_p otherwise.
>>>>
>>>>>
>>>>> Note in particular that I believe that you are doing the wrong thing with weak
>>>>> and COMMON symbols, in that you probably ought not force a copy reloc there.
>>>>
>>>> I added an extra check to not do this for WEAK symbols. I also added a
>>>> check for DECL_EXTERNAL so I believe this will also not be called for
>>>> COMMON symbols.
>>>>
>>>>>
>>>>> Note the complexity of default_binds_local_p_1, and the fact that all you
>>>>> really want to modify is
>>>>>
>>>>>   /* If PIC, then assume that any global name can be overridden by
>>>>>      symbols resolved from other modules.  */
>>>>>   else if (shlib)
>>>>>     local_p = false;
>>>>>
>>>>> near the bottom of that function.
>>>>
>>>> I did not understand what you mean here? Were you suggesting an
>>>> alternative way of doing this?
>>>>
>>>> Thanks for reviewing
>>>> Sri
>>>>
>>>>>
>>>>>
>>>>> r~

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2014-09-29 17:57             ` Sriraman Tallam
@ 2014-10-06 20:43               ` Sriraman Tallam
  2014-11-10 23:35                 ` Sriraman Tallam
  0 siblings, 1 reply; 63+ messages in thread
From: Sriraman Tallam @ 2014-10-06 20:43 UTC (permalink / raw)
  To: Richard Henderson
  Cc: GCC Patches, David Li, Cary Coutant, Ian Lance Taylor, Paul Pluzhnikov

Ping.

On Mon, Sep 29, 2014 at 10:57 AM, Sriraman Tallam <tmsriram@google.com> wrote:
> Ping.
>
> On Fri, Sep 19, 2014 at 2:11 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>> Hi Richard,
>>
>> I also ran the gcc testsuite with
>> RUNTESTFLAGS="--tool_opts=-mcopyrelocs" to check for issues.  The only
>> test that failed was g++.dg/tsan/default_options.C.  It uses -fpie
>> -pie and BFD ld to link. Since BFD ld does not support copy
>> relocations with -pie, it does not link. I linked with gold to make
>> the test pass.
>>
>> Could you please take another look at this patch?
>>
>> Thanks
>> Sri
>>
>> On Mon, Sep 8, 2014 at 3:19 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>>> On Tue, Sep 2, 2014 at 1:40 PM, Richard Henderson <rth@redhat.com> wrote:
>>>> On 06/20/2014 05:17 PM, Sriraman Tallam wrote:
>>>>> Index: config/i386/i386.c
>>>>> ===================================================================
>>>>> --- config/i386/i386.c        (revision 211826)
>>>>> +++ config/i386/i386.c        (working copy)
>>>>> @@ -12691,7 +12691,9 @@ legitimate_pic_address_disp_p (rtx disp)
>>>>>               return true;
>>>>>           }
>>>>>         else if (!SYMBOL_REF_FAR_ADDR_P (op0)
>>>>> -                && SYMBOL_REF_LOCAL_P (op0)
>>>>> +                && (SYMBOL_REF_LOCAL_P (op0)
>>>>> +                    || (TARGET_64BIT && ix86_copyrelocs && flag_pie
>>>>> +                        && !SYMBOL_REF_FUNCTION_P (op0)))
>>>>>                  && ix86_cmodel != CM_LARGE_PIC)
>>>>>           return true;
>>>>>         break;
>>>>
>>>> This is the wrong place to patch.
>>>>
>>>> You ought to be adjusting SYMBOL_REF_LOCAL_P, by providing a modified
>>>> TARGET_BINDS_LOCAL_P.
>>>
>>> I have done this in the new attached patch, I added a new function
>>> i386_binds_local_p which will check for this and call
>>> default_binds_local_p otherwise.
>>>
>>>>
>>>> Note in particular that I believe that you are doing the wrong thing with weak
>>>> and COMMON symbols, in that you probably ought not force a copy reloc there.
>>>
>>> I added an extra check to not do this for WEAK symbols. I also added a
>>> check for DECL_EXTERNAL so I believe this will also not be called for
>>> COMMON symbols.
>>>
>>>>
>>>> Note the complexity of default_binds_local_p_1, and the fact that all you
>>>> really want to modify is
>>>>
>>>>   /* If PIC, then assume that any global name can be overridden by
>>>>      symbols resolved from other modules.  */
>>>>   else if (shlib)
>>>>     local_p = false;
>>>>
>>>> near the bottom of that function.
>>>
>>> I did not understand what you mean here? Were you suggesting an
>>> alternative way of doing this?
>>>
>>> Thanks for reviewing
>>> Sri
>>>
>>>>
>>>>
>>>> r~

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2014-09-19 21:11           ` Sriraman Tallam
@ 2014-09-29 17:57             ` Sriraman Tallam
  2014-10-06 20:43               ` Sriraman Tallam
  0 siblings, 1 reply; 63+ messages in thread
From: Sriraman Tallam @ 2014-09-29 17:57 UTC (permalink / raw)
  To: Richard Henderson
  Cc: GCC Patches, David Li, Cary Coutant, Ian Lance Taylor, Paul Pluzhnikov

Ping.

On Fri, Sep 19, 2014 at 2:11 PM, Sriraman Tallam <tmsriram@google.com> wrote:
> Hi Richard,
>
> I also ran the gcc testsuite with
> RUNTESTFLAGS="--tool_opts=-mcopyrelocs" to check for issues.  The only
> test that failed was g++.dg/tsan/default_options.C.  It uses -fpie
> -pie and BFD ld to link. Since BFD ld does not support copy
> relocations with -pie, it does not link. I linked with gold to make
> the test pass.
>
> Could you please take another look at this patch?
>
> Thanks
> Sri
>
> On Mon, Sep 8, 2014 at 3:19 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>> On Tue, Sep 2, 2014 at 1:40 PM, Richard Henderson <rth@redhat.com> wrote:
>>> On 06/20/2014 05:17 PM, Sriraman Tallam wrote:
>>>> Index: config/i386/i386.c
>>>> ===================================================================
>>>> --- config/i386/i386.c        (revision 211826)
>>>> +++ config/i386/i386.c        (working copy)
>>>> @@ -12691,7 +12691,9 @@ legitimate_pic_address_disp_p (rtx disp)
>>>>               return true;
>>>>           }
>>>>         else if (!SYMBOL_REF_FAR_ADDR_P (op0)
>>>> -                && SYMBOL_REF_LOCAL_P (op0)
>>>> +                && (SYMBOL_REF_LOCAL_P (op0)
>>>> +                    || (TARGET_64BIT && ix86_copyrelocs && flag_pie
>>>> +                        && !SYMBOL_REF_FUNCTION_P (op0)))
>>>>                  && ix86_cmodel != CM_LARGE_PIC)
>>>>           return true;
>>>>         break;
>>>
>>> This is the wrong place to patch.
>>>
>>> You ought to be adjusting SYMBOL_REF_LOCAL_P, by providing a modified
>>> TARGET_BINDS_LOCAL_P.
>>
>> I have done this in the new attached patch, I added a new function
>> i386_binds_local_p which will check for this and call
>> default_binds_local_p otherwise.
>>
>>>
>>> Note in particular that I believe that you are doing the wrong thing with weak
>>> and COMMON symbols, in that you probably ought not force a copy reloc there.
>>
>> I added an extra check to not do this for WEAK symbols. I also added a
>> check for DECL_EXTERNAL so I believe this will also not be called for
>> COMMON symbols.
>>
>>>
>>> Note the complexity of default_binds_local_p_1, and the fact that all you
>>> really want to modify is
>>>
>>>   /* If PIC, then assume that any global name can be overridden by
>>>      symbols resolved from other modules.  */
>>>   else if (shlib)
>>>     local_p = false;
>>>
>>> near the bottom of that function.
>>
>> I did not understand what you mean here? Were you suggesting an
>> alternative way of doing this?
>>
>> Thanks for reviewing
>> Sri
>>
>>>
>>>
>>> r~

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2014-09-08 22:19         ` Sriraman Tallam
@ 2014-09-19 21:11           ` Sriraman Tallam
  2014-09-29 17:57             ` Sriraman Tallam
  2014-12-02 19:06           ` H.J. Lu
  1 sibling, 1 reply; 63+ messages in thread
From: Sriraman Tallam @ 2014-09-19 21:11 UTC (permalink / raw)
  To: Richard Henderson
  Cc: GCC Patches, David Li, Cary Coutant, Ian Lance Taylor, Paul Pluzhnikov

Hi Richard,

I also ran the gcc testsuite with
RUNTESTFLAGS="--tool_opts=-mcopyrelocs" to check for issues.  The only
test that failed was g++.dg/tsan/default_options.C.  It uses -fpie
-pie and BFD ld to link. Since BFD ld does not support copy
relocations with -pie, it does not link. I linked with gold to make
the test pass.

Could you please take another look at this patch?

Thanks
Sri

On Mon, Sep 8, 2014 at 3:19 PM, Sriraman Tallam <tmsriram@google.com> wrote:
> On Tue, Sep 2, 2014 at 1:40 PM, Richard Henderson <rth@redhat.com> wrote:
>> On 06/20/2014 05:17 PM, Sriraman Tallam wrote:
>>> Index: config/i386/i386.c
>>> ===================================================================
>>> --- config/i386/i386.c        (revision 211826)
>>> +++ config/i386/i386.c        (working copy)
>>> @@ -12691,7 +12691,9 @@ legitimate_pic_address_disp_p (rtx disp)
>>>               return true;
>>>           }
>>>         else if (!SYMBOL_REF_FAR_ADDR_P (op0)
>>> -                && SYMBOL_REF_LOCAL_P (op0)
>>> +                && (SYMBOL_REF_LOCAL_P (op0)
>>> +                    || (TARGET_64BIT && ix86_copyrelocs && flag_pie
>>> +                        && !SYMBOL_REF_FUNCTION_P (op0)))
>>>                  && ix86_cmodel != CM_LARGE_PIC)
>>>           return true;
>>>         break;
>>
>> This is the wrong place to patch.
>>
>> You ought to be adjusting SYMBOL_REF_LOCAL_P, by providing a modified
>> TARGET_BINDS_LOCAL_P.
>
> I have done this in the new attached patch, I added a new function
> i386_binds_local_p which will check for this and call
> default_binds_local_p otherwise.
>
>>
>> Note in particular that I believe that you are doing the wrong thing with weak
>> and COMMON symbols, in that you probably ought not force a copy reloc there.
>
> I added an extra check to not do this for WEAK symbols. I also added a
> check for DECL_EXTERNAL so I believe this will also not be called for
> COMMON symbols.
>
>>
>> Note the complexity of default_binds_local_p_1, and the fact that all you
>> really want to modify is
>>
>>   /* If PIC, then assume that any global name can be overridden by
>>      symbols resolved from other modules.  */
>>   else if (shlib)
>>     local_p = false;
>>
>> near the bottom of that function.
>
> I did not understand what you mean here? Were you suggesting an
> alternative way of doing this?
>
> Thanks for reviewing
> Sri
>
>>
>>
>> r~

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2014-09-02 20:40       ` Richard Henderson
  2014-09-03  7:25         ` Bernhard Reutner-Fischer
@ 2014-09-08 22:19         ` Sriraman Tallam
  2014-09-19 21:11           ` Sriraman Tallam
  2014-12-02 19:06           ` H.J. Lu
  1 sibling, 2 replies; 63+ messages in thread
From: Sriraman Tallam @ 2014-09-08 22:19 UTC (permalink / raw)
  To: Richard Henderson
  Cc: GCC Patches, David Li, Cary Coutant, Ian Lance Taylor, Paul Pluzhnikov

[-- Attachment #1: Type: text/plain, Size: 1861 bytes --]

On Tue, Sep 2, 2014 at 1:40 PM, Richard Henderson <rth@redhat.com> wrote:
> On 06/20/2014 05:17 PM, Sriraman Tallam wrote:
>> Index: config/i386/i386.c
>> ===================================================================
>> --- config/i386/i386.c        (revision 211826)
>> +++ config/i386/i386.c        (working copy)
>> @@ -12691,7 +12691,9 @@ legitimate_pic_address_disp_p (rtx disp)
>>               return true;
>>           }
>>         else if (!SYMBOL_REF_FAR_ADDR_P (op0)
>> -                && SYMBOL_REF_LOCAL_P (op0)
>> +                && (SYMBOL_REF_LOCAL_P (op0)
>> +                    || (TARGET_64BIT && ix86_copyrelocs && flag_pie
>> +                        && !SYMBOL_REF_FUNCTION_P (op0)))
>>                  && ix86_cmodel != CM_LARGE_PIC)
>>           return true;
>>         break;
>
> This is the wrong place to patch.
>
> You ought to be adjusting SYMBOL_REF_LOCAL_P, by providing a modified
> TARGET_BINDS_LOCAL_P.

I have done this in the new attached patch, I added a new function
i386_binds_local_p which will check for this and call
default_binds_local_p otherwise.

>
> Note in particular that I believe that you are doing the wrong thing with weak
> and COMMON symbols, in that you probably ought not force a copy reloc there.

I added an extra check to not do this for WEAK symbols. I also added a
check for DECL_EXTERNAL so I believe this will also not be called for
COMMON symbols.

>
> Note the complexity of default_binds_local_p_1, and the fact that all you
> really want to modify is
>
>   /* If PIC, then assume that any global name can be overridden by
>      symbols resolved from other modules.  */
>   else if (shlib)
>     local_p = false;
>
> near the bottom of that function.

I did not understand what you mean here? Were you suggesting an
alternative way of doing this?

Thanks for reviewing
Sri

>
>
> r~

[-- Attachment #2: gcc_pie_copyrelocs_patch.txt --]
[-- Type: text/plain, Size: 5193 bytes --]

Optimize access to globals with -fpie, x86_64 only:

Currently, with -fPIE/-fpie, GCC accesses globals that are extern to the module
using the GOT.  This is two instructions, one to get the address of the global
from the GOT and the other to get the value.  If it turns out that the global
gets defined in the executable at link-time, it still needs to go through the
GOT as it is too late then to generate a direct access. 

Examples:

foo.cc
------
int a_glob;
int main () {
  return a_glob; // defined in this file
}

With -O2 -fpie -pie, the generated code directly accesses the global via
PC-relative insn:

5e0   <main>:
   mov    0x165a(%rip),%eax        # 1c40 <a_glob>

foo.cc
------

extern int a_glob;
int main () {
  return a_glob; // defined in this file
}

With -O2 -fpie -pie, the generated code accesses global via GOT using two
memory loads:

6f0  <main>:
   mov    0x1609(%rip),%rax   # 1d00 <_DYNAMIC+0x230>
   mov    (%rax),%eax

This is true even if in the latter case the global was defined in the
executable through a different file.

Some experiments on google benchmarks shows that the extra memory loads affects
performance by 1% to 5%. 


Solution - Copy Relocations:

When the linker supports copy relocations, GCC can always assume that the
global will be defined in the executable.  For globals that are truly extern
(come from shared objects), the linker will create copy relocations and have
them defined in the executable. Result is that no global access needs to go
through the GOT and hence improves performance.

This patch to the gold linker :
https://sourceware.org/ml/binutils/2014-05/msg00092.html
submitted recently allows gold to generate copy relocations for -pie mode when
necessary.

I have added option -mcopyrelocs which when combined with -fpie would do
this.  Note that the BFD linker does not support pie copyrelocs yet and this
option cannot be used there.

Please review.


ChangeLog:

	* config/i386/i386.opt (mpie-copyrelocs): New option.
	* config/i386/i386.c (i386_binds_local_p): New function.
        (TARGET_BINDS_LOCAL_P): Define.
	* testsuite/gcc.target/i386/pie-copyrelocs-1.c: New test.
	* testsuite/gcc.target/i386/pie-copyrelocs-2.c: New test.


Index: testsuite/gcc.target/i386/pie-copyrelocs-2.c
===================================================================
--- testsuite/gcc.target/i386/pie-copyrelocs-2.c	(revision 0)
+++ testsuite/gcc.target/i386/pie-copyrelocs-2.c	(revision 0)
@@ -0,0 +1,13 @@
+/* Test if -mno-copyrelocs does the right thing. */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fpie -mno-copyrelocs" } */
+
+extern int glob_a;
+
+int foo ()
+{
+  return glob_a;
+}
+
+/* glob_a should always be accessed via GOT  */ 
+/* { dg-final { scan-assembler "glob_a\\@GOT" { target { x86_64-*-* } } } } */
Index: testsuite/gcc.target/i386/pie-copyrelocs-1.c
===================================================================
--- testsuite/gcc.target/i386/pie-copyrelocs-1.c	(revision 0)
+++ testsuite/gcc.target/i386/pie-copyrelocs-1.c	(revision 0)
@@ -0,0 +1,13 @@
+/* Test if -mcopyrelocs does the right thing. */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fpie -mcopyrelocs" } */
+
+extern int glob_a;
+
+int foo ()
+{
+  return glob_a;
+}
+
+/* glob_a should never be accessed with a GOTPCREL  */ 
+/* { dg-final { scan-assembler-not "glob_a\\@GOTPCREL" { target { x86_64-*-* } } } } */
Index: config/i386/i386.c
===================================================================
--- config/i386/i386.c	(revision 214973)
+++ config/i386/i386.c	(working copy)
@@ -12642,6 +12642,18 @@ legitimate_pic_operand_p (rtx x)
     }
 }
 
+bool
+i386_binds_local_p (const_tree exp)
+{
+  /* Globals marked extern are treated as local when linker copy relocations
+     support is available with -f{pie|PIE}.  */
+  if (TARGET_64BIT && ix86_copyrelocs && flag_pie
+      && TREE_CODE (exp) == VAR_DECL
+      && DECL_EXTERNAL (exp) && !DECL_WEAK (exp))
+    return true;
+  return default_binds_local_p (exp);
+}
+
 /* Determine if a given CONST RTX is a valid memory displacement
    in PIC mode.  */
 
@@ -47157,6 +47169,9 @@ ix86_atomic_assign_expand_fenv (tree *hold, tree *
 #undef TARGET_MS_BITFIELD_LAYOUT_P
 #define TARGET_MS_BITFIELD_LAYOUT_P ix86_ms_bitfield_layout_p
 
+#undef TARGET_BINDS_LOCAL_P
+#define TARGET_BINDS_LOCAL_P i386_binds_local_p
+
 #if TARGET_MACHO
 #undef TARGET_BINDS_LOCAL_P
 #define TARGET_BINDS_LOCAL_P darwin_binds_local_p
Index: config/i386/i386.opt
===================================================================
--- config/i386/i386.opt	(revision 214973)
+++ config/i386/i386.opt	(working copy)
@@ -108,6 +108,10 @@ int x_ix86_dump_tunes
 TargetSave
 int x_ix86_force_align_arg_pointer
 
+;; -mcopyrelocs
+TargetSave
+int x_ix86_copyrelocs
+
 ;; -mforce-drap= 
 TargetSave
 int x_ix86_force_drap
@@ -291,6 +295,10 @@ mfancy-math-387
 Target RejectNegative Report InverseMask(NO_FANCY_MATH_387, USE_FANCY_MATH_387) Save
 Generate sin, cos, sqrt for FPU
 
+mcopyrelocs
+Target Report Var(ix86_copyrelocs) Init(0)
+Use linker copy relocs for pie
+
 mforce-drap
 Target Report Var(ix86_force_drap)
 Always use Dynamic Realigned Argument Pointer (DRAP) to realign stack

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2014-09-02 20:40       ` Richard Henderson
@ 2014-09-03  7:25         ` Bernhard Reutner-Fischer
  2014-09-08 22:19         ` Sriraman Tallam
  1 sibling, 0 replies; 63+ messages in thread
From: Bernhard Reutner-Fischer @ 2014-09-03  7:25 UTC (permalink / raw)
  To: Richard Henderson, Sriraman Tallam, GCC Patches, David Li,
	Cary Coutant, Ian Lance Taylor, Paul Pluzhnikov

On 2 September 2014 22:40:50 CEST, Richard Henderson <rth@redhat.com> wrote:
>On 06/20/2014 05:17 PM, Sriraman Tallam wrote:
>> Index: config/i386/i386.c
>> ===================================================================
>> --- config/i386/i386.c	(revision 211826)
>> +++ config/i386/i386.c	(working copy)
>> @@ -12691,7 +12691,9 @@ legitimate_pic_address_disp_p (rtx disp)
>>  		return true;
>>  	    }
>>  	  else if (!SYMBOL_REF_FAR_ADDR_P (op0)
>> -		   && SYMBOL_REF_LOCAL_P (op0)
>> +		   && (SYMBOL_REF_LOCAL_P (op0)
>> +		       || (TARGET_64BIT && ix86_copyrelocs && flag_pie
>> +			   && !SYMBOL_REF_FUNCTION_P (op0)))
>>  		   && ix86_cmodel != CM_LARGE_PIC)
>>  	    return true;
>>  	  break;
>
>This is the wrong place to patch.
>
>You ought to be adjusting SYMBOL_REF_LOCAL_P, by providing a modified
>TARGET_BINDS_LOCAL_P.
>
>Note in particular that I believe that you are doing the wrong thing
>with weak
>and COMMON symbols, in that you probably ought not force a copy reloc
>there.
>
>Note the complexity of default_binds_local_p_1, and the fact that all
>you
>really want to modify is
>
>  /* If PIC, then assume that any global name can be overridden by
>     symbols resolved from other modules.  */
>  else if (shlib)
>    local_p = false;
>
>near the bottom of that function.

Reminds me of PR32219 https://gcc.gnu.org/ml/gcc-patches/2010-03/msg00665.html
but admittedly that is not PIE imposed but still fails on current trunk..


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2014-06-21  0:17     ` Sriraman Tallam
  2014-06-26 17:55       ` Sriraman Tallam
@ 2014-09-02 20:40       ` Richard Henderson
  2014-09-03  7:25         ` Bernhard Reutner-Fischer
  2014-09-08 22:19         ` Sriraman Tallam
  1 sibling, 2 replies; 63+ messages in thread
From: Richard Henderson @ 2014-09-02 20:40 UTC (permalink / raw)
  To: Sriraman Tallam, GCC Patches, David Li, Cary Coutant,
	Ian Lance Taylor, Paul Pluzhnikov

On 06/20/2014 05:17 PM, Sriraman Tallam wrote:
> Index: config/i386/i386.c
> ===================================================================
> --- config/i386/i386.c	(revision 211826)
> +++ config/i386/i386.c	(working copy)
> @@ -12691,7 +12691,9 @@ legitimate_pic_address_disp_p (rtx disp)
>  		return true;
>  	    }
>  	  else if (!SYMBOL_REF_FAR_ADDR_P (op0)
> -		   && SYMBOL_REF_LOCAL_P (op0)
> +		   && (SYMBOL_REF_LOCAL_P (op0)
> +		       || (TARGET_64BIT && ix86_copyrelocs && flag_pie
> +			   && !SYMBOL_REF_FUNCTION_P (op0)))
>  		   && ix86_cmodel != CM_LARGE_PIC)
>  	    return true;
>  	  break;

This is the wrong place to patch.

You ought to be adjusting SYMBOL_REF_LOCAL_P, by providing a modified
TARGET_BINDS_LOCAL_P.

Note in particular that I believe that you are doing the wrong thing with weak
and COMMON symbols, in that you probably ought not force a copy reloc there.

Note the complexity of default_binds_local_p_1, and the fact that all you
really want to modify is

  /* If PIC, then assume that any global name can be overridden by
     symbols resolved from other modules.  */
  else if (shlib)
    local_p = false;

near the bottom of that function.


r~

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2014-07-11 17:42         ` Sriraman Tallam
@ 2014-09-02 18:15           ` Sriraman Tallam
  0 siblings, 0 replies; 63+ messages in thread
From: Sriraman Tallam @ 2014-09-02 18:15 UTC (permalink / raw)
  To: GCC Patches, David Li, Cary Coutant, Ian Lance Taylor,
	Paul Pluzhnikov, Uros Bizjak, Jan Hubicka, Jakub Jelinek

Ping.

On Fri, Jul 11, 2014 at 10:42 AM, Sriraman Tallam <tmsriram@google.com> wrote:
> Ping.
>
> On Thu, Jun 26, 2014 at 10:54 AM, Sriraman Tallam <tmsriram@google.com> wrote:
>> Hi Uros,
>>
>>    Could you please review this patch?
>>
>> Thanks
>> Sri
>>
>> On Fri, Jun 20, 2014 at 5:17 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>>> Patch Updated.
>>>
>>> Sri
>>>
>>> On Mon, Jun 9, 2014 at 3:55 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>>>> Ping.
>>>>
>>>> On Mon, May 19, 2014 at 11:11 AM, Sriraman Tallam <tmsriram@google.com> wrote:
>>>>> Ping.
>>>>>
>>>>> On Thu, May 15, 2014 at 11:34 AM, Sriraman Tallam <tmsriram@google.com> wrote:
>>>>>> Optimize access to globals with -fpie, x86_64 only:
>>>>>>
>>>>>> Currently, with -fPIE/-fpie, GCC accesses globals that are extern to the module
>>>>>> using the GOT.  This is two instructions, one to get the address of the global
>>>>>> from the GOT and the other to get the value.  If it turns out that the global
>>>>>> gets defined in the executable at link-time, it still needs to go through the
>>>>>> GOT as it is too late then to generate a direct access.
>>>>>>
>>>>>> Examples:
>>>>>>
>>>>>> foo.cc
>>>>>> ------
>>>>>> int a_glob;
>>>>>> int main () {
>>>>>>   return a_glob; // defined in this file
>>>>>> }
>>>>>>
>>>>>> With -O2 -fpie -pie, the generated code directly accesses the global via
>>>>>> PC-relative insn:
>>>>>>
>>>>>> 5e0   <main>:
>>>>>>    mov    0x165a(%rip),%eax        # 1c40 <a_glob>
>>>>>>
>>>>>> foo.cc
>>>>>> ------
>>>>>>
>>>>>> extern int a_glob;
>>>>>> int main () {
>>>>>>   return a_glob; // defined in this file
>>>>>> }
>>>>>>
>>>>>> With -O2 -fpie -pie, the generated code accesses global via GOT using two
>>>>>> memory loads:
>>>>>>
>>>>>> 6f0  <main>:
>>>>>>    mov    0x1609(%rip),%rax   # 1d00 <_DYNAMIC+0x230>
>>>>>>    mov    (%rax),%eax
>>>>>>
>>>>>> This is true even if in the latter case the global was defined in the
>>>>>> executable through a different file.
>>>>>>
>>>>>> Some experiments on google benchmarks shows that the extra memory loads affects
>>>>>> performance by 1% to 5%.
>>>>>>
>>>>>>
>>>>>> Solution - Copy Relocations:
>>>>>>
>>>>>> When the linker supports copy relocations, GCC can always assume that the
>>>>>> global will be defined in the executable.  For globals that are truly extern
>>>>>> (come from shared objects), the linker will create copy relocations and have
>>>>>> them defined in the executable. Result is that no global access needs to go
>>>>>> through the GOT and hence improves performance.
>>>>>>
>>>>>> This patch to the gold linker :
>>>>>> https://sourceware.org/ml/binutils/2014-05/msg00092.html
>>>>>> submitted recently allows gold to generate copy relocations for -pie mode when
>>>>>> necessary.
>>>>>>
>>>>>> I have added option -mld-pie-copyrelocs which when combined with -fpie would do
>>>>>> this.  Note that the BFD linker does not support pie copyrelocs yet and this
>>>>>> option cannot be used there.
>>>>>>
>>>>>> Please review.
>>>>>>
>>>>>>
>>>>>> ChangeLog:
>>>>>>
>>>>>> * config/i386/i36.opt (mld-pie-copyrelocs): New option.
>>>>>> * config/i386/i386.c (legitimate_pic_address_disp_p): Check if this
>>>>>>  address is still legitimate in the presence of copy relocations
>>>>>>  and -fpie.
>>>>>> * testsuite/gcc.target/i386/ld-pie-copyrelocs-1.c: New test.
>>>>>> * testsuite/gcc.target/i386/ld-pie-copyrelocs-2.c: New test.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Patch attached.
>>>>>> Thanks
>>>>>> Sri

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2014-06-26 17:55       ` Sriraman Tallam
@ 2014-07-11 17:42         ` Sriraman Tallam
  2014-09-02 18:15           ` Sriraman Tallam
  0 siblings, 1 reply; 63+ messages in thread
From: Sriraman Tallam @ 2014-07-11 17:42 UTC (permalink / raw)
  To: GCC Patches, David Li, Cary Coutant, Ian Lance Taylor,
	Paul Pluzhnikov, Uros Bizjak, Jan Hubicka, Jakub Jelinek

Ping.

On Thu, Jun 26, 2014 at 10:54 AM, Sriraman Tallam <tmsriram@google.com> wrote:
> Hi Uros,
>
>    Could you please review this patch?
>
> Thanks
> Sri
>
> On Fri, Jun 20, 2014 at 5:17 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>> Patch Updated.
>>
>> Sri
>>
>> On Mon, Jun 9, 2014 at 3:55 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>>> Ping.
>>>
>>> On Mon, May 19, 2014 at 11:11 AM, Sriraman Tallam <tmsriram@google.com> wrote:
>>>> Ping.
>>>>
>>>> On Thu, May 15, 2014 at 11:34 AM, Sriraman Tallam <tmsriram@google.com> wrote:
>>>>> Optimize access to globals with -fpie, x86_64 only:
>>>>>
>>>>> Currently, with -fPIE/-fpie, GCC accesses globals that are extern to the module
>>>>> using the GOT.  This is two instructions, one to get the address of the global
>>>>> from the GOT and the other to get the value.  If it turns out that the global
>>>>> gets defined in the executable at link-time, it still needs to go through the
>>>>> GOT as it is too late then to generate a direct access.
>>>>>
>>>>> Examples:
>>>>>
>>>>> foo.cc
>>>>> ------
>>>>> int a_glob;
>>>>> int main () {
>>>>>   return a_glob; // defined in this file
>>>>> }
>>>>>
>>>>> With -O2 -fpie -pie, the generated code directly accesses the global via
>>>>> PC-relative insn:
>>>>>
>>>>> 5e0   <main>:
>>>>>    mov    0x165a(%rip),%eax        # 1c40 <a_glob>
>>>>>
>>>>> foo.cc
>>>>> ------
>>>>>
>>>>> extern int a_glob;
>>>>> int main () {
>>>>>   return a_glob; // defined in this file
>>>>> }
>>>>>
>>>>> With -O2 -fpie -pie, the generated code accesses global via GOT using two
>>>>> memory loads:
>>>>>
>>>>> 6f0  <main>:
>>>>>    mov    0x1609(%rip),%rax   # 1d00 <_DYNAMIC+0x230>
>>>>>    mov    (%rax),%eax
>>>>>
>>>>> This is true even if in the latter case the global was defined in the
>>>>> executable through a different file.
>>>>>
>>>>> Some experiments on google benchmarks shows that the extra memory loads affects
>>>>> performance by 1% to 5%.
>>>>>
>>>>>
>>>>> Solution - Copy Relocations:
>>>>>
>>>>> When the linker supports copy relocations, GCC can always assume that the
>>>>> global will be defined in the executable.  For globals that are truly extern
>>>>> (come from shared objects), the linker will create copy relocations and have
>>>>> them defined in the executable. Result is that no global access needs to go
>>>>> through the GOT and hence improves performance.
>>>>>
>>>>> This patch to the gold linker :
>>>>> https://sourceware.org/ml/binutils/2014-05/msg00092.html
>>>>> submitted recently allows gold to generate copy relocations for -pie mode when
>>>>> necessary.
>>>>>
>>>>> I have added option -mld-pie-copyrelocs which when combined with -fpie would do
>>>>> this.  Note that the BFD linker does not support pie copyrelocs yet and this
>>>>> option cannot be used there.
>>>>>
>>>>> Please review.
>>>>>
>>>>>
>>>>> ChangeLog:
>>>>>
>>>>> * config/i386/i36.opt (mld-pie-copyrelocs): New option.
>>>>> * config/i386/i386.c (legitimate_pic_address_disp_p): Check if this
>>>>>  address is still legitimate in the presence of copy relocations
>>>>>  and -fpie.
>>>>> * testsuite/gcc.target/i386/ld-pie-copyrelocs-1.c: New test.
>>>>> * testsuite/gcc.target/i386/ld-pie-copyrelocs-2.c: New test.
>>>>>
>>>>>
>>>>>
>>>>> Patch attached.
>>>>> Thanks
>>>>> Sri

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2014-06-21  0:17     ` Sriraman Tallam
@ 2014-06-26 17:55       ` Sriraman Tallam
  2014-07-11 17:42         ` Sriraman Tallam
  2014-09-02 20:40       ` Richard Henderson
  1 sibling, 1 reply; 63+ messages in thread
From: Sriraman Tallam @ 2014-06-26 17:55 UTC (permalink / raw)
  To: GCC Patches, David Li, Cary Coutant, Ian Lance Taylor,
	Paul Pluzhnikov, Uros Bizjak, Jan Hubicka

Hi Uros,

   Could you please review this patch?

Thanks
Sri

On Fri, Jun 20, 2014 at 5:17 PM, Sriraman Tallam <tmsriram@google.com> wrote:
> Patch Updated.
>
> Sri
>
> On Mon, Jun 9, 2014 at 3:55 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>> Ping.
>>
>> On Mon, May 19, 2014 at 11:11 AM, Sriraman Tallam <tmsriram@google.com> wrote:
>>> Ping.
>>>
>>> On Thu, May 15, 2014 at 11:34 AM, Sriraman Tallam <tmsriram@google.com> wrote:
>>>> Optimize access to globals with -fpie, x86_64 only:
>>>>
>>>> Currently, with -fPIE/-fpie, GCC accesses globals that are extern to the module
>>>> using the GOT.  This is two instructions, one to get the address of the global
>>>> from the GOT and the other to get the value.  If it turns out that the global
>>>> gets defined in the executable at link-time, it still needs to go through the
>>>> GOT as it is too late then to generate a direct access.
>>>>
>>>> Examples:
>>>>
>>>> foo.cc
>>>> ------
>>>> int a_glob;
>>>> int main () {
>>>>   return a_glob; // defined in this file
>>>> }
>>>>
>>>> With -O2 -fpie -pie, the generated code directly accesses the global via
>>>> PC-relative insn:
>>>>
>>>> 5e0   <main>:
>>>>    mov    0x165a(%rip),%eax        # 1c40 <a_glob>
>>>>
>>>> foo.cc
>>>> ------
>>>>
>>>> extern int a_glob;
>>>> int main () {
>>>>   return a_glob; // defined in this file
>>>> }
>>>>
>>>> With -O2 -fpie -pie, the generated code accesses global via GOT using two
>>>> memory loads:
>>>>
>>>> 6f0  <main>:
>>>>    mov    0x1609(%rip),%rax   # 1d00 <_DYNAMIC+0x230>
>>>>    mov    (%rax),%eax
>>>>
>>>> This is true even if in the latter case the global was defined in the
>>>> executable through a different file.
>>>>
>>>> Some experiments on google benchmarks shows that the extra memory loads affects
>>>> performance by 1% to 5%.
>>>>
>>>>
>>>> Solution - Copy Relocations:
>>>>
>>>> When the linker supports copy relocations, GCC can always assume that the
>>>> global will be defined in the executable.  For globals that are truly extern
>>>> (come from shared objects), the linker will create copy relocations and have
>>>> them defined in the executable. Result is that no global access needs to go
>>>> through the GOT and hence improves performance.
>>>>
>>>> This patch to the gold linker :
>>>> https://sourceware.org/ml/binutils/2014-05/msg00092.html
>>>> submitted recently allows gold to generate copy relocations for -pie mode when
>>>> necessary.
>>>>
>>>> I have added option -mld-pie-copyrelocs which when combined with -fpie would do
>>>> this.  Note that the BFD linker does not support pie copyrelocs yet and this
>>>> option cannot be used there.
>>>>
>>>> Please review.
>>>>
>>>>
>>>> ChangeLog:
>>>>
>>>> * config/i386/i36.opt (mld-pie-copyrelocs): New option.
>>>> * config/i386/i386.c (legitimate_pic_address_disp_p): Check if this
>>>>  address is still legitimate in the presence of copy relocations
>>>>  and -fpie.
>>>> * testsuite/gcc.target/i386/ld-pie-copyrelocs-1.c: New test.
>>>> * testsuite/gcc.target/i386/ld-pie-copyrelocs-2.c: New test.
>>>>
>>>>
>>>>
>>>> Patch attached.
>>>> Thanks
>>>> Sri

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2014-06-09 22:55   ` Sriraman Tallam
@ 2014-06-21  0:17     ` Sriraman Tallam
  2014-06-26 17:55       ` Sriraman Tallam
  2014-09-02 20:40       ` Richard Henderson
  0 siblings, 2 replies; 63+ messages in thread
From: Sriraman Tallam @ 2014-06-21  0:17 UTC (permalink / raw)
  To: GCC Patches, David Li, Cary Coutant, Ian Lance Taylor, Paul Pluzhnikov

[-- Attachment #1: Type: text/plain, Size: 2872 bytes --]

Patch Updated.

Sri

On Mon, Jun 9, 2014 at 3:55 PM, Sriraman Tallam <tmsriram@google.com> wrote:
> Ping.
>
> On Mon, May 19, 2014 at 11:11 AM, Sriraman Tallam <tmsriram@google.com> wrote:
>> Ping.
>>
>> On Thu, May 15, 2014 at 11:34 AM, Sriraman Tallam <tmsriram@google.com> wrote:
>>> Optimize access to globals with -fpie, x86_64 only:
>>>
>>> Currently, with -fPIE/-fpie, GCC accesses globals that are extern to the module
>>> using the GOT.  This is two instructions, one to get the address of the global
>>> from the GOT and the other to get the value.  If it turns out that the global
>>> gets defined in the executable at link-time, it still needs to go through the
>>> GOT as it is too late then to generate a direct access.
>>>
>>> Examples:
>>>
>>> foo.cc
>>> ------
>>> int a_glob;
>>> int main () {
>>>   return a_glob; // defined in this file
>>> }
>>>
>>> With -O2 -fpie -pie, the generated code directly accesses the global via
>>> PC-relative insn:
>>>
>>> 5e0   <main>:
>>>    mov    0x165a(%rip),%eax        # 1c40 <a_glob>
>>>
>>> foo.cc
>>> ------
>>>
>>> extern int a_glob;
>>> int main () {
>>>   return a_glob; // defined in this file
>>> }
>>>
>>> With -O2 -fpie -pie, the generated code accesses global via GOT using two
>>> memory loads:
>>>
>>> 6f0  <main>:
>>>    mov    0x1609(%rip),%rax   # 1d00 <_DYNAMIC+0x230>
>>>    mov    (%rax),%eax
>>>
>>> This is true even if in the latter case the global was defined in the
>>> executable through a different file.
>>>
>>> Some experiments on google benchmarks shows that the extra memory loads affects
>>> performance by 1% to 5%.
>>>
>>>
>>> Solution - Copy Relocations:
>>>
>>> When the linker supports copy relocations, GCC can always assume that the
>>> global will be defined in the executable.  For globals that are truly extern
>>> (come from shared objects), the linker will create copy relocations and have
>>> them defined in the executable. Result is that no global access needs to go
>>> through the GOT and hence improves performance.
>>>
>>> This patch to the gold linker :
>>> https://sourceware.org/ml/binutils/2014-05/msg00092.html
>>> submitted recently allows gold to generate copy relocations for -pie mode when
>>> necessary.
>>>
>>> I have added option -mld-pie-copyrelocs which when combined with -fpie would do
>>> this.  Note that the BFD linker does not support pie copyrelocs yet and this
>>> option cannot be used there.
>>>
>>> Please review.
>>>
>>>
>>> ChangeLog:
>>>
>>> * config/i386/i36.opt (mld-pie-copyrelocs): New option.
>>> * config/i386/i386.c (legitimate_pic_address_disp_p): Check if this
>>>  address is still legitimate in the presence of copy relocations
>>>  and -fpie.
>>> * testsuite/gcc.target/i386/ld-pie-copyrelocs-1.c: New test.
>>> * testsuite/gcc.target/i386/ld-pie-copyrelocs-2.c: New test.
>>>
>>>
>>>
>>> Patch attached.
>>> Thanks
>>> Sri

[-- Attachment #2: gcc_pie_copyrelocs_patch.txt --]
[-- Type: text/plain, Size: 6058 bytes --]

Optimize access to globals with -fpie, x86_64 only:

Currently, with -fPIE/-fpie, GCC accesses globals that are extern to the module
using the GOT.  This is two instructions, one to get the address of the global
from the GOT and the other to get the value.  If it turns out that the global
gets defined in the executable at link-time, it still needs to go through the
GOT as it is too late then to generate a direct access. 

Examples:

foo.cc
------
int a_glob;
int main () {
  return a_glob; // defined in this file
}

With -O2 -fpie -pie, the generated code directly accesses the global via
PC-relative insn:

5e0   <main>:
   mov    0x165a(%rip),%eax        # 1c40 <a_glob>

foo.cc
------

extern int a_glob;
int main () {
  return a_glob; // defined in this file
}

With -O2 -fpie -pie, the generated code accesses global via GOT using two
memory loads:

6f0  <main>:
   mov    0x1609(%rip),%rax   # 1d00 <_DYNAMIC+0x230>
   mov    (%rax),%eax

This is true even if in the latter case the global was defined in the
executable through a different file.

Some experiments on google benchmarks shows that the extra memory loads affects
performance by 1% to 5%. 


Solution - Copy Relocations:

When the linker supports copy relocations, GCC can always assume that the
global will be defined in the executable.  For globals that are truly extern
(come from shared objects), the linker will create copy relocations and have
them defined in the executable. Result is that no global access needs to go
through the GOT and hence improves performance.

This patch to the gold linker :
https://sourceware.org/ml/binutils/2014-05/msg00092.html
submitted recently allows gold to generate copy relocations for -pie mode when
necessary.

I have added option -mcopyrelocs which when combined with -fpie would do
this.  Note that the BFD linker does not support pie copyrelocs yet and this
option cannot be used there.

Please review.


ChangeLog:

	* config/i386/i36.opt (mcopyrelocs): New option.
	* config/i386/i386.c (legitimate_pic_address_disp_p): Check if this
	  address is still legitimate in the presence of copy relocations
	  and -fpie.
	* doc/invoke.texi (mcopyrelocs): Document.
	* testsuite/gcc.target/i386/ld-pie-copyrelocs-1.c: New test.
	* testsuite/gcc.target/i386/ld-pie-copyrelocs-2.c: New test.


Index: config/i386/i386.c
===================================================================
--- config/i386/i386.c	(revision 211826)
+++ config/i386/i386.c	(working copy)
@@ -12691,7 +12691,9 @@ legitimate_pic_address_disp_p (rtx disp)
 		return true;
 	    }
 	  else if (!SYMBOL_REF_FAR_ADDR_P (op0)
-		   && SYMBOL_REF_LOCAL_P (op0)
+		   && (SYMBOL_REF_LOCAL_P (op0)
+		       || (TARGET_64BIT && ix86_copyrelocs && flag_pie
+			   && !SYMBOL_REF_FUNCTION_P (op0)))
 		   && ix86_cmodel != CM_LARGE_PIC)
 	    return true;
 	  break;
Index: config/i386/i386.opt
===================================================================
--- config/i386/i386.opt	(revision 211826)
+++ config/i386/i386.opt	(working copy)
@@ -108,6 +108,10 @@ int x_ix86_dump_tunes
 TargetSave
 int x_ix86_force_align_arg_pointer
 
+;; -mcopyrelocs
+TargetSave
+int x_ix86_copyrelocs
+
 ;; -mforce-drap= 
 TargetSave
 int x_ix86_force_drap
@@ -291,6 +295,10 @@ mfancy-math-387
 Target RejectNegative Report InverseMask(NO_FANCY_MATH_387, USE_FANCY_MATH_387) Save
 Generate sin, cos, sqrt for FPU
 
+mcopyrelocs
+Target Report Var(ix86_copyrelocs) Init(0)
+Use copy relocations for pie when possible
+
 mforce-drap
 Target Report Var(ix86_force_drap)
 Always use Dynamic Realigned Argument Pointer (DRAP) to realign stack
Index: testsuite/gcc.target/i386/ld-pie-copyrelocs-1.c
===================================================================
--- testsuite/gcc.target/i386/ld-pie-copyrelocs-1.c	(revision 0)
+++ testsuite/gcc.target/i386/ld-pie-copyrelocs-1.c	(revision 0)
@@ -0,0 +1,13 @@
+/* Test if -mcopyrelocs does the right thing. */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fpie -mcopyrelocs" } */
+
+extern int glob_a;
+
+int foo ()
+{
+  return glob_a;
+}
+
+/* glob_a should never be accessed with a GOTPCREL  */ 
+/* { dg-final { scan-assembler-not "glob_a\\@GOTPCREL" { target { x86_64-*-* } } } } */
Index: testsuite/gcc.target/i386/ld-pie-copyrelocs-2.c
===================================================================
--- testsuite/gcc.target/i386/ld-pie-copyrelocs-2.c	(revision 0)
+++ testsuite/gcc.target/i386/ld-pie-copyrelocs-2.c	(revision 0)
@@ -0,0 +1,13 @@
+/* Test if -mnoi-copyrelocs does the right thing. */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fpie -mno-copyrelocs" } */
+
+extern int glob_a;
+
+int foo ()
+{
+  return glob_a;
+}
+
+/* glob_a should always be accessed via GOT  */ 
+/* { dg-final { scan-assembler "glob_a\\@GOT" { target { x86_64-*-* } } } } */
Index: doc/invoke.texi
===================================================================
--- doc/invoke.texi	(revision 211826)
+++ doc/invoke.texi	(working copy)
@@ -688,7 +688,8 @@ Objective-C and Objective-C++ Dialects}.
 -m32 -m64 -mx32 -m16 -mlarge-data-threshold=@var{num} @gol
 -msse2avx -mfentry -m8bit-idiv @gol
 -mavx256-split-unaligned-load -mavx256-split-unaligned-store @gol
--mstack-protector-guard=@var{guard}}
+-mstack-protector-guard=@var{guard} @gol
+-mcopyrelocs}
 
 @emph{i386 and x86-64 Windows Options}
 @gccoptlist{-mconsole -mcygwin -mno-cygwin -mdll @gol
@@ -15802,6 +15803,15 @@ locations are @samp{global} for global canary or @
 canary in the TLS block (the default).  This option has effect only when
 @option{-fstack-protector} or @option{-fstack-protector-all} is specified.
 
+@item -mcopyrelocs
+@itemx -mno-copyrelocs
+@opindex mcopyrelocs
+@opindex mno-copyrelocs
+With @option{-fpie} and @option{fPIE}, copy relocations support allows the
+compiler to assume that all symbol references are local.  This allows the
+compiler to skip the GOT for global accesses and this applies only to the
+x86-64 architecture.
+
 @end table
 
 These @samp{-m} switches are supported in addition to the above

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2014-05-19 18:11 ` Sriraman Tallam
@ 2014-06-09 22:55   ` Sriraman Tallam
  2014-06-21  0:17     ` Sriraman Tallam
  0 siblings, 1 reply; 63+ messages in thread
From: Sriraman Tallam @ 2014-06-09 22:55 UTC (permalink / raw)
  To: GCC Patches, David Li, Cary Coutant, Ian Lance Taylor, Paul Pluzhnikov

Ping.

On Mon, May 19, 2014 at 11:11 AM, Sriraman Tallam <tmsriram@google.com> wrote:
> Ping.
>
> On Thu, May 15, 2014 at 11:34 AM, Sriraman Tallam <tmsriram@google.com> wrote:
>> Optimize access to globals with -fpie, x86_64 only:
>>
>> Currently, with -fPIE/-fpie, GCC accesses globals that are extern to the module
>> using the GOT.  This is two instructions, one to get the address of the global
>> from the GOT and the other to get the value.  If it turns out that the global
>> gets defined in the executable at link-time, it still needs to go through the
>> GOT as it is too late then to generate a direct access.
>>
>> Examples:
>>
>> foo.cc
>> ------
>> int a_glob;
>> int main () {
>>   return a_glob; // defined in this file
>> }
>>
>> With -O2 -fpie -pie, the generated code directly accesses the global via
>> PC-relative insn:
>>
>> 5e0   <main>:
>>    mov    0x165a(%rip),%eax        # 1c40 <a_glob>
>>
>> foo.cc
>> ------
>>
>> extern int a_glob;
>> int main () {
>>   return a_glob; // defined in this file
>> }
>>
>> With -O2 -fpie -pie, the generated code accesses global via GOT using two
>> memory loads:
>>
>> 6f0  <main>:
>>    mov    0x1609(%rip),%rax   # 1d00 <_DYNAMIC+0x230>
>>    mov    (%rax),%eax
>>
>> This is true even if in the latter case the global was defined in the
>> executable through a different file.
>>
>> Some experiments on google benchmarks shows that the extra memory loads affects
>> performance by 1% to 5%.
>>
>>
>> Solution - Copy Relocations:
>>
>> When the linker supports copy relocations, GCC can always assume that the
>> global will be defined in the executable.  For globals that are truly extern
>> (come from shared objects), the linker will create copy relocations and have
>> them defined in the executable. Result is that no global access needs to go
>> through the GOT and hence improves performance.
>>
>> This patch to the gold linker :
>> https://sourceware.org/ml/binutils/2014-05/msg00092.html
>> submitted recently allows gold to generate copy relocations for -pie mode when
>> necessary.
>>
>> I have added option -mld-pie-copyrelocs which when combined with -fpie would do
>> this.  Note that the BFD linker does not support pie copyrelocs yet and this
>> option cannot be used there.
>>
>> Please review.
>>
>>
>> ChangeLog:
>>
>> * config/i386/i36.opt (mld-pie-copyrelocs): New option.
>> * config/i386/i386.c (legitimate_pic_address_disp_p): Check if this
>>  address is still legitimate in the presence of copy relocations
>>  and -fpie.
>> * testsuite/gcc.target/i386/ld-pie-copyrelocs-1.c: New test.
>> * testsuite/gcc.target/i386/ld-pie-copyrelocs-2.c: New test.
>>
>>
>>
>> Patch attached.
>> Thanks
>> Sri

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
  2014-05-15 18:34 Sriraman Tallam
@ 2014-05-19 18:11 ` Sriraman Tallam
  2014-06-09 22:55   ` Sriraman Tallam
  0 siblings, 1 reply; 63+ messages in thread
From: Sriraman Tallam @ 2014-05-19 18:11 UTC (permalink / raw)
  To: GCC Patches, David Li, Cary Coutant, Ian Lance Taylor, Paul Pluzhnikov

Ping.

On Thu, May 15, 2014 at 11:34 AM, Sriraman Tallam <tmsriram@google.com> wrote:
> Optimize access to globals with -fpie, x86_64 only:
>
> Currently, with -fPIE/-fpie, GCC accesses globals that are extern to the module
> using the GOT.  This is two instructions, one to get the address of the global
> from the GOT and the other to get the value.  If it turns out that the global
> gets defined in the executable at link-time, it still needs to go through the
> GOT as it is too late then to generate a direct access.
>
> Examples:
>
> foo.cc
> ------
> int a_glob;
> int main () {
>   return a_glob; // defined in this file
> }
>
> With -O2 -fpie -pie, the generated code directly accesses the global via
> PC-relative insn:
>
> 5e0   <main>:
>    mov    0x165a(%rip),%eax        # 1c40 <a_glob>
>
> foo.cc
> ------
>
> extern int a_glob;
> int main () {
>   return a_glob; // defined in this file
> }
>
> With -O2 -fpie -pie, the generated code accesses global via GOT using two
> memory loads:
>
> 6f0  <main>:
>    mov    0x1609(%rip),%rax   # 1d00 <_DYNAMIC+0x230>
>    mov    (%rax),%eax
>
> This is true even if in the latter case the global was defined in the
> executable through a different file.
>
> Some experiments on google benchmarks shows that the extra memory loads affects
> performance by 1% to 5%.
>
>
> Solution - Copy Relocations:
>
> When the linker supports copy relocations, GCC can always assume that the
> global will be defined in the executable.  For globals that are truly extern
> (come from shared objects), the linker will create copy relocations and have
> them defined in the executable. Result is that no global access needs to go
> through the GOT and hence improves performance.
>
> This patch to the gold linker :
> https://sourceware.org/ml/binutils/2014-05/msg00092.html
> submitted recently allows gold to generate copy relocations for -pie mode when
> necessary.
>
> I have added option -mld-pie-copyrelocs which when combined with -fpie would do
> this.  Note that the BFD linker does not support pie copyrelocs yet and this
> option cannot be used there.
>
> Please review.
>
>
> ChangeLog:
>
> * config/i386/i36.opt (mld-pie-copyrelocs): New option.
> * config/i386/i386.c (legitimate_pic_address_disp_p): Check if this
>  address is still legitimate in the presence of copy relocations
>  and -fpie.
> * testsuite/gcc.target/i386/ld-pie-copyrelocs-1.c: New test.
> * testsuite/gcc.target/i386/ld-pie-copyrelocs-2.c: New test.
>
>
>
> Patch attached.
> Thanks
> Sri

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
@ 2014-05-15 18:34 Sriraman Tallam
  2014-05-19 18:11 ` Sriraman Tallam
  0 siblings, 1 reply; 63+ messages in thread
From: Sriraman Tallam @ 2014-05-15 18:34 UTC (permalink / raw)
  To: GCC Patches, David Li, Cary Coutant, Ian Lance Taylor

[-- Attachment #1: Type: text/plain, Size: 2296 bytes --]

Optimize access to globals with -fpie, x86_64 only:

Currently, with -fPIE/-fpie, GCC accesses globals that are extern to the module
using the GOT.  This is two instructions, one to get the address of the global
from the GOT and the other to get the value.  If it turns out that the global
gets defined in the executable at link-time, it still needs to go through the
GOT as it is too late then to generate a direct access.

Examples:

foo.cc
------
int a_glob;
int main () {
  return a_glob; // defined in this file
}

With -O2 -fpie -pie, the generated code directly accesses the global via
PC-relative insn:

5e0   <main>:
   mov    0x165a(%rip),%eax        # 1c40 <a_glob>

foo.cc
------

extern int a_glob;
int main () {
  return a_glob; // defined in this file
}

With -O2 -fpie -pie, the generated code accesses global via GOT using two
memory loads:

6f0  <main>:
   mov    0x1609(%rip),%rax   # 1d00 <_DYNAMIC+0x230>
   mov    (%rax),%eax

This is true even if in the latter case the global was defined in the
executable through a different file.

Some experiments on google benchmarks shows that the extra memory loads affects
performance by 1% to 5%.


Solution - Copy Relocations:

When the linker supports copy relocations, GCC can always assume that the
global will be defined in the executable.  For globals that are truly extern
(come from shared objects), the linker will create copy relocations and have
them defined in the executable. Result is that no global access needs to go
through the GOT and hence improves performance.

This patch to the gold linker :
https://sourceware.org/ml/binutils/2014-05/msg00092.html
submitted recently allows gold to generate copy relocations for -pie mode when
necessary.

I have added option -mld-pie-copyrelocs which when combined with -fpie would do
this.  Note that the BFD linker does not support pie copyrelocs yet and this
option cannot be used there.

Please review.


ChangeLog:

* config/i386/i36.opt (mld-pie-copyrelocs): New option.
* config/i386/i386.c (legitimate_pic_address_disp_p): Check if this
 address is still legitimate in the presence of copy relocations
 and -fpie.
* testsuite/gcc.target/i386/ld-pie-copyrelocs-1.c: New test.
* testsuite/gcc.target/i386/ld-pie-copyrelocs-2.c: New test.



Patch attached.
Thanks
Sri

[-- Attachment #2: gcc_pie_copyrelocs_patch.txt --]
[-- Type: text/plain, Size: 4850 bytes --]

Optimize access to globals with -fpie, x86_64 only:

Currently, with -fPIE/-fpie, GCC accesses globals that are extern to the module
using the GOT.  This is two instructions, one to get the address of the global
from the GOT and the other to get the value.  If it turns out that the global
gets defined in the executable at link-time, it still needs to go through the
GOT as it is too late then to generate a direct access. 

Examples:

foo.cc
------
int a_glob;
int main () {
  return a_glob; // defined in this file
}

With -O2 -fpie -pie, the generated code directly accesses the global via
PC-relative insn:

5e0   <main>:
   mov    0x165a(%rip),%eax        # 1c40 <a_glob>

foo.cc
------

extern int a_glob;
int main () {
  return a_glob; // defined in this file
}

With -O2 -fpie -pie, the generated code accesses global via GOT using two
memory loads:

6f0  <main>:
   mov    0x1609(%rip),%rax   # 1d00 <_DYNAMIC+0x230>
   mov    (%rax),%eax

This is true even if in the latter case the global was defined in the
executable through a different file.

Some experiments on google benchmarks shows that the extra memory loads affects
performance by 1% to 5%. 


Solution - Copy Relocations:

When the linker supports copy relocations, GCC can always assume that the
global will be defined in the executable.  For globals that are truly extern
(come from shared objects), the linker will create copy relocations and have
them defined in the executable. Result is that no global access needs to go
through the GOT and hence improves performance.

This patch to the gold linker :
https://sourceware.org/ml/binutils/2014-05/msg00092.html
submitted recently allows gold to generate copy relocations for -pie mode when
necessary.

I have added option -mld-pie-copyrelocs which when combined with -fpie would do
this.  Note that the BFD linker does not support pie copyrelocs yet and this
option cannot be used there.

Please review.


ChangeLog:

	* config/i386/i36.opt (mld-pie-copyrelocs): New option.
	* config/i386/i386.c (legitimate_pic_address_disp_p): Check if this
	  address is still legitimate in the presence of copy relocations
	  and -fpie.
	* testsuite/gcc.target/i386/ld-pie-copyrelocs-1.c: New test.
	* testsuite/gcc.target/i386/ld-pie-copyrelocs-2.c: New test.


Index: config/i386/i386.opt
===================================================================
--- config/i386/i386.opt	(revision 210437)
+++ config/i386/i386.opt	(working copy)
@@ -108,6 +108,10 @@ int x_ix86_dump_tunes
 TargetSave
 int x_ix86_force_align_arg_pointer
 
+;; -mld-pie-copyrelocs
+TargetSave
+int x_ix86_ld_pie_copyrelocs
+
 ;; -mforce-drap= 
 TargetSave
 int x_ix86_force_drap
@@ -291,6 +295,10 @@ mfancy-math-387
 Target RejectNegative Report InverseMask(NO_FANCY_MATH_387, USE_FANCY_MATH_387) Save
 Generate sin, cos, sqrt for FPU
 
+mld-pie-copyrelocs
+Target Report Var(ix86_ld_pie_copyrelocs) Init(0)
+Use linker copy relocs for pie
+
 mforce-drap
 Target Report Var(ix86_force_drap)
 Always use Dynamic Realigned Argument Pointer (DRAP) to realign stack
Index: config/i386/i386.c
===================================================================
--- config/i386/i386.c	(revision 210437)
+++ config/i386/i386.c	(working copy)
@@ -12684,7 +12684,9 @@ legitimate_pic_address_disp_p (rtx disp)
 		return true;
 	    }
 	  else if (!SYMBOL_REF_FAR_ADDR_P (op0)
-		   && SYMBOL_REF_LOCAL_P (op0)
+		   && (SYMBOL_REF_LOCAL_P (op0)
+		       || (TARGET_64BIT && ix86_ld_pie_copyrelocs && flag_pie
+			   && !SYMBOL_REF_FUNCTION_P (op0)))
 		   && ix86_cmodel != CM_LARGE_PIC)
 	    return true;
 	  break;
Index: testsuite/gcc.target/i386/ld-pie-copyrelocs-1.c
===================================================================
--- testsuite/gcc.target/i386/ld-pie-copyrelocs-1.c	(revision 0)
+++ testsuite/gcc.target/i386/ld-pie-copyrelocs-1.c	(revision 0)
@@ -0,0 +1,13 @@
+/* Test if -mld-pie-copyrelocs does the right thing. */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fpie -mld-pie-copyrelocs" } */
+
+extern int glob_a;
+
+int foo ()
+{
+  return glob_a;
+}
+
+/* glob_a should never be accessed with a GOTPCREL  */ 
+/* { dg-final { scan-assembler-not "glob_a\\@GOTPCREL" { target { x86_64-*-* } } } } */
Index: testsuite/gcc.target/i386/ld-pie-copyrelocs-2.c
===================================================================
--- testsuite/gcc.target/i386/ld-pie-copyrelocs-2.c	(revision 0)
+++ testsuite/gcc.target/i386/ld-pie-copyrelocs-2.c	(revision 0)
@@ -0,0 +1,13 @@
+/* Test if -mno-ld-pie-copyrelocs does the right thing. */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fpie -mno-ld-pie-copyrelocs" } */
+
+extern int glob_a;
+
+int foo ()
+{
+  return glob_a;
+}
+
+/* glob_a should always be accessed via GOT  */ 
+/* { dg-final { scan-assembler "glob_a\\@GOT" { target { x86_64-*-* } } } } */


^ permalink raw reply	[flat|nested] 63+ messages in thread

end of thread, other threads:[~2015-02-27 23:26 UTC | newest]

Thread overview: 63+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-12-02 19:19 [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations Uros Bizjak
2014-12-02 19:39 ` H.J. Lu
2014-12-02 19:40 ` H.J. Lu
2014-12-02 20:01   ` Uros Bizjak
2014-12-02 20:43     ` H.J. Lu
2014-12-02 20:19       ` Jakub Jelinek
2014-12-02 22:14         ` H.J. Lu
2014-12-02 23:21           ` H.J. Lu
2014-12-03 13:47     ` H.J. Lu
2014-12-03 15:01       ` H.J. Lu
2014-12-03 21:35         ` H.J. Lu
2014-12-04 12:44           ` Uros Bizjak
2014-12-04 16:46             ` H.J. Lu
2014-12-04 19:32               ` Uros Bizjak
2015-02-03 19:25               ` Sriraman Tallam
2015-02-03 19:26                 ` Sriraman Tallam
2015-02-03 19:36                 ` Jakub Jelinek
2015-02-03 21:20                   ` Sriraman Tallam
2015-02-03 21:29                     ` H.J. Lu
2015-02-03 21:36                       ` Sriraman Tallam
2015-02-03 22:03                         ` H.J. Lu
2015-02-03 22:19                           ` Jakub Jelinek
2015-02-04  1:16                             ` H.J. Lu
2015-02-04 18:27                               ` Sriraman Tallam
2015-02-04 18:31                                 ` Jakub Jelinek
2015-02-04 18:38                                   ` H.J. Lu
2015-02-04 18:42                                     ` Jakub Jelinek
2015-02-04 18:45                                       ` H.J. Lu
2015-02-04 18:51                                         ` Sriraman Tallam
2015-02-04 18:57                                           ` H.J. Lu
2015-02-04 21:53                                             ` Sriraman Tallam
2015-02-04 22:37                                               ` H.J. Lu
2015-02-04 22:47                                                 ` Bernhard Reutner-Fischer
2015-02-04 23:10                                                   ` H.J. Lu
2015-02-04 23:29                                                     ` H.J. Lu
2015-02-05 16:57                                                       ` Bernhard Reutner-Fischer
2015-02-05 18:54                                                       ` Richard Henderson
2015-02-05 19:01                                                         ` H.J. Lu
2015-02-05 19:59                                                           ` Richard Henderson
2015-02-05 22:05                                                             ` Sriraman Tallam
2015-02-05 22:47                                                               ` H.J. Lu
2015-02-05 22:48                                                                 ` Sriraman Tallam
2015-02-06 16:25                                                               ` H.J. Lu
2015-02-27 23:39               ` H.J. Lu
2015-02-27 23:46                 ` H.J. Lu
  -- strict thread matches above, loose matches on Subject: below --
2014-12-04 22:19 Dominique Dhumieres
2014-12-04 23:54 ` H.J. Lu
2014-05-15 18:34 Sriraman Tallam
2014-05-19 18:11 ` Sriraman Tallam
2014-06-09 22:55   ` Sriraman Tallam
2014-06-21  0:17     ` Sriraman Tallam
2014-06-26 17:55       ` Sriraman Tallam
2014-07-11 17:42         ` Sriraman Tallam
2014-09-02 18:15           ` Sriraman Tallam
2014-09-02 20:40       ` Richard Henderson
2014-09-03  7:25         ` Bernhard Reutner-Fischer
2014-09-08 22:19         ` Sriraman Tallam
2014-09-19 21:11           ` Sriraman Tallam
2014-09-29 17:57             ` Sriraman Tallam
2014-10-06 20:43               ` Sriraman Tallam
2014-11-10 23:35                 ` Sriraman Tallam
2014-12-02 18:01                   ` Sriraman Tallam
2014-12-02 19:06           ` H.J. Lu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).