public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] Add type_unknown/type_executable/type_pic to output_type
@ 2015-08-23  2:09 H.J. Lu
  2015-08-23 13:50 ` Alan Modra
  0 siblings, 1 reply; 4+ messages in thread
From: H.J. Lu @ 2015-08-23  2:09 UTC (permalink / raw)
  To: binutils

This patch adds to enum output_type so that bfd_link_executable and
bfd_link_pic can be simplifed a little bit.  On Linux/x86-64, it
reduces the linker code size by 168 bytes.

Any comments, feedbacks?

H.J.
---
include/

	* bfdlink.h (output_type): Add type_unknown, type_executable
	and type_pic.
	(bfd_link_executable): Updated.
	(bfd_link_pic): Likewise.
	(bfd_link_info): Change type to 3 bits.

ld/

	* lexsup.c (parse_args): Default output type to type_pde.
---
 include/bfdlink.h | 17 ++++++++++-------
 ld/lexsup.c       |  4 ++++
 2 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/include/bfdlink.h b/include/bfdlink.h
index 458a768..1d0ca35 100644
--- a/include/bfdlink.h
+++ b/include/bfdlink.h
@@ -263,18 +263,21 @@ struct bfd_elf_version_tree;
 
 enum output_type
 {
-  type_pde,
-  type_relocatable,
-  type_pie,
-  type_dll,
+  type_unknown = 0,
+  type_executable = 1 << 0,
+  type_relocatable = 1 << 1,
+  type_pic = 1 << 2,
+  type_dll = type_pic,
+  type_pde = type_executable,
+  type_pie = (type_executable | type_pic)
 };
 
 #define bfd_link_pde(info)	   ((info)->type == type_pde)
 #define bfd_link_dll(info)	   ((info)->type == type_dll)
 #define bfd_link_relocatable(info) ((info)->type == type_relocatable)
 #define bfd_link_pie(info)	   ((info)->type == type_pie)
-#define bfd_link_executable(info)  (bfd_link_pde (info) || bfd_link_pie (info))
-#define bfd_link_pic(info)	   (bfd_link_dll (info) || bfd_link_pie (info))
+#define bfd_link_executable(info)  (((info)->type & type_executable) != 0)
+#define bfd_link_pic(info)	   (((info)->type & type_pic) != 0)
 
 /* This structure holds all the information needed to communicate
    between BFD and the linker when doing a link.  */
@@ -282,7 +285,7 @@ enum output_type
 struct bfd_link_info
 {
   /* Output type.  */
-  ENUM_BITFIELD (output_type) type : 2;
+  ENUM_BITFIELD (output_type) type : 3;
 
   /* TRUE if BFD should pre-bind symbols in a shared object.  */
   unsigned int symbolic: 1;
diff --git a/ld/lexsup.c b/ld/lexsup.c
index ace1803..2700929 100644
--- a/ld/lexsup.c
+++ b/ld/lexsup.c
@@ -1518,6 +1518,10 @@ parse_args (unsigned argc, char **argv)
 	}
     }
 
+  /* Default output type to PDE.  */
+  if (link_info.type == type_unknown)
+    link_info.type = type_pde;
+
   if (command_line.soname && command_line.soname[0] == '\0')
     {
       einfo (_("%P: SONAME must not be empty string; ignored\n"));
-- 
2.4.3

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] Add type_unknown/type_executable/type_pic to output_type
  2015-08-23  2:09 [PATCH] Add type_unknown/type_executable/type_pic to output_type H.J. Lu
@ 2015-08-23 13:50 ` Alan Modra
  2015-08-23 14:41   ` H.J. Lu
  0 siblings, 1 reply; 4+ messages in thread
From: Alan Modra @ 2015-08-23 13:50 UTC (permalink / raw)
  To: H.J. Lu; +Cc: binutils

Hmm, with x86_64 gcc 5.1.1 20150526, the size (as per size utility) of
an --enable-targets=all ld went from 11546435 to 11537515 bytes with
your patch.  I wondered where this was coming from, so added
  unsigned int pad: 1;
to struct bfd_link_info, after "type".
That gave me 11546275 bytes.  So some of the improvement from your
patch is due to better location of other fields.

Next I tried reordering the enum a little, as well as the pad.
That gave me 11522515 bytes, which is even better than the result I
obtained with your patch..

Is any of this worth doing?  Probably not.  Fiddling around without
proper analysis isn't that useful.  One thing I noticed is that both
of the following

#define bfd_link_executable(info)  (bfd_link_pde (info) || bfd_link_pie (info))
#define bfd_link_pic(info)	   (bfd_link_dll (info) || bfd_link_pie (info))

with

enum output_type
{
  type_pde,
  type_relocatable,
  type_pie,
  type_dll,
};

ought to simplify to a single bit test, the first ought to test
bit0 == 0, and the second to test bit1 == 1.  I spot checked a few
places and it appears that only bfd_link_executable(info) is ideal.

With my changes to enum output_type, both bfd_link_executable(info)
and bfd_link_pic(info) ought to still be a single bit test, but now
bfd_link_pic(info) generates ideal code but bfd_link_executable(info)
doesn't..  The gain I found is simply due to bfd_link_pic being more
common.

$ findfile bfd ld | xargs grep bfd_link_executable | wc -l
177
$ findfile bfd ld | xargs grep bfd_link_pic | wc -l
1028

I've opened https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67328

OK, so knowing that there is a gcc deficiency here, we could easily
work around that and write bfd_link_executable and bfd_link_pic as
directly testing the appropriate bit.  I don't think that is a good
idea, as it is a little less obvious code, but the following may as
well be committed.

	* bfdlink.h (enum output_type): Reorder enum.

diff --git a/include/bfdlink.h b/include/bfdlink.h
index 458a768..43bcc6a 100644
--- a/include/bfdlink.h
+++ b/include/bfdlink.h
@@ -264,8 +264,8 @@ struct bfd_elf_version_tree;
 enum output_type
 {
   type_pde,
-  type_relocatable,
   type_pie,
+  type_relocatable,
   type_dll,
 };
 

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] Add type_unknown/type_executable/type_pic to output_type
  2015-08-23 13:50 ` Alan Modra
@ 2015-08-23 14:41   ` H.J. Lu
  2015-08-24  3:00     ` Alan Modra
  0 siblings, 1 reply; 4+ messages in thread
From: H.J. Lu @ 2015-08-23 14:41 UTC (permalink / raw)
  To: Alan Modra; +Cc: Binutils

On Sun, Aug 23, 2015 at 6:50 AM, Alan Modra <amodra@gmail.com> wrote:
> Hmm, with x86_64 gcc 5.1.1 20150526, the size (as per size utility) of
> an --enable-targets=all ld went from 11546435 to 11537515 bytes with
> your patch.  I wondered where this was coming from, so added
>   unsigned int pad: 1;
> to struct bfd_link_info, after "type".
> That gave me 11546275 bytes.  So some of the improvement from your
> patch is due to better location of other fields.
>

I compared the assembly outputs of elflink.c on x86-64 with GCC 5.2.1
using -O2 -g0:

[hjl@gnu-tools-1 bfd]$ wc -l old.s new.s
 27370 old.s
 27304 new.s

My patch removes 66 lines of assembly codes.  I looked at
the differences and found:

old:

        testb   $8, 100(%rdi)
        movq    16(%rsi), %rbp
        jne     .L631
        movq    0(%rbp), %rax
        xorl    %r13d, %r13d
        movzbl  (%rax), %eax
        movl    %eax, %edx
        andl    $3, %edx
        cmpb    $1, %dl
        jbe     .L863
.L631:
        cmpb    $7, 24(%rbx)
        je      .L864
.L632:
        movl    4(%r12), %r9d
        testl   %r9d, %r9d
        jne     .L865
        testl   %r13d, %r13d
        jne     .L739
.L635:
        movq    8(%rbp), %rax

new

        testb   $8, 100(%rdi)
        movq    16(%rsi), %rbp
        jne     .L628
        movq    0(%rbp), %rax
        xorl    %r13d, %r13d
        movzbl  (%rax), %eax
        testb   $1, %al
        jne     .L862
.L628:
        cmpb    $7, 24(%rbx)
        je      .L863
.L629:
        movl    4(%r12), %eax
        testl   %eax, %eax
        jne     .L864
        testl   %r13d, %r13d
        jne     .L733
.L632:
        movq    8(%rbp), %rax

for

static bfd_boolean
elf_link_output_extsym (struct bfd_hash_entry *bh, void *data)
{
  struct elf_link_hash_entry *h = (struct elf_link_hash_entry *) bh;
  struct elf_outext_info *eoinfo = (struct elf_outext_info *) data;
  struct elf_final_link_info *flinfo = eoinfo->flinfo;
  bfd_boolean strip;
  Elf_Internal_Sym sym;
  asection *input_sec;
  const struct elf_backend_data *bed;
  long indx;
  int ret;
  /* A symbol is bound locally if it is forced local or it is locally
     defined, hidden versioned, not referenced by shared library and
     not exported when linking executable.  */
  bfd_boolean local_bind = (h->forced_local
                            || (bfd_link_executable (flinfo->info)
                                && !flinfo->info->export_dynamic
                                && !h->dynamic
                                && !h->ref_dynamic
                                && h->def_regular
                                && h->versioned == versioned_hidden));

My patch improves the generated code by changing

        movzbl  (%rax), %eax
        movl    %eax, %edx
        andl    $3, %edx
        cmpb    $1, %dl

to

        movzbl  (%rax), %eax
        testb   $1, %al

for bfd_link_executable (flinfo->info).

-- 
H.J.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] Add type_unknown/type_executable/type_pic to output_type
  2015-08-23 14:41   ` H.J. Lu
@ 2015-08-24  3:00     ` Alan Modra
  0 siblings, 0 replies; 4+ messages in thread
From: Alan Modra @ 2015-08-24  3:00 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Binutils

On Sun, Aug 23, 2015 at 07:41:09AM -0700, H.J. Lu wrote:
> My patch improves the generated code by changing
> 
>         movzbl  (%rax), %eax
>         movl    %eax, %edx
>         andl    $3, %edx
>         cmpb    $1, %dl
> 
> to
> 
>         movzbl  (%rax), %eax
>         testb   $1, %al
> 
> for bfd_link_executable (flinfo->info).

This is due to gcc tree-ssa-reassoc.c not considering a single range
test for optimisation to a bit test.  As I said previously, we could
work around this by using
#define bfd_link_executable(info)  (((info)->type & 2) == 0)

That generates good code, and doesn't regress other places like your
patch does (I'm assuming regressions from the size increase), but it
just doesn't seem worth applying this sort of micro-optimisation.

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-08-24  3:00 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-08-23  2:09 [PATCH] Add type_unknown/type_executable/type_pic to output_type H.J. Lu
2015-08-23 13:50 ` Alan Modra
2015-08-23 14:41   ` H.J. Lu
2015-08-24  3:00     ` Alan Modra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).