public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug debug/49676] New: inefficiency: DW_AT_GNU_call_site_value calculates everything << 32
@ 2011-07-08 11:51 jan.kratochvil at redhat dot com
  2011-07-08 14:07 ` [Bug debug/49676] " jakub at gcc dot gnu.org
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: jan.kratochvil at redhat dot com @ 2011-07-08 11:51 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49676

           Summary: inefficiency: DW_AT_GNU_call_site_value calculates
                    everything << 32
           Product: gcc
           Version: 4.7.0
            Status: UNCONFIRMED
          Severity: trivial
          Priority: P3
         Component: debug
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: jan.kratochvil@redhat.com
                CC: jakub@redhat.com
            Target: x86_64-unknown-linux-gnu


It works but it is a bit size-inefficient/overcomplicated.

extern void d (int);
void __attribute__((noinline, noclone))
self (int i)
{
  if (i == 200)
    self (i + 1);
  else
    d (i + 2);
}

-g -O2
gcc (GCC) 4.7.0 20110708 (experimental)

DW_AT_GNU_call_site_value: 26 byte block: f3 1 55 23 2 8 cb f3 1 55 8 20 24 10
80 80 80 80 80 19 2e 28 1 0 16 13    (DW_OP_GNU_entry_value: (DW_OP_reg5
(rdi)); DW_OP_plus_uconst: 2; DW_OP_const1u: 203; DW_OP_GNU_entry_value:
(DW_OP_reg5 (rdi)); DW_OP_const1u: 32; DW_OP_shl; DW_OP_constu: 858993459200;
DW_OP_ne; DW_OP_bra: 1; DW_OP_swap; DW_OP_drop)

this is:

DW_OP_GNU_entry_value: (DW_OP_reg5 (rdi))
DW_OP_plus_uconst: 2
 = 202
DW_OP_const1u: 203
DW_OP_GNU_entry_value: (DW_OP_reg5 (rdi))
DW_OP_const1u: 32
  32, 200, 203, 202
DW_OP_shl
  200 << 32, 203, 202
DW_OP_constu: 858993459200
  200 << 32, 200 << 32, 203, 202
DW_OP_ne
DW_OP_bra: 1
DW_OP_swap
DW_OP_drop

858993459200 = 200 << 32

There should not be a need to shl by 32 and calculate it everything << 32.

unrelated: function was inlined by `cmovne' despite there is `noinline'.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug debug/49676] inefficiency: DW_AT_GNU_call_site_value calculates everything << 32
  2011-07-08 11:51 [Bug debug/49676] New: inefficiency: DW_AT_GNU_call_site_value calculates everything << 32 jan.kratochvil at redhat dot com
@ 2011-07-08 14:07 ` jakub at gcc dot gnu.org
  2011-07-08 14:41 ` jan.kratochvil at redhat dot com
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: jakub at gcc dot gnu.org @ 2011-07-08 14:07 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49676

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> 2011-07-08 14:06:53 UTC ---
The reason for that is that, when not emitting typed DWARF, we have the call
site value of %edi:

(if_then_else:SI (ne (entry_value:SI (reg:SI 5 di [ i ]))
                (const_int 200 [0xc8]))
            (plus:SI (entry_value:SI (reg:SI 5 di [ i ]))
                (const_int 2 [0x2]))
            (const_int 203 [0xcb]))

and x86-64 has 64-bit DWARF2_ADDR_SIZE and we want to compare just the low 32
bits of the register.  DW_OP_ne compares whole 64-bit (untyped) integers, so
by shifting it up by 32 bits it can be compared easily.  In this exact case
when
it can be done as unsigned comparison too we could very well do:
@@ -134,11 +134,11 @@ self:
        .byte   0xf3    # DW_OP_GNU_entry_value
        .uleb128 0x1
        .byte   0x55    # DW_OP_reg5
-       .byte   0x8     # DW_OP_const1u
-       .byte   0x20
-       .byte   0x24    # DW_OP_shl
-       .byte   0x10    # DW_OP_constu
-       .uleb128 0xc800000000
+       .byte   0xc     # DW_OP_const4u
+       .long   0xffffffff
+       .byte   0x1a    # DW_OP_and
+       .byte   0x10    # DW_OP_const1u
+       .byte   0xc8
        .byte   0x2e    # DW_OP_ne
        .byte   0x28    # DW_OP_bra
        .value  0x1

and save 2 bytes instead.  But if it would be >/>=/</<= signed comparison
instead, I think comparing in most significant bits is still shorter.

BTW, self isn't inlined, it is tail recursion optimized.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug debug/49676] inefficiency: DW_AT_GNU_call_site_value calculates everything << 32
  2011-07-08 11:51 [Bug debug/49676] New: inefficiency: DW_AT_GNU_call_site_value calculates everything << 32 jan.kratochvil at redhat dot com
  2011-07-08 14:07 ` [Bug debug/49676] " jakub at gcc dot gnu.org
@ 2011-07-08 14:41 ` jan.kratochvil at redhat dot com
  2011-07-08 15:32 ` jakub at gcc dot gnu.org
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: jan.kratochvil at redhat dot com @ 2011-07-08 14:41 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49676

Jan Kratochvil <jan.kratochvil at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|                            |INVALID

--- Comment #2 from Jan Kratochvil <jan.kratochvil at redhat dot com> 2011-07-08 14:40:54 UTC ---
Therefore
  func (int i)
is correct to call with value 1 as:
  movq $0xdeaddead00000001, %rdi
I did not realize.  Thanks for the explanation.

DW_OP_shl vs. DW_OP_and "inefficiency" I would not file.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug debug/49676] inefficiency: DW_AT_GNU_call_site_value calculates everything << 32
  2011-07-08 11:51 [Bug debug/49676] New: inefficiency: DW_AT_GNU_call_site_value calculates everything << 32 jan.kratochvil at redhat dot com
  2011-07-08 14:07 ` [Bug debug/49676] " jakub at gcc dot gnu.org
  2011-07-08 14:41 ` jan.kratochvil at redhat dot com
@ 2011-07-08 15:32 ` jakub at gcc dot gnu.org
  2011-07-09 15:51 ` jakub at gcc dot gnu.org
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: jakub at gcc dot gnu.org @ 2011-07-08 15:32 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49676

--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> 2011-07-08 15:31:38 UTC ---
Created attachment 24716
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=24716
gcc47-pr49676.patch

Attached patch saves those 2 bytes.

To answer the general question, for < DWARF2_ADDR_SIZE size types entry_value
is just untyped, and assuming whether it is signed or unsigned is error-prone,
especially if it is in another CU where we can't check it.
In some ABIs the registers in which stuff is passed are sign-extended or
zero-extended, on other ABIs they may contain garbage in the upper bits, in
other ABIs it is a mess (e.g. x86_64 unfortunately).


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug debug/49676] inefficiency: DW_AT_GNU_call_site_value calculates everything << 32
  2011-07-08 11:51 [Bug debug/49676] New: inefficiency: DW_AT_GNU_call_site_value calculates everything << 32 jan.kratochvil at redhat dot com
                   ` (2 preceding siblings ...)
  2011-07-08 15:32 ` jakub at gcc dot gnu.org
@ 2011-07-09 15:51 ` jakub at gcc dot gnu.org
  2011-07-11 13:01 ` jakub at gcc dot gnu.org
  2011-07-11 16:58 ` jakub at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: jakub at gcc dot gnu.org @ 2011-07-09 15:51 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49676

--- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> 2011-07-09 15:48:46 UTC ---
Author: jakub
Date: Sat Jul  9 15:48:42 2011
New Revision: 176083

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=176083
Log:
    PR debug/49676
    * dwarf2out.c (size_of_int_loc_descriptor): New function.
    (address_of_int_loc_descriptor): Use it.
    (scompare_loc_descriptor): Optimize EQ/NE comparison with
    constant.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/dwarf2out.c


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug debug/49676] inefficiency: DW_AT_GNU_call_site_value calculates everything << 32
  2011-07-08 11:51 [Bug debug/49676] New: inefficiency: DW_AT_GNU_call_site_value calculates everything << 32 jan.kratochvil at redhat dot com
                   ` (3 preceding siblings ...)
  2011-07-09 15:51 ` jakub at gcc dot gnu.org
@ 2011-07-11 13:01 ` jakub at gcc dot gnu.org
  2011-07-11 16:58 ` jakub at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: jakub at gcc dot gnu.org @ 2011-07-11 13:01 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49676

--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> 2011-07-11 13:00:57 UTC ---
Created attachment 24739
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=24739
gcc47-pr49676-const.patch

I've noticed that int_loc_descriptor sometimes emits too large ops to build
large constants, many constants can be actually emitted using shorter sequences
of more, but smaller, ops, e.g. DW_OP_lit31 DW_OP_lit31 DW_OP_shl
is just 3 bytes, while DW_OP_constu 0xf80000000 is 7 bytes long.
Similarly, DW_OP_plus_uconst isn't always a win.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug debug/49676] inefficiency: DW_AT_GNU_call_site_value calculates everything << 32
  2011-07-08 11:51 [Bug debug/49676] New: inefficiency: DW_AT_GNU_call_site_value calculates everything << 32 jan.kratochvil at redhat dot com
                   ` (4 preceding siblings ...)
  2011-07-11 13:01 ` jakub at gcc dot gnu.org
@ 2011-07-11 16:58 ` jakub at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: jakub at gcc dot gnu.org @ 2011-07-11 16:58 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49676

--- Comment #6 from Jakub Jelinek <jakub at gcc dot gnu.org> 2011-07-11 16:57:29 UTC ---
Author: jakub
Date: Mon Jul 11 16:57:25 2011
New Revision: 176167

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=176167
Log:
    PR debug/49676
    * dwarf2out.c (int_shift_loc_descriptor): New function.
    (int_loc_descriptor): If shorter, emit i as
    (i >> shift), shift, DW_OP_shl for suitable shift value.
    Similarly, try to optimize large negative values using
    DW_OP_neg of a positive value if shorter.
    (size_of_int_shift_loc_descriptor): New function.
    (size_of_int_loc_descriptor): Adjust to match int_loc_descriptor
    changes.
    (mem_loc_descriptor) <case CONST_INT>: Emit zero-extended constants
    that fit into DWARF2_ADDR_SIZE bytes as int_loc_descriptor +
    DW_OP_GNU_convert instead of DW_OP_GNU_const_type if the former
    is shorter.
    (resolve_addr_in_expr): Optimize DW_OP_plus_uconst with a large
    addend as added DW_OP_plus if it is shorter.

    * gcc.dg/guality/csttest.c: New test.

Added:
    trunk/gcc/testsuite/gcc.dg/guality/csttest.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/dwarf2out.c
    trunk/gcc/testsuite/ChangeLog


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2011-07-11 16:58 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-07-08 11:51 [Bug debug/49676] New: inefficiency: DW_AT_GNU_call_site_value calculates everything << 32 jan.kratochvil at redhat dot com
2011-07-08 14:07 ` [Bug debug/49676] " jakub at gcc dot gnu.org
2011-07-08 14:41 ` jan.kratochvil at redhat dot com
2011-07-08 15:32 ` jakub at gcc dot gnu.org
2011-07-09 15:51 ` jakub at gcc dot gnu.org
2011-07-11 13:01 ` jakub at gcc dot gnu.org
2011-07-11 16:58 ` jakub at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).