[Bug target/30354] -Os doesn't optimize a/CONST even if it saves size.

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

From: "vda dot linux at googlemail dot com" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/30354] -Os doesn't optimize a/CONST even if it saves size.
Date: Sun, 21 Jun 2009 16:47:00 -0000	[thread overview]
Message-ID: <20090621164734.29861.qmail@sourceware.org> (raw)
In-Reply-To: <bug-30354-12956@http.gcc.gnu.org/bugzilla/>



------- Comment #11 from vda dot linux at googlemail dot com  2009-06-21 16:47 -------
In 32-bit code, there are indeed a few cases of code growth.

Here is a full list (id_XXX are signed divides, ud_XXX are unsigned ones):
-00000000 0000000f T id_x_4
+00000000 00000012 T id_x_4
-00000000 0000000f T id_x_8
+00000000 00000012 T id_x_8
-00000000 0000000f T id_x_16
+00000000 00000012 T id_x_16
-00000000 0000000f T id_x_32
+00000000 00000012 T id_x_32

-00000000 00000010 T ud_x_28
+00000000 00000015 T ud_x_28
-00000000 00000010 T ud_x_56
+00000000 00000015 T ud_x_56
-00000000 00000010 T ud_x_13952
+00000000 00000015 T ud_x_13952

They fall into two groups. Signed divisions by power-of-2 grew by 3 bytes but
they are *much faster* now, and considering how often people type "x / 4" and
think "this will be optimized to shift", forgetting that their x is signed, and
they therefore will have divide insn (!), I see it as a good trade. Code
comparison:

 00000000 <id_x_16>:
-   0:  8b 44 24 04             mov    0x4(%esp),%eax
-   4:  ba 10 00 00 00          mov    $0x10,%edx
-   9:  89 d1                   mov    %edx,%ecx
-   b:  99                      cltd
-   c:  f7 f9                   idiv   %ecx
-   e:  c3                      ret
+   0:  8b 54 24 04             mov    0x4(%esp),%edx
+   4:  89 d0                   mov    %edx,%eax
+   6:  c1 f8 1f                sar    $0x1f,%eax
+   9:  83 e0 0f                and    $0xf,%eax
+   c:  01 d0                   add    %edx,%eax
+   e:  c1 f8 04                sar    $0x4,%eax
+  11:  c3                      ret

The second group is just a few rare cases where "multiple by reciprocal"
optimization happens to require more processing and code is 5 bytes longer:

 00000000 <ud_x_56>:
-   0:  8b 44 24 04             mov    0x4(%esp),%eax
-   4:  ba 38 00 00 00          mov    $0x38,%edx
-   9:  89 d1                   mov    %edx,%ecx
-   b:  31 d2                   xor    %edx,%edx
-   d:  f7 f1                   div    %ecx
-   f:  c3                      ret
+   0:  53                      push   %ebx
+   1:  8b 4c 24 08             mov    0x8(%esp),%ecx
+   5:  bb 25 49 92 24          mov    $0x24924925,%ebx
+   a:  c1 e9 03                shr    $0x3,%ecx
+   d:  89 c8                   mov    %ecx,%eax
+   f:  f7 e3                   mul    %ebx
+  11:  5b                      pop    %ebx
+  12:  89 d0                   mov    %edx,%eax
+  14:  c3                      ret

This is rare - only three cases in entire t.c.bz2

They are far outweighted by 474 cases where code got smaller.

Most of them are saving only one byte. For example, unsigned_x / 100:

 00000000 <ud_x_100>:
-   0:  8b 44 24 04             mov    0x4(%esp),%eax
-   4:  ba 64 00 00 00          mov    $0x64,%edx
-   9:  89 d1                   mov    %edx,%ecx
-   b:  31 d2                   xor    %edx,%edx
-   d:  f7 f1                   div    %ecx
-   f:  c3                      ret
+   0:  b8 1f 85 eb 51          mov    $0x51eb851f,%eax
+   5:  f7 64 24 04             mull   0x4(%esp)
+   9:  89 d0                   mov    %edx,%eax
+   b:  c1 e8 05                shr    $0x5,%eax
+   e:  c3                      ret

Some cases got shorter by 2 or 4 bytes:
-00000000 00000010 T ud_x_3
+00000000 0000000e T ud_x_3
-00000000 00000010 T ud_x_9
+00000000 0000000e T ud_x_9
-00000000 00000010 T ud_x_67
+00000000 0000000e T ud_x_67
-00000000 00000010 T ud_x_641
+00000000 0000000c T ud_x_641
-00000000 00000010 T ud_x_6700417
+00000000 0000000c T ud_x_6700417

For example, unsigned_x / 9:

 00000000 <ud_x_9>:
-   0:  8b 44 24 04             mov    0x4(%esp),%eax
-   4:  ba 09 00 00 00          mov    $0x9,%edx
-   9:  89 d1                   mov    %edx,%ecx
-   b:  31 d2                   xor    %edx,%edx
-   d:  f7 f1                   div    %ecx
-   f:  c3                      ret
+   0:  b8 39 8e e3 38          mov    $0x38e38e39,%eax
+   5:  f7 64 24 04             mull   0x4(%esp)
+   9:  89 d0                   mov    %edx,%eax
+   b:  d1 e8                   shr    %eax
+   d:  c3                      ret

and unsigned_x / 641:

 00000000 <ud_x_641>:
-   0:  8b 44 24 04             mov    0x4(%esp),%eax
-   4:  ba 81 02 00 00          mov    $0x281,%edx
-   9:  89 d1                   mov    %edx,%ecx
-   b:  31 d2                   xor    %edx,%edx
-   d:  f7 f1                   div    %ecx
-   f:  c3                      ret
+   0:  b8 81 3d 66 00          mov    $0x663d81,%eax
+   5:  f7 64 24 04             mull   0x4(%esp)
+   9:  89 d0                   mov    %edx,%eax
+   b:  c3                      ret

I will attach t32.asm.diff now


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30354

next prev parent reply	other threads:[~2009-06-21 16:47 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-01-02 22:09 [Bug rtl-optimization/30354] New: " vda dot linux at googlemail dot com
2007-01-02 22:45 ` [Bug target/30354] " pinskia at gcc dot gnu dot org
2007-07-25 15:05 ` vda dot linux at googlemail dot com
2007-07-25 15:09 ` vda dot linux at googlemail dot com
2007-07-25 15:17 ` vda dot linux at googlemail dot com
2007-07-25 15:22 ` vda dot linux at googlemail dot com
2009-06-05 16:19 ` aldot at gcc dot gnu dot org
2009-06-06 13:41 ` hubicka at gcc dot gnu dot org
2009-06-21 16:11 ` vda dot linux at googlemail dot com
2009-06-21 16:12 ` vda dot linux at googlemail dot com
2009-06-21 16:26 ` rguenth at gcc dot gnu dot org
2009-06-21 16:47 ` vda dot linux at googlemail dot com [this message]
2009-06-21 16:48 ` vda dot linux at googlemail dot com
2009-06-30 13:36 ` hubicka at gcc dot gnu dot org
2010-01-08  9:06 ` steven at gcc dot gnu dot org
     [not found] <bug-30354-4@http.gcc.gnu.org/bugzilla/>
2012-06-28 23:29 ` aldot at gcc dot gnu.org
2013-01-18 10:29 ` vda.linux at googlemail dot com
2021-12-21 11:42 ` pinskia at gcc dot gnu.org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090621164734.29861.qmail@sourceware.org \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).