public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Guenther <richard.guenther@gmail.com>
To: Tom de Vries <vries@codesourcery.com>
Cc: Steven Bosscher <stevenb.gcc@gmail.com>, gcc-patches@gcc.gnu.org
Subject: Re: [PATCH, PR43864] Gimple level duplicate block cleanup.
Date: Wed, 08 Jun 2011 10:09:00 -0000	[thread overview]
Message-ID: <BANLkTimAvGy1jRniWftbEvLnGwEN_0Ma8Q@mail.gmail.com> (raw)
In-Reply-To: <4DEF4408.4040001@codesourcery.com>

On Wed, Jun 8, 2011 at 11:42 AM, Tom de Vries <vries@codesourcery.com> wrote:
> Hi Richard,
>
> I have a patch for PR43864. The patch adds a gimple level duplicate block
> cleanup. The patch has been bootstrapped and reg-tested on x86_64, and
> reg-tested on ARM. The size impact on ARM for spec2000 is shown in the following
> table (%, lower is better).
>
>                     none            pic
>                thumb1  thumb2  thumb1 thumb2
> spec2000          99.9    99.9    99.8   99.8
>
> PR43864 is currently marked as a duplicate of PR20070, but I'm not sure that the
> optimizations proposed in PR20070 would fix this PR.
>
> The problem in this PR is that when compiling with -O2, the example below should
> only have one call to free. The original problem is formulated in terms of -Os,
> but currently we generate one call to free with -Os, although still not the
> smallest code possible. I'll show here the -O2 case, since that's similar to the
> original PR.
>
> #include <stdio.h>
> void foo (char*, FILE*);
> char* hprofStartupp(char *outputFileName, char *ctx)
> {
>    char fileName[1000];
>    FILE *fp;
>    sprintf(fileName, outputFileName);
>    if (access(fileName, 1) == 0) {
>        free(ctx);
>        return 0;
>    }
>
>    fp = fopen(fileName, 0);
>    if (fp == 0) {
>        free(ctx);
>        return 0;
>    }
>
>    foo(outputFileName, fp);
>
>    return ctx;
> }
>
> AFAIU, there are 2 complementary methods of rtl optimizations proposed in PR20070.
> - Merging 2 blocks which are identical expect for input registers, by using a
>  conditional move to choose between the different input registers.
> - Merging 2 blocks which have different local registers, by ignoring those
>  differences
>
> Blocks .L6 and.L7 have no difference in local registers, but they have a
> difference in input registers: r3 and r1. Replacing the move to r5 by a
> conditional move would probably be benificial in terms of size, but it's not
> clear what condition the conditional move should be using. Calculating such a
> condition would add in size and increase the execution path.
>
> gcc -O2 -march=armv7-a -mthumb pr43864.c -S:
> ...
>        push    {r4, r5, lr}
>        mov     r4, r0
>        sub     sp, sp, #1004
>        mov     r5, r1
>        mov     r0, sp
>        mov     r1, r4
>        bl      sprintf
>        mov     r0, sp
>        movs    r1, #1
>        bl      access
>        mov     r3, r0
>        cbz     r0, .L6
>        movs    r1, #0
>        mov     r0, sp
>        bl      fopen
>        mov     r1, r0
>        cbz     r0, .L7
>        mov     r0, r4
>        bl      foo
> .L3:
>        mov     r0, r5
>        add     sp, sp, #1004
>        pop     {r4, r5, pc}
> .L6:
>        mov     r0, r5
>        mov     r5, r3
>        bl      free
>        b       .L3
> .L7:
>        mov     r0, r5
>        mov     r5, r1
>        bl      free
>        b       .L3
> ...
>
> The proposed patch solved the problem by dealing with the 2 blocks at a level
> when they are still identical: at gimple level. It detect that the 2 blocks are
> identical, and removes one of them.
>
> The following table shows the impact of the patch on the example in terms of
> size for -march=armv7-a:
>
>          without     with    delta
> Os      :     108      104       -4
> O2      :     120      104      -16
> Os thumb:      68       64       -4
> O2 thumb:      76       64      -12
>
> The gain in size for -O2 is that of removing the entire block, plus the
> replacement of 2 moves by a constant set, which also decreases the execution
> path. The patch ensures optimal code for both -O2 and -Os.
>
>
> By keeping track of equivalent definitions in the 2 blocks, we can ignore those
> differences in comparison. Without this feature, we would only match blocks with
> resultless operations, due the the ssa-nature of gimples.
> For example, with this feature, we reduce the following function to its minimum
> at gimple level, rather than at rtl level.
>
> int f(int c, int b, int d)
> {
>  int r, e;
>
>  if (c)
>    r = b + d;
>  else
>    {
>      e = b + d;
>      r = e;
>    }
>
>  return r;
> }
>
> ;; Function f (f)
>
> f (int c, int b, int d)
> {
>  int e;
>
> <bb 2>:
>  e_6 = b_3(D) + d_4(D);
>  return e_6;
>
> }
>
> I'll send the patch with the testcases in a separate email.
>
> OK for trunk?

I don't like that you hook this into cleanup_tree_cfg - that is called
_way_ too often.

This also duplicates the literal matching done on the RTL level - instead
I think this optimization would be more related to value-numbering
(either that of SCCVN/FRE/PRE or that of DOM which also does
jump-threading).

Richard.

> Thanks,
> - Tom
>
> 2011-06-08  Tom de Vries  <tom@codesourcery.com>
>
>        PR middle-end/43864
>        * tree-cfgcleanup.c (int_int_splay_lookup, int_int_splay_insert)
>        (int_int_splay_node_contained_in, int_int_splay_contained_in)
>        (equiv_lookup, equiv_insert, equiv_contained_in, equiv_init)
>        (equiv_delete, gimple_base_equal_p, pt_solution_equal_p, gimple_equal_p)
>        (bb_gimple_equal_p, update_debug_stmts, cleanup_duplicate_preds_1)
>        (same_or_local_phi_alternatives, cleanup_duplicate_preds): New function.
>        (cleanup_tree_cfg_bb): Use cleanup_duplicate_preds.
>

  parent reply	other threads:[~2011-06-08  9:55 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-06-08  9:49 Tom de Vries
2011-06-08  9:55 ` [PATCH, PR43864] Gimple level duplicate block cleanup - test cases Tom de Vries
2011-07-18  2:54   ` Tom de Vries
2011-08-19 18:38     ` Tom de Vries
2011-08-25 10:09       ` Richard Guenther
2011-06-08 10:09 ` Richard Guenther [this message]
2011-06-08 10:40   ` [PATCH, PR43864] Gimple level duplicate block cleanup Steven Bosscher
2011-06-10 17:16   ` Tom de Vries
2011-06-14 15:12     ` Richard Guenther
2011-07-12 12:21       ` Tom de Vries
2011-07-12 14:37         ` Richard Guenther
2011-07-18  0:41           ` Tom de Vries
2011-07-22 15:54             ` Richard Guenther
2011-08-19 18:33               ` Tom de Vries
2011-08-24  9:00               ` Tom de Vries
2011-08-25  1:07                 ` Ian Lance Taylor
2011-08-25  9:30                   ` Richard Guenther
2011-06-10 18:43 ` Jeff Law

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BANLkTimAvGy1jRniWftbEvLnGwEN_0Ma8Q@mail.gmail.com \
    --to=richard.guenther@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=stevenb.gcc@gmail.com \
    --cc=vries@codesourcery.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).