public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* RE: [0/7] Type promotion pass and elimination of zext/sext
       [not found] <A610E03AD50BFC4D95529A36D37FA55E8A7AB808CC@GEORGE.Emea.Arm.com>
@ 2015-09-07 10:51 ` Wilco Dijkstra
  2015-09-07 11:31   ` Kugan
  0 siblings, 1 reply; 28+ messages in thread
From: Wilco Dijkstra @ 2015-09-07 10:51 UTC (permalink / raw)
  To: 'Kugan', Renlin Li
  Cc: 'GCC Patches', 'Richard Biener'

> Kugan wrote:
> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with
> -O3 -g Error: unaligned opcodes detected in executable segment. It works
> fine if I remove the -g. I am looking into it and needs to be fixed as well.

This is a known assembler bug I found a while back, Renlin is looking into it.
Basically when debug tables are inserted at the end of a code section the 
assembler doesn't align to the alignment required by the debug tables.

Wilco


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [0/7] Type promotion pass and elimination of zext/sext
  2015-09-07 10:51 ` [0/7] Type promotion pass and elimination of zext/sext Wilco Dijkstra
@ 2015-09-07 11:31   ` Kugan
  2015-09-07 12:17     ` pinskia
  0 siblings, 1 reply; 28+ messages in thread
From: Kugan @ 2015-09-07 11:31 UTC (permalink / raw)
  To: Wilco Dijkstra, Renlin Li; +Cc: 'GCC Patches', 'Richard Biener'



On 07/09/15 20:46, Wilco Dijkstra wrote:
>> Kugan wrote:
>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with
>> -O3 -g Error: unaligned opcodes detected in executable segment. It works
>> fine if I remove the -g. I am looking into it and needs to be fixed as well.
> 
> This is a known assembler bug I found a while back, Renlin is looking into it.
> Basically when debug tables are inserted at the end of a code section the 
> assembler doesn't align to the alignment required by the debug tables.

This is precisely what seems to be happening. Renlin, could you please
let me know when you have a patch (even if it is a prototype or a hack).

Thanks,
Kugan

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [0/7] Type promotion pass and elimination of zext/sext
  2015-09-07 11:31   ` Kugan
@ 2015-09-07 12:17     ` pinskia
  2015-09-07 12:49       ` Wilco Dijkstra
  2015-09-08  8:03       ` Renlin Li
  0 siblings, 2 replies; 28+ messages in thread
From: pinskia @ 2015-09-07 12:17 UTC (permalink / raw)
  To: Kugan; +Cc: Wilco Dijkstra, Renlin Li, GCC Patches, Richard Biener





> On Sep 7, 2015, at 7:22 PM, Kugan <kugan.vivekanandarajah@linaro.org> wrote:
> 
> 
> 
> On 07/09/15 20:46, Wilco Dijkstra wrote:
>>> Kugan wrote:
>>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with
>>> -O3 -g Error: unaligned opcodes detected in executable segment. It works
>>> fine if I remove the -g. I am looking into it and needs to be fixed as well.
>> 
>> This is a known assembler bug I found a while back, Renlin is looking into it.
>> Basically when debug tables are inserted at the end of a code section the 
>> assembler doesn't align to the alignment required by the debug tables.
> 
> This is precisely what seems to be happening. Renlin, could you please
> let me know when you have a patch (even if it is a prototype or a hack).


I had noticed that but I read through the assembler code and it sounded very much like it was a designed this way and that the compiler was not supposed to emit assembly like this and fix up the alignment. 

Thanks,
Andrew

> 
> Thanks,
> Kugan

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [0/7] Type promotion pass and elimination of zext/sext
  2015-09-07 12:17     ` pinskia
@ 2015-09-07 12:49       ` Wilco Dijkstra
  2015-09-08  8:03       ` Renlin Li
  1 sibling, 0 replies; 28+ messages in thread
From: Wilco Dijkstra @ 2015-09-07 12:49 UTC (permalink / raw)
  To: pinskia, Kugan; +Cc: Renlin Li, GCC Patches, Richard Biener

> pinskia@gmail.com wrote:
> > On Sep 7, 2015, at 7:22 PM, Kugan <kugan.vivekanandarajah@linaro.org> wrote:
> >
> >
> >
> > On 07/09/15 20:46, Wilco Dijkstra wrote:
> >>> Kugan wrote:
> >>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with
> >>> -O3 -g Error: unaligned opcodes detected in executable segment. It works
> >>> fine if I remove the -g. I am looking into it and needs to be fixed as well.
> >>
> >> This is a known assembler bug I found a while back, Renlin is looking into it.
> >> Basically when debug tables are inserted at the end of a code section the
> >> assembler doesn't align to the alignment required by the debug tables.
> >
> > This is precisely what seems to be happening. Renlin, could you please
> > let me know when you have a patch (even if it is a prototype or a hack).
> 
> 
> I had noticed that but I read through the assembler code and it sounded very much like it was
> a designed this way and that the compiler was not supposed to emit assembly like this and fix
> up the alignment.

No, the bug is introduced solely by the assembler - there is no way to avoid it as you can't expect
users to align the end of the code section to an unspecified debug alignment (which could
potentially vary depending on the generated debug info). The assembler aligns unaligned instructions
without a warning, and doesn't require the section size to be a multiple of the section alignment,
ie. the design is that the assembler can deal with any alignment.

Wilco


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [0/7] Type promotion pass and elimination of zext/sext
  2015-09-07 12:17     ` pinskia
  2015-09-07 12:49       ` Wilco Dijkstra
@ 2015-09-08  8:03       ` Renlin Li
  2015-09-08 12:37         ` Wilco Dijkstra
  1 sibling, 1 reply; 28+ messages in thread
From: Renlin Li @ 2015-09-08  8:03 UTC (permalink / raw)
  To: pinskia, Kugan
  Cc: Wilco Dijkstra, GCC Patches, Richard Biener, Nicholas Clifton

Hi Andrew,

Previously, there is a discussion thread in binutils mailing list:

https://sourceware.org/ml/binutils/2015-04/msg00032.html

Nick proposed a way to fix, Richard Henderson hold similar opinion as you.

Regards,
Renlin

On 07/09/15 12:45, pinskia@gmail.com wrote:
>
>
>
>> On Sep 7, 2015, at 7:22 PM, Kugan <kugan.vivekanandarajah@linaro.org> wrote:
>>
>>
>>
>> On 07/09/15 20:46, Wilco Dijkstra wrote:
>>>> Kugan wrote:
>>>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with
>>>> -O3 -g Error: unaligned opcodes detected in executable segment. It works
>>>> fine if I remove the -g. I am looking into it and needs to be fixed as well.
>>> This is a known assembler bug I found a while back, Renlin is looking into it.
>>> Basically when debug tables are inserted at the end of a code section the
>>> assembler doesn't align to the alignment required by the debug tables.
>> This is precisely what seems to be happening. Renlin, could you please
>> let me know when you have a patch (even if it is a prototype or a hack).
>
> I had noticed that but I read through the assembler code and it sounded very much like it was a designed this way and that the compiler was not supposed to emit assembly like this and fix up the alignment.
>
> Thanks,
> Andrew
>
>> Thanks,
>> Kugan

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [0/7] Type promotion pass and elimination of zext/sext
  2015-09-08  8:03       ` Renlin Li
@ 2015-09-08 12:37         ` Wilco Dijkstra
  0 siblings, 0 replies; 28+ messages in thread
From: Wilco Dijkstra @ 2015-09-08 12:37 UTC (permalink / raw)
  To: Renlin Li, pinskia, Kugan; +Cc: GCC Patches, Richard Biener, nickc

> Renlin Li wrote:
> Hi Andrew,
> 
> Previously, there is a discussion thread in binutils mailing list:
> 
> https://sourceware.org/ml/binutils/2015-04/msg00032.html
> 
> Nick proposed a way to fix, Richard Henderson hold similar opinion as you.

Both Nick and Richard H seem to think it is an issue with unaligned instructions 
rather than an alignment bug in the debug code in the assembler (probably due to
the misleading error message). Although it would work, since we don't have/need
unaligned instructions that proposed patch is not the right fix for this issue.

Anyway aligning the debug tables correctly should be a safe and trivial fix.

Wilco



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [0/7] Type promotion pass and elimination of zext/sext
  2015-12-10  0:27                             ` Kugan
@ 2015-12-16 13:18                               ` Richard Biener
  0 siblings, 0 replies; 28+ messages in thread
From: Richard Biener @ 2015-12-16 13:18 UTC (permalink / raw)
  To: Kugan; +Cc: gcc-patches

On Thu, Dec 10, 2015 at 1:27 AM, Kugan
<kugan.vivekanandarajah@linaro.org> wrote:
> Hi Riachard,
>
> Thanks for the reviews.
>
> I think since we have some unresolved issues here, it is best to aim for
> the next stage1. I however would like any feedback so that I can
> continue to improve this.

Yeah, sorry I've been distracted lately and am not sure I'll get to
the patch before
christmas break.

> https://gcc.gnu.org/ml/gcc-patches/2015-09/msg01063.html is also related
> to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67714. I don't think
> there is any agreement on this. Or is there any better place to fix this?

I don't know enough in this area to suggest anything.

Richard.

> Thanks,
> Kugan

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [0/7] Type promotion pass and elimination of zext/sext
  2015-11-24  2:52                           ` Kugan
@ 2015-12-10  0:27                             ` Kugan
  2015-12-16 13:18                               ` Richard Biener
  0 siblings, 1 reply; 28+ messages in thread
From: Kugan @ 2015-12-10  0:27 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

Hi Riachard,

Thanks for the reviews.

I think since we have some unresolved issues here, it is best to aim for
the next stage1. I however would like any feedback so that I can
continue to improve this.

https://gcc.gnu.org/ml/gcc-patches/2015-09/msg01063.html is also related
to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67714. I don't think
there is any agreement on this. Or is there any better place to fix this?

Thanks,
Kugan

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [0/7] Type promotion pass and elimination of zext/sext
  2015-11-18 15:06                         ` Richard Biener
@ 2015-11-24  2:52                           ` Kugan
  2015-12-10  0:27                             ` Kugan
  0 siblings, 1 reply; 28+ messages in thread
From: Kugan @ 2015-11-24  2:52 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 17718 bytes --]

Hi Richard,

Thanks for you comments. I am attaching  an updated patch with details
below.

On 19/11/15 02:06, Richard Biener wrote:
> On Wed, Nov 18, 2015 at 3:04 PM, Richard Biener
> <richard.guenther@gmail.com> wrote:
>> On Sat, Nov 14, 2015 at 2:15 AM, Kugan
>> <kugan.vivekanandarajah@linaro.org> wrote:
>>>
>>> Attached is the latest version of the patch. With the patches
>>> 0001-Add-new-SEXT_EXPR-tree-code.patch,
>>> 0002-Add-type-promotion-pass.patch and
>>> 0003-Optimize-ZEXT_EXPR-with-tree-vrp.patch.
>>>
>>> I did bootstrap on ppc64-linux-gnu, aarch64-linux-gnu and
>>> x64-64-linux-gnu and regression testing on ppc64-linux-gnu,
>>> aarch64-linux-gnu arm64-linux-gnu and x64-64-linux-gnu. I ran into three
>>> issues in ppc64-linux-gnu regression testing. There are some other test
>>> cases which needs adjustment for scanning for some patterns that are not
>>> valid now.
>>>
>>> 1. rtl fwprop was going into infinite loop. Works with the following patch:
>>> diff --git a/gcc/fwprop.c b/gcc/fwprop.c
>>> index 16c7981..9cf4f43 100644
>>> --- a/gcc/fwprop.c
>>> +++ b/gcc/fwprop.c
>>> @@ -948,6 +948,10 @@ try_fwprop_subst (df_ref use, rtx *loc, rtx
>>> new_rtx, rtx_insn *def_insn,
>>>    int old_cost = 0;
>>>    bool ok;
>>>
>>> +  /* Value to be substituted is the same, nothing to do.  */
>>> +  if (rtx_equal_p (*loc, new_rtx))
>>> +    return false;
>>> +
>>>    update_df_init (def_insn, insn);
>>>
>>>    /* forward_propagate_subreg may be operating on an instruction with
>>
>> Which testcase was this on?

After re-basing the trunk, I cannot reproduce it anymore.

>>
>>> 2. gcc.dg/torture/ftrapv-1.c fails
>>> This is because we are checking for the  SImode trapping. With the
>>> promotion of the operation to wider mode, this is i think expected. I
>>> think the testcase needs updating.
>>
>> No, it is not expected.  As said earlier you need to refrain from promoting
>> integer operations that trap.  You can use ! operation_no_trapping_overflow
>> for this.
>>

I have changed this.

>>> 3. gcc.dg/sms-3.c fails
>>> It fails with  -fmodulo-sched-allow-regmoves  and OK when I remove it. I
>>> am looking into it.
>>>
>>>
>>> I also have the following issues based on the previous review (as posted
>>> in the previous patch). Copying again for the review purpose.
>>>
>>> 1.
>>>> you still call promote_ssa on both DEFs and USEs and promote_ssa looks
>>>> at SSA_NAME_DEF_STMT of the passed arg.  Please call promote_ssa just
>>>> on DEFs and fixup_uses on USEs.
>>>
>>> I am doing this to promote SSA that are defined with GIMPLE_NOP. Is
>>> there anyway to iterate over this. I have added gcc_assert to make sure
>>> that promote_ssa is called only once.
>>
>>   gcc_assert (!ssa_name_info_map->get_or_insert (def));
>>
>> with --disable-checking this will be compiled away so you need to do
>> the assert in a separate statement.
>>
>>> 2.
>>>> Instead of this you should, in promote_all_stmts, walk over all uses
>>> doing what
>>>> fixup_uses does and then walk over all defs, doing what promote_ssa does.
>>>>
>>>> +    case GIMPLE_NOP:
>>>> +       {
>>>> +         if (SSA_NAME_VAR (def) == NULL)
>>>> +           {
>>>> +             /* Promote def by fixing its type for anonymous def.  */
>>>> +             TREE_TYPE (def) = promoted_type;
>>>> +           }
>>>> +         else
>>>> +           {
>>>> +             /* Create a promoted copy of parameters.  */
>>>> +             bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
>>>>
>>>> I think the uninitialized vars are somewhat tricky and it would be best
>>>> to create a new uninit anonymous SSA name for them.  You can
>>>> have SSA_NAME_VAR != NULL and def _not_ being a parameter
>>>> btw.
>>>
>>> I experimented with get_or_create_default_def. Here  we have to have a
>>> SSA_NAME_VAR (def) of promoted type.
>>>
>>> In the attached patch I am doing the following and seems to work. Does
>>> this looks OK?
>>>
>>> +         }
>>> +       else if (TREE_CODE (SSA_NAME_VAR (def)) != PARM_DECL)
>>> +         {
>>> +           tree var = copy_node (SSA_NAME_VAR (def));
>>> +           TREE_TYPE (var) = promoted_type;
>>> +           TREE_TYPE (def) = promoted_type;
>>> +           SET_SSA_NAME_VAR_OR_IDENTIFIER (def, var);
>>> +         }
>>
>> I believe this will wreck the SSA default-def map so you should do
>>
>>   set_ssa_default_def (cfun, SSA_NAME_VAR (def), NULL_TREE);
>>   tree var = create_tmp_reg (promoted_type);
>>   TREE_TYPE (def) = promoted_type;
>>   SET_SSA_NAME_VAR_OR_IDENTIFIER (def, var);
>>   set_ssa_default_def (cfun, var, def);
>>
>> instead.
I have changed this.

>>
>>> I prefer to promote def as otherwise iterating over the uses and
>>> promoting can look complicated (have to look at all the different types
>>> of stmts again and do the right thing as It was in the earlier version
>>> of this before we move to this approach)
>>>
>>> 3)
>>>> you can also transparently handle constants for the cases where promoting
>>>> is required.  At the moment their handling is interwinded with the def
>>> promotion
>>>> code.  That makes the whole thing hard to follow.
>>>
>>>
>>> I have updated the comments with:
>>>
>>> +/* Promote constants in STMT to TYPE.  If PROMOTE_COND_EXPR is true,
>>> +   promote only the constants in conditions part of the COND_EXPR.
>>> +
>>> +   We promote the constants when the associated operands are promoted.
>>> +   This usually means that we promote the constants when we promote the
>>> +   defining stmnts (as part of promote_ssa). However for COND_EXPR, we
>>> +   can promote only when we promote the other operand. Therefore, this
>>> +   is done during fixup_use.  */
>>>
>>>
>>> 4)
>>> I am handling gimple_debug separately to avoid any code difference with
>>> and without -g option. I have updated the comments for this.
>>>
>>> 5)
>>> I also noticed that tree-ssa-uninit sometimes gives false positives due
>>> to the assumptions
>>> it makes. Is it OK to move this pass before type promotion? I can do the
>>> testings and post a separate patch with this if this OK.
>>
>> Hmm, no, this needs more explanation (like a testcase).
There are few issues I ran into. I will send a list with more info. For
example:

/* Test we do not warn about initializing variable with self. */
/* { dg-do compile } */
/* { dg-options "-O -Wuninitialized" } */

int f()
{
  int i = i;
  return i;
}


I now get:
kugan@kugan-desktop:~$
/home/kugan/work/builds/gcc-fsf-linaro/tools/bin/ppc64-none-linux-gnu-gcc -O
-Wuninitialized
/home/kugan/work/SVN/gcc/trunk/gcc/testsuite/c-c++-common/uninit-D.c
-fdump-tree-all
/home/kugan/work/SVN/gcc/trunk/gcc/testsuite/c-c++-common/uninit-D.c: In
function ‘f’:
/home/kugan/work/SVN/gcc/trunk/gcc/testsuite/c-c++-common/uninit-D.c:8:10:
warning: ‘i’ is used uninitialized in this function [-Wuninitialized]
   return i;


diff -u uninit-D.c.146t.veclower21  uninit-D.c.147t.promotion is:

--- uninit-D.c.146t.veclower21	2015-11-24 11:30:04.374203197 +1100
+++ uninit-D.c.147t.promotion	2015-11-24 11:30:04.374203197 +1100
@@ -1,13 +1,16 @@

 ;; Function f (f, funcdef_no=0, decl_uid=2271, cgraph_uid=0,
symbol_order=0)

 f ()
 {
+  signed long i;
   int i;
+  int _3;

   <bb 2>:
-  return i_1(D);
+  _3 = (int) i_1(D);
+  return _3;

 }



>>
>>> 6)
>>> I also removed the optimization that prevents some of the redundant
>>> truncation/extensions from type promotion pass, as it dosent do much as
>>> of now. I can send a proper follow up patch. Is that OK?
>>
>> Yeah, that sounds fine.
>>
>>> I also did a simple test with coremark for the latest patch. I compared
>>> the code size for coremark for linux-gcc with -Os. Results are as
>>> reported by the "size" utility. I know this doesn't mean much but can
>>> give some indication.
>>>         Base            with pass       Percentage improvement
>>> ==============================================================
>>> arm     10476           10372           0.9927453226
>>> aarch64 9545            9521            0.2514405448
>>> ppc64   12236           12052           1.5037593985
>>>
>>>
>>> After resolving the above issues, I would like propose that we  commit
>>> the pass as not enabled by default (even though the patch as it stands
>>> enabled by default - I am doing it for testing purposes).
>>
>> Hmm, we don't like to have passes that are not enabled by default with any
>> optimization level or for any target.  Those tend to bitrot quickly :(
>>
>> Did you do any performance measurements yet?

Ok, I understand. I did performance testing on AARch64 and saw some good
improvement for the earlier version. I will do it again for more targets
after getting it reviewed.

>>
>> Looking over the pass in detail now (again).
> 
> Ok, so still looking at the basic operation scheme.
> 
>       FOR_EACH_SSA_USE_OPERAND (op, stmt, iter, SSA_OP_USE)
>         {
>           use = USE_FROM_PTR (op);
>           if (TREE_CODE (use) == SSA_NAME
>               && gimple_code (SSA_NAME_DEF_STMT (use)) == GIMPLE_NOP)
>             promote_ssa (use, &gsi);
>           fixup_use (stmt, &gsi, op, use);
>         }
> 
>       FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_DEF)
>         promote_ssa (def, &gsi);
> 
> the GIMPLE_NOP handling in promote_ssa but when processing uses looks
> backwards.  As those are implicitely defined in the entry block you may
> better just iterate over all default defs before the dominator walk like so
> 
>   unsigned n = num_ssa_names;
>   for (i = 1; i < n; ++i)
>     {
>       tree name = ssa_name (i);
>       if (name
>           && SSA_NAME_IS_DEFAULT_DEF
>           && ! has_zero_uses (name))
>        promote_default_def (name);
>     }
> 

I have Changed this.

> I see promote_cst_in_stmt in both promote_ssa and fixup_use.  Logically
> it belongs to use processing, but on a stmt granularity.  Thus between
> iterating over all uses and iteration over all defs call promote_cst_in_stmt
> on all stmts.  It's a bit awkward as it expects to be called from context
> that knows whether promotion is necessary or not.
> 
> /* Create an ssa with TYPE to copy ssa VAR.  */
> static tree
> make_promoted_copy (tree var, gimple *def_stmt, tree type)
> {
>   tree new_lhs = make_ssa_name (type, def_stmt);
>   if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (var))
>     SSA_NAME_OCCURS_IN_ABNORMAL_PHI (new_lhs) = 1;
>   return new_lhs;
> }
> 
> as you are generating a copy statement I don't see why you need to copy
> SSA_NAME_OCCURS_IN_ABNORMAL_PHI (in no case new_lhs will
> be used in a PHI node directly AFAICS).  Merging make_promoted_copy
> and the usually following extension stmt generation plus insertion into
> a single helper would make that obvious.
> 

I have changed this.

> static unsigned int
> fixup_use (gimple *stmt, gimple_stmt_iterator *gsi,
>            use_operand_p op, tree use)
> {
>   ssa_name_info *info = ssa_name_info_map->get_or_insert (use);
>   /* If USE is not promoted, nothing to do.  */
>   if (!info)
>     return 0;
> 
> You should use ->get (), not ->get_or_insert here.
> 
>       gimple *copy_stmt = gimple_build_assign (temp, NOP_EXPR,
>                                                use, NULL_TREE);
> 

Changed this.

> you can avoid the trailing NULL_TREE here.
> 
>         gimple *copy_stmt =
>           zero_sign_extend_stmt (temp, use,
>                                  TYPE_UNSIGNED (old_type),
>                                  TYPE_PRECISION (old_type));
> 
> coding style says the '=' goes to the next line, thus
> 
>     gimple *copy_stmt
>        = zero_sign_extend_stmt ...


Changed this.
> 
> /* Zero/sign extend (depending on UNSIGNED_P) VAR and truncate to WIDTH bits.
>    Assign the zero/sign extended value in NEW_VAR.  gimple statement
>    that performs the zero/sign extension is returned.  */
> static gimple *
> zero_sign_extend_stmt (tree new_var, tree var, bool unsigned_p, int width)
> {
> 
> looks like instead of unsigned_p/width you can pass in a  type instead.
> 
>     /* Sign extend.  */
>     stmt = gimple_build_assign (new_var,
>                                 SEXT_EXPR,
>                                 var, build_int_cst (TREE_TYPE (var), width));
> 
> use size_int (width) instead.
> 
> /* Convert constant CST to TYPE.  */
> static tree
> convert_int_cst (tree type, tree cst, signop sign = SIGNED)
> 
> no need for a default argument
> 
> {
>   wide_int wi_cons = fold_convert (type, cst);
>   wi_cons = wi::ext (wi_cons, TYPE_PRECISION (TREE_TYPE (cst)), sign);
>   return wide_int_to_tree (type, wi_cons);
> }


For some of the operations, sign extended constants are created. For
example:

short unPack( unsigned char c )
{
    /* Only want lower four bit nibble */
    c = c & (unsigned char)0x0F ;

    if( c > 7 ) {
        /* Negative nibble */
        return( ( short )( c - 5 ) ) ;

    }
    else
    {
        /* positive nibble */
        return( ( short )c ) ;
    }
}


- 5 above becomes  + (-5). Therefore, If I sign extend the constant in
promotion (even though it is unsigned) results in better code. There is
no correctness issue.

I have now changed it based on your suggestions. Is this look better?


> 
> I wonder why this function is needed at all and you don't just call
> fold_convert (type, cst)?
> 
> /* Return true if the tree CODE needs the propmoted operand to be
>    truncated (when stray bits are set beyond the original type in
>    promoted mode) to preserve the semantics.  */
> static bool
> truncate_use_p (enum tree_code code)
> {
> 
> a conservatively correct predicate would implement the inversion,
> not_truncated_use_p because if you miss any tree code the
> result will be unnecessary rather than missed truncations.
> 

Changed it.

> static bool
> type_precision_ok (tree type)
> {
>   return (TYPE_PRECISION (type)
>           == GET_MODE_PRECISION (TYPE_MODE (type)));
> }
> 
> /* Return the promoted type for TYPE.  */
> static tree
> get_promoted_type (tree type)
> {
>   tree promoted_type;
>   enum machine_mode mode;
>   int uns;
> 
>   if (POINTER_TYPE_P (type)
>       || !INTEGRAL_TYPE_P (type)
>       || !type_precision_ok (type))
> 
> the type_precision_ok check is because SEXT doesn't work
> properly for bitfield types?  I think we want to promote those
> to their mode precision anyway.  We just need to use
> sth different than SEXT here (the bitwise-and works of course)
> or expand SEXT from non-mode precision differently (see
> expr.c REDUCE_BIT_FIELD which expands it as a
> lshift/rshift combo).  Eventually this can be left for a followup
> though it might get you some extra testing coverage on
> non-promote-mode targets.

I will have a look at it.

> 
> /* Return true if ssa NAME is already considered for promotion.  */
> static bool
> ssa_promoted_p (tree name)
> {
>   if (TREE_CODE (name) == SSA_NAME)
>     {
>       unsigned int index = SSA_NAME_VERSION (name);
>       if (index < n_ssa_val)
>         return bitmap_bit_p (ssa_to_be_promoted_bitmap, index);
>     }
>   return true;
> 
> better than this default assert you pass in an SSA name.

Changed it.

> 
> isn't the bitmap somewhat redundant with the hash-map?
> And you could combine both by using a vec<ssa_name_info *> indexed
> by SSA_NAME_VERSION ()?
> 
>          if ((TREE_CODE_CLASS (gimple_assign_rhs_code (stmt))
>                == tcc_comparison)
>               || truncate_use_p (gimple_assign_rhs_code (stmt)))
> 
> you always check for tcc_omparison when checking for truncate_use_p
> so just handle it there (well, as said above, implement conservative
> predicates).
> 
>   switch (gimple_code (stmt))
>     {
>     case GIMPLE_ASSIGN:
>       if (promote_cond
>           && gimple_assign_rhs_code (stmt) == COND_EXPR)
>         {
> 
> looking at all callers this condition is never true.
> 
>           tree new_op = build2 (TREE_CODE (op), type, op0, op1);
> 
> as tcc_comparison class trees are not shareable you don't
> need to build2 but can directly set TREE_OPERAND (op, ..) to the
> promoted value.  Note that rhs1 may still just be an SSA name
> and not a comparison.

Changed this.

> 
>     case GIMPLE_PHI:
>         {
>           /* Promote INTEGER_CST arguments to GIMPLE_PHI.  */
>           gphi *phi = as_a <gphi *> (stmt);
>           FOR_EACH_PHI_ARG (oprnd, phi, iter, SSA_OP_USE)
>             {
>               op = USE_FROM_PTR (oprnd);
>               index = PHI_ARG_INDEX_FROM_USE (oprnd);
>               if (TREE_CODE (op) == INTEGER_CST)
>                 SET_PHI_ARG_DEF (phi, index, convert_int_cst (type, op, sign));
>             }
> 
> static unsigned int
> fixup_use (gimple *stmt, gimple_stmt_iterator *gsi,
>            use_operand_p op, tree use)
> {
>   ssa_name_info *info = ssa_name_info_map->get_or_insert (use);
>   /* If USE is not promoted, nothing to do.  */
>   if (!info)
>     return 0;
> 
>   tree promoted_type = info->promoted_type;
>   tree old_type = info->type;
>   bool do_not_promote = false;
> 
>   switch (gimple_code (stmt))
>     {
>  ....
>     default:
>       break;
>     }
> 
> do_not_promote = false is not conservative.  Please place a
> gcc_unreachable () in the default case.

We will have valid statements (which are not handled in switch) for
which we don't have to do any fix ups.

> 
> I see you handle debug stmts here but that case cannot be reached.
> 
> /* Promote use in GIMPLE_DEBUG stmts. Do this separately to avoid generating
>    different sequence with and without -g.  This can  happen when promoting
>    SSA that are defined with GIMPLE_NOP.  */
> 
> but that's only because you choose to unconditionally handle GIMPLE_NOP uses...

I have removed this.

Thanks,
Kugan

> 
> Richard.
> 
> 
>> Thanks,
>> Richard.
>>
>>> Thanks,
>>> Kugan
>>>
>>>

[-- Attachment #2: 0002-Add-type-promotion-pass.patch --]
[-- Type: text/x-patch, Size: 31362 bytes --]

From 89f526ea6f7878879fa65a2b869cac4c21dc7df0 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Fri, 20 Nov 2015 14:14:52 +1100
Subject: [PATCH 2/3] Add type promotion pass

---
 gcc/Makefile.in               |   1 +
 gcc/auto-profile.c            |   2 +-
 gcc/common.opt                |   4 +
 gcc/doc/invoke.texi           |  10 +
 gcc/gimple-ssa-type-promote.c | 849 ++++++++++++++++++++++++++++++++++++++++++
 gcc/passes.def                |   1 +
 gcc/timevar.def               |   1 +
 gcc/tree-pass.h               |   1 +
 libiberty/cp-demangle.c       |   2 +-
 9 files changed, 869 insertions(+), 2 deletions(-)
 create mode 100644 gcc/gimple-ssa-type-promote.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 0fd8d99..4e1444c 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1512,6 +1512,7 @@ OBJS = \
 	tree-vect-slp.o \
 	tree-vectorizer.o \
 	tree-vrp.o \
+	gimple-ssa-type-promote.o \
 	tree.o \
 	valtrack.o \
 	value-prof.o \
diff --git a/gcc/auto-profile.c b/gcc/auto-profile.c
index c7aab42..f214331 100644
--- a/gcc/auto-profile.c
+++ b/gcc/auto-profile.c
@@ -1257,7 +1257,7 @@ afdo_propagate_circuit (const bb_set &annotated_bb, edge_set *annotated_edge)
     FOR_EACH_EDGE (e, ei, bb->succs)
     {
       unsigned i, total = 0;
-      edge only_one;
+      edge only_one = NULL;
       bool check_value_one = (((integer_onep (cmp_rhs))
                                ^ (gimple_cond_code (cmp_stmt) == EQ_EXPR))
                               ^ ((e->flags & EDGE_TRUE_VALUE) != 0));
diff --git a/gcc/common.opt b/gcc/common.opt
index 3eb520e..582e8ee 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2407,6 +2407,10 @@ fsplit-paths
 Common Report Var(flag_split_paths) Init(0) Optimization
 Split paths leading to loop backedges.
 
+ftree-type-promote
+Common Report Var(flag_tree_type_promote) Init(1) Optimization
+Perform Type Promotion on trees
+
 funit-at-a-time
 Common Report Var(flag_unit_at_a_time) Init(1)
 Compile whole compilation unit at a time.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 7cef176..21f94a6 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -9142,6 +9142,16 @@ Split paths leading to loop backedges.  This can improve dead code
 elimination and common subexpression elimination.  This is enabled by
 default at @option{-O2} and above.
 
+@item -ftree-type-promote
+@opindex ftree-type-promote
+This pass applies type promotion to SSA names in the function and
+inserts appropriate truncations to preserve the semantics.  Idea of
+this pass is to promote operations such a way that we can minimise
+generation of subreg in RTL, that intern results in removal of
+redundant zero/sign extensions.
+
+This optimization is enabled by default.
+
 @item -fsplit-ivs-in-unroller
 @opindex fsplit-ivs-in-unroller
 Enables expression of values of induction variables in later iterations
diff --git a/gcc/gimple-ssa-type-promote.c b/gcc/gimple-ssa-type-promote.c
new file mode 100644
index 0000000..5993e89
--- /dev/null
+++ b/gcc/gimple-ssa-type-promote.c
@@ -0,0 +1,849 @@
+/* Type promotion of SSA names to minimise redundant zero/sign extension.
+   Copyright (C) 2015 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "hash-set.h"
+#include "machmode.h"
+#include "vec.h"
+#include "double-int.h"
+#include "input.h"
+#include "symtab.h"
+#include "wide-int.h"
+#include "inchash.h"
+#include "tree.h"
+#include "fold-const.h"
+#include "stor-layout.h"
+#include "predict.h"
+#include "function.h"
+#include "dominance.h"
+#include "cfg.h"
+#include "basic-block.h"
+#include "tree-ssa-alias.h"
+#include "gimple-fold.h"
+#include "tree-eh.h"
+#include "gimple-expr.h"
+#include "is-a.h"
+#include "gimple.h"
+#include "gimple-iterator.h"
+#include "gimple-ssa.h"
+#include "tree-phinodes.h"
+#include "ssa-iterators.h"
+#include "stringpool.h"
+#include "tree-ssanames.h"
+#include "tree-pass.h"
+#include "gimple-pretty-print.h"
+#include "langhooks.h"
+#include "sbitmap.h"
+#include "domwalk.h"
+#include "tree-dfa.h"
+
+/* This pass applies type promotion to SSA names in the function and
+   inserts appropriate truncations.  Idea of this pass is to promote operations
+   such a way that we can minimise generation of subreg in RTL,
+   that in turn results in removal of redundant zero/sign extensions.  This pass
+   will run prior to The VRP and DOM such that they will be able to optimise
+   redundant truncations and extensions.  This is based on the discussion from
+   https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00472.html.
+*/
+
+/* Structure to hold the type and promoted type for promoted ssa variables.  */
+struct ssa_name_info
+{
+  tree ssa;		/* Name of the SSA_NAME.  */
+  tree type;		/* Original type of ssa.  */
+  tree promoted_type;	/* Promoted type of ssa.  */
+};
+
+/* Obstack for ssa_name_info.  */
+static struct obstack ssa_name_info_obstack;
+
+static unsigned n_ssa_val;
+static sbitmap ssa_to_be_promoted_bitmap;
+static hash_map <tree, ssa_name_info *>  *ssa_name_info_map;
+
+static bool
+type_precision_ok (tree type)
+{
+  return (TYPE_PRECISION (type)
+	  == GET_MODE_PRECISION (TYPE_MODE (type)));
+}
+
+/* Return the promoted type for TYPE.  */
+static tree
+get_promoted_type (tree type)
+{
+  tree promoted_type;
+  enum machine_mode mode;
+  int uns;
+
+  if (POINTER_TYPE_P (type)
+      || !INTEGRAL_TYPE_P (type)
+      || !type_precision_ok (type))
+    return type;
+
+  mode = TYPE_MODE (type);
+#ifdef PROMOTE_MODE
+  uns = TYPE_SIGN (type);
+  PROMOTE_MODE (mode, uns, type);
+#endif
+  uns = TYPE_SIGN (type);
+  if (TYPE_PRECISION (type) == GET_MODE_PRECISION (mode))
+    return type;
+  promoted_type
+    = build_nonstandard_integer_type (GET_MODE_PRECISION (mode),
+				      uns);
+  gcc_assert (TYPE_PRECISION (promoted_type) == GET_MODE_PRECISION (mode));
+  return promoted_type;
+}
+
+/* Return true if ssa NAME is already considered for promotion.  */
+static bool
+ssa_promoted_p (tree name)
+{
+  gcc_assert (TREE_CODE (name) == SSA_NAME);
+  unsigned int index = SSA_NAME_VERSION (name);
+  if (index < n_ssa_val)
+    return bitmap_bit_p (ssa_to_be_promoted_bitmap, index);
+  return true;
+}
+
+/* Set ssa NAME to be already considered for promotion.  */
+static void
+set_ssa_promoted (tree name)
+{
+  gcc_assert (TREE_CODE (name) == SSA_NAME);
+  unsigned int index = SSA_NAME_VERSION (name);
+  if (index < n_ssa_val)
+    bitmap_set_bit (ssa_to_be_promoted_bitmap, index);
+}
+
+/* Return true if the tree CODE needs the propmoted operand to be
+   truncated (when stray bits are set beyond the original type in
+   promoted mode) to preserve the semantics.  */
+static bool
+not_truncated_use_p (enum tree_code code)
+{
+  if (TREE_CODE_CLASS (code) == tcc_comparison
+      || code == TRUNC_DIV_EXPR
+      || code == CEIL_DIV_EXPR
+      || code == FLOOR_DIV_EXPR
+      || code == ROUND_DIV_EXPR
+      || code == TRUNC_MOD_EXPR
+      || code == CEIL_MOD_EXPR
+      || code == FLOOR_MOD_EXPR
+      || code == ROUND_MOD_EXPR
+      || code == LSHIFT_EXPR
+      || code == RSHIFT_EXPR
+      || code == MAX_EXPR
+      || code == MIN_EXPR)
+    return false;
+  else
+    return true;
+}
+
+
+/* Return true if LHS will be promoted later.  */
+static bool
+tobe_promoted_p (tree lhs)
+{
+  if (TREE_CODE (lhs) == SSA_NAME
+      && INTEGRAL_TYPE_P (TREE_TYPE (lhs))
+      && !VECTOR_TYPE_P (TREE_TYPE (lhs))
+      && !POINTER_TYPE_P (TREE_TYPE (lhs))
+      && !ssa_promoted_p (lhs)
+      && (get_promoted_type (TREE_TYPE (lhs))
+	  != TREE_TYPE (lhs)))
+    return true;
+  else
+    return false;
+}
+
+/* Convert and sign-extend constant CST to TYPE.  */
+static tree
+fold_convert_sext (tree type, tree cst)
+{
+  wide_int wi_cons = fold_convert (type, cst);
+  wi_cons = wi::ext (wi_cons, TYPE_PRECISION (TREE_TYPE (cst)), SIGNED);
+  return wide_int_to_tree (type, wi_cons);
+}
+
+/* Promote constants in STMT to TYPE.  If PROMOTE_COND_EXPR is true,
+   promote only the constants in conditions part of the COND_EXPR.
+
+   We promote the constants when the ssociated operands are promoted.
+   This usually means that we promote the constants when we promote the
+   defining stmnts (as part of promote_ssa). However for COND_EXPR, we
+   can promote only when we promote the other operand. Therefore, this
+   is done during fixup_use.  */
+
+static void
+promote_cst_in_stmt (gimple *stmt, tree type)
+{
+  tree op;
+  ssa_op_iter iter;
+  use_operand_p oprnd;
+  int index;
+  tree op0, op1;
+
+  switch (gimple_code (stmt))
+    {
+    case GIMPLE_ASSIGN:
+      if (gimple_assign_rhs_code (stmt) == COND_EXPR
+	  && TREE_OPERAND_LENGTH (gimple_assign_rhs1 (stmt)) == 2)
+	{
+	  /* Promote INTEGER_CST that are tcc_compare arguments.  */
+	  op = gimple_assign_rhs1 (stmt);
+	  op0 = TREE_OPERAND (op, 0);
+	  op1 = TREE_OPERAND (op, 1);
+	  if (TREE_TYPE (op0) != TREE_TYPE (op1))
+	    {
+	      if (TREE_CODE (op0) == INTEGER_CST)
+		TREE_OPERAND (op, 0) = fold_convert (type, op0);
+	      if (TREE_CODE (op1) == INTEGER_CST)
+		TREE_OPERAND (op, 1) = fold_convert (type, op1);
+	    }
+	}
+      /* Promote INTEGER_CST in GIMPLE_ASSIGN.  */
+      if (not_truncated_use_p (gimple_assign_rhs_code (stmt)))
+	{
+	  op = gimple_assign_rhs3 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs3 (stmt, fold_convert_sext (type, op));
+	  op = gimple_assign_rhs1 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs1 (stmt, fold_convert_sext (type, op));
+	  op = gimple_assign_rhs2 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs2 (stmt, fold_convert_sext (type, op));
+	}
+      else
+	{
+	  op = gimple_assign_rhs3 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs3 (stmt, fold_convert (type, op));
+	  op = gimple_assign_rhs1 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs1 (stmt, fold_convert (type, op));
+	  op = gimple_assign_rhs2 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs2 (stmt, fold_convert (type, op));
+	}
+      break;
+
+    case GIMPLE_PHI:
+	{
+	  /* Promote INTEGER_CST arguments to GIMPLE_PHI.  */
+	  gphi *phi = as_a <gphi *> (stmt);
+	  FOR_EACH_PHI_ARG (oprnd, phi, iter, SSA_OP_USE)
+	    {
+	      op = USE_FROM_PTR (oprnd);
+	      index = PHI_ARG_INDEX_FROM_USE (oprnd);
+	      if (TREE_CODE (op) == INTEGER_CST)
+		SET_PHI_ARG_DEF (phi, index, fold_convert (type, op));
+	    }
+	}
+      break;
+
+    case GIMPLE_COND:
+	{
+	  /* Promote INTEGER_CST that are GIMPLE_COND arguments.  */
+	  gcond *cond = as_a <gcond *> (stmt);
+	  op = gimple_cond_lhs (cond);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_cond_set_lhs (cond, fold_convert (type, op));
+
+	  op = gimple_cond_rhs (cond);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_cond_set_rhs (cond, fold_convert (type, op));
+	}
+      break;
+
+    default:
+      gcc_unreachable ();
+    }
+}
+
+/* Zero/sign extend VAR and truncate to INNER_TYPE.
+   Assign the zero/sign extended value in NEW_VAR.  gimple statement
+   that performs the zero/sign extension is returned.  */
+
+static gimple *
+zero_sign_extend_stmt (tree new_var, tree var, tree inner_type)
+{
+  gcc_assert (TYPE_PRECISION (TREE_TYPE (var))
+	      == TYPE_PRECISION (TREE_TYPE (new_var)));
+  gcc_assert (TYPE_PRECISION (TREE_TYPE (var)) > TYPE_PRECISION (inner_type));
+  gimple *stmt;
+
+  if (TYPE_UNSIGNED (inner_type))
+    {
+      /* Zero extend.  */
+      tree cst
+	= wide_int_to_tree (TREE_TYPE (var),
+			    wi::mask (TYPE_PRECISION (inner_type), false,
+				      TYPE_PRECISION (TREE_TYPE (var))));
+      stmt = gimple_build_assign (new_var, BIT_AND_EXPR,
+				  var, cst);
+    }
+  else
+    /* Sign extend.  */
+    stmt = gimple_build_assign (new_var,
+				SEXT_EXPR,
+				var,
+				build_int_cst (TREE_TYPE (var),
+					       TYPE_PRECISION (inner_type)));
+  return stmt;
+}
+
+static void
+copy_default_ssa (tree to, tree from)
+{
+  SET_SSA_NAME_VAR_OR_IDENTIFIER (to, SSA_NAME_VAR (from));
+  SSA_NAME_DEF_STMT (to) = SSA_NAME_DEF_STMT (from);
+  SET_SSA_NAME_VAR_OR_IDENTIFIER (from, NULL_TREE);
+  SSA_NAME_IS_DEFAULT_DEF (to) = 1;
+  SSA_NAME_IS_DEFAULT_DEF (from) = 0;
+}
+
+/* Promote definition DEF to PROMOTED_TYPE.  If the stmt that defines def
+   is def_stmt, make the type of def promoted_type.  If the stmt is such
+   that, result of the def_stmt cannot be of promoted_type, create a new_def
+   of the original_type and make the def_stmt assign its value to newdef.
+   Then, create a NOP_EXPR to convert new_def to def of promoted type.
+
+   For example, for stmt with original_type char and promoted_type int:
+		char _1 = mem;
+	becomes:
+		char _2 = mem;
+		int _1 = (int)_2;
+
+   If the def_stmt allows def to be promoted, promote def in-place
+   (and its arguments when needed).
+
+   For example:
+		char _3 = _1 + _2;
+	becomes:
+		int _3 = _1 + _2;
+   Here, _1 and _2 will also be promoted.  */
+
+static void
+promote_ssa (tree def, gimple_stmt_iterator *gsi)
+{
+  gimple *def_stmt = SSA_NAME_DEF_STMT (def);
+  gimple *copy_stmt = NULL;
+  gimple_stmt_iterator gsi2;
+  tree original_type = TREE_TYPE (def);
+  tree new_def;
+  ssa_name_info *info;
+  bool do_not_promote = false;
+  tree promoted_type = get_promoted_type (TREE_TYPE (def));
+
+  if (!tobe_promoted_p (def))
+    return;
+
+  info = (ssa_name_info *) obstack_alloc (&ssa_name_info_obstack,
+					  sizeof (ssa_name_info));
+  info->type = original_type;
+  info->promoted_type = promoted_type;
+  info->ssa = def;
+  ssa_name_info_map->put (def, info);
+
+  switch (gimple_code (def_stmt))
+    {
+    case GIMPLE_PHI:
+      {
+	/* Promote def by fixing its type and make def anonymous.  */
+	TREE_TYPE (def) = promoted_type;
+	SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+	promote_cst_in_stmt (def_stmt, promoted_type);
+	break;
+      }
+
+    case GIMPLE_ASM:
+      {
+	gasm *asm_stmt = as_a <gasm *> (def_stmt);
+	for (unsigned int i = 0; i < gimple_asm_noutputs (asm_stmt); ++i)
+	  {
+	    /* Promote def and copy (i.e. convert) the value defined
+	       by asm to def.  */
+	    tree link = gimple_asm_output_op (asm_stmt, i);
+	    tree op = TREE_VALUE (link);
+	    if (op == def)
+	      {
+		new_def = copy_ssa_name (def);
+		set_ssa_promoted (new_def);
+		copy_default_ssa (new_def, def);
+		TREE_VALUE (link) = new_def;
+		gimple_asm_set_output_op (asm_stmt, i, link);
+
+		TREE_TYPE (def) = promoted_type;
+		copy_stmt = gimple_build_assign (def, NOP_EXPR, new_def);
+		SSA_NAME_IS_DEFAULT_DEF (new_def) = 0;
+		gimple_set_location (copy_stmt, gimple_location (def_stmt));
+		gsi2 = gsi_for_stmt (def_stmt);
+		gsi_insert_after (&gsi2, copy_stmt, GSI_NEW_STMT);
+		break;
+	      }
+	  }
+	break;
+      }
+
+    case GIMPLE_NOP:
+      {
+	gcc_unreachable ();
+      }
+
+    case GIMPLE_ASSIGN:
+      {
+	enum tree_code code = gimple_assign_rhs_code (def_stmt);
+	tree rhs = gimple_assign_rhs1 (def_stmt);
+	if (gimple_vuse (def_stmt) != NULL_TREE
+	    || gimple_vdef (def_stmt) != NULL_TREE
+	    || (ANY_INTEGRAL_TYPE_P (TREE_TYPE (def))
+		&& !operation_no_trapping_overflow (TREE_TYPE (def), code))
+	    || TREE_CODE_CLASS (code) == tcc_reference
+	    || TREE_CODE_CLASS (code) == tcc_comparison
+	    || code == LROTATE_EXPR
+	    || code == RROTATE_EXPR
+	    || code == VIEW_CONVERT_EXPR
+	    || code == REALPART_EXPR
+	    || code == IMAGPART_EXPR
+	    || code == REDUC_PLUS_EXPR
+	    || code == REDUC_MAX_EXPR
+	    || code == REDUC_MIN_EXPR
+	    || !INTEGRAL_TYPE_P (TREE_TYPE (rhs)))
+	  {
+	    do_not_promote = true;
+	  }
+	else if (CONVERT_EXPR_CODE_P (code))
+	  {
+	    if (!type_precision_ok (TREE_TYPE (rhs)))
+	      {
+		do_not_promote = true;
+	      }
+	    else if (types_compatible_p (TREE_TYPE (rhs), promoted_type))
+	      {
+		/* As we travel statements in dominated order, arguments
+		   of def_stmt will be visited before visiting def.  If RHS
+		   is already promoted and type is compatible, we can convert
+		   them into ZERO/SIGN EXTEND stmt.  */
+		ssa_name_info *info = ssa_name_info_map->get_or_insert (rhs);
+		tree type;
+		if (info == NULL)
+		  type = TREE_TYPE (rhs);
+		else
+		  type = info->type;
+		if ((TYPE_PRECISION (original_type)
+		     > TYPE_PRECISION (type))
+		    || (TYPE_UNSIGNED (original_type)
+			!= TYPE_UNSIGNED (type)))
+		  {
+		    if (TYPE_PRECISION (original_type) < TYPE_PRECISION (type))
+		      type = original_type;
+		    gcc_assert (type != NULL_TREE);
+		    TREE_TYPE (def) = promoted_type;
+		    copy_stmt = zero_sign_extend_stmt (def, rhs, type);
+		    SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		    gsi_replace (gsi, copy_stmt, false);
+		  }
+		else
+		  {
+		    TREE_TYPE (def) = promoted_type;
+		    SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		  }
+	      }
+	    else
+	      {
+		/* If RHS is not promoted OR their types are not
+		   compatible, create NOP_EXPR that converts
+		   RHS to  promoted DEF type and perform a
+		   ZERO/SIGN EXTEND to get the required value
+		   from RHS.  */
+		ssa_name_info *info = ssa_name_info_map->get_or_insert (rhs);
+		if (info != NULL)
+		  {
+		    tree type = info->type;
+		    new_def = copy_ssa_name (rhs);
+		    SET_SSA_NAME_VAR_OR_IDENTIFIER (new_def, NULL_TREE);
+		    TREE_TYPE (def) = promoted_type;
+		    SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		    copy_stmt = zero_sign_extend_stmt (new_def, rhs, type);
+		    gimple_set_location (copy_stmt, gimple_location (def_stmt));
+		    gsi2 = gsi_for_stmt (def_stmt);
+		    gsi_insert_before (&gsi2, copy_stmt, GSI_NEW_STMT);
+		    gassign *new_def_stmt = gimple_build_assign (def, code, new_def);
+		    gsi_replace (gsi, new_def_stmt, false);
+		  }
+		else
+		  {
+		    TREE_TYPE (def) = promoted_type;
+		    SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		  }
+	      }
+	  }
+	else
+	  {
+	    /* Promote def by fixing its type and make def anonymous.  */
+	    promote_cst_in_stmt (def_stmt, promoted_type);
+	    SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+	    TREE_TYPE (def) = promoted_type;
+	  }
+	break;
+      }
+
+    default:
+      do_not_promote = true;
+      break;
+    }
+
+  if (do_not_promote)
+    {
+      /* Promote def and copy (i.e. convert) the value defined
+	 by the stmt that cannot be promoted.  */
+      new_def = copy_ssa_name (def);
+      set_ssa_promoted (new_def);
+      SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+      TREE_TYPE (def) = promoted_type;
+      gimple_set_lhs (def_stmt, new_def);
+      copy_stmt = gimple_build_assign (def, NOP_EXPR, new_def);
+      gimple_set_location (copy_stmt, gimple_location (def_stmt));
+      gsi2 = gsi_for_stmt (def_stmt);
+      if (lookup_stmt_eh_lp (def_stmt) > 0
+	  || (gimple_code (def_stmt) == GIMPLE_CALL
+	      && gimple_call_ctrl_altering_p (def_stmt)))
+	gsi_insert_on_edge (FALLTHRU_EDGE (gimple_bb (def_stmt)),
+			    copy_stmt);
+      else
+	gsi_insert_after (&gsi2, copy_stmt, GSI_NEW_STMT);
+    }
+  reset_flow_sensitive_info (def);
+}
+
+/* Fix the (promoted) USE in stmts where USE cannot be be promoted.  */
+static unsigned int
+fixup_use (gimple *stmt, gimple_stmt_iterator *gsi,
+	   use_operand_p op, tree use)
+{
+  gimple *copy_stmt;
+  ssa_name_info **info = ssa_name_info_map->get (use);
+  /* If USE is not promoted, nothing to do.  */
+  if (!info || *info == NULL)
+    return 0;
+
+  tree promoted_type = (*info)->promoted_type;
+  tree old_type = (*info)->type;
+  bool do_not_promote = false;
+
+  switch (gimple_code (stmt))
+    {
+    case GIMPLE_DEBUG:
+      {
+	SET_USE (op, fold_convert (old_type, use));
+	update_stmt (stmt);
+	break;
+      }
+
+    case GIMPLE_ASM:
+    case GIMPLE_CALL:
+    case GIMPLE_RETURN:
+      {
+	/* USE cannot be promoted here.  */
+	do_not_promote = true;
+	break;
+      }
+
+    case GIMPLE_ASSIGN:
+      {
+	enum tree_code code = gimple_assign_rhs_code (stmt);
+	tree lhs = gimple_assign_lhs (stmt);
+	if (gimple_vuse (stmt) != NULL_TREE
+	    || gimple_vdef (stmt) != NULL_TREE
+	    || (ANY_INTEGRAL_TYPE_P (TREE_TYPE (lhs))
+		&& !operation_no_trapping_overflow (TREE_TYPE (lhs), code))
+	    || code == VIEW_CONVERT_EXPR
+	    || code == LROTATE_EXPR
+	    || code == RROTATE_EXPR
+	    || code == CONSTRUCTOR
+	    || code == BIT_FIELD_REF
+	    || code == COMPLEX_EXPR
+	    || VECTOR_TYPE_P (TREE_TYPE (lhs)))
+	  {
+	    do_not_promote = true;
+	  }
+	else if (!not_truncated_use_p (code))
+	  {
+	    /* Promote the constant in comparison when other comparison
+	       operand is promoted.  All other constants are promoted as
+	       part of promoting definition in promote_ssa.  */
+	    if (TREE_CODE_CLASS (code) == tcc_comparison)
+	      promote_cst_in_stmt (stmt, promoted_type);
+	    /* In some stmts, value in USE has to be ZERO/SIGN
+	       Extended based on the original type for correct
+	       result.  */
+	    tree temp = make_ssa_name (TREE_TYPE (use), NULL);
+	    copy_stmt = zero_sign_extend_stmt (temp, use, old_type);
+	    gimple_set_location (copy_stmt, gimple_location (stmt));
+	    gsi_insert_before (gsi, copy_stmt, GSI_NEW_STMT);
+
+	    SET_USE (op, temp);
+	    update_stmt (stmt);
+	  }
+	else if (CONVERT_EXPR_CODE_P (code)
+	    || code == FLOAT_EXPR)
+	  {
+	    if (types_compatible_p (TREE_TYPE (lhs), promoted_type))
+	      {
+		/* Type of LHS and promoted RHS are compatible, we can
+		   convert this into ZERO/SIGN EXTEND stmt.  */
+		copy_stmt = zero_sign_extend_stmt (lhs, use, old_type);
+		gimple_set_location (copy_stmt, gimple_location (stmt));
+		set_ssa_promoted (lhs);
+		gsi_replace (gsi, copy_stmt, false);
+	      }
+	    else if (!tobe_promoted_p (lhs)
+		     || !INTEGRAL_TYPE_P (TREE_TYPE (lhs))
+		     || (TYPE_UNSIGNED (TREE_TYPE (use)) != TYPE_UNSIGNED (TREE_TYPE (lhs))))
+	      {
+		tree temp = make_ssa_name (TREE_TYPE (use), NULL);
+		copy_stmt = zero_sign_extend_stmt (temp, use, old_type);
+		gimple_set_location (copy_stmt, gimple_location (stmt));
+		gsi_insert_before (gsi, copy_stmt, GSI_NEW_STMT);
+		SET_USE (op, temp);
+		update_stmt (stmt);
+	      }
+	  }
+	break;
+      }
+
+    case GIMPLE_COND:
+      {
+	/* In GIMPLE_COND, value in USE has to be ZERO/SIGN
+	   Extended based on the original type for correct
+	   result.  */
+	tree temp = make_ssa_name (TREE_TYPE (use), NULL);
+	copy_stmt = zero_sign_extend_stmt (temp, use, old_type);
+	gimple_set_location (copy_stmt, gimple_location (stmt));
+	gsi_insert_before (gsi, copy_stmt, GSI_NEW_STMT);
+	SET_USE (op, temp);
+	promote_cst_in_stmt (stmt, promoted_type);
+	update_stmt (stmt);
+	break;
+      }
+
+    default:
+      break;
+    }
+
+  if (do_not_promote)
+    {
+      /* FOR stmts where USE cannot be promoted, create an
+	 original type copy.  */
+      tree temp;
+      temp = copy_ssa_name (use);
+      SET_SSA_NAME_VAR_OR_IDENTIFIER (temp, NULL_TREE);
+      set_ssa_promoted (temp);
+      TREE_TYPE (temp) = old_type;
+      copy_stmt = gimple_build_assign (temp, NOP_EXPR, use);
+      gimple_set_location (copy_stmt, gimple_location (stmt));
+      gsi_insert_before (gsi, copy_stmt, GSI_NEW_STMT);
+      SET_USE (op, temp);
+      update_stmt (stmt);
+    }
+  return 0;
+}
+
+static void
+promote_all_ssa_defined_with_nop ()
+{
+  unsigned n = num_ssa_names, i;
+  gimple_stmt_iterator gsi2;
+  tree new_def;
+  basic_block bb;
+  gimple *copy_stmt;
+
+  for (i = 1; i < n; ++i)
+    {
+      tree name = ssa_name (i);
+      if (name
+	  && gimple_code (SSA_NAME_DEF_STMT (name)) == GIMPLE_NOP
+	  && tobe_promoted_p (name)
+	  && !has_zero_uses (name))
+	{
+	  tree promoted_type = get_promoted_type (TREE_TYPE (name));
+	  ssa_name_info *info;
+	  set_ssa_promoted (name);
+	  info = (ssa_name_info *) obstack_alloc (&ssa_name_info_obstack,
+						  sizeof (ssa_name_info));
+	  info->type = TREE_TYPE (name);
+	  info->promoted_type = promoted_type;
+	  info->ssa = name;
+	  ssa_name_info_map->put (name, info);
+
+	  if (SSA_NAME_VAR (name) == NULL)
+	    {
+	      /* Promote def by fixing its type for anonymous def.  */
+	      TREE_TYPE (name) = promoted_type;
+	    }
+	  else if (TREE_CODE (SSA_NAME_VAR (name)) != PARM_DECL)
+	    {
+	      tree var = create_tmp_reg (promoted_type);
+	      DECL_NAME (var) = DECL_NAME (SSA_NAME_VAR (name));
+	      set_ssa_default_def (cfun, SSA_NAME_VAR (name), NULL_TREE);
+	      TREE_TYPE (name) = promoted_type;
+	      SET_SSA_NAME_VAR_OR_IDENTIFIER (name, var);
+	      set_ssa_default_def (cfun, var, name);
+	    }
+	  else
+	    {
+	      /* Create a promoted copy of parameters.  */
+	      bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+	      gcc_assert (bb);
+	      gsi2 = gsi_after_labels (bb);
+	      /* Create new_def of the original type and set that to be the
+		 parameter.  */
+	      new_def = copy_ssa_name (name);
+	      set_ssa_promoted (new_def);
+	      set_ssa_default_def (cfun, SSA_NAME_VAR (name), new_def);
+	      copy_default_ssa (new_def, name);
+
+	      /* Now promote the def and copy the value from parameter.  */
+	      TREE_TYPE (name) = promoted_type;
+	      copy_stmt = gimple_build_assign (name, NOP_EXPR, new_def);
+	      SSA_NAME_DEF_STMT (name) = copy_stmt;
+	      gsi_insert_before (&gsi2, copy_stmt, GSI_NEW_STMT);
+	    }
+	  reset_flow_sensitive_info (name);
+	}
+    }
+}
+
+/* Promote all the stmts in the basic block.  */
+static void
+promote_all_stmts (basic_block bb)
+{
+  gimple_stmt_iterator gsi;
+  ssa_op_iter iter;
+  tree def, use;
+  use_operand_p op;
+
+  for (gphi_iterator gpi = gsi_start_phis (bb);
+       !gsi_end_p (gpi); gsi_next (&gpi))
+    {
+      gphi *phi = gpi.phi ();
+      FOR_EACH_PHI_ARG (op, phi, iter, SSA_OP_USE)
+	{
+	  use = USE_FROM_PTR (op);
+	  fixup_use (phi, &gsi, op, use);
+	}
+
+      def = PHI_RESULT (phi);
+      promote_ssa (def, &gsi);
+    }
+  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+    {
+      gimple *stmt = gsi_stmt (gsi);
+      FOR_EACH_SSA_USE_OPERAND (op, stmt, iter, SSA_OP_USE)
+	{
+	  use = USE_FROM_PTR (op);
+	  fixup_use (stmt, &gsi, op, use);
+	}
+
+      FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_DEF)
+	promote_ssa (def, &gsi);
+    }
+}
+
+class type_promotion_dom_walker : public dom_walker
+{
+public:
+  type_promotion_dom_walker (cdi_direction direction)
+    : dom_walker (direction) {}
+  virtual void before_dom_children (basic_block bb)
+    {
+      promote_all_stmts (bb);
+    }
+};
+
+/* Main entry point to the pass.  */
+static unsigned int
+execute_type_promotion (void)
+{
+  n_ssa_val = num_ssa_names;
+  ssa_name_info_map = new hash_map<tree, ssa_name_info *>;
+  ssa_to_be_promoted_bitmap = sbitmap_alloc (n_ssa_val);
+  bitmap_clear (ssa_to_be_promoted_bitmap);
+
+  /* Create the obstack where ssa_name_info will reside.  */
+  gcc_obstack_init (&ssa_name_info_obstack);
+
+  calculate_dominance_info (CDI_DOMINATORS);
+  promote_all_ssa_defined_with_nop ();
+  /* Walk the CFG in dominator order.  */
+  type_promotion_dom_walker (CDI_DOMINATORS)
+    .walk (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+  gsi_commit_edge_inserts ();
+
+  obstack_free (&ssa_name_info_obstack, NULL);
+  sbitmap_free (ssa_to_be_promoted_bitmap);
+  delete ssa_name_info_map;
+  return 0;
+}
+
+namespace {
+const pass_data pass_data_type_promotion =
+{
+  GIMPLE_PASS, /* type */
+  "promotion", /* name */
+  OPTGROUP_NONE, /* optinfo_flags */
+  TV_TREE_TYPE_PROMOTE, /* tv_id */
+  PROP_ssa, /* properties_required */
+  0, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  (TODO_cleanup_cfg | TODO_update_ssa | TODO_verify_all),
+};
+
+class pass_type_promotion : public gimple_opt_pass
+{
+public:
+  pass_type_promotion (gcc::context *ctxt)
+    : gimple_opt_pass (pass_data_type_promotion, ctxt)
+  {}
+
+  /* opt_pass methods: */
+  opt_pass * clone () { return new pass_type_promotion (m_ctxt); }
+  virtual bool gate (function *) { return flag_tree_type_promote != 0; }
+  virtual unsigned int execute (function *)
+    {
+      return execute_type_promotion ();
+    }
+
+}; // class pass_type_promotion
+
+} // anon namespace
+
+gimple_opt_pass *
+make_pass_type_promote (gcc::context *ctxt)
+{
+  return new pass_type_promotion (ctxt);
+}
+
diff --git a/gcc/passes.def b/gcc/passes.def
index 1702778..26838f3 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -276,6 +276,7 @@ along with GCC; see the file COPYING3.  If not see
       POP_INSERT_PASSES ()
       NEXT_PASS (pass_simduid_cleanup);
       NEXT_PASS (pass_lower_vector_ssa);
+      NEXT_PASS (pass_type_promote);
       NEXT_PASS (pass_split_paths);
       NEXT_PASS (pass_cse_reciprocals);
       NEXT_PASS (pass_reassoc, false /* insert_powi_p */);
diff --git a/gcc/timevar.def b/gcc/timevar.def
index 45e3b70..da7f2d5 100644
--- a/gcc/timevar.def
+++ b/gcc/timevar.def
@@ -279,6 +279,7 @@ DEFTIMEVAR (TV_VTABLE_VERIFICATION   , "vtable verification")
 DEFTIMEVAR (TV_TREE_UBSAN            , "tree ubsan")
 DEFTIMEVAR (TV_INITIALIZE_RTL        , "initialize rtl")
 DEFTIMEVAR (TV_GIMPLE_LADDRESS       , "address lowering")
+DEFTIMEVAR (TV_TREE_TYPE_PROMOTE     , "tree type promote")
 
 /* Everything else in rest_of_compilation not included above.  */
 DEFTIMEVAR (TV_EARLY_LOCAL	     , "early local passes")
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index dcd2d5e..376ad7d 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -441,6 +441,7 @@ extern gimple_opt_pass *make_pass_fre (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_check_data_deps (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_copy_prop (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_isolate_erroneous_paths (gcc::context *ctxt);
+extern gimple_opt_pass *make_pass_type_promote (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_vrp (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_uncprop (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_return_slot (gcc::context *ctxt);
diff --git a/libiberty/cp-demangle.c b/libiberty/cp-demangle.c
index ff608a3..6722331 100644
--- a/libiberty/cp-demangle.c
+++ b/libiberty/cp-demangle.c
@@ -4353,7 +4353,7 @@ d_print_comp_inner (struct d_print_info *dpi, int options,
 
   /* Variable used to store the current templates while a previously
      captured scope is used.  */
-  struct d_print_template *saved_templates;
+  struct d_print_template *saved_templates = NULL;
 
   /* Nonzero if templates have been stored in the above variable.  */
   int need_template_restore = 0;
-- 
1.9.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [0/7] Type promotion pass and elimination of zext/sext
  2015-11-18 14:04                       ` Richard Biener
@ 2015-11-18 15:06                         ` Richard Biener
  2015-11-24  2:52                           ` Kugan
  0 siblings, 1 reply; 28+ messages in thread
From: Richard Biener @ 2015-11-18 15:06 UTC (permalink / raw)
  To: Kugan; +Cc: gcc-patches

On Wed, Nov 18, 2015 at 3:04 PM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Sat, Nov 14, 2015 at 2:15 AM, Kugan
> <kugan.vivekanandarajah@linaro.org> wrote:
>>
>> Attached is the latest version of the patch. With the patches
>> 0001-Add-new-SEXT_EXPR-tree-code.patch,
>> 0002-Add-type-promotion-pass.patch and
>> 0003-Optimize-ZEXT_EXPR-with-tree-vrp.patch.
>>
>> I did bootstrap on ppc64-linux-gnu, aarch64-linux-gnu and
>> x64-64-linux-gnu and regression testing on ppc64-linux-gnu,
>> aarch64-linux-gnu arm64-linux-gnu and x64-64-linux-gnu. I ran into three
>> issues in ppc64-linux-gnu regression testing. There are some other test
>> cases which needs adjustment for scanning for some patterns that are not
>> valid now.
>>
>> 1. rtl fwprop was going into infinite loop. Works with the following patch:
>> diff --git a/gcc/fwprop.c b/gcc/fwprop.c
>> index 16c7981..9cf4f43 100644
>> --- a/gcc/fwprop.c
>> +++ b/gcc/fwprop.c
>> @@ -948,6 +948,10 @@ try_fwprop_subst (df_ref use, rtx *loc, rtx
>> new_rtx, rtx_insn *def_insn,
>>    int old_cost = 0;
>>    bool ok;
>>
>> +  /* Value to be substituted is the same, nothing to do.  */
>> +  if (rtx_equal_p (*loc, new_rtx))
>> +    return false;
>> +
>>    update_df_init (def_insn, insn);
>>
>>    /* forward_propagate_subreg may be operating on an instruction with
>
> Which testcase was this on?
>
>> 2. gcc.dg/torture/ftrapv-1.c fails
>> This is because we are checking for the  SImode trapping. With the
>> promotion of the operation to wider mode, this is i think expected. I
>> think the testcase needs updating.
>
> No, it is not expected.  As said earlier you need to refrain from promoting
> integer operations that trap.  You can use ! operation_no_trapping_overflow
> for this.
>
>> 3. gcc.dg/sms-3.c fails
>> It fails with  -fmodulo-sched-allow-regmoves  and OK when I remove it. I
>> am looking into it.
>>
>>
>> I also have the following issues based on the previous review (as posted
>> in the previous patch). Copying again for the review purpose.
>>
>> 1.
>>> you still call promote_ssa on both DEFs and USEs and promote_ssa looks
>>> at SSA_NAME_DEF_STMT of the passed arg.  Please call promote_ssa just
>>> on DEFs and fixup_uses on USEs.
>>
>> I am doing this to promote SSA that are defined with GIMPLE_NOP. Is
>> there anyway to iterate over this. I have added gcc_assert to make sure
>> that promote_ssa is called only once.
>
>   gcc_assert (!ssa_name_info_map->get_or_insert (def));
>
> with --disable-checking this will be compiled away so you need to do
> the assert in a separate statement.
>
>> 2.
>>> Instead of this you should, in promote_all_stmts, walk over all uses
>> doing what
>>> fixup_uses does and then walk over all defs, doing what promote_ssa does.
>>>
>>> +    case GIMPLE_NOP:
>>> +       {
>>> +         if (SSA_NAME_VAR (def) == NULL)
>>> +           {
>>> +             /* Promote def by fixing its type for anonymous def.  */
>>> +             TREE_TYPE (def) = promoted_type;
>>> +           }
>>> +         else
>>> +           {
>>> +             /* Create a promoted copy of parameters.  */
>>> +             bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
>>>
>>> I think the uninitialized vars are somewhat tricky and it would be best
>>> to create a new uninit anonymous SSA name for them.  You can
>>> have SSA_NAME_VAR != NULL and def _not_ being a parameter
>>> btw.
>>
>> I experimented with get_or_create_default_def. Here  we have to have a
>> SSA_NAME_VAR (def) of promoted type.
>>
>> In the attached patch I am doing the following and seems to work. Does
>> this looks OK?
>>
>> +         }
>> +       else if (TREE_CODE (SSA_NAME_VAR (def)) != PARM_DECL)
>> +         {
>> +           tree var = copy_node (SSA_NAME_VAR (def));
>> +           TREE_TYPE (var) = promoted_type;
>> +           TREE_TYPE (def) = promoted_type;
>> +           SET_SSA_NAME_VAR_OR_IDENTIFIER (def, var);
>> +         }
>
> I believe this will wreck the SSA default-def map so you should do
>
>   set_ssa_default_def (cfun, SSA_NAME_VAR (def), NULL_TREE);
>   tree var = create_tmp_reg (promoted_type);
>   TREE_TYPE (def) = promoted_type;
>   SET_SSA_NAME_VAR_OR_IDENTIFIER (def, var);
>   set_ssa_default_def (cfun, var, def);
>
> instead.
>
>> I prefer to promote def as otherwise iterating over the uses and
>> promoting can look complicated (have to look at all the different types
>> of stmts again and do the right thing as It was in the earlier version
>> of this before we move to this approach)
>>
>> 3)
>>> you can also transparently handle constants for the cases where promoting
>>> is required.  At the moment their handling is interwinded with the def
>> promotion
>>> code.  That makes the whole thing hard to follow.
>>
>>
>> I have updated the comments with:
>>
>> +/* Promote constants in STMT to TYPE.  If PROMOTE_COND_EXPR is true,
>> +   promote only the constants in conditions part of the COND_EXPR.
>> +
>> +   We promote the constants when the associated operands are promoted.
>> +   This usually means that we promote the constants when we promote the
>> +   defining stmnts (as part of promote_ssa). However for COND_EXPR, we
>> +   can promote only when we promote the other operand. Therefore, this
>> +   is done during fixup_use.  */
>>
>>
>> 4)
>> I am handling gimple_debug separately to avoid any code difference with
>> and without -g option. I have updated the comments for this.
>>
>> 5)
>> I also noticed that tree-ssa-uninit sometimes gives false positives due
>> to the assumptions
>> it makes. Is it OK to move this pass before type promotion? I can do the
>> testings and post a separate patch with this if this OK.
>
> Hmm, no, this needs more explanation (like a testcase).
>
>> 6)
>> I also removed the optimization that prevents some of the redundant
>> truncation/extensions from type promotion pass, as it dosent do much as
>> of now. I can send a proper follow up patch. Is that OK?
>
> Yeah, that sounds fine.
>
>> I also did a simple test with coremark for the latest patch. I compared
>> the code size for coremark for linux-gcc with -Os. Results are as
>> reported by the "size" utility. I know this doesn't mean much but can
>> give some indication.
>>         Base            with pass       Percentage improvement
>> ==============================================================
>> arm     10476           10372           0.9927453226
>> aarch64 9545            9521            0.2514405448
>> ppc64   12236           12052           1.5037593985
>>
>>
>> After resolving the above issues, I would like propose that we  commit
>> the pass as not enabled by default (even though the patch as it stands
>> enabled by default - I am doing it for testing purposes).
>
> Hmm, we don't like to have passes that are not enabled by default with any
> optimization level or for any target.  Those tend to bitrot quickly :(
>
> Did you do any performance measurements yet?
>
> Looking over the pass in detail now (again).

Ok, so still looking at the basic operation scheme.

      FOR_EACH_SSA_USE_OPERAND (op, stmt, iter, SSA_OP_USE)
        {
          use = USE_FROM_PTR (op);
          if (TREE_CODE (use) == SSA_NAME
              && gimple_code (SSA_NAME_DEF_STMT (use)) == GIMPLE_NOP)
            promote_ssa (use, &gsi);
          fixup_use (stmt, &gsi, op, use);
        }

      FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_DEF)
        promote_ssa (def, &gsi);

the GIMPLE_NOP handling in promote_ssa but when processing uses looks
backwards.  As those are implicitely defined in the entry block you may
better just iterate over all default defs before the dominator walk like so

  unsigned n = num_ssa_names;
  for (i = 1; i < n; ++i)
    {
      tree name = ssa_name (i);
      if (name
          && SSA_NAME_IS_DEFAULT_DEF
          && ! has_zero_uses (name))
       promote_default_def (name);
    }

I see promote_cst_in_stmt in both promote_ssa and fixup_use.  Logically
it belongs to use processing, but on a stmt granularity.  Thus between
iterating over all uses and iteration over all defs call promote_cst_in_stmt
on all stmts.  It's a bit awkward as it expects to be called from context
that knows whether promotion is necessary or not.

/* Create an ssa with TYPE to copy ssa VAR.  */
static tree
make_promoted_copy (tree var, gimple *def_stmt, tree type)
{
  tree new_lhs = make_ssa_name (type, def_stmt);
  if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (var))
    SSA_NAME_OCCURS_IN_ABNORMAL_PHI (new_lhs) = 1;
  return new_lhs;
}

as you are generating a copy statement I don't see why you need to copy
SSA_NAME_OCCURS_IN_ABNORMAL_PHI (in no case new_lhs will
be used in a PHI node directly AFAICS).  Merging make_promoted_copy
and the usually following extension stmt generation plus insertion into
a single helper would make that obvious.

static unsigned int
fixup_use (gimple *stmt, gimple_stmt_iterator *gsi,
           use_operand_p op, tree use)
{
  ssa_name_info *info = ssa_name_info_map->get_or_insert (use);
  /* If USE is not promoted, nothing to do.  */
  if (!info)
    return 0;

You should use ->get (), not ->get_or_insert here.

      gimple *copy_stmt = gimple_build_assign (temp, NOP_EXPR,
                                               use, NULL_TREE);

you can avoid the trailing NULL_TREE here.

        gimple *copy_stmt =
          zero_sign_extend_stmt (temp, use,
                                 TYPE_UNSIGNED (old_type),
                                 TYPE_PRECISION (old_type));

coding style says the '=' goes to the next line, thus

    gimple *copy_stmt
       = zero_sign_extend_stmt ...

/* Zero/sign extend (depending on UNSIGNED_P) VAR and truncate to WIDTH bits.
   Assign the zero/sign extended value in NEW_VAR.  gimple statement
   that performs the zero/sign extension is returned.  */
static gimple *
zero_sign_extend_stmt (tree new_var, tree var, bool unsigned_p, int width)
{

looks like instead of unsigned_p/width you can pass in a  type instead.

    /* Sign extend.  */
    stmt = gimple_build_assign (new_var,
                                SEXT_EXPR,
                                var, build_int_cst (TREE_TYPE (var), width));

use size_int (width) instead.

/* Convert constant CST to TYPE.  */
static tree
convert_int_cst (tree type, tree cst, signop sign = SIGNED)

no need for a default argument

{
  wide_int wi_cons = fold_convert (type, cst);
  wi_cons = wi::ext (wi_cons, TYPE_PRECISION (TREE_TYPE (cst)), sign);
  return wide_int_to_tree (type, wi_cons);
}

I wonder why this function is needed at all and you don't just call
fold_convert (type, cst)?

/* Return true if the tree CODE needs the propmoted operand to be
   truncated (when stray bits are set beyond the original type in
   promoted mode) to preserve the semantics.  */
static bool
truncate_use_p (enum tree_code code)
{

a conservatively correct predicate would implement the inversion,
not_truncated_use_p because if you miss any tree code the
result will be unnecessary rather than missed truncations.

static bool
type_precision_ok (tree type)
{
  return (TYPE_PRECISION (type)
          == GET_MODE_PRECISION (TYPE_MODE (type)));
}

/* Return the promoted type for TYPE.  */
static tree
get_promoted_type (tree type)
{
  tree promoted_type;
  enum machine_mode mode;
  int uns;

  if (POINTER_TYPE_P (type)
      || !INTEGRAL_TYPE_P (type)
      || !type_precision_ok (type))

the type_precision_ok check is because SEXT doesn't work
properly for bitfield types?  I think we want to promote those
to their mode precision anyway.  We just need to use
sth different than SEXT here (the bitwise-and works of course)
or expand SEXT from non-mode precision differently (see
expr.c REDUCE_BIT_FIELD which expands it as a
lshift/rshift combo).  Eventually this can be left for a followup
though it might get you some extra testing coverage on
non-promote-mode targets.

/* Return true if ssa NAME is already considered for promotion.  */
static bool
ssa_promoted_p (tree name)
{
  if (TREE_CODE (name) == SSA_NAME)
    {
      unsigned int index = SSA_NAME_VERSION (name);
      if (index < n_ssa_val)
        return bitmap_bit_p (ssa_to_be_promoted_bitmap, index);
    }
  return true;

better than this default assert you pass in an SSA name.

isn't the bitmap somewhat redundant with the hash-map?
And you could combine both by using a vec<ssa_name_info *> indexed
by SSA_NAME_VERSION ()?

         if ((TREE_CODE_CLASS (gimple_assign_rhs_code (stmt))
               == tcc_comparison)
              || truncate_use_p (gimple_assign_rhs_code (stmt)))

you always check for tcc_omparison when checking for truncate_use_p
so just handle it there (well, as said above, implement conservative
predicates).

  switch (gimple_code (stmt))
    {
    case GIMPLE_ASSIGN:
      if (promote_cond
          && gimple_assign_rhs_code (stmt) == COND_EXPR)
        {

looking at all callers this condition is never true.

          tree new_op = build2 (TREE_CODE (op), type, op0, op1);

as tcc_comparison class trees are not shareable you don't
need to build2 but can directly set TREE_OPERAND (op, ..) to the
promoted value.  Note that rhs1 may still just be an SSA name
and not a comparison.

    case GIMPLE_PHI:
        {
          /* Promote INTEGER_CST arguments to GIMPLE_PHI.  */
          gphi *phi = as_a <gphi *> (stmt);
          FOR_EACH_PHI_ARG (oprnd, phi, iter, SSA_OP_USE)
            {
              op = USE_FROM_PTR (oprnd);
              index = PHI_ARG_INDEX_FROM_USE (oprnd);
              if (TREE_CODE (op) == INTEGER_CST)
                SET_PHI_ARG_DEF (phi, index, convert_int_cst (type, op, sign));
            }

static unsigned int
fixup_use (gimple *stmt, gimple_stmt_iterator *gsi,
           use_operand_p op, tree use)
{
  ssa_name_info *info = ssa_name_info_map->get_or_insert (use);
  /* If USE is not promoted, nothing to do.  */
  if (!info)
    return 0;

  tree promoted_type = info->promoted_type;
  tree old_type = info->type;
  bool do_not_promote = false;

  switch (gimple_code (stmt))
    {
 ....
    default:
      break;
    }

do_not_promote = false is not conservative.  Please place a
gcc_unreachable () in the default case.

I see you handle debug stmts here but that case cannot be reached.

/* Promote use in GIMPLE_DEBUG stmts. Do this separately to avoid generating
   different sequence with and without -g.  This can  happen when promoting
   SSA that are defined with GIMPLE_NOP.  */

but that's only because you choose to unconditionally handle GIMPLE_NOP uses...

Richard.


> Thanks,
> Richard.
>
>> Thanks,
>> Kugan
>>
>>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [0/7] Type promotion pass and elimination of zext/sext
  2015-11-14  1:15                     ` Kugan
@ 2015-11-18 14:04                       ` Richard Biener
  2015-11-18 15:06                         ` Richard Biener
  0 siblings, 1 reply; 28+ messages in thread
From: Richard Biener @ 2015-11-18 14:04 UTC (permalink / raw)
  To: Kugan; +Cc: gcc-patches

On Sat, Nov 14, 2015 at 2:15 AM, Kugan
<kugan.vivekanandarajah@linaro.org> wrote:
>
> Attached is the latest version of the patch. With the patches
> 0001-Add-new-SEXT_EXPR-tree-code.patch,
> 0002-Add-type-promotion-pass.patch and
> 0003-Optimize-ZEXT_EXPR-with-tree-vrp.patch.
>
> I did bootstrap on ppc64-linux-gnu, aarch64-linux-gnu and
> x64-64-linux-gnu and regression testing on ppc64-linux-gnu,
> aarch64-linux-gnu arm64-linux-gnu and x64-64-linux-gnu. I ran into three
> issues in ppc64-linux-gnu regression testing. There are some other test
> cases which needs adjustment for scanning for some patterns that are not
> valid now.
>
> 1. rtl fwprop was going into infinite loop. Works with the following patch:
> diff --git a/gcc/fwprop.c b/gcc/fwprop.c
> index 16c7981..9cf4f43 100644
> --- a/gcc/fwprop.c
> +++ b/gcc/fwprop.c
> @@ -948,6 +948,10 @@ try_fwprop_subst (df_ref use, rtx *loc, rtx
> new_rtx, rtx_insn *def_insn,
>    int old_cost = 0;
>    bool ok;
>
> +  /* Value to be substituted is the same, nothing to do.  */
> +  if (rtx_equal_p (*loc, new_rtx))
> +    return false;
> +
>    update_df_init (def_insn, insn);
>
>    /* forward_propagate_subreg may be operating on an instruction with

Which testcase was this on?

> 2. gcc.dg/torture/ftrapv-1.c fails
> This is because we are checking for the  SImode trapping. With the
> promotion of the operation to wider mode, this is i think expected. I
> think the testcase needs updating.

No, it is not expected.  As said earlier you need to refrain from promoting
integer operations that trap.  You can use ! operation_no_trapping_overflow
for this.

> 3. gcc.dg/sms-3.c fails
> It fails with  -fmodulo-sched-allow-regmoves  and OK when I remove it. I
> am looking into it.
>
>
> I also have the following issues based on the previous review (as posted
> in the previous patch). Copying again for the review purpose.
>
> 1.
>> you still call promote_ssa on both DEFs and USEs and promote_ssa looks
>> at SSA_NAME_DEF_STMT of the passed arg.  Please call promote_ssa just
>> on DEFs and fixup_uses on USEs.
>
> I am doing this to promote SSA that are defined with GIMPLE_NOP. Is
> there anyway to iterate over this. I have added gcc_assert to make sure
> that promote_ssa is called only once.

  gcc_assert (!ssa_name_info_map->get_or_insert (def));

with --disable-checking this will be compiled away so you need to do
the assert in a separate statement.

> 2.
>> Instead of this you should, in promote_all_stmts, walk over all uses
> doing what
>> fixup_uses does and then walk over all defs, doing what promote_ssa does.
>>
>> +    case GIMPLE_NOP:
>> +       {
>> +         if (SSA_NAME_VAR (def) == NULL)
>> +           {
>> +             /* Promote def by fixing its type for anonymous def.  */
>> +             TREE_TYPE (def) = promoted_type;
>> +           }
>> +         else
>> +           {
>> +             /* Create a promoted copy of parameters.  */
>> +             bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
>>
>> I think the uninitialized vars are somewhat tricky and it would be best
>> to create a new uninit anonymous SSA name for them.  You can
>> have SSA_NAME_VAR != NULL and def _not_ being a parameter
>> btw.
>
> I experimented with get_or_create_default_def. Here  we have to have a
> SSA_NAME_VAR (def) of promoted type.
>
> In the attached patch I am doing the following and seems to work. Does
> this looks OK?
>
> +         }
> +       else if (TREE_CODE (SSA_NAME_VAR (def)) != PARM_DECL)
> +         {
> +           tree var = copy_node (SSA_NAME_VAR (def));
> +           TREE_TYPE (var) = promoted_type;
> +           TREE_TYPE (def) = promoted_type;
> +           SET_SSA_NAME_VAR_OR_IDENTIFIER (def, var);
> +         }

I believe this will wreck the SSA default-def map so you should do

  set_ssa_default_def (cfun, SSA_NAME_VAR (def), NULL_TREE);
  tree var = create_tmp_reg (promoted_type);
  TREE_TYPE (def) = promoted_type;
  SET_SSA_NAME_VAR_OR_IDENTIFIER (def, var);
  set_ssa_default_def (cfun, var, def);

instead.

> I prefer to promote def as otherwise iterating over the uses and
> promoting can look complicated (have to look at all the different types
> of stmts again and do the right thing as It was in the earlier version
> of this before we move to this approach)
>
> 3)
>> you can also transparently handle constants for the cases where promoting
>> is required.  At the moment their handling is interwinded with the def
> promotion
>> code.  That makes the whole thing hard to follow.
>
>
> I have updated the comments with:
>
> +/* Promote constants in STMT to TYPE.  If PROMOTE_COND_EXPR is true,
> +   promote only the constants in conditions part of the COND_EXPR.
> +
> +   We promote the constants when the associated operands are promoted.
> +   This usually means that we promote the constants when we promote the
> +   defining stmnts (as part of promote_ssa). However for COND_EXPR, we
> +   can promote only when we promote the other operand. Therefore, this
> +   is done during fixup_use.  */
>
>
> 4)
> I am handling gimple_debug separately to avoid any code difference with
> and without -g option. I have updated the comments for this.
>
> 5)
> I also noticed that tree-ssa-uninit sometimes gives false positives due
> to the assumptions
> it makes. Is it OK to move this pass before type promotion? I can do the
> testings and post a separate patch with this if this OK.

Hmm, no, this needs more explanation (like a testcase).

> 6)
> I also removed the optimization that prevents some of the redundant
> truncation/extensions from type promotion pass, as it dosent do much as
> of now. I can send a proper follow up patch. Is that OK?

Yeah, that sounds fine.

> I also did a simple test with coremark for the latest patch. I compared
> the code size for coremark for linux-gcc with -Os. Results are as
> reported by the "size" utility. I know this doesn't mean much but can
> give some indication.
>         Base            with pass       Percentage improvement
> ==============================================================
> arm     10476           10372           0.9927453226
> aarch64 9545            9521            0.2514405448
> ppc64   12236           12052           1.5037593985
>
>
> After resolving the above issues, I would like propose that we  commit
> the pass as not enabled by default (even though the patch as it stands
> enabled by default - I am doing it for testing purposes).

Hmm, we don't like to have passes that are not enabled by default with any
optimization level or for any target.  Those tend to bitrot quickly :(

Did you do any performance measurements yet?

Looking over the pass in detail now (again).

Thanks,
Richard.

> Thanks,
> Kugan
>
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [0/7] Type promotion pass and elimination of zext/sext
  2015-11-10 14:13                   ` Richard Biener
  2015-11-12  6:08                     ` Kugan
@ 2015-11-14  1:15                     ` Kugan
  2015-11-18 14:04                       ` Richard Biener
  1 sibling, 1 reply; 28+ messages in thread
From: Kugan @ 2015-11-14  1:15 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 5293 bytes --]


Attached is the latest version of the patch. With the patches
0001-Add-new-SEXT_EXPR-tree-code.patch,
0002-Add-type-promotion-pass.patch and
0003-Optimize-ZEXT_EXPR-with-tree-vrp.patch.

I did bootstrap on ppc64-linux-gnu, aarch64-linux-gnu and
x64-64-linux-gnu and regression testing on ppc64-linux-gnu,
aarch64-linux-gnu arm64-linux-gnu and x64-64-linux-gnu. I ran into three
issues in ppc64-linux-gnu regression testing. There are some other test
cases which needs adjustment for scanning for some patterns that are not
valid now.

1. rtl fwprop was going into infinite loop. Works with the following patch:
diff --git a/gcc/fwprop.c b/gcc/fwprop.c
index 16c7981..9cf4f43 100644
--- a/gcc/fwprop.c
+++ b/gcc/fwprop.c
@@ -948,6 +948,10 @@ try_fwprop_subst (df_ref use, rtx *loc, rtx
new_rtx, rtx_insn *def_insn,
   int old_cost = 0;
   bool ok;

+  /* Value to be substituted is the same, nothing to do.  */
+  if (rtx_equal_p (*loc, new_rtx))
+    return false;
+
   update_df_init (def_insn, insn);

   /* forward_propagate_subreg may be operating on an instruction with

2. gcc.dg/torture/ftrapv-1.c fails
This is because we are checking for the  SImode trapping. With the
promotion of the operation to wider mode, this is i think expected. I
think the testcase needs updating.
3. gcc.dg/sms-3.c fails
It fails with  -fmodulo-sched-allow-regmoves  and OK when I remove it. I
am looking into it.


I also have the following issues based on the previous review (as posted
in the previous patch). Copying again for the review purpose.

1.
> you still call promote_ssa on both DEFs and USEs and promote_ssa looks
> at SSA_NAME_DEF_STMT of the passed arg.  Please call promote_ssa just
> on DEFs and fixup_uses on USEs.

I am doing this to promote SSA that are defined with GIMPLE_NOP. Is
there anyway to iterate over this. I have added gcc_assert to make sure
that promote_ssa is called only once.

2.
> Instead of this you should, in promote_all_stmts, walk over all uses
doing what
> fixup_uses does and then walk over all defs, doing what promote_ssa does.
>
> +    case GIMPLE_NOP:
> +       {
> +         if (SSA_NAME_VAR (def) == NULL)
> +           {
> +             /* Promote def by fixing its type for anonymous def.  */
> +             TREE_TYPE (def) = promoted_type;
> +           }
> +         else
> +           {
> +             /* Create a promoted copy of parameters.  */
> +             bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
>
> I think the uninitialized vars are somewhat tricky and it would be best
> to create a new uninit anonymous SSA name for them.  You can
> have SSA_NAME_VAR != NULL and def _not_ being a parameter
> btw.

I experimented with get_or_create_default_def. Here  we have to have a
SSA_NAME_VAR (def) of promoted type.

In the attached patch I am doing the following and seems to work. Does
this looks OK?

+	  }
+	else if (TREE_CODE (SSA_NAME_VAR (def)) != PARM_DECL)
+	  {
+	    tree var = copy_node (SSA_NAME_VAR (def));
+	    TREE_TYPE (var) = promoted_type;
+	    TREE_TYPE (def) = promoted_type;
+	    SET_SSA_NAME_VAR_OR_IDENTIFIER (def, var);
+	  }

I prefer to promote def as otherwise iterating over the uses and
promoting can look complicated (have to look at all the different types
of stmts again and do the right thing as It was in the earlier version
of this before we move to this approach)

3)
> you can also transparently handle constants for the cases where promoting
> is required.  At the moment their handling is interwinded with the def
promotion
> code.  That makes the whole thing hard to follow.


I have updated the comments with:

+/* Promote constants in STMT to TYPE.  If PROMOTE_COND_EXPR is true,
+   promote only the constants in conditions part of the COND_EXPR.
+
+   We promote the constants when the associated operands are promoted.
+   This usually means that we promote the constants when we promote the
+   defining stmnts (as part of promote_ssa). However for COND_EXPR, we
+   can promote only when we promote the other operand. Therefore, this
+   is done during fixup_use.  */


4)
I am handling gimple_debug separately to avoid any code difference with
and without -g option. I have updated the comments for this.

5)
I also noticed that tree-ssa-uninit sometimes gives false positives due
to the assumptions
it makes. Is it OK to move this pass before type promotion? I can do the
testings and post a separate patch with this if this OK.

6)
I also removed the optimization that prevents some of the redundant
truncation/extensions from type promotion pass, as it dosent do much as
of now. I can send a proper follow up patch. Is that OK?

I also did a simple test with coremark for the latest patch. I compared
the code size for coremark for linux-gcc with -Os. Results are as
reported by the "size" utility. I know this doesn't mean much but can
give some indication.
	Base   		with pass	Percentage improvement
==============================================================
arm	10476		10372		0.9927453226
aarch64	9545		9521		0.2514405448
ppc64	12236		12052		1.5037593985


After resolving the above issues, I would like propose that we  commit
the pass as not enabled by default (even though the patch as it stands
enabled by default - I am doing it for testing purposes).

Thanks,
Kugan



[-- Attachment #2: 0003-Optimize-ZEXT_EXPR-with-tree-vrp.patch --]
[-- Type: text/x-diff, Size: 3609 bytes --]

From 8e71ea17eaf6f282325076f588dbdf4f53c8b865 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Thu, 22 Oct 2015 10:53:56 +1100
Subject: [PATCH 3/5] Optimize ZEXT_EXPR with tree-vrp

---
 gcc/match.pd   |  6 ++++++
 gcc/tree-vrp.c | 61 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 67 insertions(+)

diff --git a/gcc/match.pd b/gcc/match.pd
index 0a9598e..1b152f1 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -2585,3 +2585,9 @@ along with GCC; see the file COPYING3.  If not see
   (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
    (op @0 (ext @1 @2)))))
 
+(simplify
+ (sext (sext@2 @0 @1) @3)
+ (if (tree_int_cst_compare (@1, @3) <= 0)
+  @2
+  (sext @0 @3)))
+
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index fe34ffd..024c8ef 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -2241,6 +2241,7 @@ extract_range_from_binary_expr_1 (value_range *vr,
       && code != LSHIFT_EXPR
       && code != MIN_EXPR
       && code != MAX_EXPR
+      && code != SEXT_EXPR
       && code != BIT_AND_EXPR
       && code != BIT_IOR_EXPR
       && code != BIT_XOR_EXPR)
@@ -2801,6 +2802,54 @@ extract_range_from_binary_expr_1 (value_range *vr,
       extract_range_from_multiplicative_op_1 (vr, code, &vr0, &vr1);
       return;
     }
+  else if (code == SEXT_EXPR)
+    {
+      gcc_assert (range_int_cst_p (&vr1));
+      HOST_WIDE_INT prec = tree_to_uhwi (vr1.min);
+      type = vr0.type;
+      wide_int tmin, tmax;
+      wide_int may_be_nonzero, must_be_nonzero;
+
+      wide_int type_min = wi::min_value (prec, SIGNED);
+      wide_int type_max = wi::max_value (prec, SIGNED);
+      type_min = wide_int_to_tree (expr_type, type_min);
+      type_max = wide_int_to_tree (expr_type, type_max);
+      type_min = wi::sext (type_min, prec);
+      type_max = wi::sext (type_max, prec);
+      wide_int sign_bit
+	= wi::set_bit_in_zero (prec - 1,
+			       TYPE_PRECISION (TREE_TYPE (vr0.min)));
+      if (zero_nonzero_bits_from_vr (expr_type, &vr0,
+				     &may_be_nonzero,
+				     &must_be_nonzero))
+	{
+	  if (wi::bit_and (must_be_nonzero, sign_bit) == sign_bit)
+	    {
+	      /* If to-be-extended sign bit is one.  */
+	      tmin = type_min;
+	      tmax = wi::zext (may_be_nonzero, prec);
+	    }
+	  else if (wi::bit_and (may_be_nonzero, sign_bit)
+		   != sign_bit)
+	    {
+	      /* If to-be-extended sign bit is zero.  */
+	      tmin = wi::zext (must_be_nonzero, prec);
+	      tmax = wi::zext (may_be_nonzero, prec);
+	    }
+	  else
+	    {
+	      tmin = type_min;
+	      tmax = type_max;
+	    }
+	}
+      else
+	{
+	  tmin = type_min;
+	  tmax = type_max;
+	}
+      min = wide_int_to_tree (expr_type, tmin);
+      max = wide_int_to_tree (expr_type, tmax);
+    }
   else if (code == RSHIFT_EXPR
 	   || code == LSHIFT_EXPR)
     {
@@ -9166,6 +9215,17 @@ simplify_bit_ops_using_ranges (gimple_stmt_iterator *gsi, gimple *stmt)
 	  break;
 	}
       break;
+    case SEXT_EXPR:
+	{
+	  unsigned int prec = tree_to_uhwi (op1);
+	  wide_int min = vr0.min;
+	  wide_int max = vr0.max;
+	  wide_int sext_min = wi::sext (min, prec);
+	  wide_int sext_max = wi::sext (max, prec);
+	  if (min == sext_min && max == sext_max)
+	    op = op0;
+	}
+      break;
     default:
       gcc_unreachable ();
     }
@@ -9868,6 +9928,7 @@ simplify_stmt_using_ranges (gimple_stmt_iterator *gsi)
 
 	case BIT_AND_EXPR:
 	case BIT_IOR_EXPR:
+	case SEXT_EXPR:
 	  /* Optimize away BIT_AND_EXPR and BIT_IOR_EXPR
 	     if all the bits being cleared are already cleared or
 	     all the bits being set are already set.  */
-- 
1.9.1


[-- Attachment #3: 0002-Add-type-promotion-pass.patch --]
[-- Type: text/x-diff, Size: 31165 bytes --]

From 42128668393c32c3860d346ead7b3118a090ffa4 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Thu, 22 Oct 2015 10:52:37 +1100
Subject: [PATCH 2/5] Add type promotion pass

---
 gcc/Makefile.in               |   1 +
 gcc/auto-profile.c            |   2 +-
 gcc/common.opt                |   4 +
 gcc/doc/invoke.texi           |  10 +
 gcc/gimple-ssa-type-promote.c | 867 ++++++++++++++++++++++++++++++++++++++++++
 gcc/passes.def                |   1 +
 gcc/timevar.def               |   1 +
 gcc/tree-pass.h               |   1 +
 libiberty/cp-demangle.c       |   2 +-
 9 files changed, 887 insertions(+), 2 deletions(-)
 create mode 100644 gcc/gimple-ssa-type-promote.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index b91b8dc..c6aed45 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1499,6 +1499,7 @@ OBJS = \
 	tree-vect-slp.o \
 	tree-vectorizer.o \
 	tree-vrp.o \
+	gimple-ssa-type-promote.o \
 	tree.o \
 	valtrack.o \
 	value-prof.o \
diff --git a/gcc/auto-profile.c b/gcc/auto-profile.c
index 25202c5..d32c3b6 100644
--- a/gcc/auto-profile.c
+++ b/gcc/auto-profile.c
@@ -1266,7 +1266,7 @@ afdo_propagate_circuit (const bb_set &annotated_bb, edge_set *annotated_edge)
     FOR_EACH_EDGE (e, ei, bb->succs)
     {
       unsigned i, total = 0;
-      edge only_one;
+      edge only_one = NULL;
       bool check_value_one = (((integer_onep (cmp_rhs))
                                ^ (gimple_cond_code (cmp_stmt) == EQ_EXPR))
                               ^ ((e->flags & EDGE_TRUE_VALUE) != 0));
diff --git a/gcc/common.opt b/gcc/common.opt
index 12ca0d6..f450428 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2404,6 +2404,10 @@ ftree-vrp
 Common Report Var(flag_tree_vrp) Init(0) Optimization
 Perform Value Range Propagation on trees.
 
+ftree-type-promote
+Common Report Var(flag_tree_type_promote) Init(1) Optimization
+Perform Type Promotion on trees
+
 funit-at-a-time
 Common Report Var(flag_unit_at_a_time) Init(1)
 Compile whole compilation unit at a time.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index cd82544..bc059a0 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -9093,6 +9093,16 @@ enabled by default at @option{-O2} and higher.  Null pointer check
 elimination is only done if @option{-fdelete-null-pointer-checks} is
 enabled.
 
+@item -ftree-type-promote
+@opindex ftree-type-promote
+This pass applies type promotion to SSA names in the function and
+inserts appropriate truncations to preserve the semantics.  Idea of
+this pass is to promote operations such a way that we can minimise
+generation of subreg in RTL, that intern results in removal of
+redundant zero/sign extensions.
+
+This optimization is enabled by default.
+
 @item -fsplit-ivs-in-unroller
 @opindex fsplit-ivs-in-unroller
 Enables expression of values of induction variables in later iterations
diff --git a/gcc/gimple-ssa-type-promote.c b/gcc/gimple-ssa-type-promote.c
new file mode 100644
index 0000000..735e7ee
--- /dev/null
+++ b/gcc/gimple-ssa-type-promote.c
@@ -0,0 +1,867 @@
+/* Type promotion of SSA names to minimise redundant zero/sign extension.
+   Copyright (C) 2015 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "hash-set.h"
+#include "machmode.h"
+#include "vec.h"
+#include "double-int.h"
+#include "input.h"
+#include "symtab.h"
+#include "wide-int.h"
+#include "inchash.h"
+#include "tree.h"
+#include "fold-const.h"
+#include "stor-layout.h"
+#include "predict.h"
+#include "function.h"
+#include "dominance.h"
+#include "cfg.h"
+#include "basic-block.h"
+#include "tree-ssa-alias.h"
+#include "gimple-fold.h"
+#include "tree-eh.h"
+#include "gimple-expr.h"
+#include "is-a.h"
+#include "gimple.h"
+#include "gimple-iterator.h"
+#include "gimple-ssa.h"
+#include "tree-phinodes.h"
+#include "ssa-iterators.h"
+#include "stringpool.h"
+#include "tree-ssanames.h"
+#include "tree-pass.h"
+#include "gimple-pretty-print.h"
+#include "langhooks.h"
+#include "sbitmap.h"
+#include "domwalk.h"
+#include "tree-dfa.h"
+
+/* This pass applies type promotion to SSA names in the function and
+   inserts appropriate truncations.  Idea of this pass is to promote operations
+   such a way that we can minimise generation of subreg in RTL,
+   that in turn results in removal of redundant zero/sign extensions.  This pass
+   will run prior to The VRP and DOM such that they will be able to optimise
+   redundant truncations and extensions.  This is based on the discussion from
+   https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00472.html.
+*/
+
+/* Structure to hold the type and promoted type for promoted ssa variables.  */
+struct ssa_name_info
+{
+  tree ssa;		/* Name of the SSA_NAME.  */
+  tree type;		/* Original type of ssa.  */
+  tree promoted_type;	/* Promoted type of ssa.  */
+};
+
+/* Obstack for ssa_name_info.  */
+static struct obstack ssa_name_info_obstack;
+
+static unsigned n_ssa_val;
+static sbitmap ssa_to_be_promoted_bitmap;
+static hash_map <tree, ssa_name_info *>  *ssa_name_info_map;
+
+static bool
+type_precision_ok (tree type)
+{
+  return (TYPE_PRECISION (type)
+	  == GET_MODE_PRECISION (TYPE_MODE (type)));
+}
+
+/* Return the promoted type for TYPE.  */
+static tree
+get_promoted_type (tree type)
+{
+  tree promoted_type;
+  enum machine_mode mode;
+  int uns;
+
+  if (POINTER_TYPE_P (type)
+      || !INTEGRAL_TYPE_P (type)
+      || !type_precision_ok (type))
+    return type;
+
+  mode = TYPE_MODE (type);
+#ifdef PROMOTE_MODE
+  uns = TYPE_SIGN (type);
+  PROMOTE_MODE (mode, uns, type);
+#endif
+  uns = TYPE_SIGN (type);
+  if (TYPE_PRECISION (type) == GET_MODE_PRECISION (mode))
+    return type;
+  promoted_type
+    = build_nonstandard_integer_type (GET_MODE_PRECISION (mode),
+				      uns);
+  gcc_assert (TYPE_PRECISION (promoted_type) == GET_MODE_PRECISION (mode));
+  return promoted_type;
+}
+
+/* Return true if ssa NAME is already considered for promotion.  */
+static bool
+ssa_promoted_p (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	return bitmap_bit_p (ssa_to_be_promoted_bitmap, index);
+    }
+  return true;
+}
+
+/* Set ssa NAME to be already considered for promotion.  */
+static void
+set_ssa_promoted (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	bitmap_set_bit (ssa_to_be_promoted_bitmap, index);
+    }
+}
+
+/* Return true if LHS will be promoted later.  */
+static bool
+tobe_promoted_p (tree lhs)
+{
+  if (TREE_CODE (lhs) == SSA_NAME
+      && !POINTER_TYPE_P (TREE_TYPE (lhs))
+      && INTEGRAL_TYPE_P (TREE_TYPE (lhs))
+      && !VECTOR_TYPE_P (TREE_TYPE (lhs))
+      && !ssa_promoted_p (lhs)
+      && (get_promoted_type (TREE_TYPE (lhs))
+	  != TREE_TYPE (lhs)))
+    return true;
+  else
+    return false;
+}
+
+/* Return true if the tree CODE needs the propmoted operand to be
+   truncated (when stray bits are set beyond the original type in
+   promoted mode) to preserve the semantics.  */
+static bool
+truncate_use_p (enum tree_code code)
+{
+  if (code == TRUNC_DIV_EXPR
+      || code == CEIL_DIV_EXPR
+      || code == FLOOR_DIV_EXPR
+      || code == ROUND_DIV_EXPR
+      || code == TRUNC_MOD_EXPR
+      || code == CEIL_MOD_EXPR
+      || code == FLOOR_MOD_EXPR
+      || code == ROUND_MOD_EXPR
+      || code == LSHIFT_EXPR
+      || code == RSHIFT_EXPR
+      || code == MAX_EXPR
+      || code == MIN_EXPR)
+    return true;
+  else
+    return false;
+}
+
+/* Convert constant CST to TYPE.  */
+static tree
+convert_int_cst (tree type, tree cst, signop sign = SIGNED)
+{
+  wide_int wi_cons = fold_convert (type, cst);
+  wi_cons = wi::ext (wi_cons, TYPE_PRECISION (TREE_TYPE (cst)), sign);
+  return wide_int_to_tree (type, wi_cons);
+}
+
+/* Promote constants in STMT to TYPE.  If PROMOTE_COND_EXPR is true,
+   promote only the constants in conditions part of the COND_EXPR.
+
+   We promote the constants when the associated operands are promoted.
+   This usually means that we promote the constants when we promote the
+   defining stmnts (as part of promote_ssa). However for COND_EXPR, we
+   can promote only when we promote the other operand. Therefore, this
+   is done during fixup_use.  */
+
+static void
+promote_cst_in_stmt (gimple *stmt, tree type, bool promote_cond = false)
+{
+  tree op;
+  ssa_op_iter iter;
+  use_operand_p oprnd;
+  int index;
+  tree op0, op1;
+  signop sign = SIGNED;
+
+  switch (gimple_code (stmt))
+    {
+    case GIMPLE_ASSIGN:
+      if (promote_cond
+	  && gimple_assign_rhs_code (stmt) == COND_EXPR)
+	{
+	  /* Promote INTEGER_CST that are tcc_compare arguments.  */
+	  sign = TYPE_SIGN (type);
+	  op = gimple_assign_rhs1 (stmt);
+	  op0 = TREE_OPERAND (op, 0);
+	  op1 = TREE_OPERAND (op, 1);
+	  if (TREE_CODE (op0) == INTEGER_CST)
+	    op0 = convert_int_cst (type, op0, sign);
+	  if (TREE_CODE (op1) == INTEGER_CST)
+	    op1 = convert_int_cst (type, op1, sign);
+	  tree new_op = build2 (TREE_CODE (op), type, op0, op1);
+	  gimple_assign_set_rhs1 (stmt, new_op);
+	}
+      else
+	{
+	  /* Promote INTEGER_CST in GIMPLE_ASSIGN.  */
+	  op = gimple_assign_rhs3 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs3 (stmt, convert_int_cst (type, op, sign));
+	  if ((TREE_CODE_CLASS (gimple_assign_rhs_code (stmt))
+	       == tcc_comparison)
+	      || truncate_use_p (gimple_assign_rhs_code (stmt)))
+	    sign = TYPE_SIGN (type);
+	  op = gimple_assign_rhs1 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs1 (stmt, convert_int_cst (type, op, sign));
+	  op = gimple_assign_rhs2 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs2 (stmt, convert_int_cst (type, op, sign));
+	}
+      break;
+
+    case GIMPLE_PHI:
+	{
+	  /* Promote INTEGER_CST arguments to GIMPLE_PHI.  */
+	  gphi *phi = as_a <gphi *> (stmt);
+	  FOR_EACH_PHI_ARG (oprnd, phi, iter, SSA_OP_USE)
+	    {
+	      op = USE_FROM_PTR (oprnd);
+	      index = PHI_ARG_INDEX_FROM_USE (oprnd);
+	      if (TREE_CODE (op) == INTEGER_CST)
+		SET_PHI_ARG_DEF (phi, index, convert_int_cst (type, op, sign));
+	    }
+	}
+      break;
+
+    case GIMPLE_COND:
+	{
+	  /* Promote INTEGER_CST that are GIMPLE_COND arguments.  */
+	  gcond *cond = as_a <gcond *> (stmt);
+	  sign = TYPE_SIGN (type);
+	  op = gimple_cond_lhs (cond);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_cond_set_lhs (cond, convert_int_cst (type, op, sign));
+
+	  op = gimple_cond_rhs (cond);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_cond_set_rhs (cond, convert_int_cst (type, op, sign));
+	}
+      break;
+
+    default:
+      gcc_unreachable ();
+    }
+}
+
+/* Create an ssa with TYPE to copy ssa VAR.  */
+static tree
+make_promoted_copy (tree var, gimple *def_stmt, tree type)
+{
+  tree new_lhs = make_ssa_name (type, def_stmt);
+  if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (var))
+    SSA_NAME_OCCURS_IN_ABNORMAL_PHI (new_lhs) = 1;
+  return new_lhs;
+}
+
+/* Zero/sign extend (depending on UNSIGNED_P) VAR and truncate to WIDTH bits.
+   Assign the zero/sign extended value in NEW_VAR.  gimple statement
+   that performs the zero/sign extension is returned.  */
+static gimple *
+zero_sign_extend_stmt (tree new_var, tree var, bool unsigned_p, int width)
+{
+  gcc_assert (TYPE_PRECISION (TREE_TYPE (var))
+	      == TYPE_PRECISION (TREE_TYPE (new_var)));
+  gcc_assert (TYPE_PRECISION (TREE_TYPE (var)) > width);
+  gimple *stmt;
+
+  if (unsigned_p)
+    {
+      /* Zero extend.  */
+      tree cst
+	= wide_int_to_tree (TREE_TYPE (var),
+			    wi::mask (width, false,
+				      TYPE_PRECISION (TREE_TYPE (var))));
+      stmt = gimple_build_assign (new_var, BIT_AND_EXPR,
+				  var, cst);
+    }
+  else
+    /* Sign extend.  */
+    stmt = gimple_build_assign (new_var,
+				SEXT_EXPR,
+				var, build_int_cst (TREE_TYPE (var), width));
+  return stmt;
+}
+
+
+static void
+copy_default_ssa (tree to, tree from)
+{
+  SET_SSA_NAME_VAR_OR_IDENTIFIER (to, SSA_NAME_VAR (from));
+  SSA_NAME_DEF_STMT (to) = SSA_NAME_DEF_STMT (from);
+  SET_SSA_NAME_VAR_OR_IDENTIFIER (from, NULL_TREE);
+  SSA_NAME_IS_DEFAULT_DEF (to) = 1;
+  SSA_NAME_IS_DEFAULT_DEF (from) = 0;
+}
+
+/* Promote definition DEF to PROMOTED_TYPE.  If the stmt that defines def
+   is def_stmt, make the type of def promoted_type.  If the stmt is such
+   that, result of the def_stmt cannot be of promoted_type, create a new_def
+   of the original_type and make the def_stmt assign its value to newdef.
+   Then, create a NOP_EXPR to convert new_def to def of promoted type.
+
+   For example, for stmt with original_type char and promoted_type int:
+		char _1 = mem;
+	becomes:
+		char _2 = mem;
+		int _1 = (int)_2;
+
+   If the def_stmt allows def to be promoted, promote def in-place
+   (and its arguments when needed).
+
+   For example:
+		char _3 = _1 + _2;
+	becomes:
+		int _3 = _1 + _2;
+   Here, _1 and _2 will also be promoted.  */
+
+static void
+promote_ssa (tree def, gimple_stmt_iterator *gsi)
+{
+  gimple *def_stmt = SSA_NAME_DEF_STMT (def);
+  gimple *copy_stmt = NULL;
+  basic_block bb;
+  gimple_stmt_iterator gsi2;
+  tree original_type = TREE_TYPE (def);
+  tree new_def;
+  ssa_name_info *info;
+  bool do_not_promote = false;
+  tree promoted_type = get_promoted_type (TREE_TYPE (def));
+
+  if (!tobe_promoted_p (def))
+    return;
+
+  info = (ssa_name_info *) obstack_alloc (&ssa_name_info_obstack,
+							 sizeof (ssa_name_info));
+  info->type = original_type;
+  info->promoted_type = promoted_type;
+  info->ssa = def;
+  gcc_assert (!ssa_name_info_map->get_or_insert (def));
+  ssa_name_info_map->put (def, info);
+
+  switch (gimple_code (def_stmt))
+    {
+    case GIMPLE_PHI:
+      {
+	/* Promote def by fixing its type and make def anonymous.  */
+	TREE_TYPE (def) = promoted_type;
+	SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+	promote_cst_in_stmt (def_stmt, promoted_type);
+	break;
+      }
+
+    case GIMPLE_ASM:
+      {
+	gasm *asm_stmt = as_a <gasm *> (def_stmt);
+	for (unsigned int i = 0; i < gimple_asm_noutputs (asm_stmt); ++i)
+	  {
+	    /* Promote def and copy (i.e. convert) the value defined
+	       by asm to def.  */
+	    tree link = gimple_asm_output_op (asm_stmt, i);
+	    tree op = TREE_VALUE (link);
+	    if (op == def)
+	      {
+		new_def = copy_ssa_name (def);
+		set_ssa_promoted (new_def);
+		copy_default_ssa (new_def, def);
+		TREE_VALUE (link) = new_def;
+		gimple_asm_set_output_op (asm_stmt, i, link);
+
+		TREE_TYPE (def) = promoted_type;
+		copy_stmt = gimple_build_assign (def, NOP_EXPR,
+						 new_def, NULL_TREE);
+		SSA_NAME_IS_DEFAULT_DEF (new_def) = 0;
+		gsi2 = gsi_for_stmt (def_stmt);
+		gsi_insert_after (&gsi2, copy_stmt, GSI_NEW_STMT);
+		break;
+	      }
+	  }
+	break;
+      }
+
+    case GIMPLE_NOP:
+      {
+	if (SSA_NAME_VAR (def) == NULL)
+	  {
+	    /* Promote def by fixing its type for anonymous def.  */
+	    TREE_TYPE (def) = promoted_type;
+	  }
+	else if (TREE_CODE (SSA_NAME_VAR (def)) != PARM_DECL)
+	  {
+	    tree var = copy_node (SSA_NAME_VAR (def));
+	    TREE_TYPE (var) = promoted_type;
+	    TREE_TYPE (def) = promoted_type;
+	    SET_SSA_NAME_VAR_OR_IDENTIFIER (def, var);
+	  }
+	else
+	  {
+	    /* Create a promoted copy of parameters.  */
+	    bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+	    gcc_assert (bb);
+	    gsi2 = gsi_after_labels (bb);
+	    /* Create new_def of the original type and set that to be the
+	       parameter.  */
+	    new_def = copy_ssa_name (def);
+	    set_ssa_promoted (new_def);
+	    set_ssa_default_def (cfun, SSA_NAME_VAR (def), new_def);
+	    copy_default_ssa (new_def, def);
+
+	    /* Now promote the def and copy the value from parameter.  */
+	    TREE_TYPE (def) = promoted_type;
+	    copy_stmt = gimple_build_assign (def, NOP_EXPR,
+					     new_def, NULL_TREE);
+	    SSA_NAME_DEF_STMT (def) = copy_stmt;
+	    gsi_insert_before (&gsi2, copy_stmt, GSI_NEW_STMT);
+	  }
+	break;
+      }
+
+    case GIMPLE_ASSIGN:
+      {
+	enum tree_code code = gimple_assign_rhs_code (def_stmt);
+	tree rhs = gimple_assign_rhs1 (def_stmt);
+	if (gimple_vuse (def_stmt) != NULL_TREE
+	    || gimple_vdef (def_stmt) != NULL_TREE
+	    || TREE_CODE_CLASS (code) == tcc_reference
+	    || code == LROTATE_EXPR
+	    || code == RROTATE_EXPR
+	    || code == VIEW_CONVERT_EXPR
+	    || code == REALPART_EXPR
+	    || code == IMAGPART_EXPR
+	    || code == REDUC_PLUS_EXPR
+	    || code == REDUC_MAX_EXPR
+	    || code == REDUC_MIN_EXPR
+	    || !INTEGRAL_TYPE_P (TREE_TYPE (rhs)))
+	  {
+	    do_not_promote = true;
+	  }
+	else if (CONVERT_EXPR_CODE_P (code))
+	  {
+	    if (!type_precision_ok (TREE_TYPE (rhs)))
+	      {
+		do_not_promote = true;
+	      }
+	    else if (types_compatible_p (TREE_TYPE (rhs), promoted_type))
+	      {
+		/* As we travel statements in dominated order, arguments
+		   of def_stmt will be visited before visiting def.  If RHS
+		   is already promoted and type is compatible, we can convert
+		   them into ZERO/SIGN EXTEND stmt.  */
+		ssa_name_info *info = ssa_name_info_map->get_or_insert (rhs);
+		tree type;
+		if (info == NULL)
+		  type = TREE_TYPE (rhs);
+		else
+		  type = info->type;
+		if ((TYPE_PRECISION (original_type)
+		     > TYPE_PRECISION (type))
+		    || (TYPE_UNSIGNED (original_type)
+			!= TYPE_UNSIGNED (type)))
+		  {
+		    if (TYPE_PRECISION (original_type) < TYPE_PRECISION (type))
+		      type = original_type;
+		    gcc_assert (type != NULL_TREE);
+		    TREE_TYPE (def) = promoted_type;
+		    gimple *copy_stmt =
+		      zero_sign_extend_stmt (def, rhs,
+					     TYPE_UNSIGNED (type),
+					     TYPE_PRECISION (type));
+		    SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		    gsi_replace (gsi, copy_stmt, false);
+		  }
+		else
+		  {
+		    TREE_TYPE (def) = promoted_type;
+		    SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		  }
+	      }
+	    else
+	      {
+		/* If RHS is not promoted OR their types are not
+		   compatible, create NOP_EXPR that converts
+		   RHS to  promoted DEF type and perform a
+		   ZERO/SIGN EXTEND to get the required value
+		   from RHS.  */
+		ssa_name_info *info = ssa_name_info_map->get_or_insert (rhs);
+		if (info != NULL)
+		  {
+		    tree type = info->type;
+		    new_def = copy_ssa_name (rhs);
+		    SET_SSA_NAME_VAR_OR_IDENTIFIER (new_def, NULL_TREE);
+		    TREE_TYPE (def) = promoted_type;
+		    SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		    gimple *copy_stmt =
+		      zero_sign_extend_stmt (new_def, rhs,
+					     TYPE_UNSIGNED (type),
+					     TYPE_PRECISION (type));
+		    gsi2 = gsi_for_stmt (def_stmt);
+		    gsi_insert_before (&gsi2, copy_stmt, GSI_NEW_STMT);
+		    gassign *new_def_stmt = gimple_build_assign (def, code,
+								 new_def, NULL_TREE);
+		    gsi_replace (gsi, new_def_stmt, false);
+		  }
+		else
+		  {
+		    TREE_TYPE (def) = promoted_type;
+		    SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		  }
+	      }
+	  }
+	else
+	  {
+	    /* Promote def by fixing its type and make def anonymous.  */
+	    SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+	    promote_cst_in_stmt (def_stmt, promoted_type);
+	    TREE_TYPE (def) = promoted_type;
+	  }
+	break;
+      }
+
+    default:
+      do_not_promote = true;
+      break;
+    }
+
+  if (do_not_promote)
+    {
+      /* Promote def and copy (i.e. convert) the value defined
+	 by the stmt that cannot be promoted.  */
+      new_def = copy_ssa_name (def);
+      set_ssa_promoted (new_def);
+      SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+      TREE_TYPE (def) = promoted_type;
+      gimple_set_lhs (def_stmt, new_def);
+      copy_stmt = gimple_build_assign (def, NOP_EXPR,
+				       new_def, NULL_TREE);
+      gsi2 = gsi_for_stmt (def_stmt);
+      if (lookup_stmt_eh_lp (def_stmt) > 0
+	  || (gimple_code (def_stmt) == GIMPLE_CALL
+	      && gimple_call_ctrl_altering_p (def_stmt)))
+	gsi_insert_on_edge (FALLTHRU_EDGE (gimple_bb (def_stmt)),
+			    copy_stmt);
+      else
+	gsi_insert_after (&gsi2, copy_stmt, GSI_NEW_STMT);
+    }
+  reset_flow_sensitive_info (def);
+}
+
+/* Fix the (promoted) USE in stmts where USE cannot be be promoted.  */
+static unsigned int
+fixup_use (gimple *stmt, gimple_stmt_iterator *gsi,
+	   use_operand_p op, tree use)
+{
+  ssa_name_info *info = ssa_name_info_map->get_or_insert (use);
+  /* If USE is not promoted, nothing to do.  */
+  if (!info)
+    return 0;
+
+  tree promoted_type = info->promoted_type;
+  tree old_type = info->type;
+  bool do_not_promote = false;
+
+  switch (gimple_code (stmt))
+    {
+    case GIMPLE_DEBUG:
+      {
+	SET_USE (op, fold_convert (old_type, use));
+	update_stmt (stmt);
+	break;
+      }
+
+    case GIMPLE_ASM:
+    case GIMPLE_CALL:
+    case GIMPLE_RETURN:
+      {
+	/* USE cannot be promoted here.  */
+	do_not_promote = true;
+	break;
+      }
+
+    case GIMPLE_ASSIGN:
+      {
+	enum tree_code code = gimple_assign_rhs_code (stmt);
+	tree lhs = gimple_assign_lhs (stmt);
+	if (gimple_vuse (stmt) != NULL_TREE
+	    || gimple_vdef (stmt) != NULL_TREE
+	    || code == VIEW_CONVERT_EXPR
+	    || code == LROTATE_EXPR
+	    || code == RROTATE_EXPR
+	    || code == CONSTRUCTOR
+	    || code == BIT_FIELD_REF
+	    || code == COMPLEX_EXPR
+	    || VECTOR_TYPE_P (TREE_TYPE (lhs)))
+	  {
+	    do_not_promote = true;
+	  }
+	else if (TREE_CODE_CLASS (code) == tcc_comparison
+		 || truncate_use_p (code))
+	  {
+	    /* Promote the constant in comparison when other comparison
+	       operand is promoted.  All other constants are promoted as
+	       part of promoting definition in promote_ssa.  */
+	    if (TREE_CODE_CLASS (code) == tcc_comparison)
+	      promote_cst_in_stmt (stmt, promoted_type, true);
+	    /* In some stmts, value in USE has to be ZERO/SIGN
+	       Extended based on the original type for correct
+	       result.  */
+	    tree temp = make_promoted_copy (use, NULL, TREE_TYPE (use));
+	    gimple *copy_stmt =
+	      zero_sign_extend_stmt (temp, use,
+				     TYPE_UNSIGNED (old_type),
+				     TYPE_PRECISION (old_type));
+	    gsi_insert_before (gsi, copy_stmt, GSI_NEW_STMT);
+
+	    SET_USE (op, temp);
+	    update_stmt (stmt);
+	  }
+	else if (CONVERT_EXPR_CODE_P (code))
+	  {
+	    if (types_compatible_p (TREE_TYPE (lhs), promoted_type))
+	      {
+		/* Type of LHS and promoted RHS are compatible, we can
+		   convert this into ZERO/SIGN EXTEND stmt.  */
+		gimple *copy_stmt =
+		  zero_sign_extend_stmt (lhs, use,
+					 TYPE_UNSIGNED (old_type),
+					 TYPE_PRECISION (old_type));
+		set_ssa_promoted (lhs);
+		gsi_replace (gsi, copy_stmt, false);
+	      }
+	    else if (!tobe_promoted_p (lhs)
+		     || !INTEGRAL_TYPE_P (TREE_TYPE (lhs))
+		     || (TYPE_UNSIGNED (TREE_TYPE (use)) != TYPE_UNSIGNED (TREE_TYPE (lhs))))
+	      {
+		tree temp = make_promoted_copy (use, NULL, TREE_TYPE (use));
+		gimple *copy_stmt =
+		  zero_sign_extend_stmt (temp, use,
+					 TYPE_UNSIGNED (old_type),
+					 TYPE_PRECISION (old_type));
+		gsi_insert_before (gsi, copy_stmt, GSI_NEW_STMT);
+		SET_USE (op, temp);
+		update_stmt (stmt);
+	      }
+	  }
+	break;
+      }
+
+    case GIMPLE_COND:
+      {
+	/* In GIMPLE_COND, value in USE has to be ZERO/SIGN
+	   Extended based on the original type for correct
+	   result.  */
+	tree temp = make_promoted_copy (use, NULL, TREE_TYPE (use));
+	gimple *copy_stmt =
+	  zero_sign_extend_stmt (temp, use,
+				 TYPE_UNSIGNED (old_type),
+				 TYPE_PRECISION (old_type));
+	gsi_insert_before (gsi, copy_stmt, GSI_NEW_STMT);
+	SET_USE (op, temp);
+	promote_cst_in_stmt (stmt, promoted_type);
+	update_stmt (stmt);
+	break;
+      }
+
+    default:
+      break;
+    }
+
+  if (do_not_promote)
+    {
+      /* FOR stmts where USE cannot be promoted, create an
+	 original type copy.  */
+      tree temp;
+      temp = copy_ssa_name (use);
+      SET_SSA_NAME_VAR_OR_IDENTIFIER (temp, NULL_TREE);
+      set_ssa_promoted (temp);
+      TREE_TYPE (temp) = old_type;
+      gimple *copy_stmt = gimple_build_assign (temp, NOP_EXPR,
+					       use, NULL_TREE);
+      gsi_insert_before (gsi, copy_stmt, GSI_NEW_STMT);
+      SET_USE (op, temp);
+      update_stmt (stmt);
+    }
+  return 0;
+}
+
+
+/* Promote all the stmts in the basic block.  */
+static void
+promote_all_stmts (basic_block bb)
+{
+  gimple_stmt_iterator gsi;
+  ssa_op_iter iter;
+  tree def, use;
+  use_operand_p op;
+
+  for (gphi_iterator gpi = gsi_start_phis (bb);
+       !gsi_end_p (gpi); gsi_next (&gpi))
+    {
+      gphi *phi = gpi.phi ();
+      FOR_EACH_PHI_ARG (op, phi, iter, SSA_OP_USE)
+	{
+	  use = USE_FROM_PTR (op);
+	  if (TREE_CODE (use) == SSA_NAME
+	      && gimple_code (SSA_NAME_DEF_STMT (use)) == GIMPLE_NOP)
+	    promote_ssa (use, &gsi);
+	  fixup_use (phi, &gsi, op, use);
+	}
+
+      def = PHI_RESULT (phi);
+      promote_ssa (def, &gsi);
+    }
+  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+    {
+      gimple *stmt = gsi_stmt (gsi);
+      if (is_gimple_debug (stmt))
+	continue;
+
+      FOR_EACH_SSA_USE_OPERAND (op, stmt, iter, SSA_OP_USE)
+	{
+	  use = USE_FROM_PTR (op);
+	  if (TREE_CODE (use) == SSA_NAME
+	      && gimple_code (SSA_NAME_DEF_STMT (use)) == GIMPLE_NOP)
+	    promote_ssa (use, &gsi);
+	  fixup_use (stmt, &gsi, op, use);
+	}
+
+      FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_DEF)
+	promote_ssa (def, &gsi);
+    }
+}
+
+/* Promote use in GIMPLE_DEBUG stmts. Do this separately to avoid generating
+   different sequence with and without -g.  This can  happen when promoting
+   SSA that are defined with GIMPLE_NOP.  */
+static void
+promote_debug_stmts ()
+{
+  basic_block bb;
+  gimple_stmt_iterator gsi;
+  ssa_op_iter iter;
+  tree use;
+  use_operand_p op;
+
+  FOR_EACH_BB_FN (bb, cfun)
+    for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+      {
+	gimple *stmt = gsi_stmt (gsi);
+	if (!is_gimple_debug (stmt))
+	  continue;
+	FOR_EACH_SSA_USE_OPERAND (op, stmt, iter, SSA_OP_USE)
+	  {
+	    use = USE_FROM_PTR (op);
+	    fixup_use (stmt, &gsi, op, use);
+	  }
+      }
+}
+
+
+class type_promotion_dom_walker : public dom_walker
+{
+public:
+  type_promotion_dom_walker (cdi_direction direction)
+    : dom_walker (direction) {}
+  virtual void before_dom_children (basic_block bb)
+    {
+      promote_all_stmts (bb);
+    }
+};
+
+/* Main entry point to the pass.  */
+static unsigned int
+execute_type_promotion (void)
+{
+  n_ssa_val = num_ssa_names;
+  ssa_name_info_map = new hash_map<tree, ssa_name_info *>;
+  ssa_to_be_promoted_bitmap = sbitmap_alloc (n_ssa_val);
+  bitmap_clear (ssa_to_be_promoted_bitmap);
+
+  /* Create the obstack where ssa_name_info will reside.  */
+  gcc_obstack_init (&ssa_name_info_obstack);
+
+  calculate_dominance_info (CDI_DOMINATORS);
+  /* Walk the CFG in dominator order.  */
+  type_promotion_dom_walker (CDI_DOMINATORS)
+    .walk (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+  promote_debug_stmts ();
+  gsi_commit_edge_inserts ();
+
+  obstack_free (&ssa_name_info_obstack, NULL);
+  sbitmap_free (ssa_to_be_promoted_bitmap);
+  delete ssa_name_info_map;
+  return 0;
+}
+
+namespace {
+const pass_data pass_data_type_promotion =
+{
+  GIMPLE_PASS, /* type */
+  "promotion", /* name */
+  OPTGROUP_NONE, /* optinfo_flags */
+  TV_TREE_TYPE_PROMOTE, /* tv_id */
+  PROP_ssa, /* properties_required */
+  0, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  (TODO_cleanup_cfg | TODO_update_ssa | TODO_verify_all),
+};
+
+class pass_type_promotion : public gimple_opt_pass
+{
+public:
+  pass_type_promotion (gcc::context *ctxt)
+    : gimple_opt_pass (pass_data_type_promotion, ctxt)
+  {}
+
+  /* opt_pass methods: */
+  opt_pass * clone () { return new pass_type_promotion (m_ctxt); }
+  virtual bool gate (function *) { return flag_tree_type_promote != 0; }
+  virtual unsigned int execute (function *)
+    {
+      return execute_type_promotion ();
+    }
+
+}; // class pass_type_promotion
+
+} // anon namespace
+
+gimple_opt_pass *
+make_pass_type_promote (gcc::context *ctxt)
+{
+  return new pass_type_promotion (ctxt);
+}
+
diff --git a/gcc/passes.def b/gcc/passes.def
index 36d2b3b..78c463a 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -272,6 +272,7 @@ along with GCC; see the file COPYING3.  If not see
       POP_INSERT_PASSES ()
       NEXT_PASS (pass_simduid_cleanup);
       NEXT_PASS (pass_lower_vector_ssa);
+      NEXT_PASS (pass_type_promote);
       NEXT_PASS (pass_cse_reciprocals);
       NEXT_PASS (pass_reassoc);
       NEXT_PASS (pass_strength_reduction);
diff --git a/gcc/timevar.def b/gcc/timevar.def
index b429faf..a8d40c3 100644
--- a/gcc/timevar.def
+++ b/gcc/timevar.def
@@ -278,6 +278,7 @@ DEFTIMEVAR (TV_VTABLE_VERIFICATION   , "vtable verification")
 DEFTIMEVAR (TV_TREE_UBSAN            , "tree ubsan")
 DEFTIMEVAR (TV_INITIALIZE_RTL        , "initialize rtl")
 DEFTIMEVAR (TV_GIMPLE_LADDRESS       , "address lowering")
+DEFTIMEVAR (TV_TREE_TYPE_PROMOTE     , "tree type promote")
 
 /* Everything else in rest_of_compilation not included above.  */
 DEFTIMEVAR (TV_EARLY_LOCAL	     , "early local passes")
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index 333b5a7..449dd19 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -436,6 +436,7 @@ extern gimple_opt_pass *make_pass_fre (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_check_data_deps (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_copy_prop (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_isolate_erroneous_paths (gcc::context *ctxt);
+extern gimple_opt_pass *make_pass_type_promote (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_vrp (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_uncprop (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_return_slot (gcc::context *ctxt);
diff --git a/libiberty/cp-demangle.c b/libiberty/cp-demangle.c
index ff608a3..6722331 100644
--- a/libiberty/cp-demangle.c
+++ b/libiberty/cp-demangle.c
@@ -4353,7 +4353,7 @@ d_print_comp_inner (struct d_print_info *dpi, int options,
 
   /* Variable used to store the current templates while a previously
      captured scope is used.  */
-  struct d_print_template *saved_templates;
+  struct d_print_template *saved_templates = NULL;
 
   /* Nonzero if templates have been stored in the above variable.  */
   int need_template_restore = 0;
-- 
1.9.1


[-- Attachment #4: 0001-Add-new-SEXT_EXPR-tree-code.patch --]
[-- Type: text/x-diff, Size: 5067 bytes --]

From c0ce364e3a422912a08189645efde46c36583753 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Thu, 22 Oct 2015 10:51:42 +1100
Subject: [PATCH 1/5] Add new SEXT_EXPR tree code

---
 gcc/cfgexpand.c         | 12 ++++++++++++
 gcc/expr.c              | 20 ++++++++++++++++++++
 gcc/fold-const.c        |  4 ++++
 gcc/tree-cfg.c          | 12 ++++++++++++
 gcc/tree-inline.c       |  1 +
 gcc/tree-pretty-print.c | 11 +++++++++++
 gcc/tree.def            |  5 +++++
 7 files changed, 65 insertions(+)

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index eaad859..aeb64bb 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -5054,6 +5054,18 @@ expand_debug_expr (tree exp)
     case FMA_EXPR:
       return simplify_gen_ternary (FMA, mode, inner_mode, op0, op1, op2);
 
+    case SEXT_EXPR:
+      gcc_assert (CONST_INT_P (op1));
+      inner_mode = mode_for_size (INTVAL (op1), MODE_INT, 0);
+      gcc_assert (GET_MODE_BITSIZE (inner_mode) == INTVAL (op1));
+
+      if (mode != inner_mode)
+	op0 = simplify_gen_unary (SIGN_EXTEND,
+				  mode,
+				  gen_lowpart_SUBREG (inner_mode, op0),
+				  inner_mode);
+      return op0;
+
     default:
     flag_unsupported:
 #ifdef ENABLE_CHECKING
diff --git a/gcc/expr.c b/gcc/expr.c
index da68870..c2f535f 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -9318,6 +9318,26 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
       target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, target);
       return target;
 
+    case SEXT_EXPR:
+	{
+	  machine_mode inner_mode = mode_for_size (tree_to_uhwi (treeop1),
+						   MODE_INT, 0);
+	  rtx temp, result;
+	  rtx op0 = expand_normal (treeop0);
+	  op0 = force_reg (mode, op0);
+	  if (mode != inner_mode)
+	    {
+	      result = gen_reg_rtx (mode);
+	      temp = simplify_gen_unary (SIGN_EXTEND, mode,
+					 gen_lowpart_SUBREG (inner_mode, op0),
+					 inner_mode);
+	      convert_move (result, temp, 0);
+	    }
+	  else
+	    result = op0;
+	  return result;
+	}
+
     default:
       gcc_unreachable ();
     }
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 602ea24..a149bad 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -987,6 +987,10 @@ int_const_binop_1 (enum tree_code code, const_tree arg1, const_tree parg2,
       res = wi::bit_and (arg1, arg2);
       break;
 
+    case SEXT_EXPR:
+      res = wi::sext (arg1, arg2.to_uhwi ());
+      break;
+
     case RSHIFT_EXPR:
     case LSHIFT_EXPR:
       if (wi::neg_p (arg2))
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index 8e3e810..d18b3f7 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -3752,6 +3752,18 @@ verify_gimple_assign_binary (gassign *stmt)
         return false;
       }
 
+    case SEXT_EXPR:
+      {
+	if (!INTEGRAL_TYPE_P (lhs_type)
+	    || !useless_type_conversion_p (lhs_type, rhs1_type)
+	    || !tree_fits_uhwi_p (rhs2))
+	  {
+	    error ("invalid operands in sext expr");
+	    return true;
+	  }
+	return false;
+      }
+
     case VEC_WIDEN_LSHIFT_HI_EXPR:
     case VEC_WIDEN_LSHIFT_LO_EXPR:
       {
diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index b8269ef..e61c200 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -3893,6 +3893,7 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights,
     case BIT_XOR_EXPR:
     case BIT_AND_EXPR:
     case BIT_NOT_EXPR:
+    case SEXT_EXPR:
 
     case TRUTH_ANDIF_EXPR:
     case TRUTH_ORIF_EXPR:
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index 11f90051..bec9082 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -1923,6 +1923,14 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, int flags,
       }
       break;
 
+    case SEXT_EXPR:
+      pp_string (pp, "SEXT_EXPR <");
+      dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);
+      pp_string (pp, ", ");
+      dump_generic_node (pp, TREE_OPERAND (node, 1), spc, flags, false);
+      pp_greater (pp);
+      break;
+
     case MODIFY_EXPR:
     case INIT_EXPR:
       dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags,
@@ -3561,6 +3569,9 @@ op_symbol_code (enum tree_code code)
     case MIN_EXPR:
       return "min";
 
+    case SEXT_EXPR:
+      return "sext";
+
     default:
       return "<<< ??? >>>";
     }
diff --git a/gcc/tree.def b/gcc/tree.def
index d0a3bd6..789cfdd 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -760,6 +760,11 @@ DEFTREECODE (BIT_XOR_EXPR, "bit_xor_expr", tcc_binary, 2)
 DEFTREECODE (BIT_AND_EXPR, "bit_and_expr", tcc_binary, 2)
 DEFTREECODE (BIT_NOT_EXPR, "bit_not_expr", tcc_unary, 1)
 
+/*  Sign-extend operation.  It will sign extend first operand from
+ the sign bit specified by the second operand.  The type of the
+ result is that of the first operand.  */
+DEFTREECODE (SEXT_EXPR, "sext_expr", tcc_binary, 2)
+
 /* ANDIF and ORIF allow the second operand not to be computed if the
    value of the expression is determined from the first operand.  AND,
    OR, and XOR always compute the second operand whether its value is
-- 
1.9.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [0/7] Type promotion pass and elimination of zext/sext
  2015-11-10 14:13                   ` Richard Biener
@ 2015-11-12  6:08                     ` Kugan
  2015-11-14  1:15                     ` Kugan
  1 sibling, 0 replies; 28+ messages in thread
From: Kugan @ 2015-11-12  6:08 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 9053 bytes --]

Hi Richard,

Thanks for the review.

>>>
>>> The basic "structure" thing still remains.  You walk over all uses and
>>> defs in all stmts
>>> in promote_all_stmts which ends up calling promote_ssa_if_not_promoted on all
>>> uses and defs which in turn promotes (the "def") and then fixes up all
>>> uses in all stmts.
>>
>> Done.
> 
> Not exactly.  I still see
> 
> /* Promote all the stmts in the basic block.  */
> static void
> promote_all_stmts (basic_block bb)
> {
>   gimple_stmt_iterator gsi;
>   ssa_op_iter iter;
>   tree def, use;
>   use_operand_p op;
> 
>   for (gphi_iterator gpi = gsi_start_phis (bb);
>        !gsi_end_p (gpi); gsi_next (&gpi))
>     {
>       gphi *phi = gpi.phi ();
>       def = PHI_RESULT (phi);
>       promote_ssa (def, &gsi);
> 
>       FOR_EACH_PHI_ARG (op, phi, iter, SSA_OP_USE)
>         {
>           use = USE_FROM_PTR (op);
>           if (TREE_CODE (use) == SSA_NAME
>               && gimple_code (SSA_NAME_DEF_STMT (use)) == GIMPLE_NOP)
>             promote_ssa (use, &gsi);
>           fixup_uses (phi, &gsi, op, use);
>         }
> 
> you still call promote_ssa on both DEFs and USEs and promote_ssa looks
> at SSA_NAME_DEF_STMT of the passed arg.  Please call promote_ssa just
> on DEFs and fixup_uses on USEs.

I am doing this to promote SSA that are defined with GIMPLE_NOP. Is
there anyway to iterate over this. I have added gcc_assert to make sure
that promote_ssa is called only once.

> 
> Any reason you do not promote debug stmts during the DOM walk?
> 
> So for each DEF you record in ssa_name_info
> 
> struct ssa_name_info
> {
>   tree ssa;
>   tree type;
>   tree promoted_type;
> };
> 
> (the fields need documenting).  Add a tree promoted_def to it which you
> can replace any use of the DEF with.

In this version of the patch, I am promoting the def in place. If we
decide to change, I will add it. If I understand you correctly, this is
to be used in iterating over uses and fixing.

> 
> Currently as you call promote_ssa for DEFs and USEs you repeatedly
> overwrite the entry in ssa_name_info_map with a new copy.  So you
> should assert it wasn't already there.
> 
>   switch (gimple_code (def_stmt))
>     {
>     case GIMPLE_PHI:
>         {
> 
> the last { is indented too much it should be indented 2 spaces
> relative to the 'case'

Done.

> 
> 
>   SSA_NAME_RANGE_INFO (def) = NULL;
> 
> only needed in the case 'def' was promoted itself.  Please use
> reset_flow_sensitive_info (def).

We are promoting all the defs. In some-cases we can however use the
value ranges in SSA just by promoting to new type (as the values will be
the same). Shall I do it as a follow up.
> 
>>>
>>> Instead of this you should, in promote_all_stmts, walk over all uses doing what
>>> fixup_uses does and then walk over all defs, doing what promote_ssa does.
>>>
>>> +    case GIMPLE_NOP:
>>> +       {
>>> +         if (SSA_NAME_VAR (def) == NULL)
>>> +           {
>>> +             /* Promote def by fixing its type for anonymous def.  */
>>> +             TREE_TYPE (def) = promoted_type;
>>> +           }
>>> +         else
>>> +           {
>>> +             /* Create a promoted copy of parameters.  */
>>> +             bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
>>>
>>> I think the uninitialized vars are somewhat tricky and it would be best
>>> to create a new uninit anonymous SSA name for them.  You can
>>> have SSA_NAME_VAR != NULL and def _not_ being a parameter
>>> btw.
>>
>> Done. I also had to do some changes to in couple of other places to
>> reflect this.
>> They are:
>> --- a/gcc/tree-ssa-reassoc.c
>> +++ b/gcc/tree-ssa-reassoc.c
>> @@ -302,6 +302,7 @@ phi_rank (gimple *stmt)
>>      {
>>        tree arg = gimple_phi_arg_def (stmt, i);
>>        if (TREE_CODE (arg) == SSA_NAME
>> +         && SSA_NAME_VAR (arg)
>>           && !SSA_NAME_IS_DEFAULT_DEF (arg))
>>         {
>>           gimple *def_stmt = SSA_NAME_DEF_STMT (arg);
>> @@ -434,7 +435,8 @@ get_rank (tree e)
>>        if (gimple_code (stmt) == GIMPLE_PHI)
>>         return phi_rank (stmt);
>>
>> -      if (!is_gimple_assign (stmt))
>> +      if (!is_gimple_assign (stmt)
>> +         && !gimple_nop_p (stmt))
>>         return bb_rank[gimple_bb (stmt)->index];
>>
>> and
>>
>> --- a/gcc/tree-ssa.c
>> +++ b/gcc/tree-ssa.c
>> @@ -752,7 +752,8 @@ verify_use (basic_block bb, basic_block def_bb,
>> use_operand_p use_p,
>>    TREE_VISITED (ssa_name) = 1;
>>
>>    if (gimple_nop_p (SSA_NAME_DEF_STMT (ssa_name))
>> -      && SSA_NAME_IS_DEFAULT_DEF (ssa_name))
>> +      && (SSA_NAME_IS_DEFAULT_DEF (ssa_name)
>> +         || SSA_NAME_VAR (ssa_name) == NULL))
>>      ; /* Default definitions have empty statements.  Nothing to do.  */
>>    else if (!def_bb)
>>      {
>>
>> Does this look OK?
> 
> Hmm, no, this looks bogus.

I have removed all the above.

> 
> I think the best thing to do is not promoting default defs at all and instead
> promote at the uses.
> 
>               /* Create a promoted copy of parameters.  */
>               bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
>               gcc_assert (bb);
>               gsi2 = gsi_after_labels (bb);
>               new_def = copy_ssa_name (def);
>               set_ssa_promoted (new_def);
>               set_ssa_default_def (cfun, SSA_NAME_VAR (def), new_def);
>               duplicate_default_ssa (new_def, def);
>               TREE_TYPE (def) = promoted_type;
> 
> AFAIK this is just an awkward way of replacing all uses by a new DEF, sth
> that should be supported by the machinery so that other default defs can just
> do
> 
>              new_def = get_or_create_default_def (create_tmp_reg
> (promoted_type));
> 
> and have all uses ('def') replaced by new_def.

I experimented with get_or_create_default_def. Here  we have to have a
SSA_NAME_VAR (def) of promoted type.

In the attached patch I am doing the following and seems to work. Does
this looks OK?

+	  }
+	else if (TREE_CODE (SSA_NAME_VAR (def)) != PARM_DECL)
+	  {
+	    tree var = copy_node (SSA_NAME_VAR (def));
+	    TREE_TYPE (var) = promoted_type;
+	    TREE_TYPE (def) = promoted_type;
+	    SET_SSA_NAME_VAR_OR_IDENTIFIER (def, var);
+	  }

I prefer to promote def as otherwise iterating over the uses and
promoting can look complicated (have to look at all the different types
of stmts again and do the right thing as It was in the earlier version
of this before we move to this approach)

>>>
>>> Note that as followup things like the rotates should be "expanded" like
>>> we'd do on RTL (open-coding the thing).  And we'd need a way to
>>> specify zero-/sign-extended loads.
>>>
>>> +/* Return true if it is safe to promote the use in the STMT.  */
>>> +static bool
>>> +safe_to_promote_use_p (gimple *stmt)
>>> +{
>>> +  enum tree_code code = gimple_assign_rhs_code (stmt);
>>> +  tree lhs = gimple_assign_lhs (stmt);
>>> +
>>> +  if (gimple_vuse (stmt) != NULL_TREE
>>> +      || gimple_vdef (stmt) != NULL_TREE
>>>
>>> I think the vuse/vdef check is bogus, you can have a use of 'i_3' in say
>>> _2 = a[i_3];
>>>
>> When I remove this, I see errors in stmts like:
>>
>> unsigned char
>> unsigned int
>> # .MEM_197 = VDEF <.MEM_187>
>> fs_9(D)->fde_encoding = _154;
> 
> Yeah, as said a stmt based check is really bogus without context.  As the
> predicate is only used in a single place it's better to inline it
> there.  In this
> case you want to handle loads/stores differently.  From this context it
> looks like not iterating over uses in the caller but rather iterating over
> uses here makes most sense as you then can do
> 
>    if (gimple_store_p (stmt))
>      {
>         promote all uses that are not gimple_assign_rhs1 ()
>      }
> 
> you can also transparently handle constants for the cases where promoting
> is required.  At the moment their handling is interwinded with the def promotion
> code.  That makes the whole thing hard to follow.


I have updated the comments with:

+/* Promote constants in STMT to TYPE.  If PROMOTE_COND_EXPR is true,
+   promote only the constants in conditions part of the COND_EXPR.
+
+   We promote the constants when the associated operands are promoted.
+   This usually means that we promote the constants when we promote the
+   defining stmnts (as part of promote_ssa). However for COND_EXPR, we
+   can promote only when we promote the other operand. Therefore, this
+   is done during fixup_use.  */


I am handling gimple_debug separately to avoid any code difference with
and without -g option. I have updated the comments for this.

Tested attached patch on ppc64, aarch64 and x86-none-linux-gnu.
regression testing for ppc64 is progressing. I also noticed that
tree-ssa-uninit sometimes gives false positives due to the assumptions
it makes. Is it OK to move this pass before type promotion? I can do the
testings and post a separate patch with this if this OK.

I also removed the optimization that prevents some of the redundant
truncation/extensions from type promotion pass, as it dosent do much as
of now. I can send a proper follow up patch. Is that OK?

Thanks,
Kugan

[-- Attachment #2: 0003-Optimize-ZEXT_EXPR-with-tree-vrp.patch --]
[-- Type: text/x-patch, Size: 3609 bytes --]

From 0eb41ec18322484cf0ae8ca6631ac9dc913576fb Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Thu, 22 Oct 2015 10:53:56 +1100
Subject: [PATCH 3/5] Optimize ZEXT_EXPR with tree-vrp

---
 gcc/match.pd   |  6 ++++++
 gcc/tree-vrp.c | 61 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 67 insertions(+)

diff --git a/gcc/match.pd b/gcc/match.pd
index 0a9598e..1b152f1 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -2585,3 +2585,9 @@ along with GCC; see the file COPYING3.  If not see
   (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
    (op @0 (ext @1 @2)))))
 
+(simplify
+ (sext (sext@2 @0 @1) @3)
+ (if (tree_int_cst_compare (@1, @3) <= 0)
+  @2
+  (sext @0 @3)))
+
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index fe34ffd..024c8ef 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -2241,6 +2241,7 @@ extract_range_from_binary_expr_1 (value_range *vr,
       && code != LSHIFT_EXPR
       && code != MIN_EXPR
       && code != MAX_EXPR
+      && code != SEXT_EXPR
       && code != BIT_AND_EXPR
       && code != BIT_IOR_EXPR
       && code != BIT_XOR_EXPR)
@@ -2801,6 +2802,54 @@ extract_range_from_binary_expr_1 (value_range *vr,
       extract_range_from_multiplicative_op_1 (vr, code, &vr0, &vr1);
       return;
     }
+  else if (code == SEXT_EXPR)
+    {
+      gcc_assert (range_int_cst_p (&vr1));
+      HOST_WIDE_INT prec = tree_to_uhwi (vr1.min);
+      type = vr0.type;
+      wide_int tmin, tmax;
+      wide_int may_be_nonzero, must_be_nonzero;
+
+      wide_int type_min = wi::min_value (prec, SIGNED);
+      wide_int type_max = wi::max_value (prec, SIGNED);
+      type_min = wide_int_to_tree (expr_type, type_min);
+      type_max = wide_int_to_tree (expr_type, type_max);
+      type_min = wi::sext (type_min, prec);
+      type_max = wi::sext (type_max, prec);
+      wide_int sign_bit
+	= wi::set_bit_in_zero (prec - 1,
+			       TYPE_PRECISION (TREE_TYPE (vr0.min)));
+      if (zero_nonzero_bits_from_vr (expr_type, &vr0,
+				     &may_be_nonzero,
+				     &must_be_nonzero))
+	{
+	  if (wi::bit_and (must_be_nonzero, sign_bit) == sign_bit)
+	    {
+	      /* If to-be-extended sign bit is one.  */
+	      tmin = type_min;
+	      tmax = wi::zext (may_be_nonzero, prec);
+	    }
+	  else if (wi::bit_and (may_be_nonzero, sign_bit)
+		   != sign_bit)
+	    {
+	      /* If to-be-extended sign bit is zero.  */
+	      tmin = wi::zext (must_be_nonzero, prec);
+	      tmax = wi::zext (may_be_nonzero, prec);
+	    }
+	  else
+	    {
+	      tmin = type_min;
+	      tmax = type_max;
+	    }
+	}
+      else
+	{
+	  tmin = type_min;
+	  tmax = type_max;
+	}
+      min = wide_int_to_tree (expr_type, tmin);
+      max = wide_int_to_tree (expr_type, tmax);
+    }
   else if (code == RSHIFT_EXPR
 	   || code == LSHIFT_EXPR)
     {
@@ -9166,6 +9215,17 @@ simplify_bit_ops_using_ranges (gimple_stmt_iterator *gsi, gimple *stmt)
 	  break;
 	}
       break;
+    case SEXT_EXPR:
+	{
+	  unsigned int prec = tree_to_uhwi (op1);
+	  wide_int min = vr0.min;
+	  wide_int max = vr0.max;
+	  wide_int sext_min = wi::sext (min, prec);
+	  wide_int sext_max = wi::sext (max, prec);
+	  if (min == sext_min && max == sext_max)
+	    op = op0;
+	}
+      break;
     default:
       gcc_unreachable ();
     }
@@ -9868,6 +9928,7 @@ simplify_stmt_using_ranges (gimple_stmt_iterator *gsi)
 
 	case BIT_AND_EXPR:
 	case BIT_IOR_EXPR:
+	case SEXT_EXPR:
 	  /* Optimize away BIT_AND_EXPR and BIT_IOR_EXPR
 	     if all the bits being cleared are already cleared or
 	     all the bits being set are already set.  */
-- 
1.9.1


[-- Attachment #3: 0002-Add-type-promotion-pass.patch --]
[-- Type: text/x-patch, Size: 30437 bytes --]

From 31c9caf7b239827ed6ac7ad7f4fe05e0ba4197e2 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Thu, 22 Oct 2015 10:52:37 +1100
Subject: [PATCH 2/5] Add type promotion pass

---
 gcc/Makefile.in               |   1 +
 gcc/auto-profile.c            |   2 +-
 gcc/common.opt                |   4 +
 gcc/doc/invoke.texi           |  10 +
 gcc/gimple-ssa-type-promote.c | 845 ++++++++++++++++++++++++++++++++++++++++++
 gcc/passes.def                |   1 +
 gcc/timevar.def               |   1 +
 gcc/tree-pass.h               |   1 +
 libiberty/cp-demangle.c       |   2 +-
 9 files changed, 865 insertions(+), 2 deletions(-)
 create mode 100644 gcc/gimple-ssa-type-promote.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index b91b8dc..c6aed45 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1499,6 +1499,7 @@ OBJS = \
 	tree-vect-slp.o \
 	tree-vectorizer.o \
 	tree-vrp.o \
+	gimple-ssa-type-promote.o \
 	tree.o \
 	valtrack.o \
 	value-prof.o \
diff --git a/gcc/auto-profile.c b/gcc/auto-profile.c
index 25202c5..d32c3b6 100644
--- a/gcc/auto-profile.c
+++ b/gcc/auto-profile.c
@@ -1266,7 +1266,7 @@ afdo_propagate_circuit (const bb_set &annotated_bb, edge_set *annotated_edge)
     FOR_EACH_EDGE (e, ei, bb->succs)
     {
       unsigned i, total = 0;
-      edge only_one;
+      edge only_one = NULL;
       bool check_value_one = (((integer_onep (cmp_rhs))
                                ^ (gimple_cond_code (cmp_stmt) == EQ_EXPR))
                               ^ ((e->flags & EDGE_TRUE_VALUE) != 0));
diff --git a/gcc/common.opt b/gcc/common.opt
index 12ca0d6..f450428 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2404,6 +2404,10 @@ ftree-vrp
 Common Report Var(flag_tree_vrp) Init(0) Optimization
 Perform Value Range Propagation on trees.
 
+ftree-type-promote
+Common Report Var(flag_tree_type_promote) Init(1) Optimization
+Perform Type Promotion on trees
+
 funit-at-a-time
 Common Report Var(flag_unit_at_a_time) Init(1)
 Compile whole compilation unit at a time.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index cd82544..bc059a0 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -9093,6 +9093,16 @@ enabled by default at @option{-O2} and higher.  Null pointer check
 elimination is only done if @option{-fdelete-null-pointer-checks} is
 enabled.
 
+@item -ftree-type-promote
+@opindex ftree-type-promote
+This pass applies type promotion to SSA names in the function and
+inserts appropriate truncations to preserve the semantics.  Idea of
+this pass is to promote operations such a way that we can minimise
+generation of subreg in RTL, that intern results in removal of
+redundant zero/sign extensions.
+
+This optimization is enabled by default.
+
 @item -fsplit-ivs-in-unroller
 @opindex fsplit-ivs-in-unroller
 Enables expression of values of induction variables in later iterations
diff --git a/gcc/gimple-ssa-type-promote.c b/gcc/gimple-ssa-type-promote.c
new file mode 100644
index 0000000..6a8cc06
--- /dev/null
+++ b/gcc/gimple-ssa-type-promote.c
@@ -0,0 +1,845 @@
+/* Type promotion of SSA names to minimise redundant zero/sign extension.
+   Copyright (C) 2015 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "hash-set.h"
+#include "machmode.h"
+#include "vec.h"
+#include "double-int.h"
+#include "input.h"
+#include "symtab.h"
+#include "wide-int.h"
+#include "inchash.h"
+#include "tree.h"
+#include "fold-const.h"
+#include "stor-layout.h"
+#include "predict.h"
+#include "function.h"
+#include "dominance.h"
+#include "cfg.h"
+#include "basic-block.h"
+#include "tree-ssa-alias.h"
+#include "gimple-fold.h"
+#include "tree-eh.h"
+#include "gimple-expr.h"
+#include "is-a.h"
+#include "gimple.h"
+#include "gimple-iterator.h"
+#include "gimple-ssa.h"
+#include "tree-phinodes.h"
+#include "ssa-iterators.h"
+#include "stringpool.h"
+#include "tree-ssanames.h"
+#include "tree-pass.h"
+#include "gimple-pretty-print.h"
+#include "langhooks.h"
+#include "sbitmap.h"
+#include "domwalk.h"
+#include "tree-dfa.h"
+
+/* This pass applies type promotion to SSA names in the function and
+   inserts appropriate truncations.  Idea of this pass is to promote operations
+   such a way that we can minimise generation of subreg in RTL,
+   that in turn results in removal of redundant zero/sign extensions.  This pass
+   will run prior to The VRP and DOM such that they will be able to optimise
+   redundant truncations and extensions.  This is based on the discussion from
+   https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00472.html.
+*/
+
+/* Structure to hold the type and promoted type for promoted ssa variables.  */
+struct ssa_name_info
+{
+  tree ssa;		/* Name of the SSA_NAME.  */
+  tree type;		/* Original type of ssa.  */
+  tree promoted_type;	/* Promoted type of ssa.  */
+};
+
+/* Obstack for ssa_name_info.  */
+static struct obstack ssa_name_info_obstack;
+
+static unsigned n_ssa_val;
+static sbitmap ssa_to_be_promoted_bitmap;
+static hash_map <tree, ssa_name_info *>  *ssa_name_info_map;
+
+static bool
+type_precision_ok (tree type)
+{
+  return (TYPE_PRECISION (type)
+	  == GET_MODE_PRECISION (TYPE_MODE (type)));
+}
+
+/* Return the promoted type for TYPE.  */
+static tree
+get_promoted_type (tree type)
+{
+  tree promoted_type;
+  enum machine_mode mode;
+  int uns;
+
+  if (POINTER_TYPE_P (type)
+      || !INTEGRAL_TYPE_P (type)
+      || !type_precision_ok (type))
+    return type;
+
+  mode = TYPE_MODE (type);
+#ifdef PROMOTE_MODE
+  uns = TYPE_SIGN (type);
+  PROMOTE_MODE (mode, uns, type);
+#endif
+  uns = TYPE_SIGN (type);
+  if (TYPE_PRECISION (type) == GET_MODE_PRECISION (mode))
+    return type;
+  promoted_type
+    = build_nonstandard_integer_type (GET_MODE_PRECISION (mode),
+				      uns);
+  gcc_assert (TYPE_PRECISION (promoted_type) == GET_MODE_PRECISION (mode));
+  return promoted_type;
+}
+
+/* Return true if ssa NAME is already considered for promotion.  */
+static bool
+ssa_promoted_p (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	return bitmap_bit_p (ssa_to_be_promoted_bitmap, index);
+    }
+  return true;
+}
+
+/* Set ssa NAME to be already considered for promotion.  */
+static void
+set_ssa_promoted (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	bitmap_set_bit (ssa_to_be_promoted_bitmap, index);
+    }
+}
+
+/* Return true if LHS will be promoted later.  */
+static bool
+tobe_promoted_p (tree lhs)
+{
+  if (TREE_CODE (lhs) == SSA_NAME
+      && !POINTER_TYPE_P (TREE_TYPE (lhs))
+      && INTEGRAL_TYPE_P (TREE_TYPE (lhs))
+      && !VECTOR_TYPE_P (TREE_TYPE (lhs))
+      && !ssa_promoted_p (lhs)
+      && (get_promoted_type (TREE_TYPE (lhs))
+	  != TREE_TYPE (lhs)))
+    return true;
+  else
+    return false;
+}
+
+/* Convert constant CST to TYPE.  */
+static tree
+convert_int_cst (tree type, tree cst, signop sign = SIGNED)
+{
+  wide_int wi_cons = fold_convert (type, cst);
+  wi_cons = wi::ext (wi_cons, TYPE_PRECISION (TREE_TYPE (cst)), sign);
+  return wide_int_to_tree (type, wi_cons);
+}
+
+/* Promote constants in STMT to TYPE.  If PROMOTE_COND_EXPR is true,
+   promote only the constants in conditions part of the COND_EXPR.
+
+   We promote the constants when the associated operands are promoted.
+   This usually means that we promote the constants when we promote the
+   defining stmnts (as part of promote_ssa). However for COND_EXPR, we
+   can promote only when we promote the other operand. Therefore, this
+   is done during fixup_use.  */
+
+static void
+promote_cst_in_stmt (gimple *stmt, tree type, bool promote_cond = false)
+{
+  tree op;
+  ssa_op_iter iter;
+  use_operand_p oprnd;
+  int index;
+  tree op0, op1;
+  signop sign = SIGNED;
+
+  switch (gimple_code (stmt))
+    {
+    case GIMPLE_ASSIGN:
+      if (promote_cond
+	  && gimple_assign_rhs_code (stmt) == COND_EXPR)
+	{
+	  /* Promote INTEGER_CST that are tcc_compare arguments.  */
+	  sign = TYPE_SIGN (type);
+	  op = gimple_assign_rhs1 (stmt);
+	  op0 = TREE_OPERAND (op, 0);
+	  op1 = TREE_OPERAND (op, 1);
+	  if (TREE_CODE (op0) == INTEGER_CST)
+	    op0 = convert_int_cst (type, op0, sign);
+	  if (TREE_CODE (op1) == INTEGER_CST)
+	    op1 = convert_int_cst (type, op1, sign);
+	  tree new_op = build2 (TREE_CODE (op), type, op0, op1);
+	  gimple_assign_set_rhs1 (stmt, new_op);
+	}
+      else
+	{
+	  /* Promote INTEGER_CST in GIMPLE_ASSIGN.  */
+	  op = gimple_assign_rhs3 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs3 (stmt, convert_int_cst (type, op, sign));
+	  if (TREE_CODE_CLASS (gimple_assign_rhs_code (stmt))
+	      == tcc_comparison)
+	    sign = TYPE_SIGN (type);
+	  op = gimple_assign_rhs1 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs1 (stmt, convert_int_cst (type, op, sign));
+	  op = gimple_assign_rhs2 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs2 (stmt, convert_int_cst (type, op, sign));
+	}
+      break;
+
+    case GIMPLE_PHI:
+	{
+	  /* Promote INTEGER_CST arguments to GIMPLE_PHI.  */
+	  gphi *phi = as_a <gphi *> (stmt);
+	  FOR_EACH_PHI_ARG (oprnd, phi, iter, SSA_OP_USE)
+	    {
+	      op = USE_FROM_PTR (oprnd);
+	      index = PHI_ARG_INDEX_FROM_USE (oprnd);
+	      if (TREE_CODE (op) == INTEGER_CST)
+		SET_PHI_ARG_DEF (phi, index, convert_int_cst (type, op, sign));
+	    }
+	}
+      break;
+
+    case GIMPLE_COND:
+	{
+	  /* Promote INTEGER_CST that are GIMPLE_COND arguments.  */
+	  gcond *cond = as_a <gcond *> (stmt);
+	  sign = TYPE_SIGN (type);
+	  op = gimple_cond_lhs (cond);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_cond_set_lhs (cond, convert_int_cst (type, op, sign));
+
+	  op = gimple_cond_rhs (cond);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_cond_set_rhs (cond, convert_int_cst (type, op, sign));
+	}
+      break;
+
+    default:
+      gcc_unreachable ();
+    }
+}
+
+/* Create an ssa with TYPE to copy ssa VAR.  */
+static tree
+make_promoted_copy (tree var, gimple *def_stmt, tree type)
+{
+  tree new_lhs = make_ssa_name (type, def_stmt);
+  if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (var))
+    SSA_NAME_OCCURS_IN_ABNORMAL_PHI (new_lhs) = 1;
+  return new_lhs;
+}
+
+/* Zero/sign extend (depending on UNSIGNED_P) VAR and truncate to WIDTH bits.
+   Assign the zero/sign extended value in NEW_VAR.  gimple statement
+   that performs the zero/sign extension is returned.  */
+static gimple *
+zero_sign_extend_stmt (tree new_var, tree var, bool unsigned_p, int width)
+{
+  gcc_assert (TYPE_PRECISION (TREE_TYPE (var))
+	      == TYPE_PRECISION (TREE_TYPE (new_var)));
+  gcc_assert (TYPE_PRECISION (TREE_TYPE (var)) > width);
+  gimple *stmt;
+
+  if (unsigned_p)
+    {
+      /* Zero extend.  */
+      tree cst
+	= wide_int_to_tree (TREE_TYPE (var),
+			    wi::mask (width, false,
+				      TYPE_PRECISION (TREE_TYPE (var))));
+      stmt = gimple_build_assign (new_var, BIT_AND_EXPR,
+				  var, cst);
+    }
+  else
+    /* Sign extend.  */
+    stmt = gimple_build_assign (new_var,
+				SEXT_EXPR,
+				var, build_int_cst (TREE_TYPE (var), width));
+  return stmt;
+}
+
+
+static void
+copy_default_ssa (tree to, tree from)
+{
+  SET_SSA_NAME_VAR_OR_IDENTIFIER (to, SSA_NAME_VAR (from));
+  SSA_NAME_DEF_STMT (to) = SSA_NAME_DEF_STMT (from);
+  SET_SSA_NAME_VAR_OR_IDENTIFIER (from, NULL_TREE);
+  SSA_NAME_IS_DEFAULT_DEF (to) = 1;
+  SSA_NAME_IS_DEFAULT_DEF (from) = 0;
+}
+
+/* Promote definition DEF to PROMOTED_TYPE.  If the stmt that defines def
+   is def_stmt, make the type of def promoted_type.  If the stmt is such
+   that, result of the def_stmt cannot be of promoted_type, create a new_def
+   of the original_type and make the def_stmt assign its value to newdef.
+   Then, create a NOP_EXPR to convert new_def to def of promoted type.
+
+   For example, for stmt with original_type char and promoted_type int:
+		char _1 = mem;
+	becomes:
+		char _2 = mem;
+		int _1 = (int)_2;
+
+   If the def_stmt allows def to be promoted, promote def in-place
+   (and its arguments when needed).
+
+   For example:
+		char _3 = _1 + _2;
+	becomes:
+		int _3 = _1 + _2;
+   Here, _1 and _2 will also be promoted.  */
+static void
+promote_ssa (tree def, gimple_stmt_iterator *gsi)
+{
+  gimple *def_stmt = SSA_NAME_DEF_STMT (def);
+  gimple *copy_stmt = NULL;
+  basic_block bb;
+  gimple_stmt_iterator gsi2;
+  tree original_type = TREE_TYPE (def);
+  tree new_def;
+  ssa_name_info *info;
+  bool do_not_promote = false;
+  tree promoted_type = get_promoted_type (TREE_TYPE (def));
+
+  if (!tobe_promoted_p (def))
+    return;
+
+  info = (ssa_name_info *) obstack_alloc (&ssa_name_info_obstack,
+							 sizeof (ssa_name_info));
+  info->type = original_type;
+  info->promoted_type = promoted_type;
+  info->ssa = def;
+  gcc_assert (!ssa_name_info_map->get_or_insert (def));
+  ssa_name_info_map->put (def, info);
+
+  switch (gimple_code (def_stmt))
+    {
+    case GIMPLE_PHI:
+      {
+	/* Promote def by fixing its type and make def anonymous.  */
+	TREE_TYPE (def) = promoted_type;
+	SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+	promote_cst_in_stmt (def_stmt, promoted_type);
+	break;
+      }
+
+    case GIMPLE_ASM:
+      {
+	gasm *asm_stmt = as_a <gasm *> (def_stmt);
+	for (unsigned int i = 0; i < gimple_asm_noutputs (asm_stmt); ++i)
+	  {
+	    /* Promote def and copy (i.e. convert) the value defined
+	       by asm to def.  */
+	    tree link = gimple_asm_output_op (asm_stmt, i);
+	    tree op = TREE_VALUE (link);
+	    if (op == def)
+	      {
+		new_def = copy_ssa_name (def);
+		set_ssa_promoted (new_def);
+		copy_default_ssa (new_def, def);
+		TREE_VALUE (link) = new_def;
+		gimple_asm_set_output_op (asm_stmt, i, link);
+
+		TREE_TYPE (def) = promoted_type;
+		copy_stmt = gimple_build_assign (def, NOP_EXPR,
+						 new_def, NULL_TREE);
+		SSA_NAME_IS_DEFAULT_DEF (new_def) = 0;
+		gsi2 = gsi_for_stmt (def_stmt);
+		gsi_insert_after (&gsi2, copy_stmt, GSI_NEW_STMT);
+		break;
+	      }
+	  }
+	break;
+      }
+
+    case GIMPLE_NOP:
+      {
+	if (SSA_NAME_VAR (def) == NULL)
+	  {
+	    /* Promote def by fixing its type for anonymous def.  */
+	    TREE_TYPE (def) = promoted_type;
+	  }
+	else if (TREE_CODE (SSA_NAME_VAR (def)) != PARM_DECL)
+	  {
+	    tree var = copy_node (SSA_NAME_VAR (def));
+	    TREE_TYPE (var) = promoted_type;
+	    TREE_TYPE (def) = promoted_type;
+	    SET_SSA_NAME_VAR_OR_IDENTIFIER (def, var);
+	  }
+	else
+	  {
+	    /* Create a promoted copy of parameters.  */
+	    bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+	    gcc_assert (bb);
+	    gsi2 = gsi_after_labels (bb);
+	    /* Create new_def of the original type and set that to be the
+	       parameter.  */
+	    new_def = copy_ssa_name (def);
+	    set_ssa_promoted (new_def);
+	    set_ssa_default_def (cfun, SSA_NAME_VAR (def), new_def);
+	    copy_default_ssa (new_def, def);
+
+	    /* Now promote the def and copy the value from parameter.  */
+	    TREE_TYPE (def) = promoted_type;
+	    copy_stmt = gimple_build_assign (def, NOP_EXPR,
+					     new_def, NULL_TREE);
+	    SSA_NAME_DEF_STMT (def) = copy_stmt;
+	    gsi_insert_before (&gsi2, copy_stmt, GSI_NEW_STMT);
+	  }
+	break;
+      }
+
+    case GIMPLE_ASSIGN:
+      {
+	enum tree_code code = gimple_assign_rhs_code (def_stmt);
+	if (gimple_vuse (def_stmt) != NULL_TREE
+	    || gimple_vdef (def_stmt) != NULL_TREE
+	    || TREE_CODE_CLASS (code) == tcc_reference
+	    || code == LROTATE_EXPR
+	    || code == RROTATE_EXPR
+	    || code == VIEW_CONVERT_EXPR
+	    || code == REALPART_EXPR
+	    || code == IMAGPART_EXPR
+	    || code == REDUC_MAX_EXPR
+	    || code == REDUC_PLUS_EXPR
+	    || code == REDUC_MIN_EXPR)
+	  {
+	    do_not_promote = true;
+	  }
+	else if (CONVERT_EXPR_CODE_P (code))
+	  {
+	    tree rhs = gimple_assign_rhs1 (def_stmt);
+	    if (!type_precision_ok (TREE_TYPE (rhs))
+		|| !INTEGRAL_TYPE_P (TREE_TYPE (rhs))
+		|| (TYPE_UNSIGNED (TREE_TYPE (rhs)) != TYPE_UNSIGNED (promoted_type)))
+	      {
+		do_not_promote = true;
+	      }
+	    else if (types_compatible_p (TREE_TYPE (rhs), promoted_type))
+	      {
+		/* As we travel statements in dominated order, arguments
+		   of def_stmt will be visited before visiting def.  If RHS
+		   is already promoted and type is compatible, we can convert
+		   them into ZERO/SIGN EXTEND stmt.  */
+		ssa_name_info *info = ssa_name_info_map->get_or_insert (rhs);
+		tree type;
+		if (info == NULL)
+		  type = TREE_TYPE (rhs);
+		else
+		  type = info->type;
+		if ((TYPE_PRECISION (original_type)
+		     > TYPE_PRECISION (type))
+		    || (TYPE_UNSIGNED (original_type)
+			!= TYPE_UNSIGNED (type)))
+		  {
+		    if (TYPE_PRECISION (original_type) < TYPE_PRECISION (type))
+		      type = original_type;
+		    gcc_assert (type != NULL_TREE);
+		    TREE_TYPE (def) = promoted_type;
+		    gimple *copy_stmt =
+		      zero_sign_extend_stmt (def, rhs,
+					     TYPE_UNSIGNED (type),
+					     TYPE_PRECISION (type));
+		    SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		    gsi_replace (gsi, copy_stmt, false);
+		  }
+		else
+		  {
+		    TREE_TYPE (def) = promoted_type;
+		    SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		  }
+	      }
+	    else
+	      {
+		/* If RHS is not promoted OR their types are not
+		   compatible, create NOP_EXPR that converts
+		   RHS to  promoted DEF type and perform a
+		   ZERO/SIGN EXTEND to get the required value
+		   from RHS.  */
+		ssa_name_info *info = ssa_name_info_map->get_or_insert (rhs);
+		if (info != NULL)
+		  {
+		    tree type = info->type;
+		    new_def = copy_ssa_name (rhs);
+		    SET_SSA_NAME_VAR_OR_IDENTIFIER (new_def, NULL_TREE);
+		    TREE_TYPE (def) = promoted_type;
+		    SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		    gimple *copy_stmt =
+		      zero_sign_extend_stmt (new_def, rhs,
+					     TYPE_UNSIGNED (type),
+					     TYPE_PRECISION (type));
+		    gsi2 = gsi_for_stmt (def_stmt);
+		    gsi_insert_before (&gsi2, copy_stmt, GSI_NEW_STMT);
+		    gassign *new_def_stmt = gimple_build_assign (def, code,
+								 new_def, NULL_TREE);
+		    gsi_replace (gsi, new_def_stmt, false);
+		  }
+		else
+		  {
+		    TREE_TYPE (def) = promoted_type;
+		    SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		  }
+	      }
+	  }
+	else
+	  {
+	    /* Promote def by fixing its type and make def anonymous.  */
+	    SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+	    promote_cst_in_stmt (def_stmt, promoted_type);
+	    TREE_TYPE (def) = promoted_type;
+	  }
+	break;
+      }
+
+    default:
+      do_not_promote = true;
+      break;
+    }
+
+  if (do_not_promote)
+    {
+      /* Promote def and copy (i.e. convert) the value defined
+	 by the stmt that cannot be promoted.  */
+      new_def = copy_ssa_name (def);
+      set_ssa_promoted (new_def);
+      SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+      TREE_TYPE (def) = promoted_type;
+      gimple_set_lhs (def_stmt, new_def);
+      copy_stmt = gimple_build_assign (def, NOP_EXPR,
+				       new_def, NULL_TREE);
+      gsi2 = gsi_for_stmt (def_stmt);
+      if (lookup_stmt_eh_lp (def_stmt) > 0
+	  || (gimple_code (def_stmt) == GIMPLE_CALL
+	      && gimple_call_ctrl_altering_p (def_stmt)))
+	gsi_insert_on_edge (FALLTHRU_EDGE (gimple_bb (def_stmt)),
+			    copy_stmt);
+      else
+	gsi_insert_after (&gsi2, copy_stmt, GSI_NEW_STMT);
+    }
+  reset_flow_sensitive_info (def);
+}
+
+/* Fix the (promoted) USE in stmts where USE cannot be be promoted.  */
+static unsigned int
+fixup_use (gimple *stmt, gimple_stmt_iterator *gsi,
+	   use_operand_p op, tree use)
+{
+  ssa_name_info *info = ssa_name_info_map->get_or_insert (use);
+  /* If USE is not promoted, nothing to do.  */
+  if (!info)
+    return 0;
+
+  tree promoted_type = info->promoted_type;
+  tree old_type = info->type;
+  bool do_not_promote = false;
+
+  switch (gimple_code (stmt))
+    {
+    case GIMPLE_DEBUG:
+      {
+	SET_USE (op, fold_convert (old_type, use));
+	update_stmt (stmt);
+	break;
+      }
+
+    case GIMPLE_ASM:
+    case GIMPLE_CALL:
+    case GIMPLE_RETURN:
+      {
+	/* USE cannot be promoted here.  */
+	do_not_promote = true;
+	break;
+      }
+
+    case GIMPLE_ASSIGN:
+      {
+	enum tree_code code = gimple_assign_rhs_code (stmt);
+	tree lhs = gimple_assign_lhs (stmt);
+	if (gimple_vuse (stmt) != NULL_TREE
+	    || gimple_vdef (stmt) != NULL_TREE
+	    || code == VIEW_CONVERT_EXPR
+	    || code == LROTATE_EXPR
+	    || code == RROTATE_EXPR
+	    || code == CONSTRUCTOR
+	    || code == BIT_FIELD_REF
+	    || code == COMPLEX_EXPR
+	    || VECTOR_TYPE_P (TREE_TYPE (lhs)))
+	  {
+	    do_not_promote = true;
+	  }
+	else if (TREE_CODE_CLASS (code) == tcc_comparison
+		 || code == TRUNC_DIV_EXPR
+		 || code == CEIL_DIV_EXPR
+		 || code == FLOOR_DIV_EXPR
+		 || code == ROUND_DIV_EXPR
+		 || code == TRUNC_MOD_EXPR
+		 || code == CEIL_MOD_EXPR
+		 || code == FLOOR_MOD_EXPR
+		 || code == ROUND_MOD_EXPR
+		 || code == LSHIFT_EXPR
+		 || code == RSHIFT_EXPR
+		 || !INTEGRAL_TYPE_P (TREE_TYPE (lhs)))
+	  {
+	    /* Promote the constant in comparison when other comparison
+	       operand is promoted.  All other constants are promoted as
+	       part of promoting definition in promote_ssa.  */
+	    if (TREE_CODE_CLASS (code) == tcc_comparison)
+	      promote_cst_in_stmt (stmt, promoted_type, true);
+	    /* In some stmts, value in USE has to be ZERO/SIGN
+	       Extended based on the original type for correct
+	       result.  */
+	    tree temp = make_promoted_copy (use, NULL, TREE_TYPE (use));
+	    gimple *copy_stmt =
+	      zero_sign_extend_stmt (temp, use,
+				     TYPE_UNSIGNED (old_type),
+				     TYPE_PRECISION (old_type));
+	    gsi_insert_before (gsi, copy_stmt, GSI_NEW_STMT);
+
+	    SET_USE (op, temp);
+	    update_stmt (stmt);
+	  }
+	else if (CONVERT_EXPR_CODE_P (code))
+	  {
+	    if (types_compatible_p (TREE_TYPE (lhs), promoted_type))
+	      {
+		/* Type of LHS and promoted RHS are compatible, we can
+		   convert this into ZERO/SIGN EXTEND stmt.  */
+		gimple *copy_stmt =
+		  zero_sign_extend_stmt (lhs, use,
+					 TYPE_UNSIGNED (old_type),
+					 TYPE_PRECISION (old_type));
+		set_ssa_promoted (lhs);
+		gsi_replace (gsi, copy_stmt, false);
+	      }
+	    else if (tobe_promoted_p (lhs));
+	    else
+	      {
+		do_not_promote = true;
+	      }
+	  }
+	break;
+      }
+
+    case GIMPLE_COND:
+      {
+	/* In GIMPLE_COND, value in USE has to be ZERO/SIGN
+	   Extended based on the original type for correct
+	   result.  */
+	tree temp = make_promoted_copy (use, NULL, TREE_TYPE (use));
+	gimple *copy_stmt =
+	  zero_sign_extend_stmt (temp, use,
+				 TYPE_UNSIGNED (old_type),
+				 TYPE_PRECISION (old_type));
+	gsi_insert_before (gsi, copy_stmt, GSI_NEW_STMT);
+	SET_USE (op, temp);
+	promote_cst_in_stmt (stmt, promoted_type);
+	update_stmt (stmt);
+	break;
+      }
+
+    default:
+      break;
+    }
+
+  if (do_not_promote)
+    {
+      /* FOR stmts where USE cannot be promoted, create an
+	 original type copy.  */
+      tree temp;
+      temp = copy_ssa_name (use);
+      SET_SSA_NAME_VAR_OR_IDENTIFIER (temp, NULL_TREE);
+      set_ssa_promoted (temp);
+      TREE_TYPE (temp) = old_type;
+      gimple *copy_stmt = gimple_build_assign (temp, NOP_EXPR,
+					       use, NULL_TREE);
+      gsi_insert_before (gsi, copy_stmt, GSI_NEW_STMT);
+      SET_USE (op, temp);
+      update_stmt (stmt);
+    }
+  return 0;
+}
+
+
+/* Promote all the stmts in the basic block.  */
+static void
+promote_all_stmts (basic_block bb)
+{
+  gimple_stmt_iterator gsi;
+  ssa_op_iter iter;
+  tree def, use;
+  use_operand_p op;
+
+  for (gphi_iterator gpi = gsi_start_phis (bb);
+       !gsi_end_p (gpi); gsi_next (&gpi))
+    {
+      gphi *phi = gpi.phi ();
+      FOR_EACH_PHI_ARG (op, phi, iter, SSA_OP_USE)
+	{
+	  use = USE_FROM_PTR (op);
+	  if (TREE_CODE (use) == SSA_NAME
+	      && gimple_code (SSA_NAME_DEF_STMT (use)) == GIMPLE_NOP)
+	    promote_ssa (use, &gsi);
+	  fixup_use (phi, &gsi, op, use);
+	}
+
+      def = PHI_RESULT (phi);
+      promote_ssa (def, &gsi);
+    }
+  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+    {
+      gimple *stmt = gsi_stmt (gsi);
+      if (is_gimple_debug (stmt))
+	continue;
+
+      FOR_EACH_SSA_USE_OPERAND (op, stmt, iter, SSA_OP_USE)
+	{
+	  use = USE_FROM_PTR (op);
+	  if (TREE_CODE (use) == SSA_NAME
+	      && gimple_code (SSA_NAME_DEF_STMT (use)) == GIMPLE_NOP)
+	    promote_ssa (use, &gsi);
+	  fixup_use (stmt, &gsi, op, use);
+	}
+
+      FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_DEF)
+	promote_ssa (def, &gsi);
+    }
+}
+
+/* Promote use in GIMPLE_DEBUG stmts. Do this separately to avoid generating
+   different sequence with and without -g.  This can  happen when promoting
+   SSA that are defined with GIMPLE_NOP.  */
+static void
+promote_debug_stmts ()
+{
+  basic_block bb;
+  gimple_stmt_iterator gsi;
+  ssa_op_iter iter;
+  tree use;
+  use_operand_p op;
+
+  FOR_EACH_BB_FN (bb, cfun)
+    for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+      {
+	gimple *stmt = gsi_stmt (gsi);
+	if (!is_gimple_debug (stmt))
+	  continue;
+	FOR_EACH_SSA_USE_OPERAND (op, stmt, iter, SSA_OP_USE)
+	  {
+	    use = USE_FROM_PTR (op);
+	    fixup_use (stmt, &gsi, op, use);
+	  }
+      }
+}
+
+
+class type_promotion_dom_walker : public dom_walker
+{
+public:
+  type_promotion_dom_walker (cdi_direction direction)
+    : dom_walker (direction) {}
+  virtual void before_dom_children (basic_block bb)
+    {
+      promote_all_stmts (bb);
+    }
+};
+
+/* Main entry point to the pass.  */
+static unsigned int
+execute_type_promotion (void)
+{
+  n_ssa_val = num_ssa_names;
+  ssa_name_info_map = new hash_map<tree, ssa_name_info *>;
+  ssa_to_be_promoted_bitmap = sbitmap_alloc (n_ssa_val);
+  bitmap_clear (ssa_to_be_promoted_bitmap);
+
+  /* Create the obstack where ssa_name_info will reside.  */
+  gcc_obstack_init (&ssa_name_info_obstack);
+
+  calculate_dominance_info (CDI_DOMINATORS);
+  /* Walk the CFG in dominator order.  */
+  type_promotion_dom_walker (CDI_DOMINATORS)
+    .walk (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+  promote_debug_stmts ();
+  gsi_commit_edge_inserts ();
+
+  obstack_free (&ssa_name_info_obstack, NULL);
+  sbitmap_free (ssa_to_be_promoted_bitmap);
+  delete ssa_name_info_map;
+  return 0;
+}
+
+namespace {
+const pass_data pass_data_type_promotion =
+{
+  GIMPLE_PASS, /* type */
+  "promotion", /* name */
+  OPTGROUP_NONE, /* optinfo_flags */
+  TV_TREE_TYPE_PROMOTE, /* tv_id */
+  PROP_ssa, /* properties_required */
+  0, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  (TODO_cleanup_cfg | TODO_update_ssa | TODO_verify_all),
+};
+
+class pass_type_promotion : public gimple_opt_pass
+{
+public:
+  pass_type_promotion (gcc::context *ctxt)
+    : gimple_opt_pass (pass_data_type_promotion, ctxt)
+  {}
+
+  /* opt_pass methods: */
+  opt_pass * clone () { return new pass_type_promotion (m_ctxt); }
+  virtual bool gate (function *) { return flag_tree_type_promote != 0; }
+  virtual unsigned int execute (function *)
+    {
+      return execute_type_promotion ();
+    }
+
+}; // class pass_type_promotion
+
+} // anon namespace
+
+gimple_opt_pass *
+make_pass_type_promote (gcc::context *ctxt)
+{
+  return new pass_type_promotion (ctxt);
+}
+
diff --git a/gcc/passes.def b/gcc/passes.def
index 36d2b3b..78c463a 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -272,6 +272,7 @@ along with GCC; see the file COPYING3.  If not see
       POP_INSERT_PASSES ()
       NEXT_PASS (pass_simduid_cleanup);
       NEXT_PASS (pass_lower_vector_ssa);
+      NEXT_PASS (pass_type_promote);
       NEXT_PASS (pass_cse_reciprocals);
       NEXT_PASS (pass_reassoc);
       NEXT_PASS (pass_strength_reduction);
diff --git a/gcc/timevar.def b/gcc/timevar.def
index b429faf..a8d40c3 100644
--- a/gcc/timevar.def
+++ b/gcc/timevar.def
@@ -278,6 +278,7 @@ DEFTIMEVAR (TV_VTABLE_VERIFICATION   , "vtable verification")
 DEFTIMEVAR (TV_TREE_UBSAN            , "tree ubsan")
 DEFTIMEVAR (TV_INITIALIZE_RTL        , "initialize rtl")
 DEFTIMEVAR (TV_GIMPLE_LADDRESS       , "address lowering")
+DEFTIMEVAR (TV_TREE_TYPE_PROMOTE     , "tree type promote")
 
 /* Everything else in rest_of_compilation not included above.  */
 DEFTIMEVAR (TV_EARLY_LOCAL	     , "early local passes")
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index 333b5a7..449dd19 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -436,6 +436,7 @@ extern gimple_opt_pass *make_pass_fre (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_check_data_deps (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_copy_prop (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_isolate_erroneous_paths (gcc::context *ctxt);
+extern gimple_opt_pass *make_pass_type_promote (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_vrp (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_uncprop (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_return_slot (gcc::context *ctxt);
diff --git a/libiberty/cp-demangle.c b/libiberty/cp-demangle.c
index ff608a3..6722331 100644
--- a/libiberty/cp-demangle.c
+++ b/libiberty/cp-demangle.c
@@ -4353,7 +4353,7 @@ d_print_comp_inner (struct d_print_info *dpi, int options,
 
   /* Variable used to store the current templates while a previously
      captured scope is used.  */
-  struct d_print_template *saved_templates;
+  struct d_print_template *saved_templates = NULL;
 
   /* Nonzero if templates have been stored in the above variable.  */
   int need_template_restore = 0;
-- 
1.9.1


[-- Attachment #4: 0001-Add-new-SEXT_EXPR-tree-code.patch --]
[-- Type: text/x-patch, Size: 5067 bytes --]

From c0ce364e3a422912a08189645efde46c36583753 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Thu, 22 Oct 2015 10:51:42 +1100
Subject: [PATCH 1/5] Add new SEXT_EXPR tree code

---
 gcc/cfgexpand.c         | 12 ++++++++++++
 gcc/expr.c              | 20 ++++++++++++++++++++
 gcc/fold-const.c        |  4 ++++
 gcc/tree-cfg.c          | 12 ++++++++++++
 gcc/tree-inline.c       |  1 +
 gcc/tree-pretty-print.c | 11 +++++++++++
 gcc/tree.def            |  5 +++++
 7 files changed, 65 insertions(+)

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index eaad859..aeb64bb 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -5054,6 +5054,18 @@ expand_debug_expr (tree exp)
     case FMA_EXPR:
       return simplify_gen_ternary (FMA, mode, inner_mode, op0, op1, op2);
 
+    case SEXT_EXPR:
+      gcc_assert (CONST_INT_P (op1));
+      inner_mode = mode_for_size (INTVAL (op1), MODE_INT, 0);
+      gcc_assert (GET_MODE_BITSIZE (inner_mode) == INTVAL (op1));
+
+      if (mode != inner_mode)
+	op0 = simplify_gen_unary (SIGN_EXTEND,
+				  mode,
+				  gen_lowpart_SUBREG (inner_mode, op0),
+				  inner_mode);
+      return op0;
+
     default:
     flag_unsupported:
 #ifdef ENABLE_CHECKING
diff --git a/gcc/expr.c b/gcc/expr.c
index da68870..c2f535f 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -9318,6 +9318,26 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
       target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, target);
       return target;
 
+    case SEXT_EXPR:
+	{
+	  machine_mode inner_mode = mode_for_size (tree_to_uhwi (treeop1),
+						   MODE_INT, 0);
+	  rtx temp, result;
+	  rtx op0 = expand_normal (treeop0);
+	  op0 = force_reg (mode, op0);
+	  if (mode != inner_mode)
+	    {
+	      result = gen_reg_rtx (mode);
+	      temp = simplify_gen_unary (SIGN_EXTEND, mode,
+					 gen_lowpart_SUBREG (inner_mode, op0),
+					 inner_mode);
+	      convert_move (result, temp, 0);
+	    }
+	  else
+	    result = op0;
+	  return result;
+	}
+
     default:
       gcc_unreachable ();
     }
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 602ea24..a149bad 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -987,6 +987,10 @@ int_const_binop_1 (enum tree_code code, const_tree arg1, const_tree parg2,
       res = wi::bit_and (arg1, arg2);
       break;
 
+    case SEXT_EXPR:
+      res = wi::sext (arg1, arg2.to_uhwi ());
+      break;
+
     case RSHIFT_EXPR:
     case LSHIFT_EXPR:
       if (wi::neg_p (arg2))
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index 8e3e810..d18b3f7 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -3752,6 +3752,18 @@ verify_gimple_assign_binary (gassign *stmt)
         return false;
       }
 
+    case SEXT_EXPR:
+      {
+	if (!INTEGRAL_TYPE_P (lhs_type)
+	    || !useless_type_conversion_p (lhs_type, rhs1_type)
+	    || !tree_fits_uhwi_p (rhs2))
+	  {
+	    error ("invalid operands in sext expr");
+	    return true;
+	  }
+	return false;
+      }
+
     case VEC_WIDEN_LSHIFT_HI_EXPR:
     case VEC_WIDEN_LSHIFT_LO_EXPR:
       {
diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index b8269ef..e61c200 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -3893,6 +3893,7 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights,
     case BIT_XOR_EXPR:
     case BIT_AND_EXPR:
     case BIT_NOT_EXPR:
+    case SEXT_EXPR:
 
     case TRUTH_ANDIF_EXPR:
     case TRUTH_ORIF_EXPR:
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index 11f90051..bec9082 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -1923,6 +1923,14 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, int flags,
       }
       break;
 
+    case SEXT_EXPR:
+      pp_string (pp, "SEXT_EXPR <");
+      dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);
+      pp_string (pp, ", ");
+      dump_generic_node (pp, TREE_OPERAND (node, 1), spc, flags, false);
+      pp_greater (pp);
+      break;
+
     case MODIFY_EXPR:
     case INIT_EXPR:
       dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags,
@@ -3561,6 +3569,9 @@ op_symbol_code (enum tree_code code)
     case MIN_EXPR:
       return "min";
 
+    case SEXT_EXPR:
+      return "sext";
+
     default:
       return "<<< ??? >>>";
     }
diff --git a/gcc/tree.def b/gcc/tree.def
index d0a3bd6..789cfdd 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -760,6 +760,11 @@ DEFTREECODE (BIT_XOR_EXPR, "bit_xor_expr", tcc_binary, 2)
 DEFTREECODE (BIT_AND_EXPR, "bit_and_expr", tcc_binary, 2)
 DEFTREECODE (BIT_NOT_EXPR, "bit_not_expr", tcc_unary, 1)
 
+/*  Sign-extend operation.  It will sign extend first operand from
+ the sign bit specified by the second operand.  The type of the
+ result is that of the first operand.  */
+DEFTREECODE (SEXT_EXPR, "sext_expr", tcc_binary, 2)
+
 /* ANDIF and ORIF allow the second operand not to be computed if the
    value of the expression is determined from the first operand.  AND,
    OR, and XOR always compute the second operand whether its value is
-- 
1.9.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [0/7] Type promotion pass and elimination of zext/sext
  2015-11-08  9:43                 ` Kugan
@ 2015-11-10 14:13                   ` Richard Biener
  2015-11-12  6:08                     ` Kugan
  2015-11-14  1:15                     ` Kugan
  0 siblings, 2 replies; 28+ messages in thread
From: Richard Biener @ 2015-11-10 14:13 UTC (permalink / raw)
  To: Kugan; +Cc: gcc-patches

On Sun, Nov 8, 2015 at 10:43 AM, Kugan
<kugan.vivekanandarajah@linaro.org> wrote:
>
> Thanks Richard for the comments.  Please find the attached patches which
> now passes bootstrap with x86_64-none-linux-gnu, aarch64-linux-gnu  and
> ppc64-linux-gnu. Regression testing is ongoing. Please find the comments
> for your questions/suggestions below.
>
>>
>> I notice
>>
>> diff --git a/gcc/tree-ssanames.c b/gcc/tree-ssanames.c
>> index 82fd4a1..80fcf70 100644
>> --- a/gcc/tree-ssanames.c
>> +++ b/gcc/tree-ssanames.c
>> @@ -207,7 +207,8 @@ set_range_info (tree name, enum value_range_type range_type,
>>    unsigned int precision = TYPE_PRECISION (TREE_TYPE (name));
>>
>>    /* Allocate if not available.  */
>> -  if (ri == NULL)
>> +  if (ri == NULL
>> +      || (precision != ri->get_min ().get_precision ()))
>>
>> and I think you need to clear range info on promoted SSA vars in the
>> promotion pass.
>
> Done.
>
>>
>> The basic "structure" thing still remains.  You walk over all uses and
>> defs in all stmts
>> in promote_all_stmts which ends up calling promote_ssa_if_not_promoted on all
>> uses and defs which in turn promotes (the "def") and then fixes up all
>> uses in all stmts.
>
> Done.

Not exactly.  I still see

/* Promote all the stmts in the basic block.  */
static void
promote_all_stmts (basic_block bb)
{
  gimple_stmt_iterator gsi;
  ssa_op_iter iter;
  tree def, use;
  use_operand_p op;

  for (gphi_iterator gpi = gsi_start_phis (bb);
       !gsi_end_p (gpi); gsi_next (&gpi))
    {
      gphi *phi = gpi.phi ();
      def = PHI_RESULT (phi);
      promote_ssa (def, &gsi);

      FOR_EACH_PHI_ARG (op, phi, iter, SSA_OP_USE)
        {
          use = USE_FROM_PTR (op);
          if (TREE_CODE (use) == SSA_NAME
              && gimple_code (SSA_NAME_DEF_STMT (use)) == GIMPLE_NOP)
            promote_ssa (use, &gsi);
          fixup_uses (phi, &gsi, op, use);
        }

you still call promote_ssa on both DEFs and USEs and promote_ssa looks
at SSA_NAME_DEF_STMT of the passed arg.  Please call promote_ssa just
on DEFs and fixup_uses on USEs.

Any reason you do not promote debug stmts during the DOM walk?

So for each DEF you record in ssa_name_info

struct ssa_name_info
{
  tree ssa;
  tree type;
  tree promoted_type;
};

(the fields need documenting).  Add a tree promoted_def to it which you
can replace any use of the DEF with.

Currently as you call promote_ssa for DEFs and USEs you repeatedly
overwrite the entry in ssa_name_info_map with a new copy.  So you
should assert it wasn't already there.

  switch (gimple_code (def_stmt))
    {
    case GIMPLE_PHI:
        {

the last { is indented too much it should be indented 2 spaces
relative to the 'case'


  SSA_NAME_RANGE_INFO (def) = NULL;

only needed in the case 'def' was promoted itself.  Please use
reset_flow_sensitive_info (def).

>>
>> Instead of this you should, in promote_all_stmts, walk over all uses doing what
>> fixup_uses does and then walk over all defs, doing what promote_ssa does.
>>
>> +    case GIMPLE_NOP:
>> +       {
>> +         if (SSA_NAME_VAR (def) == NULL)
>> +           {
>> +             /* Promote def by fixing its type for anonymous def.  */
>> +             TREE_TYPE (def) = promoted_type;
>> +           }
>> +         else
>> +           {
>> +             /* Create a promoted copy of parameters.  */
>> +             bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
>>
>> I think the uninitialized vars are somewhat tricky and it would be best
>> to create a new uninit anonymous SSA name for them.  You can
>> have SSA_NAME_VAR != NULL and def _not_ being a parameter
>> btw.
>
> Done. I also had to do some changes to in couple of other places to
> reflect this.
> They are:
> --- a/gcc/tree-ssa-reassoc.c
> +++ b/gcc/tree-ssa-reassoc.c
> @@ -302,6 +302,7 @@ phi_rank (gimple *stmt)
>      {
>        tree arg = gimple_phi_arg_def (stmt, i);
>        if (TREE_CODE (arg) == SSA_NAME
> +         && SSA_NAME_VAR (arg)
>           && !SSA_NAME_IS_DEFAULT_DEF (arg))
>         {
>           gimple *def_stmt = SSA_NAME_DEF_STMT (arg);
> @@ -434,7 +435,8 @@ get_rank (tree e)
>        if (gimple_code (stmt) == GIMPLE_PHI)
>         return phi_rank (stmt);
>
> -      if (!is_gimple_assign (stmt))
> +      if (!is_gimple_assign (stmt)
> +         && !gimple_nop_p (stmt))
>         return bb_rank[gimple_bb (stmt)->index];
>
> and
>
> --- a/gcc/tree-ssa.c
> +++ b/gcc/tree-ssa.c
> @@ -752,7 +752,8 @@ verify_use (basic_block bb, basic_block def_bb,
> use_operand_p use_p,
>    TREE_VISITED (ssa_name) = 1;
>
>    if (gimple_nop_p (SSA_NAME_DEF_STMT (ssa_name))
> -      && SSA_NAME_IS_DEFAULT_DEF (ssa_name))
> +      && (SSA_NAME_IS_DEFAULT_DEF (ssa_name)
> +         || SSA_NAME_VAR (ssa_name) == NULL))
>      ; /* Default definitions have empty statements.  Nothing to do.  */
>    else if (!def_bb)
>      {
>
> Does this look OK?

Hmm, no, this looks bogus.

I think the best thing to do is not promoting default defs at all and instead
promote at the uses.

              /* Create a promoted copy of parameters.  */
              bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
              gcc_assert (bb);
              gsi2 = gsi_after_labels (bb);
              new_def = copy_ssa_name (def);
              set_ssa_promoted (new_def);
              set_ssa_default_def (cfun, SSA_NAME_VAR (def), new_def);
              duplicate_default_ssa (new_def, def);
              TREE_TYPE (def) = promoted_type;

AFAIK this is just an awkward way of replacing all uses by a new DEF, sth
that should be supported by the machinery so that other default defs can just
do

             new_def = get_or_create_default_def (create_tmp_reg
(promoted_type));

and have all uses ('def') replaced by new_def.

>>
>> +/* Return true if it is safe to promote the defined SSA_NAME in the STMT
>> +   itself.  */
>> +static bool
>> +safe_to_promote_def_p (gimple *stmt)
>> +{
>> +  enum tree_code code = gimple_assign_rhs_code (stmt);
>> +  if (gimple_vuse (stmt) != NULL_TREE
>> +      || gimple_vdef (stmt) != NULL_TREE
>> +      || code == ARRAY_REF
>> +      || code == LROTATE_EXPR
>> +      || code == RROTATE_EXPR
>> +      || code == VIEW_CONVERT_EXPR
>> +      || code == BIT_FIELD_REF
>> +      || code == REALPART_EXPR
>> +      || code == IMAGPART_EXPR
>> +      || code == REDUC_MAX_EXPR
>> +      || code == REDUC_PLUS_EXPR
>> +      || code == REDUC_MIN_EXPR)
>> +    return false;
>> +  return true;
>>
>> huh, I think this function has an odd name, maybe
>> can_promote_operation ()?  Please
>> use TREE_CODE_CLASS (code) == tcc_reference for all _REF trees.
>
> Done.
>
>>
>> Note that as followup things like the rotates should be "expanded" like
>> we'd do on RTL (open-coding the thing).  And we'd need a way to
>> specify zero-/sign-extended loads.
>>
>> +/* Return true if it is safe to promote the use in the STMT.  */
>> +static bool
>> +safe_to_promote_use_p (gimple *stmt)
>> +{
>> +  enum tree_code code = gimple_assign_rhs_code (stmt);
>> +  tree lhs = gimple_assign_lhs (stmt);
>> +
>> +  if (gimple_vuse (stmt) != NULL_TREE
>> +      || gimple_vdef (stmt) != NULL_TREE
>>
>> I think the vuse/vdef check is bogus, you can have a use of 'i_3' in say
>> _2 = a[i_3];
>>
> When I remove this, I see errors in stmts like:
>
> unsigned char
> unsigned int
> # .MEM_197 = VDEF <.MEM_187>
> fs_9(D)->fde_encoding = _154;

Yeah, as said a stmt based check is really bogus without context.  As the
predicate is only used in a single place it's better to inline it
there.  In this
case you want to handle loads/stores differently.  From this context it
looks like not iterating over uses in the caller but rather iterating over
uses here makes most sense as you then can do

   if (gimple_store_p (stmt))
     {
        promote all uses that are not gimple_assign_rhs1 ()
     }

you can also transparently handle constants for the cases where promoting
is required.  At the moment their handling is interwinded with the def promotion
code.  That makes the whole thing hard to follow.

Thanks,
Richard.

>
>> +      || code == VIEW_CONVERT_EXPR
>> +      || code == LROTATE_EXPR
>> +      || code == RROTATE_EXPR
>> +      || code == CONSTRUCTOR
>> +      || code == BIT_FIELD_REF
>> +      || code == COMPLEX_EXPR
>> +      || code == ASM_EXPR
>> +      || VECTOR_TYPE_P (TREE_TYPE (lhs)))
>> +    return false;
>> +  return true;
>>
>> ASM_EXPR can never appear here.  I think PROMOTE_MODE never
>> promotes vector types - what cases did you need to add VECTOR_TYPE_P for?
>
> Done
>>
>> +/* Return true if the SSA_NAME has to be truncated to preserve the
>> +   semantics.  */
>> +static bool
>> +truncate_use_p (gimple *stmt)
>> +{
>> +  enum tree_code code = gimple_assign_rhs_code (stmt);
>>
>> I think the description can be improved.  This is about stray bits set
>> beyond the original type, correct?
>>
>> Please use NOP_EXPR wherever you use CONVERT_EXPR right how.
>>
>> +                 if (TREE_CODE_CLASS (code)
>> +                     == tcc_comparison)
>> +                   promote_cst_in_stmt (stmt, promoted_type, true);
>>
>> don't you always need to promote constant operands?
>
> I am promoting all the constants. Here, I am promoting the the constants
> that are part of the conditions.
>
>
> Thanks,
> Kugan

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [0/7] Type promotion pass and elimination of zext/sext
  2015-11-03 14:40               ` Richard Biener
@ 2015-11-08  9:43                 ` Kugan
  2015-11-10 14:13                   ` Richard Biener
  0 siblings, 1 reply; 28+ messages in thread
From: Kugan @ 2015-11-08  9:43 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 5762 bytes --]


Thanks Richard for the comments.  Please find the attached patches which
now passes bootstrap with x86_64-none-linux-gnu, aarch64-linux-gnu  and
ppc64-linux-gnu. Regression testing is ongoing. Please find the comments
for your questions/suggestions below.

> 
> I notice
> 
> diff --git a/gcc/tree-ssanames.c b/gcc/tree-ssanames.c
> index 82fd4a1..80fcf70 100644
> --- a/gcc/tree-ssanames.c
> +++ b/gcc/tree-ssanames.c
> @@ -207,7 +207,8 @@ set_range_info (tree name, enum value_range_type range_type,
>    unsigned int precision = TYPE_PRECISION (TREE_TYPE (name));
> 
>    /* Allocate if not available.  */
> -  if (ri == NULL)
> +  if (ri == NULL
> +      || (precision != ri->get_min ().get_precision ()))
> 
> and I think you need to clear range info on promoted SSA vars in the
> promotion pass.

Done.

> 
> The basic "structure" thing still remains.  You walk over all uses and
> defs in all stmts
> in promote_all_stmts which ends up calling promote_ssa_if_not_promoted on all
> uses and defs which in turn promotes (the "def") and then fixes up all
> uses in all stmts.

Done.

> 
> Instead of this you should, in promote_all_stmts, walk over all uses doing what
> fixup_uses does and then walk over all defs, doing what promote_ssa does.
> 
> +    case GIMPLE_NOP:
> +       {
> +         if (SSA_NAME_VAR (def) == NULL)
> +           {
> +             /* Promote def by fixing its type for anonymous def.  */
> +             TREE_TYPE (def) = promoted_type;
> +           }
> +         else
> +           {
> +             /* Create a promoted copy of parameters.  */
> +             bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
> 
> I think the uninitialized vars are somewhat tricky and it would be best
> to create a new uninit anonymous SSA name for them.  You can
> have SSA_NAME_VAR != NULL and def _not_ being a parameter
> btw.

Done. I also had to do some changes to in couple of other places to
reflect this.
They are:
--- a/gcc/tree-ssa-reassoc.c
+++ b/gcc/tree-ssa-reassoc.c
@@ -302,6 +302,7 @@ phi_rank (gimple *stmt)
     {
       tree arg = gimple_phi_arg_def (stmt, i);
       if (TREE_CODE (arg) == SSA_NAME
+	  && SSA_NAME_VAR (arg)
 	  && !SSA_NAME_IS_DEFAULT_DEF (arg))
 	{
 	  gimple *def_stmt = SSA_NAME_DEF_STMT (arg);
@@ -434,7 +435,8 @@ get_rank (tree e)
       if (gimple_code (stmt) == GIMPLE_PHI)
 	return phi_rank (stmt);

-      if (!is_gimple_assign (stmt))
+      if (!is_gimple_assign (stmt)
+	  && !gimple_nop_p (stmt))
 	return bb_rank[gimple_bb (stmt)->index];

and

--- a/gcc/tree-ssa.c
+++ b/gcc/tree-ssa.c
@@ -752,7 +752,8 @@ verify_use (basic_block bb, basic_block def_bb,
use_operand_p use_p,
   TREE_VISITED (ssa_name) = 1;

   if (gimple_nop_p (SSA_NAME_DEF_STMT (ssa_name))
-      && SSA_NAME_IS_DEFAULT_DEF (ssa_name))
+      && (SSA_NAME_IS_DEFAULT_DEF (ssa_name)
+	  || SSA_NAME_VAR (ssa_name) == NULL))
     ; /* Default definitions have empty statements.  Nothing to do.  */
   else if (!def_bb)
     {

Does this look OK?

> 
> +/* Return true if it is safe to promote the defined SSA_NAME in the STMT
> +   itself.  */
> +static bool
> +safe_to_promote_def_p (gimple *stmt)
> +{
> +  enum tree_code code = gimple_assign_rhs_code (stmt);
> +  if (gimple_vuse (stmt) != NULL_TREE
> +      || gimple_vdef (stmt) != NULL_TREE
> +      || code == ARRAY_REF
> +      || code == LROTATE_EXPR
> +      || code == RROTATE_EXPR
> +      || code == VIEW_CONVERT_EXPR
> +      || code == BIT_FIELD_REF
> +      || code == REALPART_EXPR
> +      || code == IMAGPART_EXPR
> +      || code == REDUC_MAX_EXPR
> +      || code == REDUC_PLUS_EXPR
> +      || code == REDUC_MIN_EXPR)
> +    return false;
> +  return true;
> 
> huh, I think this function has an odd name, maybe
> can_promote_operation ()?  Please
> use TREE_CODE_CLASS (code) == tcc_reference for all _REF trees.

Done.

> 
> Note that as followup things like the rotates should be "expanded" like
> we'd do on RTL (open-coding the thing).  And we'd need a way to
> specify zero-/sign-extended loads.
> 
> +/* Return true if it is safe to promote the use in the STMT.  */
> +static bool
> +safe_to_promote_use_p (gimple *stmt)
> +{
> +  enum tree_code code = gimple_assign_rhs_code (stmt);
> +  tree lhs = gimple_assign_lhs (stmt);
> +
> +  if (gimple_vuse (stmt) != NULL_TREE
> +      || gimple_vdef (stmt) != NULL_TREE
> 
> I think the vuse/vdef check is bogus, you can have a use of 'i_3' in say
> _2 = a[i_3];
> 
When I remove this, I see errors in stmts like:

unsigned char
unsigned int
# .MEM_197 = VDEF <.MEM_187>
fs_9(D)->fde_encoding = _154;


> +      || code == VIEW_CONVERT_EXPR
> +      || code == LROTATE_EXPR
> +      || code == RROTATE_EXPR
> +      || code == CONSTRUCTOR
> +      || code == BIT_FIELD_REF
> +      || code == COMPLEX_EXPR
> +      || code == ASM_EXPR
> +      || VECTOR_TYPE_P (TREE_TYPE (lhs)))
> +    return false;
> +  return true;
> 
> ASM_EXPR can never appear here.  I think PROMOTE_MODE never
> promotes vector types - what cases did you need to add VECTOR_TYPE_P for?

Done
> 
> +/* Return true if the SSA_NAME has to be truncated to preserve the
> +   semantics.  */
> +static bool
> +truncate_use_p (gimple *stmt)
> +{
> +  enum tree_code code = gimple_assign_rhs_code (stmt);
> 
> I think the description can be improved.  This is about stray bits set
> beyond the original type, correct?
> 
> Please use NOP_EXPR wherever you use CONVERT_EXPR right how.
> 
> +                 if (TREE_CODE_CLASS (code)
> +                     == tcc_comparison)
> +                   promote_cst_in_stmt (stmt, promoted_type, true);
> 
> don't you always need to promote constant operands?

I am promoting all the constants. Here, I am promoting the the constants
that are part of the conditions.


Thanks,
Kugan

[-- Attachment #2: 0003-Optimize-ZEXT_EXPR-with-tree-vrp.patch --]
[-- Type: text/x-diff, Size: 3519 bytes --]

From a25f711713778cd3ed3d0976cc3f37d541479afb Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Thu, 22 Oct 2015 10:53:56 +1100
Subject: [PATCH 3/4] Optimize ZEXT_EXPR with tree-vrp

---
 gcc/match.pd   |  6 ++++++
 gcc/tree-vrp.c | 59 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 65 insertions(+)

diff --git a/gcc/match.pd b/gcc/match.pd
index 0a9598e..1b152f1 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -2585,3 +2585,9 @@ along with GCC; see the file COPYING3.  If not see
   (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
    (op @0 (ext @1 @2)))))
 
+(simplify
+ (sext (sext@2 @0 @1) @3)
+ (if (tree_int_cst_compare (@1, @3) <= 0)
+  @2
+  (sext @0 @3)))
+
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index fe34ffd..671a388 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -2241,6 +2241,7 @@ extract_range_from_binary_expr_1 (value_range *vr,
       && code != LSHIFT_EXPR
       && code != MIN_EXPR
       && code != MAX_EXPR
+      && code != SEXT_EXPR
       && code != BIT_AND_EXPR
       && code != BIT_IOR_EXPR
       && code != BIT_XOR_EXPR)
@@ -2801,6 +2802,52 @@ extract_range_from_binary_expr_1 (value_range *vr,
       extract_range_from_multiplicative_op_1 (vr, code, &vr0, &vr1);
       return;
     }
+  else if (code == SEXT_EXPR)
+    {
+      gcc_assert (range_int_cst_p (&vr1));
+      HOST_WIDE_INT prec = tree_to_uhwi (vr1.min);
+      type = vr0.type;
+      wide_int tmin, tmax;
+      wide_int may_be_nonzero, must_be_nonzero;
+
+      wide_int type_min = wi::min_value (prec, SIGNED);
+      wide_int type_max = wi::max_value (prec, SIGNED);
+      type_min = wide_int_to_tree (expr_type, type_min);
+      type_max = wide_int_to_tree (expr_type, type_max);
+      wide_int sign_bit
+	= wi::set_bit_in_zero (prec - 1,
+			       TYPE_PRECISION (TREE_TYPE (vr0.min)));
+      if (zero_nonzero_bits_from_vr (expr_type, &vr0,
+				     &may_be_nonzero,
+				     &must_be_nonzero))
+	{
+	  if (wi::bit_and (must_be_nonzero, sign_bit) == sign_bit)
+	    {
+	      /* If to-be-extended sign bit is one.  */
+	      tmin = type_min;
+	      tmax = wi::zext (may_be_nonzero, prec);
+	    }
+	  else if (wi::bit_and (may_be_nonzero, sign_bit)
+		   != sign_bit)
+	    {
+	      /* If to-be-extended sign bit is zero.  */
+	      tmin = wi::zext (must_be_nonzero, prec);
+	      tmax = wi::zext (may_be_nonzero, prec);
+	    }
+	  else
+	    {
+	      tmin = type_min;
+	      tmax = type_max;
+	    }
+	}
+      else
+	{
+	  tmin = type_min;
+	  tmax = type_max;
+	}
+      min = wide_int_to_tree (expr_type, tmin);
+      max = wide_int_to_tree (expr_type, tmax);
+    }
   else if (code == RSHIFT_EXPR
 	   || code == LSHIFT_EXPR)
     {
@@ -9166,6 +9213,17 @@ simplify_bit_ops_using_ranges (gimple_stmt_iterator *gsi, gimple *stmt)
 	  break;
 	}
       break;
+    case SEXT_EXPR:
+	{
+	  unsigned int prec = tree_to_uhwi (op1);
+	  wide_int min = vr0.min;
+	  wide_int max = vr0.max;
+	  wide_int sext_min = wi::sext (min, prec);
+	  wide_int sext_max = wi::sext (max, prec);
+	  if (min == sext_min && max == sext_max)
+	    op = op0;
+	}
+      break;
     default:
       gcc_unreachable ();
     }
@@ -9868,6 +9926,7 @@ simplify_stmt_using_ranges (gimple_stmt_iterator *gsi)
 
 	case BIT_AND_EXPR:
 	case BIT_IOR_EXPR:
+	case SEXT_EXPR:
 	  /* Optimize away BIT_AND_EXPR and BIT_IOR_EXPR
 	     if all the bits being cleared are already cleared or
 	     all the bits being set are already set.  */
-- 
1.9.1


[-- Attachment #3: 0002-Add-type-promotion-pass.patch --]
[-- Type: text/x-diff, Size: 37083 bytes --]

From f1b226443b63eda75f38f204a0befa5578e6df0f Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Thu, 22 Oct 2015 10:52:37 +1100
Subject: [PATCH 2/4] Add type promotion pass

---
 gcc/Makefile.in               |    1 +
 gcc/auto-profile.c            |    2 +-
 gcc/common.opt                |    4 +
 gcc/doc/invoke.texi           |   10 +
 gcc/gimple-ssa-type-promote.c | 1026 +++++++++++++++++++++++++++++++++++++++++
 gcc/passes.def                |    1 +
 gcc/timevar.def               |    1 +
 gcc/tree-pass.h               |    1 +
 gcc/tree-ssa-reassoc.c        |    4 +-
 gcc/tree-ssa-uninit.c         |   23 +-
 gcc/tree-ssa.c                |    3 +-
 libiberty/cp-demangle.c       |    2 +-
 12 files changed, 1064 insertions(+), 14 deletions(-)
 create mode 100644 gcc/gimple-ssa-type-promote.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index b91b8dc..c6aed45 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1499,6 +1499,7 @@ OBJS = \
 	tree-vect-slp.o \
 	tree-vectorizer.o \
 	tree-vrp.o \
+	gimple-ssa-type-promote.o \
 	tree.o \
 	valtrack.o \
 	value-prof.o \
diff --git a/gcc/auto-profile.c b/gcc/auto-profile.c
index 25202c5..d32c3b6 100644
--- a/gcc/auto-profile.c
+++ b/gcc/auto-profile.c
@@ -1266,7 +1266,7 @@ afdo_propagate_circuit (const bb_set &annotated_bb, edge_set *annotated_edge)
     FOR_EACH_EDGE (e, ei, bb->succs)
     {
       unsigned i, total = 0;
-      edge only_one;
+      edge only_one = NULL;
       bool check_value_one = (((integer_onep (cmp_rhs))
                                ^ (gimple_cond_code (cmp_stmt) == EQ_EXPR))
                               ^ ((e->flags & EDGE_TRUE_VALUE) != 0));
diff --git a/gcc/common.opt b/gcc/common.opt
index 12ca0d6..f450428 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2404,6 +2404,10 @@ ftree-vrp
 Common Report Var(flag_tree_vrp) Init(0) Optimization
 Perform Value Range Propagation on trees.
 
+ftree-type-promote
+Common Report Var(flag_tree_type_promote) Init(1) Optimization
+Perform Type Promotion on trees
+
 funit-at-a-time
 Common Report Var(flag_unit_at_a_time) Init(1)
 Compile whole compilation unit at a time.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index cd82544..bc059a0 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -9093,6 +9093,16 @@ enabled by default at @option{-O2} and higher.  Null pointer check
 elimination is only done if @option{-fdelete-null-pointer-checks} is
 enabled.
 
+@item -ftree-type-promote
+@opindex ftree-type-promote
+This pass applies type promotion to SSA names in the function and
+inserts appropriate truncations to preserve the semantics.  Idea of
+this pass is to promote operations such a way that we can minimise
+generation of subreg in RTL, that intern results in removal of
+redundant zero/sign extensions.
+
+This optimization is enabled by default.
+
 @item -fsplit-ivs-in-unroller
 @opindex fsplit-ivs-in-unroller
 Enables expression of values of induction variables in later iterations
diff --git a/gcc/gimple-ssa-type-promote.c b/gcc/gimple-ssa-type-promote.c
new file mode 100644
index 0000000..1d24566
--- /dev/null
+++ b/gcc/gimple-ssa-type-promote.c
@@ -0,0 +1,1026 @@
+/* Type promotion of SSA names to minimise redundant zero/sign extension.
+   Copyright (C) 2015 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "hash-set.h"
+#include "machmode.h"
+#include "vec.h"
+#include "double-int.h"
+#include "input.h"
+#include "symtab.h"
+#include "wide-int.h"
+#include "inchash.h"
+#include "tree.h"
+#include "fold-const.h"
+#include "stor-layout.h"
+#include "predict.h"
+#include "function.h"
+#include "dominance.h"
+#include "cfg.h"
+#include "basic-block.h"
+#include "tree-ssa-alias.h"
+#include "gimple-fold.h"
+#include "tree-eh.h"
+#include "gimple-expr.h"
+#include "is-a.h"
+#include "gimple.h"
+#include "gimple-iterator.h"
+#include "gimple-ssa.h"
+#include "tree-phinodes.h"
+#include "ssa-iterators.h"
+#include "stringpool.h"
+#include "tree-ssanames.h"
+#include "tree-pass.h"
+#include "gimple-pretty-print.h"
+#include "langhooks.h"
+#include "sbitmap.h"
+#include "domwalk.h"
+#include "tree-dfa.h"
+
+/* This pass applies type promotion to SSA names in the function and
+   inserts appropriate truncations.  Idea of this pass is to promote operations
+   such a way that we can minimise generation of subreg in RTL,
+   that in turn results in removal of redundant zero/sign extensions.  This pass
+   will run prior to The VRP and DOM such that they will be able to optimise
+   redundant truncations and extensions.  This is based on the discussion from
+   https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00472.html.
+
+*/
+
+struct ssa_name_info
+{
+  tree ssa;
+  tree type;
+  tree promoted_type;
+};
+
+/* Obstack for ssa_name_info.  */
+static struct obstack ssa_name_info_obstack;
+
+static unsigned n_ssa_val;
+static sbitmap ssa_to_be_promoted_bitmap;
+static sbitmap ssa_sets_higher_bits_bitmap;
+static hash_map <tree, ssa_name_info *>  *ssa_name_info_map;
+
+static bool
+type_precision_ok (tree type)
+{
+  return (TYPE_PRECISION (type)
+	  == GET_MODE_PRECISION (TYPE_MODE (type)));
+}
+
+/* Return the promoted type for TYPE.  */
+static tree
+get_promoted_type (tree type)
+{
+  tree promoted_type;
+  enum machine_mode mode;
+  int uns;
+
+  if (POINTER_TYPE_P (type)
+      || !INTEGRAL_TYPE_P (type)
+      || !type_precision_ok (type))
+    return type;
+
+  mode = TYPE_MODE (type);
+#ifdef PROMOTE_MODE
+  uns = TYPE_SIGN (type);
+  PROMOTE_MODE (mode, uns, type);
+#endif
+  uns = TYPE_SIGN (type);
+  if (TYPE_PRECISION (type) == GET_MODE_PRECISION (mode))
+    return type;
+  promoted_type
+    = build_nonstandard_integer_type (GET_MODE_PRECISION (mode),
+				      uns);
+  gcc_assert (TYPE_PRECISION (promoted_type) == GET_MODE_PRECISION (mode));
+  return promoted_type;
+}
+
+/* Return true if ssa NAME is already considered for promotion.  */
+static bool
+ssa_promoted_p (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	return bitmap_bit_p (ssa_to_be_promoted_bitmap, index);
+    }
+  return true;
+}
+
+/* Set ssa NAME to be already considered for promotion.  */
+static void
+set_ssa_promoted (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	bitmap_set_bit (ssa_to_be_promoted_bitmap, index);
+    }
+}
+
+/* Set ssa NAME will have higher bits if promoted.  */
+static void
+set_ssa_overflows (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	bitmap_set_bit (ssa_sets_higher_bits_bitmap, index);
+    }
+}
+
+
+/* Return true if ssa NAME will have higher bits if promoted.  */
+static bool
+ssa_overflows_p (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      gimple *def_stmt = SSA_NAME_DEF_STMT (name);
+
+      if (gimple_code (def_stmt) == GIMPLE_NOP
+	  && SSA_NAME_VAR (name)
+	  && TREE_CODE (SSA_NAME_VAR (name)) != PARM_DECL)
+	return true;
+      if (index < n_ssa_val)
+	return bitmap_bit_p (ssa_sets_higher_bits_bitmap, index);
+    }
+  return true;
+}
+
+/* Visit PHI stmt and record if variables might have higher bits set if
+   promoted.  */
+static bool
+record_visit_phi_node (gimple *stmt)
+{
+  tree def;
+  ssa_op_iter i;
+  use_operand_p op;
+  bool high_bits_set = false;
+  gphi *phi = as_a <gphi *> (stmt);
+  tree lhs = PHI_RESULT (phi);
+
+  if (TREE_CODE (lhs) != SSA_NAME
+      || POINTER_TYPE_P (TREE_TYPE (lhs))
+      || !INTEGRAL_TYPE_P (TREE_TYPE (lhs))
+      || ssa_overflows_p (lhs))
+    return false;
+
+  FOR_EACH_PHI_ARG (op, phi, i, SSA_OP_USE)
+    {
+      def = USE_FROM_PTR (op);
+      if (ssa_overflows_p (def))
+	high_bits_set = true;
+    }
+
+  if (high_bits_set)
+    {
+      set_ssa_overflows (lhs);
+      return true;
+    }
+  else
+    return false;
+}
+
+/* Visit STMT and record if variables might have higher bits set if
+   promoted.  */
+static bool
+record_visit_stmt (gimple *stmt)
+{
+  bool changed = false;
+  gcc_assert (gimple_code (stmt) == GIMPLE_ASSIGN);
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  tree lhs = gimple_assign_lhs (stmt);
+  tree rhs1 = gimple_assign_rhs1 (stmt);
+
+  if (TREE_CODE (lhs) != SSA_NAME
+      || POINTER_TYPE_P (TREE_TYPE (lhs))
+      || !INTEGRAL_TYPE_P (TREE_TYPE (lhs)))
+    return false;
+
+  switch (code)
+    {
+    case SSA_NAME:
+      if (!ssa_overflows_p (lhs)
+	  && ssa_overflows_p (rhs1))
+	{
+	  set_ssa_overflows (lhs);
+	  changed = true;
+	}
+      break;
+
+    default:
+      if (!ssa_overflows_p (lhs))
+	{
+	  set_ssa_overflows (lhs);
+	  changed = true;
+	}
+      break;
+    }
+  return changed;
+}
+
+static void
+process_all_stmts_for_unsafe_promotion ()
+{
+  basic_block bb;
+  gimple_stmt_iterator gsi;
+  auto_vec<gimple *> work_list;
+
+  FOR_EACH_BB_FN (bb, cfun)
+    {
+      for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+	{
+	  gimple *phi = gsi_stmt (gsi);
+	  work_list.safe_push (phi);
+	}
+
+      for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+	{
+	  gimple *stmt = gsi_stmt (gsi);
+	  if (gimple_code (stmt) == GIMPLE_ASSIGN)
+	    work_list.safe_push (stmt);
+	}
+    }
+
+  while (work_list.length () > 0)
+    {
+      bool changed;
+      gimple *stmt = work_list.pop ();
+      tree lhs;
+
+      switch (gimple_code (stmt))
+	{
+
+	case GIMPLE_ASSIGN:
+	  changed = record_visit_stmt (stmt);
+	  lhs = gimple_assign_lhs (stmt);
+	  break;
+
+	case GIMPLE_PHI:
+	  changed = record_visit_phi_node (stmt);
+	  lhs = PHI_RESULT (stmt);
+	  break;
+
+	default:
+	  gcc_assert (false);
+	  break;
+	}
+
+      if (changed)
+	{
+	  gimple *use_stmt;
+	  imm_use_iterator ui;
+
+	  FOR_EACH_IMM_USE_STMT (use_stmt, ui, lhs)
+	    {
+	      if (gimple_code (use_stmt) == GIMPLE_ASSIGN
+		  || gimple_code (use_stmt) == GIMPLE_PHI)
+		work_list.safe_push (use_stmt);
+	    }
+	}
+    }
+}
+
+/* Return true if it is safe to promote the defined SSA_NAME in the STMT
+   itself.  */
+static bool
+can_promote_operation_p (gimple *stmt)
+{
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  if (gimple_vuse (stmt) != NULL_TREE
+      || gimple_vdef (stmt) != NULL_TREE
+      || TREE_CODE_CLASS (code) == tcc_reference
+      || code == LROTATE_EXPR
+      || code == RROTATE_EXPR
+      || code == VIEW_CONVERT_EXPR
+      || code == REALPART_EXPR
+      || code == IMAGPART_EXPR
+      || code == REDUC_MAX_EXPR
+      || code == REDUC_PLUS_EXPR
+      || code == REDUC_MIN_EXPR)
+    return false;
+  return true;
+}
+
+/* Return true if it is safe to promote the use in the STMT.  */
+static bool
+safe_to_promote_use_p (gimple *stmt)
+{
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  tree lhs = gimple_assign_lhs (stmt);
+
+  if (gimple_vuse (stmt) != NULL_TREE
+      || gimple_vdef (stmt) != NULL_TREE
+      || code == VIEW_CONVERT_EXPR
+      || code == LROTATE_EXPR
+      || code == RROTATE_EXPR
+      || code == CONSTRUCTOR
+      || code == BIT_FIELD_REF
+      || code == COMPLEX_EXPR
+      || VECTOR_TYPE_P (TREE_TYPE (lhs)))
+    return false;
+  return true;
+}
+
+/* Return true if the SSA_NAME has to be truncated when (stray bits are set
+   beyond the original type in promoted mode) to preserve the semantics.  */
+static bool
+truncate_use_p (gimple *stmt)
+{
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  if (TREE_CODE_CLASS (code) == tcc_comparison
+      || code == TRUNC_DIV_EXPR
+      || code == CEIL_DIV_EXPR
+      || code == FLOOR_DIV_EXPR
+      || code == ROUND_DIV_EXPR
+      || code == TRUNC_MOD_EXPR
+      || code == CEIL_MOD_EXPR
+      || code == FLOOR_MOD_EXPR
+      || code == ROUND_MOD_EXPR
+      || code == LSHIFT_EXPR
+      || code == RSHIFT_EXPR)
+    return true;
+  return false;
+}
+
+/* Return true if LHS will be promoted later.  */
+static bool
+tobe_promoted_p (tree lhs)
+{
+  if (TREE_CODE (lhs) == SSA_NAME
+      && !POINTER_TYPE_P (TREE_TYPE (lhs))
+      && INTEGRAL_TYPE_P (TREE_TYPE (lhs))
+      && !VECTOR_TYPE_P (TREE_TYPE (lhs))
+      && !ssa_promoted_p (lhs)
+      && (get_promoted_type (TREE_TYPE (lhs))
+	  != TREE_TYPE (lhs)))
+    return true;
+  else
+    return false;
+}
+
+/* Convert constant CST to TYPE.  */
+static tree
+convert_int_cst (tree type, tree cst, signop sign = SIGNED)
+{
+  wide_int wi_cons = fold_convert (type, cst);
+  wi_cons = wi::ext (wi_cons, TYPE_PRECISION (TREE_TYPE (cst)), sign);
+  return wide_int_to_tree (type, wi_cons);
+}
+
+/* Promote constants in STMT to TYPE.  If PROMOTE_COND_EXPR is true,
+   promote only the constants in conditions part of the COND_EXPR.  */
+static void
+promote_cst_in_stmt (gimple *stmt, tree type, bool promote_cond = false)
+{
+  tree op;
+  ssa_op_iter iter;
+  use_operand_p oprnd;
+  int index;
+  tree op0, op1;
+  signop sign = SIGNED;
+
+  switch (gimple_code (stmt))
+    {
+    case GIMPLE_ASSIGN:
+      if (promote_cond
+	  && gimple_assign_rhs_code (stmt) == COND_EXPR)
+	{
+	  /* Promote INTEGER_CST that are tcc_compare arguments.  */
+	  sign = TYPE_SIGN (type);
+	  op = gimple_assign_rhs1 (stmt);
+	  op0 = TREE_OPERAND (op, 0);
+	  op1 = TREE_OPERAND (op, 1);
+	  if (TREE_CODE (op0) == INTEGER_CST)
+	    op0 = convert_int_cst (type, op0, sign);
+	  if (TREE_CODE (op1) == INTEGER_CST)
+	    op1 = convert_int_cst (type, op1, sign);
+	  tree new_op = build2 (TREE_CODE (op), type, op0, op1);
+	  gimple_assign_set_rhs1 (stmt, new_op);
+	}
+      else
+	{
+	  /* Promote INTEGER_CST in GIMPLE_ASSIGN.  */
+	  op = gimple_assign_rhs3 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs3 (stmt, convert_int_cst (type, op, sign));
+	  if (TREE_CODE_CLASS (gimple_assign_rhs_code (stmt))
+	      == tcc_comparison)
+	    sign = TYPE_SIGN (type);
+	  op = gimple_assign_rhs1 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs1 (stmt, convert_int_cst (type, op, sign));
+	  op = gimple_assign_rhs2 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs2 (stmt, convert_int_cst (type, op, sign));
+	}
+      break;
+
+    case GIMPLE_PHI:
+	{
+	  /* Promote INTEGER_CST arguments to GIMPLE_PHI.  */
+	  gphi *phi = as_a <gphi *> (stmt);
+	  FOR_EACH_PHI_ARG (oprnd, phi, iter, SSA_OP_USE)
+	    {
+	      op = USE_FROM_PTR (oprnd);
+	      index = PHI_ARG_INDEX_FROM_USE (oprnd);
+	      if (TREE_CODE (op) == INTEGER_CST)
+		SET_PHI_ARG_DEF (phi, index, convert_int_cst (type, op, sign));
+	    }
+	}
+      break;
+
+    case GIMPLE_COND:
+	{
+	  /* Promote INTEGER_CST that are GIMPLE_COND arguments.  */
+	  gcond *cond = as_a <gcond *> (stmt);
+	  op = gimple_cond_lhs (cond);
+	  sign = TYPE_SIGN (type);
+
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_cond_set_lhs (cond, convert_int_cst (type, op, sign));
+	  op = gimple_cond_rhs (cond);
+
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_cond_set_rhs (cond, convert_int_cst (type, op, sign));
+	}
+      break;
+
+    default:
+      gcc_unreachable ();
+    }
+}
+
+/* Create an ssa with TYPE to copy ssa VAR.  */
+static tree
+make_promoted_copy (tree var, gimple *def_stmt, tree type)
+{
+  tree new_lhs = make_ssa_name (type, def_stmt);
+  if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (var))
+    SSA_NAME_OCCURS_IN_ABNORMAL_PHI (new_lhs) = 1;
+  return new_lhs;
+}
+
+/* Zero/sign extend (depending on type) VAR and truncate to WIDTH bits.
+   Assign the zero/sign extended value in NEW_VAR.  gimple statement
+   that performs the zero/sign extension is returned.  */
+static gimple *
+zero_sign_extend_stmt (tree new_var, tree var, int width)
+{
+  gcc_assert (TYPE_PRECISION (TREE_TYPE (var))
+	      == TYPE_PRECISION (TREE_TYPE (new_var)));
+  gcc_assert (TYPE_PRECISION (TREE_TYPE (var)) > width);
+  gimple *stmt;
+
+  if (TYPE_UNSIGNED (TREE_TYPE (new_var)))
+    {
+      /* Zero extend.  */
+      tree cst
+	= wide_int_to_tree (TREE_TYPE (var),
+			    wi::mask (width, false,
+				      TYPE_PRECISION (TREE_TYPE (var))));
+      stmt = gimple_build_assign (new_var, BIT_AND_EXPR,
+				  var, cst);
+    }
+  else
+    /* Sign extend.  */
+    stmt = gimple_build_assign (new_var,
+				SEXT_EXPR,
+				var, build_int_cst (TREE_TYPE (var), width));
+  return stmt;
+}
+
+
+void duplicate_default_ssa (tree to, tree from)
+{
+  SET_SSA_NAME_VAR_OR_IDENTIFIER (to, SSA_NAME_VAR (from));
+  SSA_NAME_IS_DEFAULT_DEF (to) = SSA_NAME_IS_DEFAULT_DEF (from);
+  SSA_NAME_DEF_STMT (to) = SSA_NAME_DEF_STMT (from);
+  SET_SSA_NAME_VAR_OR_IDENTIFIER (from, NULL_TREE);
+  SSA_NAME_IS_DEFAULT_DEF (to) = 1;
+  SSA_NAME_IS_DEFAULT_DEF (from) = 0;
+}
+
+/* Promote definition DEF to PROMOTED_TYPE.  If the stmt that defines def
+   is def_stmt, make the type of def promoted_type.  If the stmt is such
+   that, result of the def_stmt cannot be of promoted_type, create a new_def
+   of the original_type and make the def_stmt assign its value to newdef.
+   Then, create a CONVERT_EXPR to convert new_def to def of promoted type.
+
+   For example, for stmt with original_type char and promoted_type int:
+		char _1 = mem;
+	becomes:
+		char _2 = mem;
+		int _1 = (int)_2;
+
+   If the def_stmt allows def to be promoted, promote def in-place
+   (and its arguments when needed).
+
+   For example:
+		char _3 = _1 + _2;
+	becomes:
+		int _3 = _1 + _2;
+   Here, _1 and _2 will also be promoted.  */
+static void
+promote_ssa (tree def, gimple_stmt_iterator *gsi)
+{
+  gimple *def_stmt = SSA_NAME_DEF_STMT (def);
+  gimple *copy_stmt = NULL;
+  basic_block bb;
+  gimple_stmt_iterator gsi2;
+  tree original_type = TREE_TYPE (def);
+  tree new_def;
+  bool do_not_promote = false;
+  if (!tobe_promoted_p (def))
+    return;
+  tree promoted_type = get_promoted_type (TREE_TYPE (def));
+  ssa_name_info *info = (ssa_name_info *) obstack_alloc (&ssa_name_info_obstack,
+						       sizeof (ssa_name_info));
+  info->type = original_type;
+  info->promoted_type = promoted_type;
+  info->ssa = def;
+  ssa_name_info_map->put (def, info);
+
+  switch (gimple_code (def_stmt))
+    {
+    case GIMPLE_PHI:
+	{
+	  /* Promote def by fixing its type and make def anonymous.  */
+	  TREE_TYPE (def) = promoted_type;
+	  SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+	  promote_cst_in_stmt (def_stmt, promoted_type);
+	  break;
+	}
+
+    case GIMPLE_ASM:
+	{
+	  gasm *asm_stmt = as_a <gasm *> (def_stmt);
+	  for (unsigned int i = 0; i < gimple_asm_noutputs (asm_stmt); ++i)
+	    {
+	      /* Promote def and copy (i.e. convert) the value defined
+		 by asm to def.  */
+	      tree link = gimple_asm_output_op (asm_stmt, i);
+	      tree op = TREE_VALUE (link);
+	      if (op == def)
+		{
+		  new_def = copy_ssa_name (def);
+		  set_ssa_promoted (new_def);
+		  duplicate_default_ssa (new_def, def);
+		  TREE_VALUE (link) = new_def;
+		  gimple_asm_set_output_op (asm_stmt, i, link);
+
+		  TREE_TYPE (def) = promoted_type;
+		  copy_stmt = gimple_build_assign (def, NOP_EXPR,
+						   new_def, NULL_TREE);
+		  SSA_NAME_IS_DEFAULT_DEF (new_def) = 0;
+		  gsi2 = gsi_for_stmt (def_stmt);
+		  gsi_insert_after (&gsi2, copy_stmt, GSI_NEW_STMT);
+		  break;
+		}
+	    }
+	  break;
+	}
+
+    case GIMPLE_NOP:
+	{
+	  if (SSA_NAME_VAR (def) == NULL
+	      || TREE_CODE (SSA_NAME_VAR (def)) != PARM_DECL)
+	    {
+	      /* Promote def by fixing its type for anonymous def.  */
+	      if (SSA_NAME_VAR (def))
+		{
+		  set_ssa_default_def (cfun, SSA_NAME_VAR (def), NULL_TREE);
+		  SSA_NAME_IS_DEFAULT_DEF (def) = 0;
+		  SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		}
+	      TREE_TYPE (def) = promoted_type;
+	    }
+	  else
+	    {
+	      /* Create a promoted copy of parameters.  */
+	      bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+	      gcc_assert (bb);
+	      gsi2 = gsi_after_labels (bb);
+	      new_def = copy_ssa_name (def);
+	      set_ssa_promoted (new_def);
+	      set_ssa_default_def (cfun, SSA_NAME_VAR (def), new_def);
+	      duplicate_default_ssa (new_def, def);
+	      TREE_TYPE (def) = promoted_type;
+	      copy_stmt = gimple_build_assign (def, NOP_EXPR,
+					       new_def, NULL_TREE);
+	      SSA_NAME_DEF_STMT (def) = copy_stmt;
+	      gsi_insert_before (&gsi2, copy_stmt, GSI_NEW_STMT);
+	    }
+	  break;
+	}
+
+    case GIMPLE_ASSIGN:
+	{
+	  enum tree_code code = gimple_assign_rhs_code (def_stmt);
+	  if (!can_promote_operation_p (def_stmt))
+	    {
+	      do_not_promote = true;
+	    }
+	  else if (CONVERT_EXPR_CODE_P (code))
+	    {
+	      tree rhs = gimple_assign_rhs1 (def_stmt);
+	      if (!type_precision_ok (TREE_TYPE (rhs)))
+		{
+		  do_not_promote = true;
+		}
+	      else if (types_compatible_p (TREE_TYPE (rhs), promoted_type))
+		{
+		  /* As we travel statements in dominated order, arguments
+		     of def_stmt will be visited before visiting def.  If RHS
+		     is already promoted and type is compatible, we can convert
+		     them into ZERO/SIGN EXTEND stmt.  */
+		  ssa_name_info *info = ssa_name_info_map->get_or_insert (rhs);
+		  tree type;
+		  if (info == NULL)
+		    type = TREE_TYPE (rhs);
+		  else
+		    type = info->type;
+		  if ((TYPE_PRECISION (original_type)
+		       > TYPE_PRECISION (type))
+		      || (TYPE_UNSIGNED (original_type)
+			  != TYPE_UNSIGNED (type)))
+		    {
+		      if (TYPE_PRECISION (original_type) < TYPE_PRECISION (type))
+			type = original_type;
+		      gcc_assert (type != NULL_TREE);
+		      TREE_TYPE (def) = promoted_type;
+		      gimple *copy_stmt =
+			zero_sign_extend_stmt (def, rhs,
+					       TYPE_PRECISION (type));
+		      SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		      gsi_replace (gsi, copy_stmt, false);
+		    }
+		  else
+		    {
+		      TREE_TYPE (def) = promoted_type;
+		      SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		    }
+		}
+	      else
+		{
+		  /* If RHS is not promoted OR their types are not
+		     compatible, create CONVERT_EXPR that converts
+		     RHS to  promoted DEF type and perform a
+		     ZERO/SIGN EXTEND to get the required value
+		     from RHS.  */
+		  tree s = (TYPE_PRECISION (TREE_TYPE (def))
+			    < TYPE_PRECISION (TREE_TYPE (rhs)))
+		    ? TREE_TYPE (def) : TREE_TYPE (rhs);
+		  new_def = copy_ssa_name (def);
+		  set_ssa_promoted (new_def);
+		  TREE_TYPE (def) = promoted_type;
+		  TREE_TYPE (new_def) = promoted_type;
+		  SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		  SET_SSA_NAME_VAR_OR_IDENTIFIER (new_def, NULL_TREE);
+		  gimple_set_lhs (def_stmt, new_def);
+		  gimple *copy_stmt =
+		    zero_sign_extend_stmt (def, new_def,
+					   TYPE_PRECISION (s));
+		  gsi2 = gsi_for_stmt (def_stmt);
+		  if (lookup_stmt_eh_lp (def_stmt) > 0
+		      || (gimple_code (def_stmt) == GIMPLE_CALL
+			  && gimple_call_ctrl_altering_p (def_stmt)))
+		    gsi_insert_on_edge (FALLTHRU_EDGE (gimple_bb (def_stmt)),
+					copy_stmt);
+		  else
+		    gsi_insert_after (&gsi2, copy_stmt, GSI_NEW_STMT);
+	      }
+	    }
+	  else
+	    {
+	      /* Promote def by fixing its type and make def anonymous.  */
+	      SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+	      promote_cst_in_stmt (def_stmt, promoted_type);
+	      TREE_TYPE (def) = promoted_type;
+	    }
+	  break;
+	}
+
+    default:
+      do_not_promote = true;
+      break;
+    }
+
+  if (do_not_promote)
+    {
+      /* Promote def and copy (i.e. convert) the value defined
+	 by the stmt that cannot be promoted.  */
+      new_def = copy_ssa_name (def);
+      set_ssa_promoted (new_def);
+      SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+      TREE_TYPE (def) = promoted_type;
+      gimple_set_lhs (def_stmt, new_def);
+      copy_stmt = gimple_build_assign (def, NOP_EXPR,
+				       new_def, NULL_TREE);
+      gsi2 = gsi_for_stmt (def_stmt);
+      if (lookup_stmt_eh_lp (def_stmt) > 0
+	  || (gimple_code (def_stmt) == GIMPLE_CALL
+	      && gimple_call_ctrl_altering_p (def_stmt)))
+	gsi_insert_on_edge (FALLTHRU_EDGE (gimple_bb (def_stmt)),
+			    copy_stmt);
+      else
+	gsi_insert_after (&gsi2, copy_stmt, GSI_NEW_STMT);
+    }
+
+  SSA_NAME_RANGE_INFO (def) = NULL;
+}
+
+/* Fix the (promoted) USE in stmts where USE cannot be be promoted.  */
+static unsigned int
+fixup_uses (gimple *stmt, gimple_stmt_iterator *gsi,
+	    use_operand_p op, tree use)
+{
+  ssa_name_info *info = ssa_name_info_map->get_or_insert (use);
+  if (!info)
+    return 0;
+
+  tree promoted_type = info->promoted_type;
+  tree old_type = info->type;
+  bool do_not_promote = false;
+
+  switch (gimple_code (stmt))
+    {
+    case GIMPLE_DEBUG:
+	{
+	  SET_USE (op, fold_convert (old_type, use));
+	  update_stmt (stmt);
+	}
+      break;
+
+    case GIMPLE_ASM:
+    case GIMPLE_CALL:
+    case GIMPLE_RETURN:
+	{
+	  /* USE cannot be promoted here.  */
+	  do_not_promote = true;
+	  break;
+	}
+
+    case GIMPLE_ASSIGN:
+	{
+	  enum tree_code code = gimple_assign_rhs_code (stmt);
+	  tree lhs = gimple_assign_lhs (stmt);
+	  if (!safe_to_promote_use_p (stmt))
+	    {
+	      do_not_promote = true;
+	    }
+	  else if (truncate_use_p (stmt)
+		   || !INTEGRAL_TYPE_P (TREE_TYPE (lhs)))
+	    {
+	      /* Promote the constant in comparison when other comparsion
+		 operand is promoted.  All other constants are promoted as
+		 part of promoting definition in promote_ssa.  */
+	      if (TREE_CODE_CLASS (code) == tcc_comparison)
+		promote_cst_in_stmt (stmt, promoted_type, true);
+	      if (!ssa_overflows_p (use))
+		break;
+	      /* In some stmts, value in USE has to be ZERO/SIGN
+		 Extended based on the original type for correct
+		 result.  */
+	      tree temp = make_promoted_copy (use, NULL, TREE_TYPE (use));
+	      gimple *copy_stmt =
+		zero_sign_extend_stmt (temp, use,
+				       TYPE_PRECISION (old_type));
+	      gsi_insert_before (gsi, copy_stmt, GSI_NEW_STMT);
+
+	      SET_USE (op, temp);
+	      update_stmt (stmt);
+	    }
+	  else if (CONVERT_EXPR_CODE_P (code))
+	    {
+	      if (types_compatible_p (TREE_TYPE (lhs), promoted_type))
+		{
+		  /* Type of LHS and promoted RHS are compatible, we can
+		     convert this into ZERO/SIGN EXTEND stmt.  */
+		  gimple *copy_stmt =
+		    zero_sign_extend_stmt (lhs, use,
+					   TYPE_PRECISION (old_type));
+		  set_ssa_promoted (lhs);
+		  gsi_replace (gsi, copy_stmt, false);
+		}
+	      else if (tobe_promoted_p (lhs));
+	      else
+		{
+		  do_not_promote = true;
+		}
+	    }
+	  break;
+	}
+
+    case GIMPLE_COND:
+      if (ssa_overflows_p (use))
+	{
+	  /* In GIMPLE_COND, value in USE has to be ZERO/SIGN
+	     Extended based on the original type for correct
+	     result.  */
+	  tree temp = make_promoted_copy (use, NULL, TREE_TYPE (use));
+	  gimple *copy_stmt =
+	    zero_sign_extend_stmt (temp, use,
+				   TYPE_PRECISION (old_type));
+	  gsi_insert_before (gsi, copy_stmt, GSI_NEW_STMT);
+	  SET_USE (op, temp);
+	}
+      promote_cst_in_stmt (stmt, promoted_type, true);
+      update_stmt (stmt);
+      break;
+
+    default:
+      break;
+    }
+
+  if (do_not_promote)
+    {
+      /* FOR stmts where USE canoot be promoted, create an
+	 original type copy.  */
+      tree temp;
+      temp = copy_ssa_name (use);
+      set_ssa_promoted (temp);
+      TREE_TYPE (temp) = old_type;
+      gimple *copy_stmt = gimple_build_assign (temp, NOP_EXPR,
+					       use, NULL_TREE);
+      gsi_insert_before (gsi, copy_stmt, GSI_NEW_STMT);
+      SET_USE (op, temp);
+      update_stmt (stmt);
+    }
+  return 0;
+}
+
+
+/* Promote all the stmts in the basic block.  */
+static void
+promote_all_stmts (basic_block bb)
+{
+  gimple_stmt_iterator gsi;
+  ssa_op_iter iter;
+  tree def, use;
+  use_operand_p op;
+
+  for (gphi_iterator gpi = gsi_start_phis (bb);
+       !gsi_end_p (gpi); gsi_next (&gpi))
+    {
+      gphi *phi = gpi.phi ();
+      def = PHI_RESULT (phi);
+      promote_ssa (def, &gsi);
+
+      FOR_EACH_PHI_ARG (op, phi, iter, SSA_OP_USE)
+	{
+	  use = USE_FROM_PTR (op);
+	  if (TREE_CODE (use) == SSA_NAME
+	      && gimple_code (SSA_NAME_DEF_STMT (use)) == GIMPLE_NOP)
+	    promote_ssa (use, &gsi);
+	  fixup_uses (phi, &gsi, op, use);
+	}
+    }
+  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+    {
+      gimple *stmt = gsi_stmt (gsi);
+      if (is_gimple_debug (stmt))
+	continue;
+
+      FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_DEF)
+	promote_ssa (def, &gsi);
+
+      FOR_EACH_SSA_USE_OPERAND (op, stmt, iter, SSA_OP_USE)
+	{
+	  use = USE_FROM_PTR (op);
+	  if (TREE_CODE (use) == SSA_NAME
+	    && gimple_code (SSA_NAME_DEF_STMT (use)) == GIMPLE_NOP)
+	    promote_ssa (use, &gsi);
+	  fixup_uses (stmt, &gsi, op, use);
+	}
+    }
+}
+
+void promote_debug_stmts ()
+{
+  basic_block bb;
+  gimple_stmt_iterator gsi;
+  ssa_op_iter iter;
+  tree use;
+  use_operand_p op;
+
+  FOR_EACH_BB_FN (bb, cfun)
+    for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+      {
+	gimple *stmt = gsi_stmt (gsi);
+	if (!is_gimple_debug (stmt))
+	  continue;
+	FOR_EACH_SSA_USE_OPERAND (op, stmt, iter, SSA_OP_USE)
+	  {
+	    use = USE_FROM_PTR (op);
+	    fixup_uses (stmt, &gsi, op, use);
+	  }
+      }
+}
+
+
+class type_promotion_dom_walker : public dom_walker
+{
+public:
+  type_promotion_dom_walker (cdi_direction direction)
+    : dom_walker (direction) {}
+  virtual void before_dom_children (basic_block bb)
+    {
+      promote_all_stmts (bb);
+    }
+};
+
+/* Main entry point to the pass.  */
+static unsigned int
+execute_type_promotion (void)
+{
+  n_ssa_val = num_ssa_names;
+  ssa_name_info_map = new hash_map<tree, ssa_name_info *>;
+  ssa_to_be_promoted_bitmap = sbitmap_alloc (n_ssa_val);
+  bitmap_clear (ssa_to_be_promoted_bitmap);
+  ssa_sets_higher_bits_bitmap = sbitmap_alloc (n_ssa_val);
+  bitmap_clear (ssa_sets_higher_bits_bitmap);
+
+  /* Create the obstack where ssa_name_info will reside.  */
+  gcc_obstack_init (&ssa_name_info_obstack);
+
+  calculate_dominance_info (CDI_DOMINATORS);
+  process_all_stmts_for_unsafe_promotion ();
+  /* Walk the CFG in dominator order.  */
+  type_promotion_dom_walker (CDI_DOMINATORS)
+    .walk (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+  promote_debug_stmts ();
+  gsi_commit_edge_inserts ();
+
+  obstack_free (&ssa_name_info_obstack, NULL);
+  sbitmap_free (ssa_to_be_promoted_bitmap);
+  sbitmap_free (ssa_sets_higher_bits_bitmap);
+  delete ssa_name_info_map;
+  return 0;
+}
+
+namespace {
+const pass_data pass_data_type_promotion =
+{
+  GIMPLE_PASS, /* type */
+  "promotion", /* name */
+  OPTGROUP_NONE, /* optinfo_flags */
+  TV_TREE_TYPE_PROMOTE, /* tv_id */
+  PROP_ssa, /* properties_required */
+  0, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  (TODO_cleanup_cfg | TODO_update_ssa | TODO_verify_all),
+};
+
+class pass_type_promotion : public gimple_opt_pass
+{
+public:
+  pass_type_promotion (gcc::context *ctxt)
+    : gimple_opt_pass (pass_data_type_promotion, ctxt)
+  {}
+
+  /* opt_pass methods: */
+  opt_pass * clone () { return new pass_type_promotion (m_ctxt); }
+  virtual bool gate (function *) { return flag_tree_type_promote != 0; }
+  virtual unsigned int execute (function *)
+    {
+      return execute_type_promotion ();
+    }
+
+}; // class pass_type_promotion
+
+} // anon namespace
+
+gimple_opt_pass *
+make_pass_type_promote (gcc::context *ctxt)
+{
+  return new pass_type_promotion (ctxt);
+}
+
diff --git a/gcc/passes.def b/gcc/passes.def
index 36d2b3b..78c463a 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -272,6 +272,7 @@ along with GCC; see the file COPYING3.  If not see
       POP_INSERT_PASSES ()
       NEXT_PASS (pass_simduid_cleanup);
       NEXT_PASS (pass_lower_vector_ssa);
+      NEXT_PASS (pass_type_promote);
       NEXT_PASS (pass_cse_reciprocals);
       NEXT_PASS (pass_reassoc);
       NEXT_PASS (pass_strength_reduction);
diff --git a/gcc/timevar.def b/gcc/timevar.def
index b429faf..a8d40c3 100644
--- a/gcc/timevar.def
+++ b/gcc/timevar.def
@@ -278,6 +278,7 @@ DEFTIMEVAR (TV_VTABLE_VERIFICATION   , "vtable verification")
 DEFTIMEVAR (TV_TREE_UBSAN            , "tree ubsan")
 DEFTIMEVAR (TV_INITIALIZE_RTL        , "initialize rtl")
 DEFTIMEVAR (TV_GIMPLE_LADDRESS       , "address lowering")
+DEFTIMEVAR (TV_TREE_TYPE_PROMOTE     , "tree type promote")
 
 /* Everything else in rest_of_compilation not included above.  */
 DEFTIMEVAR (TV_EARLY_LOCAL	     , "early local passes")
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index 333b5a7..449dd19 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -436,6 +436,7 @@ extern gimple_opt_pass *make_pass_fre (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_check_data_deps (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_copy_prop (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_isolate_erroneous_paths (gcc::context *ctxt);
+extern gimple_opt_pass *make_pass_type_promote (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_vrp (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_uncprop (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_return_slot (gcc::context *ctxt);
diff --git a/gcc/tree-ssa-reassoc.c b/gcc/tree-ssa-reassoc.c
index 45b8d46..07845e3 100644
--- a/gcc/tree-ssa-reassoc.c
+++ b/gcc/tree-ssa-reassoc.c
@@ -302,6 +302,7 @@ phi_rank (gimple *stmt)
     {
       tree arg = gimple_phi_arg_def (stmt, i);
       if (TREE_CODE (arg) == SSA_NAME
+	  && SSA_NAME_VAR (arg)
 	  && !SSA_NAME_IS_DEFAULT_DEF (arg))
 	{
 	  gimple *def_stmt = SSA_NAME_DEF_STMT (arg);
@@ -434,7 +435,8 @@ get_rank (tree e)
       if (gimple_code (stmt) == GIMPLE_PHI)
 	return phi_rank (stmt);
 
-      if (!is_gimple_assign (stmt))
+      if (!is_gimple_assign (stmt)
+	  && !gimple_nop_p (stmt))
 	return bb_rank[gimple_bb (stmt)->index];
 
       /* If we already have a rank for this expression, use that.  */
diff --git a/gcc/tree-ssa-uninit.c b/gcc/tree-ssa-uninit.c
index 3f7dbcf..93422ac 100644
--- a/gcc/tree-ssa-uninit.c
+++ b/gcc/tree-ssa-uninit.c
@@ -201,16 +201,19 @@ warn_uninitialized_vars (bool warn_possibly_uninitialized)
 	  FOR_EACH_SSA_USE_OPERAND (use_p, stmt, op_iter, SSA_OP_USE)
 	    {
 	      use = USE_FROM_PTR (use_p);
-	      if (always_executed)
-		warn_uninit (OPT_Wuninitialized, use,
-			     SSA_NAME_VAR (use), SSA_NAME_VAR (use),
-			     "%qD is used uninitialized in this function",
-			     stmt, UNKNOWN_LOCATION);
-	      else if (warn_possibly_uninitialized)
-		warn_uninit (OPT_Wmaybe_uninitialized, use,
-			     SSA_NAME_VAR (use), SSA_NAME_VAR (use),
-			     "%qD may be used uninitialized in this function",
-			     stmt, UNKNOWN_LOCATION);
+	      if (SSA_NAME_VAR (use))
+		{
+		  if (always_executed)
+		    warn_uninit (OPT_Wuninitialized, use,
+				 SSA_NAME_VAR (use), SSA_NAME_VAR (use),
+				 "%qD is used uninitialized in this function",
+				 stmt, UNKNOWN_LOCATION);
+		  else if (warn_possibly_uninitialized)
+		    warn_uninit (OPT_Wmaybe_uninitialized, use,
+				 SSA_NAME_VAR (use), SSA_NAME_VAR (use),
+				 "%qD may be used uninitialized in this function",
+				 stmt, UNKNOWN_LOCATION);
+		}
 	    }
 
 	  /* For memory the only cheap thing we can do is see if we
diff --git a/gcc/tree-ssa.c b/gcc/tree-ssa.c
index 4b869be..3e520fc 100644
--- a/gcc/tree-ssa.c
+++ b/gcc/tree-ssa.c
@@ -752,7 +752,8 @@ verify_use (basic_block bb, basic_block def_bb, use_operand_p use_p,
   TREE_VISITED (ssa_name) = 1;
 
   if (gimple_nop_p (SSA_NAME_DEF_STMT (ssa_name))
-      && SSA_NAME_IS_DEFAULT_DEF (ssa_name))
+      && (SSA_NAME_IS_DEFAULT_DEF (ssa_name)
+	  || SSA_NAME_VAR (ssa_name) == NULL))
     ; /* Default definitions have empty statements.  Nothing to do.  */
   else if (!def_bb)
     {
diff --git a/libiberty/cp-demangle.c b/libiberty/cp-demangle.c
index ff608a3..6722331 100644
--- a/libiberty/cp-demangle.c
+++ b/libiberty/cp-demangle.c
@@ -4353,7 +4353,7 @@ d_print_comp_inner (struct d_print_info *dpi, int options,
 
   /* Variable used to store the current templates while a previously
      captured scope is used.  */
-  struct d_print_template *saved_templates;
+  struct d_print_template *saved_templates = NULL;
 
   /* Nonzero if templates have been stored in the above variable.  */
   int need_template_restore = 0;
-- 
1.9.1


[-- Attachment #4: 0001-Add-new-SEXT_EXPR-tree-code.patch --]
[-- Type: text/x-diff, Size: 5067 bytes --]

From c0ce364e3a422912a08189645efde46c36583753 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Thu, 22 Oct 2015 10:51:42 +1100
Subject: [PATCH 1/4] Add new SEXT_EXPR tree code

---
 gcc/cfgexpand.c         | 12 ++++++++++++
 gcc/expr.c              | 20 ++++++++++++++++++++
 gcc/fold-const.c        |  4 ++++
 gcc/tree-cfg.c          | 12 ++++++++++++
 gcc/tree-inline.c       |  1 +
 gcc/tree-pretty-print.c | 11 +++++++++++
 gcc/tree.def            |  5 +++++
 7 files changed, 65 insertions(+)

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index eaad859..aeb64bb 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -5054,6 +5054,18 @@ expand_debug_expr (tree exp)
     case FMA_EXPR:
       return simplify_gen_ternary (FMA, mode, inner_mode, op0, op1, op2);
 
+    case SEXT_EXPR:
+      gcc_assert (CONST_INT_P (op1));
+      inner_mode = mode_for_size (INTVAL (op1), MODE_INT, 0);
+      gcc_assert (GET_MODE_BITSIZE (inner_mode) == INTVAL (op1));
+
+      if (mode != inner_mode)
+	op0 = simplify_gen_unary (SIGN_EXTEND,
+				  mode,
+				  gen_lowpart_SUBREG (inner_mode, op0),
+				  inner_mode);
+      return op0;
+
     default:
     flag_unsupported:
 #ifdef ENABLE_CHECKING
diff --git a/gcc/expr.c b/gcc/expr.c
index da68870..c2f535f 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -9318,6 +9318,26 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
       target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, target);
       return target;
 
+    case SEXT_EXPR:
+	{
+	  machine_mode inner_mode = mode_for_size (tree_to_uhwi (treeop1),
+						   MODE_INT, 0);
+	  rtx temp, result;
+	  rtx op0 = expand_normal (treeop0);
+	  op0 = force_reg (mode, op0);
+	  if (mode != inner_mode)
+	    {
+	      result = gen_reg_rtx (mode);
+	      temp = simplify_gen_unary (SIGN_EXTEND, mode,
+					 gen_lowpart_SUBREG (inner_mode, op0),
+					 inner_mode);
+	      convert_move (result, temp, 0);
+	    }
+	  else
+	    result = op0;
+	  return result;
+	}
+
     default:
       gcc_unreachable ();
     }
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 602ea24..a149bad 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -987,6 +987,10 @@ int_const_binop_1 (enum tree_code code, const_tree arg1, const_tree parg2,
       res = wi::bit_and (arg1, arg2);
       break;
 
+    case SEXT_EXPR:
+      res = wi::sext (arg1, arg2.to_uhwi ());
+      break;
+
     case RSHIFT_EXPR:
     case LSHIFT_EXPR:
       if (wi::neg_p (arg2))
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index 8e3e810..d18b3f7 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -3752,6 +3752,18 @@ verify_gimple_assign_binary (gassign *stmt)
         return false;
       }
 
+    case SEXT_EXPR:
+      {
+	if (!INTEGRAL_TYPE_P (lhs_type)
+	    || !useless_type_conversion_p (lhs_type, rhs1_type)
+	    || !tree_fits_uhwi_p (rhs2))
+	  {
+	    error ("invalid operands in sext expr");
+	    return true;
+	  }
+	return false;
+      }
+
     case VEC_WIDEN_LSHIFT_HI_EXPR:
     case VEC_WIDEN_LSHIFT_LO_EXPR:
       {
diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index b8269ef..e61c200 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -3893,6 +3893,7 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights,
     case BIT_XOR_EXPR:
     case BIT_AND_EXPR:
     case BIT_NOT_EXPR:
+    case SEXT_EXPR:
 
     case TRUTH_ANDIF_EXPR:
     case TRUTH_ORIF_EXPR:
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index 11f90051..bec9082 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -1923,6 +1923,14 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, int flags,
       }
       break;
 
+    case SEXT_EXPR:
+      pp_string (pp, "SEXT_EXPR <");
+      dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);
+      pp_string (pp, ", ");
+      dump_generic_node (pp, TREE_OPERAND (node, 1), spc, flags, false);
+      pp_greater (pp);
+      break;
+
     case MODIFY_EXPR:
     case INIT_EXPR:
       dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags,
@@ -3561,6 +3569,9 @@ op_symbol_code (enum tree_code code)
     case MIN_EXPR:
       return "min";
 
+    case SEXT_EXPR:
+      return "sext";
+
     default:
       return "<<< ??? >>>";
     }
diff --git a/gcc/tree.def b/gcc/tree.def
index d0a3bd6..789cfdd 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -760,6 +760,11 @@ DEFTREECODE (BIT_XOR_EXPR, "bit_xor_expr", tcc_binary, 2)
 DEFTREECODE (BIT_AND_EXPR, "bit_and_expr", tcc_binary, 2)
 DEFTREECODE (BIT_NOT_EXPR, "bit_not_expr", tcc_unary, 1)
 
+/*  Sign-extend operation.  It will sign extend first operand from
+ the sign bit specified by the second operand.  The type of the
+ result is that of the first operand.  */
+DEFTREECODE (SEXT_EXPR, "sext_expr", tcc_binary, 2)
+
 /* ANDIF and ORIF allow the second operand not to be computed if the
    value of the expression is determined from the first operand.  AND,
    OR, and XOR always compute the second operand whether its value is
-- 
1.9.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [0/7] Type promotion pass and elimination of zext/sext
  2015-11-02  9:17             ` Kugan
@ 2015-11-03 14:40               ` Richard Biener
  2015-11-08  9:43                 ` Kugan
  0 siblings, 1 reply; 28+ messages in thread
From: Richard Biener @ 2015-11-03 14:40 UTC (permalink / raw)
  To: Kugan; +Cc: gcc-patches

On Mon, Nov 2, 2015 at 10:17 AM, Kugan
<kugan.vivekanandarajah@linaro.org> wrote:
>
>
> On 29/10/15 02:45, Richard Biener wrote:
>> On Tue, Oct 27, 2015 at 1:50 AM, kugan
>> <kugan.vivekanandarajah@linaro.org> wrote:
>>>
>>>
>>> On 23/10/15 01:23, Richard Biener wrote:
>>>>
>>>> On Thu, Oct 22, 2015 at 12:50 PM, Kugan
>>>> <kugan.vivekanandarajah@linaro.org> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On 21/10/15 23:45, Richard Biener wrote:
>>>>>>
>>>>>> On Tue, Oct 20, 2015 at 10:03 PM, Kugan
>>>>>> <kugan.vivekanandarajah@linaro.org> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 07/09/15 12:53, Kugan wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> This a new version of the patch posted in
>>>>>>>> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done
>>>>>>>> more testing and spitted the patch to make it more easier to review.
>>>>>>>> There are still couple of issues to be addressed and I am working on
>>>>>>>> them.
>>>>>>>>
>>>>>>>> 1. AARCH64 bootstrap now fails with the commit
>>>>>>>> 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is
>>>>>>>> mis-compiled
>>>>>>>> in stage2 and fwprop.c is failing. It looks to me that there is a
>>>>>>>> latent
>>>>>>>> issue which gets exposed my patch. I can also reproduce this in x86_64
>>>>>>>> if I use the same PROMOTE_MODE which is used in aarch64 port. For the
>>>>>>>> time being, I am using  patch
>>>>>>>> 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a
>>>>>>>> workaround. This meeds to be fixed before the patches are ready to be
>>>>>>>> committed.
>>>>>>>>
>>>>>>>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with
>>>>>>>> -O3 -g Error: unaligned opcodes detected in executable segment. It
>>>>>>>> works
>>>>>>>> fine if I remove the -g. I am looking into it and needs to be fixed as
>>>>>>>> well.
>>>>>>>
>>>>>>>
>>>>>>> Hi Richard,
>>>>>>>
>>>>>>> Now that stage 1 is going to close, I would like to get these patches
>>>>>>> accepted for stage1. I will try my best to address your review comments
>>>>>>> ASAP.
>>>>>>
>>>>>>
>>>>>> Ok, can you make the whole patch series available so I can poke at the
>>>>>> implementation a bit?  Please state the revision it was rebased on
>>>>>> (or point me to a git/svn branch the work resides on).
>>>>>>
>>>>>
>>>>> Thanks. Please find the patched rebated against trunk@229156. I have
>>>>> skipped the test-case readjustment patches.
>>>>
>>>>
>>>> Some quick observations.  On x86_64 when building
>>>
>>>
>>> Hi Richard,
>>>
>>> Thanks for the review.
>>>
>>>>
>>>> short bar (short y);
>>>> int foo (short x)
>>>> {
>>>>    short y = bar (x) + 15;
>>>>    return y;
>>>> }
>>>>
>>>> with -m32 -O2 -mtune=pentiumpro (which ends up promoting HImode regs)
>>>> I get
>>>>
>>>>    <bb 2>:
>>>>    _1 = (int) x_10(D);
>>>>    _2 = (_1) sext (16);
>>>>    _11 = bar (_2);
>>>>    _5 = (int) _11;
>>>>    _12 = (unsigned int) _5;
>>>>    _6 = _12 & 65535;
>>>>    _7 = _6 + 15;
>>>>    _13 = (int) _7;
>>>>    _8 = (_13) sext (16);
>>>>    _9 = (_8) sext (16);
>>>>    return _9;
>>>>
>>>> which looks fine but the VRP optimization doesn't trigger for the
>>>> redundant sext
>>>> (ranges are computed correctly but the 2nd extension is not removed).
>
> Thanks for the comments. Please fond the attached patches with which I
> am now getting
> cat .192t.optimized
>
> ;; Function foo (foo, funcdef_no=0, decl_uid=1406, cgraph_uid=0,
> symbol_order=0)
>
> foo (short int x)
> {
>   signed int _1;
>   int _2;
>   signed int _5;
>   unsigned int _6;
>   unsigned int _7;
>   signed int _8;
>   int _9;
>   short int _11;
>   unsigned int _12;
>   signed int _13;
>
>   <bb 2>:
>   _1 = (signed int) x_10(D);
>   _2 = _1;
>   _11 = bar (_2);
>   _5 = (signed int) _11;
>   _12 = (unsigned int) _11;
>   _6 = _12 & 65535;
>   _7 = _6 + 15;
>   _13 = (signed int) _7;
>   _8 = (_13) sext (16);
>   _9 = _8;
>   return _9;
>
> }
>
>
> There are still some redundancies. The asm difference after RTL
> optimizations is
>
> -       addl    $15, %eax
> +       addw    $15, %ax
>
>
>>>>
>>>> This also makes me notice trivial match.pd patterns are missing, like
>>>> for example
>>>>
>>>> (simplify
>>>>   (sext (sext@2 @0 @1) @3)
>>>>   (if (tree_int_cst_compare (@1, @3) <= 0)
>>>>    @2
>>>>    (sext @0 @3)))
>>>>
>>>> as VRP doesn't run at -O1 we must rely on those to remove rendudant
>>>> extensions,
>>>> otherwise generated code might get worse compared to without the pass(?)
>>>
>>>
>>> Do you think that we should enable this pass only when vrp is enabled.
>>> Otherwise, even when we do the simple optimizations you mentioned below, we
>>> might not be able to remove all the redundancies.
>>>
>>>>
>>>> I also notice that the 'short' argument does not get it's sign-extension
>>>> removed
>>>> as redundand either even though we have
>>>>
>>>> _1 = (int) x_8(D);
>>>> Found new range for _1: [-32768, 32767]
>>>>
>>>
>>> I am looking into it.
>>>
>>>> In the end I suspect that keeping track of the "simple" cases in the
>>>> promotion
>>>> pass itself (by keeping a lattice) might be a good idea (after we fix VRP
>>>> to do
>>>> its work).  In some way whether the ABI guarantees promoted argument
>>>> registers might need some other target hook queries.
>
> I tried adding it in the attached patch with record_visit_stmt to track
> whether an ssa would have value overflow or properly zero/sign extended
> in promoted mode. We can use this to eliminate some of the zero/sign
> extension at gimple level. As it is, it doesn't do much. If this is what
> you had in mind, I will extend it based on your feedback.
>
>
>>>>
>>>> Now onto the 0002 patch.
>>>>
>>>> +static bool
>>>> +type_precision_ok (tree type)
>>>> +{
>>>> +  return (TYPE_PRECISION (type)  == 8
>>>> +         || TYPE_PRECISION (type) == 16
>>>> +         || TYPE_PRECISION (type) == 32);
>>>> +}
>>>>
>>>> that's a weird function to me.  You probably want
>>>> TYPE_PRECISION (type) == GET_MODE_PRECISION (TYPE_MODE (type))
>>>> here?  And guard that thing with POINTER_TYPE_P || INTEGRAL_TYPE_P?
>>>>
>>>
>>> I will change this. (I have a patch which I am testing with other changes
>>> you have asked for)
>>>
>>>
>>>> +/* Return the promoted type for TYPE.  */
>>>> +static tree
>>>> +get_promoted_type (tree type)
>>>> +{
>>>> +  tree promoted_type;
>>>> +  enum machine_mode mode;
>>>> +  int uns;
>>>> +  if (POINTER_TYPE_P (type)
>>>> +      || !INTEGRAL_TYPE_P (type)
>>>> +      || !type_precision_ok (type))
>>>> +    return type;
>>>> +
>>>> +  mode = TYPE_MODE (type);
>>>> +#ifdef PROMOTE_MODE
>>>> +  uns = TYPE_SIGN (type);
>>>> +  PROMOTE_MODE (mode, uns, type);
>>>> +#endif
>>>> +  uns = TYPE_SIGN (type);
>>>> +  promoted_type = lang_hooks.types.type_for_mode (mode, uns);
>>>> +  if (promoted_type
>>>> +      && (TYPE_PRECISION (promoted_type) > TYPE_PRECISION (type)))
>>>> +    type = promoted_type;
>>>>
>>>> I think what you want to verify is that TYPE_PRECISION (promoted_type)
>>>> == GET_MODE_PRECISION (mode).
>>>> And to not even bother with this simply use
>>>>
>>>> promoted_type = build_nonstandard_integer_type (GET_MODE_PRECISION (mode),
>>>> uns);
>>>>
>>>
>>> I am changing this too.
>>>
>>>> You use a domwalk but also might create new basic-blocks during it
>>>> (insert_on_edge_immediate), that's a
>>>> no-no, commit edge inserts after the domwalk.
>>>
>>>
>>> I am sorry, I dont understand "commit edge inserts after the domwalk" Is
>>> there a way to do this in the current implementation?
>>
>> Yes, simply use gsi_insert_on_edge () and after the domwalk is done do
>> gsi_commit_edge_inserts ().
>>
>>>> ssa_sets_higher_bits_bitmap looks unused and
>>>> we generally don't free dominance info, so please don't do that.
>>>>
>>>> I fired off a bootstrap on ppc64-linux which fails building stage1 libgcc
>>>> with
>>>>
>>>> /abuild/rguenther/obj/./gcc/xgcc -B/abuild/rguenther/obj/./gcc/
>>>> -B/usr/local/powerpc64-unknown-linux-gnu/bin/
>>>> -B/usr/local/powerpc64-unknown-linux-gnu/lib/ -isystem
>>>> /usr/local/powerpc64-unknown-linux-gnu/include -isystem
>>>> /usr/local/powerpc64-unknown-linux-gnu/sys-include    -g -O2 -O2  -g
>>>> -O2 -DIN_GCC    -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual
>>>> -Wno-format -Wstrict-prototypes -Wmissing-prototypes
>>>> -Wold-style-definition  -isystem ./include   -fPIC -mlong-double-128
>>>> -mno-minimal-toc -g -DIN_LIBGCC2 -fbuilding-libgcc
>>>> -fno-stack-protector   -fPIC -mlong-double-128 -mno-minimal-toc -I.
>>>> -I. -I../.././gcc -I../../../trunk/libgcc -I../../../trunk/libgcc/.
>>>> -I../../../trunk/libgcc/../gcc -I../../../trunk/libgcc/../include
>>>> -I../../../trunk/libgcc/../libdecnumber/dpd
>>>> -I../../../trunk/libgcc/../libdecnumber -DHAVE_CC_TLS  -o _divdi3.o
>>>> -MT _divdi3.o -MD -MP -MF _divdi3.dep -DL_divdi3 -c
>>>> ../../../trunk/libgcc/libgcc2.c \
>>>>            -fexceptions -fnon-call-exceptions -fvisibility=hidden
>>>> -DHIDE_EXPORTS
>>>> In file included from ../../../trunk/libgcc/libgcc2.c:56:0:
>>>> ../../../trunk/libgcc/libgcc2.c: In function ‘__divti3’:
>>>> ../../../trunk/libgcc/libgcc2.h:193:20: internal compiler error: in
>>>> expand_debug_locations, at cfgexpand.c:5277
>>>>
>
> With the attached patch, now I am running into Bootstrap comparison
> failure. I am looking into it. Please review this version so that I can
> address them while fixing this issue.

I notice

diff --git a/gcc/tree-ssanames.c b/gcc/tree-ssanames.c
index 82fd4a1..80fcf70 100644
--- a/gcc/tree-ssanames.c
+++ b/gcc/tree-ssanames.c
@@ -207,7 +207,8 @@ set_range_info (tree name, enum value_range_type range_type,
   unsigned int precision = TYPE_PRECISION (TREE_TYPE (name));

   /* Allocate if not available.  */
-  if (ri == NULL)
+  if (ri == NULL
+      || (precision != ri->get_min ().get_precision ()))

and I think you need to clear range info on promoted SSA vars in the
promotion pass.

The basic "structure" thing still remains.  You walk over all uses and
defs in all stmts
in promote_all_stmts which ends up calling promote_ssa_if_not_promoted on all
uses and defs which in turn promotes (the "def") and then fixes up all
uses in all stmts.

Instead of this you should, in promote_all_stmts, walk over all uses doing what
fixup_uses does and then walk over all defs, doing what promote_ssa does.

+    case GIMPLE_NOP:
+       {
+         if (SSA_NAME_VAR (def) == NULL)
+           {
+             /* Promote def by fixing its type for anonymous def.  */
+             TREE_TYPE (def) = promoted_type;
+           }
+         else
+           {
+             /* Create a promoted copy of parameters.  */
+             bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));

I think the uninitialized vars are somewhat tricky and it would be best
to create a new uninit anonymous SSA name for them.  You can
have SSA_NAME_VAR != NULL and def _not_ being a parameter
btw.

+/* Return true if it is safe to promote the defined SSA_NAME in the STMT
+   itself.  */
+static bool
+safe_to_promote_def_p (gimple *stmt)
+{
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  if (gimple_vuse (stmt) != NULL_TREE
+      || gimple_vdef (stmt) != NULL_TREE
+      || code == ARRAY_REF
+      || code == LROTATE_EXPR
+      || code == RROTATE_EXPR
+      || code == VIEW_CONVERT_EXPR
+      || code == BIT_FIELD_REF
+      || code == REALPART_EXPR
+      || code == IMAGPART_EXPR
+      || code == REDUC_MAX_EXPR
+      || code == REDUC_PLUS_EXPR
+      || code == REDUC_MIN_EXPR)
+    return false;
+  return true;

huh, I think this function has an odd name, maybe
can_promote_operation ()?  Please
use TREE_CODE_CLASS (code) == tcc_reference for all _REF trees.

Note that as followup things like the rotates should be "expanded" like
we'd do on RTL (open-coding the thing).  And we'd need a way to
specify zero-/sign-extended loads.

+/* Return true if it is safe to promote the use in the STMT.  */
+static bool
+safe_to_promote_use_p (gimple *stmt)
+{
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  tree lhs = gimple_assign_lhs (stmt);
+
+  if (gimple_vuse (stmt) != NULL_TREE
+      || gimple_vdef (stmt) != NULL_TREE

I think the vuse/vdef check is bogus, you can have a use of 'i_3' in say
_2 = a[i_3];

+      || code == VIEW_CONVERT_EXPR
+      || code == LROTATE_EXPR
+      || code == RROTATE_EXPR
+      || code == CONSTRUCTOR
+      || code == BIT_FIELD_REF
+      || code == COMPLEX_EXPR
+      || code == ASM_EXPR
+      || VECTOR_TYPE_P (TREE_TYPE (lhs)))
+    return false;
+  return true;

ASM_EXPR can never appear here.  I think PROMOTE_MODE never
promotes vector types - what cases did you need to add VECTOR_TYPE_P for?

+/* Return true if the SSA_NAME has to be truncated to preserve the
+   semantics.  */
+static bool
+truncate_use_p (gimple *stmt)
+{
+  enum tree_code code = gimple_assign_rhs_code (stmt);

I think the description can be improved.  This is about stray bits set
beyond the original type, correct?

Please use NOP_EXPR wherever you use CONVERT_EXPR right how.

+                 if (TREE_CODE_CLASS (code)
+                     == tcc_comparison)
+                   promote_cst_in_stmt (stmt, promoted_type, true);

don't you always need to promote constant operands?

Richard.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [0/7] Type promotion pass and elimination of zext/sext
  2015-10-28 15:51           ` Richard Biener
@ 2015-11-02  9:17             ` Kugan
  2015-11-03 14:40               ` Richard Biener
  0 siblings, 1 reply; 28+ messages in thread
From: Kugan @ 2015-11-02  9:17 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 12098 bytes --]



On 29/10/15 02:45, Richard Biener wrote:
> On Tue, Oct 27, 2015 at 1:50 AM, kugan
> <kugan.vivekanandarajah@linaro.org> wrote:
>>
>>
>> On 23/10/15 01:23, Richard Biener wrote:
>>>
>>> On Thu, Oct 22, 2015 at 12:50 PM, Kugan
>>> <kugan.vivekanandarajah@linaro.org> wrote:
>>>>
>>>>
>>>>
>>>> On 21/10/15 23:45, Richard Biener wrote:
>>>>>
>>>>> On Tue, Oct 20, 2015 at 10:03 PM, Kugan
>>>>> <kugan.vivekanandarajah@linaro.org> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 07/09/15 12:53, Kugan wrote:
>>>>>>>
>>>>>>>
>>>>>>> This a new version of the patch posted in
>>>>>>> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done
>>>>>>> more testing and spitted the patch to make it more easier to review.
>>>>>>> There are still couple of issues to be addressed and I am working on
>>>>>>> them.
>>>>>>>
>>>>>>> 1. AARCH64 bootstrap now fails with the commit
>>>>>>> 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is
>>>>>>> mis-compiled
>>>>>>> in stage2 and fwprop.c is failing. It looks to me that there is a
>>>>>>> latent
>>>>>>> issue which gets exposed my patch. I can also reproduce this in x86_64
>>>>>>> if I use the same PROMOTE_MODE which is used in aarch64 port. For the
>>>>>>> time being, I am using  patch
>>>>>>> 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a
>>>>>>> workaround. This meeds to be fixed before the patches are ready to be
>>>>>>> committed.
>>>>>>>
>>>>>>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with
>>>>>>> -O3 -g Error: unaligned opcodes detected in executable segment. It
>>>>>>> works
>>>>>>> fine if I remove the -g. I am looking into it and needs to be fixed as
>>>>>>> well.
>>>>>>
>>>>>>
>>>>>> Hi Richard,
>>>>>>
>>>>>> Now that stage 1 is going to close, I would like to get these patches
>>>>>> accepted for stage1. I will try my best to address your review comments
>>>>>> ASAP.
>>>>>
>>>>>
>>>>> Ok, can you make the whole patch series available so I can poke at the
>>>>> implementation a bit?  Please state the revision it was rebased on
>>>>> (or point me to a git/svn branch the work resides on).
>>>>>
>>>>
>>>> Thanks. Please find the patched rebated against trunk@229156. I have
>>>> skipped the test-case readjustment patches.
>>>
>>>
>>> Some quick observations.  On x86_64 when building
>>
>>
>> Hi Richard,
>>
>> Thanks for the review.
>>
>>>
>>> short bar (short y);
>>> int foo (short x)
>>> {
>>>    short y = bar (x) + 15;
>>>    return y;
>>> }
>>>
>>> with -m32 -O2 -mtune=pentiumpro (which ends up promoting HImode regs)
>>> I get
>>>
>>>    <bb 2>:
>>>    _1 = (int) x_10(D);
>>>    _2 = (_1) sext (16);
>>>    _11 = bar (_2);
>>>    _5 = (int) _11;
>>>    _12 = (unsigned int) _5;
>>>    _6 = _12 & 65535;
>>>    _7 = _6 + 15;
>>>    _13 = (int) _7;
>>>    _8 = (_13) sext (16);
>>>    _9 = (_8) sext (16);
>>>    return _9;
>>>
>>> which looks fine but the VRP optimization doesn't trigger for the
>>> redundant sext
>>> (ranges are computed correctly but the 2nd extension is not removed).

Thanks for the comments. Please fond the attached patches with which I
am now getting
cat .192t.optimized

;; Function foo (foo, funcdef_no=0, decl_uid=1406, cgraph_uid=0,
symbol_order=0)

foo (short int x)
{
  signed int _1;
  int _2;
  signed int _5;
  unsigned int _6;
  unsigned int _7;
  signed int _8;
  int _9;
  short int _11;
  unsigned int _12;
  signed int _13;

  <bb 2>:
  _1 = (signed int) x_10(D);
  _2 = _1;
  _11 = bar (_2);
  _5 = (signed int) _11;
  _12 = (unsigned int) _11;
  _6 = _12 & 65535;
  _7 = _6 + 15;
  _13 = (signed int) _7;
  _8 = (_13) sext (16);
  _9 = _8;
  return _9;

}


There are still some redundancies. The asm difference after RTL
optimizations is

-	addl	$15, %eax
+	addw	$15, %ax


>>>
>>> This also makes me notice trivial match.pd patterns are missing, like
>>> for example
>>>
>>> (simplify
>>>   (sext (sext@2 @0 @1) @3)
>>>   (if (tree_int_cst_compare (@1, @3) <= 0)
>>>    @2
>>>    (sext @0 @3)))
>>>
>>> as VRP doesn't run at -O1 we must rely on those to remove rendudant
>>> extensions,
>>> otherwise generated code might get worse compared to without the pass(?)
>>
>>
>> Do you think that we should enable this pass only when vrp is enabled.
>> Otherwise, even when we do the simple optimizations you mentioned below, we
>> might not be able to remove all the redundancies.
>>
>>>
>>> I also notice that the 'short' argument does not get it's sign-extension
>>> removed
>>> as redundand either even though we have
>>>
>>> _1 = (int) x_8(D);
>>> Found new range for _1: [-32768, 32767]
>>>
>>
>> I am looking into it.
>>
>>> In the end I suspect that keeping track of the "simple" cases in the
>>> promotion
>>> pass itself (by keeping a lattice) might be a good idea (after we fix VRP
>>> to do
>>> its work).  In some way whether the ABI guarantees promoted argument
>>> registers might need some other target hook queries.

I tried adding it in the attached patch with record_visit_stmt to track
whether an ssa would have value overflow or properly zero/sign extended
in promoted mode. We can use this to eliminate some of the zero/sign
extension at gimple level. As it is, it doesn't do much. If this is what
you had in mind, I will extend it based on your feedback.


>>>
>>> Now onto the 0002 patch.
>>>
>>> +static bool
>>> +type_precision_ok (tree type)
>>> +{
>>> +  return (TYPE_PRECISION (type)  == 8
>>> +         || TYPE_PRECISION (type) == 16
>>> +         || TYPE_PRECISION (type) == 32);
>>> +}
>>>
>>> that's a weird function to me.  You probably want
>>> TYPE_PRECISION (type) == GET_MODE_PRECISION (TYPE_MODE (type))
>>> here?  And guard that thing with POINTER_TYPE_P || INTEGRAL_TYPE_P?
>>>
>>
>> I will change this. (I have a patch which I am testing with other changes
>> you have asked for)
>>
>>
>>> +/* Return the promoted type for TYPE.  */
>>> +static tree
>>> +get_promoted_type (tree type)
>>> +{
>>> +  tree promoted_type;
>>> +  enum machine_mode mode;
>>> +  int uns;
>>> +  if (POINTER_TYPE_P (type)
>>> +      || !INTEGRAL_TYPE_P (type)
>>> +      || !type_precision_ok (type))
>>> +    return type;
>>> +
>>> +  mode = TYPE_MODE (type);
>>> +#ifdef PROMOTE_MODE
>>> +  uns = TYPE_SIGN (type);
>>> +  PROMOTE_MODE (mode, uns, type);
>>> +#endif
>>> +  uns = TYPE_SIGN (type);
>>> +  promoted_type = lang_hooks.types.type_for_mode (mode, uns);
>>> +  if (promoted_type
>>> +      && (TYPE_PRECISION (promoted_type) > TYPE_PRECISION (type)))
>>> +    type = promoted_type;
>>>
>>> I think what you want to verify is that TYPE_PRECISION (promoted_type)
>>> == GET_MODE_PRECISION (mode).
>>> And to not even bother with this simply use
>>>
>>> promoted_type = build_nonstandard_integer_type (GET_MODE_PRECISION (mode),
>>> uns);
>>>
>>
>> I am changing this too.
>>
>>> You use a domwalk but also might create new basic-blocks during it
>>> (insert_on_edge_immediate), that's a
>>> no-no, commit edge inserts after the domwalk.
>>
>>
>> I am sorry, I dont understand "commit edge inserts after the domwalk" Is
>> there a way to do this in the current implementation?
> 
> Yes, simply use gsi_insert_on_edge () and after the domwalk is done do
> gsi_commit_edge_inserts ().
> 
>>> ssa_sets_higher_bits_bitmap looks unused and
>>> we generally don't free dominance info, so please don't do that.
>>>
>>> I fired off a bootstrap on ppc64-linux which fails building stage1 libgcc
>>> with
>>>
>>> /abuild/rguenther/obj/./gcc/xgcc -B/abuild/rguenther/obj/./gcc/
>>> -B/usr/local/powerpc64-unknown-linux-gnu/bin/
>>> -B/usr/local/powerpc64-unknown-linux-gnu/lib/ -isystem
>>> /usr/local/powerpc64-unknown-linux-gnu/include -isystem
>>> /usr/local/powerpc64-unknown-linux-gnu/sys-include    -g -O2 -O2  -g
>>> -O2 -DIN_GCC    -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual
>>> -Wno-format -Wstrict-prototypes -Wmissing-prototypes
>>> -Wold-style-definition  -isystem ./include   -fPIC -mlong-double-128
>>> -mno-minimal-toc -g -DIN_LIBGCC2 -fbuilding-libgcc
>>> -fno-stack-protector   -fPIC -mlong-double-128 -mno-minimal-toc -I.
>>> -I. -I../.././gcc -I../../../trunk/libgcc -I../../../trunk/libgcc/.
>>> -I../../../trunk/libgcc/../gcc -I../../../trunk/libgcc/../include
>>> -I../../../trunk/libgcc/../libdecnumber/dpd
>>> -I../../../trunk/libgcc/../libdecnumber -DHAVE_CC_TLS  -o _divdi3.o
>>> -MT _divdi3.o -MD -MP -MF _divdi3.dep -DL_divdi3 -c
>>> ../../../trunk/libgcc/libgcc2.c \
>>>            -fexceptions -fnon-call-exceptions -fvisibility=hidden
>>> -DHIDE_EXPORTS
>>> In file included from ../../../trunk/libgcc/libgcc2.c:56:0:
>>> ../../../trunk/libgcc/libgcc2.c: In function ‘__divti3’:
>>> ../../../trunk/libgcc/libgcc2.h:193:20: internal compiler error: in
>>> expand_debug_locations, at cfgexpand.c:5277
>>>

With the attached patch, now I am running into Bootstrap comparison
failure. I am looking into it. Please review this version so that I can
address them while fixing this issue.

Thanks,
Kugan

>>
>> I am testing on gcc computefarm. I will get it to bootstrap and will do the
>> regression testing before posting the next version.
>>
>>> as hinted at above a bootstrap on i?86 (yes, 32bit) with
>>> --with-tune=pentiumpro might be another good testing candidate.
>>>
>>> +      FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_USE |
>>> SSA_OP_DEF)
>>> +       promote_def_and_uses (def);
>>>
>>> it looks like you are doing some redundant work by walking both defs
>>> and uses of each stmt.  I'd say you should separate
>>> def and use processing and use
>>>
>>>    FOR_EACH_SSA_USE_OPERAND (use, stmt, iter, SSA_OP_USE)
>>>      promote_use (use);
>>>    FOR_EACH_SSA_DEF_OPERAND (def, stmt, iter, SSA_OP_DEF)
>>>      promote_def (def);
>>>
>>
>> Name promote_def_and_uses in my implementation is a bit confusing. It is
>> promoting the SSA_NAMEs. We only have to do that for the definitions if we
>> can do the SSA_NAMEs defined by parameters.
>>
>> I also have a bitmap to see if we have promoted a variable and avoid doing
>> it again. I will try to improve this.
>>
>>
>>
>>> this should make processing more efficient (memory local) compared to
>>> doing the split handling
>>> in promote_def_and_uses.
>>>
>>> I think it will be convenient to have a SSA name info structure where
>>> you can remember the original
>>> type a name was promoted from as well as whether it was promoted or
>>> not.  This way adjusting
>>> debug uses should be "trivial":
>>>
>>> +static unsigned int
>>> +fixup_uses (tree use, tree promoted_type, tree old_type)
>>> +{
>>> +  gimple *stmt;
>>> +  imm_use_iterator ui;
>>> +  gimple_stmt_iterator gsi;
>>> +  use_operand_p op;
>>> +
>>> +  FOR_EACH_IMM_USE_STMT (stmt, ui, use)
>>> +    {
>>> +      bool do_not_promote = false;
>>> +      switch (gimple_code (stmt))
>>> +       {
>>> +       case GIMPLE_DEBUG:
>>> +           {
>>> +             gsi = gsi_for_stmt (stmt);
>>> +             gsi_remove (&gsi, true);
>>>
>>> rather than doing the above you'd do sth like
>>>
>>>    SET_USE (use, fold_convert (old_type, new_def));
>>>    update_stmt (stmt);
>>>
>>
>> We do have these information (original type a name was promoted from as well
>> as whether it was promoted or not). To make it easy to review, in the patch
>> that adds the pass,I am removing these debug stmts. But in patch 4, I am
>> trying to handle this properly. Maybe  I should combine them.
> 
> Yeah, it's a bit confusing otherwise.
> 
>>> note that while you may not be able to use promoted regs at all uses
>>> (like calls or asms) you can promote all defs, if only with a compensation
>>> statement after the original def.  The SSA name info struct can be used
>>> to note down the actual SSA name holding the promoted def.
>>>
>>> The pass looks a lot better than last time (it's way smaller!) but
>>> still needs some
>>> improvements.  There are some more fishy details with respect to how you
>>> allocate/change SSA names but I think those can be dealt with once the
>>> basic structure looks how I like it to be.
>>>
>>
>> I will post an updated patch in a day or two.
> 
> Thanks,
> Richard.
> 
>> Thanks again,
>> Kugan

[-- Attachment #2: 0003-Optimize-ZEXT_EXPR-with-tree-vrp.patch --]
[-- Type: text/x-diff, Size: 3519 bytes --]

From 355a6ebe7cc2548417e2e4976b842fbbf5e93224 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Thu, 22 Oct 2015 10:53:56 +1100
Subject: [PATCH 3/3] Optimize ZEXT_EXPR with tree-vrp

---
 gcc/match.pd   |  6 ++++++
 gcc/tree-vrp.c | 59 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 65 insertions(+)

diff --git a/gcc/match.pd b/gcc/match.pd
index 0a9598e..1b152f1 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -2585,3 +2585,9 @@ along with GCC; see the file COPYING3.  If not see
   (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
    (op @0 (ext @1 @2)))))
 
+(simplify
+ (sext (sext@2 @0 @1) @3)
+ (if (tree_int_cst_compare (@1, @3) <= 0)
+  @2
+  (sext @0 @3)))
+
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index fe34ffd..671a388 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -2241,6 +2241,7 @@ extract_range_from_binary_expr_1 (value_range *vr,
       && code != LSHIFT_EXPR
       && code != MIN_EXPR
       && code != MAX_EXPR
+      && code != SEXT_EXPR
       && code != BIT_AND_EXPR
       && code != BIT_IOR_EXPR
       && code != BIT_XOR_EXPR)
@@ -2801,6 +2802,52 @@ extract_range_from_binary_expr_1 (value_range *vr,
       extract_range_from_multiplicative_op_1 (vr, code, &vr0, &vr1);
       return;
     }
+  else if (code == SEXT_EXPR)
+    {
+      gcc_assert (range_int_cst_p (&vr1));
+      HOST_WIDE_INT prec = tree_to_uhwi (vr1.min);
+      type = vr0.type;
+      wide_int tmin, tmax;
+      wide_int may_be_nonzero, must_be_nonzero;
+
+      wide_int type_min = wi::min_value (prec, SIGNED);
+      wide_int type_max = wi::max_value (prec, SIGNED);
+      type_min = wide_int_to_tree (expr_type, type_min);
+      type_max = wide_int_to_tree (expr_type, type_max);
+      wide_int sign_bit
+	= wi::set_bit_in_zero (prec - 1,
+			       TYPE_PRECISION (TREE_TYPE (vr0.min)));
+      if (zero_nonzero_bits_from_vr (expr_type, &vr0,
+				     &may_be_nonzero,
+				     &must_be_nonzero))
+	{
+	  if (wi::bit_and (must_be_nonzero, sign_bit) == sign_bit)
+	    {
+	      /* If to-be-extended sign bit is one.  */
+	      tmin = type_min;
+	      tmax = wi::zext (may_be_nonzero, prec);
+	    }
+	  else if (wi::bit_and (may_be_nonzero, sign_bit)
+		   != sign_bit)
+	    {
+	      /* If to-be-extended sign bit is zero.  */
+	      tmin = wi::zext (must_be_nonzero, prec);
+	      tmax = wi::zext (may_be_nonzero, prec);
+	    }
+	  else
+	    {
+	      tmin = type_min;
+	      tmax = type_max;
+	    }
+	}
+      else
+	{
+	  tmin = type_min;
+	  tmax = type_max;
+	}
+      min = wide_int_to_tree (expr_type, tmin);
+      max = wide_int_to_tree (expr_type, tmax);
+    }
   else if (code == RSHIFT_EXPR
 	   || code == LSHIFT_EXPR)
     {
@@ -9166,6 +9213,17 @@ simplify_bit_ops_using_ranges (gimple_stmt_iterator *gsi, gimple *stmt)
 	  break;
 	}
       break;
+    case SEXT_EXPR:
+	{
+	  unsigned int prec = tree_to_uhwi (op1);
+	  wide_int min = vr0.min;
+	  wide_int max = vr0.max;
+	  wide_int sext_min = wi::sext (min, prec);
+	  wide_int sext_max = wi::sext (max, prec);
+	  if (min == sext_min && max == sext_max)
+	    op = op0;
+	}
+      break;
     default:
       gcc_unreachable ();
     }
@@ -9868,6 +9926,7 @@ simplify_stmt_using_ranges (gimple_stmt_iterator *gsi)
 
 	case BIT_AND_EXPR:
 	case BIT_IOR_EXPR:
+	case SEXT_EXPR:
 	  /* Optimize away BIT_AND_EXPR and BIT_IOR_EXPR
 	     if all the bits being cleared are already cleared or
 	     all the bits being set are already set.  */
-- 
1.9.1


[-- Attachment #3: 0002-Add-type-promotion-pass.patch --]
[-- Type: text/x-diff, Size: 33011 bytes --]

From 8b2256e4787adb05ac9c439ef54d5befe035915d Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Thu, 22 Oct 2015 10:52:37 +1100
Subject: [PATCH 2/3] Add type promotion pass

---
 gcc/Makefile.in               |   1 +
 gcc/common.opt                |   4 +
 gcc/doc/invoke.texi           |  10 +
 gcc/gimple-ssa-type-promote.c | 997 ++++++++++++++++++++++++++++++++++++++++++
 gcc/passes.def                |   1 +
 gcc/timevar.def               |   1 +
 gcc/tree-pass.h               |   1 +
 gcc/tree-ssanames.c           |   3 +-
 8 files changed, 1017 insertions(+), 1 deletion(-)
 create mode 100644 gcc/gimple-ssa-type-promote.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index b91b8dc..c6aed45 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1499,6 +1499,7 @@ OBJS = \
 	tree-vect-slp.o \
 	tree-vectorizer.o \
 	tree-vrp.o \
+	gimple-ssa-type-promote.o \
 	tree.o \
 	valtrack.o \
 	value-prof.o \
diff --git a/gcc/common.opt b/gcc/common.opt
index 12ca0d6..f450428 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2404,6 +2404,10 @@ ftree-vrp
 Common Report Var(flag_tree_vrp) Init(0) Optimization
 Perform Value Range Propagation on trees.
 
+ftree-type-promote
+Common Report Var(flag_tree_type_promote) Init(1) Optimization
+Perform Type Promotion on trees
+
 funit-at-a-time
 Common Report Var(flag_unit_at_a_time) Init(1)
 Compile whole compilation unit at a time.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index cd82544..bc059a0 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -9093,6 +9093,16 @@ enabled by default at @option{-O2} and higher.  Null pointer check
 elimination is only done if @option{-fdelete-null-pointer-checks} is
 enabled.
 
+@item -ftree-type-promote
+@opindex ftree-type-promote
+This pass applies type promotion to SSA names in the function and
+inserts appropriate truncations to preserve the semantics.  Idea of
+this pass is to promote operations such a way that we can minimise
+generation of subreg in RTL, that intern results in removal of
+redundant zero/sign extensions.
+
+This optimization is enabled by default.
+
 @item -fsplit-ivs-in-unroller
 @opindex fsplit-ivs-in-unroller
 Enables expression of values of induction variables in later iterations
diff --git a/gcc/gimple-ssa-type-promote.c b/gcc/gimple-ssa-type-promote.c
new file mode 100644
index 0000000..2831fec
--- /dev/null
+++ b/gcc/gimple-ssa-type-promote.c
@@ -0,0 +1,997 @@
+/* Type promotion of SSA names to minimise redundant zero/sign extension.
+   Copyright (C) 2015 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "hash-set.h"
+#include "machmode.h"
+#include "vec.h"
+#include "double-int.h"
+#include "input.h"
+#include "symtab.h"
+#include "wide-int.h"
+#include "inchash.h"
+#include "tree.h"
+#include "fold-const.h"
+#include "stor-layout.h"
+#include "predict.h"
+#include "function.h"
+#include "dominance.h"
+#include "cfg.h"
+#include "basic-block.h"
+#include "tree-ssa-alias.h"
+#include "gimple-fold.h"
+#include "tree-eh.h"
+#include "gimple-expr.h"
+#include "is-a.h"
+#include "gimple.h"
+#include "gimple-iterator.h"
+#include "gimple-ssa.h"
+#include "tree-phinodes.h"
+#include "ssa-iterators.h"
+#include "stringpool.h"
+#include "tree-ssanames.h"
+#include "tree-pass.h"
+#include "gimple-pretty-print.h"
+#include "langhooks.h"
+#include "sbitmap.h"
+#include "domwalk.h"
+#include "tree-dfa.h"
+
+/* This pass applies type promotion to SSA names in the function and
+   inserts appropriate truncations.  Idea of this pass is to promote operations
+   such a way that we can minimise generation of subreg in RTL,
+   that in turn results in removal of redundant zero/sign extensions.  This pass
+   will run prior to The VRP and DOM such that they will be able to optimise
+   redundant truncations and extensions.  This is based on the discussion from
+   https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00472.html.
+
+*/
+
+static unsigned n_ssa_val;
+static sbitmap ssa_to_be_promoted_bitmap;
+static sbitmap ssa_sets_higher_bits_bitmap;
+static hash_map <tree, tree>  *original_type_map;
+
+static bool
+type_precision_ok (tree type)
+{
+  return (TYPE_PRECISION (type)
+	  == GET_MODE_PRECISION (TYPE_MODE (type)));
+}
+
+/* Return the promoted type for TYPE.  */
+static tree
+get_promoted_type (tree type)
+{
+  tree promoted_type;
+  enum machine_mode mode;
+  int uns;
+
+  if (POINTER_TYPE_P (type)
+      || !INTEGRAL_TYPE_P (type)
+      || !type_precision_ok (type))
+    return type;
+
+  mode = TYPE_MODE (type);
+#ifdef PROMOTE_MODE
+  uns = TYPE_SIGN (type);
+  PROMOTE_MODE (mode, uns, type);
+#endif
+  uns = TYPE_SIGN (type);
+  if (TYPE_PRECISION (type) == GET_MODE_PRECISION (mode))
+    return type;
+  promoted_type
+    = build_nonstandard_integer_type (GET_MODE_PRECISION (mode),
+				      uns);
+  gcc_assert (TYPE_PRECISION (promoted_type) == GET_MODE_PRECISION (mode));
+  return promoted_type;
+}
+
+/* Return true if ssa NAME is already considered for promotion.  */
+static bool
+ssa_promoted_p (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	return bitmap_bit_p (ssa_to_be_promoted_bitmap, index);
+    }
+  return true;
+}
+
+/* Set ssa NAME to be already considered for promotion.  */
+static void
+set_ssa_promoted (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	bitmap_set_bit (ssa_to_be_promoted_bitmap, index);
+    }
+}
+
+/* Set ssa NAME will have higher bits if promoted.  */
+static void
+set_ssa_overflows (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	bitmap_set_bit (ssa_sets_higher_bits_bitmap, index);
+    }
+}
+
+
+/* Return true if ssa NAME will have higher bits if promoted.  */
+static bool
+ssa_overflows_p (tree name ATTRIBUTE_UNUSED)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	return bitmap_bit_p (ssa_sets_higher_bits_bitmap, index);
+    }
+  return true;
+}
+
+/* Visit PHI stmt and record if variables might have higher bits set if
+   promoted.  */
+static bool
+record_visit_phi_node (gimple *stmt)
+{
+  tree def;
+  ssa_op_iter i;
+  use_operand_p op;
+  bool high_bits_set = false;
+  gphi *phi = as_a <gphi *> (stmt);
+  tree lhs = PHI_RESULT (phi);
+
+  if (TREE_CODE (lhs) != SSA_NAME
+      || POINTER_TYPE_P (TREE_TYPE (lhs))
+      || !INTEGRAL_TYPE_P (TREE_TYPE (lhs))
+      || ssa_overflows_p (lhs))
+    return false;
+
+  FOR_EACH_PHI_ARG (op, phi, i, SSA_OP_USE)
+    {
+      def = USE_FROM_PTR (op);
+      if (ssa_overflows_p (def))
+	high_bits_set = true;
+    }
+
+  if (high_bits_set)
+    {
+      set_ssa_overflows (lhs);
+      return true;
+    }
+  else
+    return false;
+}
+
+/* Visit STMT and record if variables might have higher bits set if
+   promoted.  */
+static bool
+record_visit_stmt (gimple *stmt)
+{
+  bool changed = false;
+  gcc_assert (gimple_code (stmt) == GIMPLE_ASSIGN);
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  tree lhs = gimple_assign_lhs (stmt);
+  tree rhs1 = gimple_assign_rhs1 (stmt);
+
+  if (TREE_CODE (lhs) != SSA_NAME
+      || POINTER_TYPE_P (TREE_TYPE (lhs))
+      || !INTEGRAL_TYPE_P (TREE_TYPE (lhs)))
+    return false;
+
+  switch (code)
+    {
+    case SSA_NAME:
+      if (!ssa_overflows_p (lhs)
+	  && ssa_overflows_p (rhs1))
+	{
+	  set_ssa_overflows (lhs);
+	  changed = true;
+	}
+      break;
+
+    default:
+      if (!ssa_overflows_p (lhs))
+	{
+	  set_ssa_overflows (lhs);
+	  changed = true;
+	}
+      break;
+    }
+  return changed;
+}
+
+static void
+process_all_stmts_for_unsafe_promotion ()
+{
+  basic_block bb;
+  gimple_stmt_iterator gsi;
+  auto_vec<gimple *> work_list;
+
+  FOR_EACH_BB_FN (bb, cfun)
+    {
+      for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+	{
+	  gimple *phi = gsi_stmt (gsi);
+	  work_list.safe_push (phi);
+	}
+
+      for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+	{
+	  gimple *stmt = gsi_stmt (gsi);
+	  if (gimple_code (stmt) == GIMPLE_ASSIGN)
+	    work_list.safe_push (stmt);
+	}
+    }
+
+  while (work_list.length () > 0)
+    {
+      bool changed;
+      gimple *stmt = work_list.pop ();
+      tree lhs;
+
+      switch (gimple_code (stmt))
+	{
+
+	case GIMPLE_ASSIGN:
+	  changed = record_visit_stmt (stmt);
+	  lhs = gimple_assign_lhs (stmt);
+	  break;
+
+	case GIMPLE_PHI:
+	  changed = record_visit_phi_node (stmt);
+	  lhs = PHI_RESULT (stmt);
+	  break;
+
+	default:
+	  gcc_assert (false);
+	  break;
+	}
+
+      if (changed)
+	{
+	  gimple *use_stmt;
+	  imm_use_iterator ui;
+
+	  FOR_EACH_IMM_USE_STMT (use_stmt, ui, lhs)
+	    {
+	      if (gimple_code (use_stmt) == GIMPLE_ASSIGN
+		  || gimple_code (use_stmt) == GIMPLE_PHI)
+		work_list.safe_push (use_stmt);
+	    }
+	}
+    }
+}
+
+/* Return true if it is safe to promote the defined SSA_NAME in the STMT
+   itself.  */
+static bool
+safe_to_promote_def_p (gimple *stmt)
+{
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  if (gimple_vuse (stmt) != NULL_TREE
+      || gimple_vdef (stmt) != NULL_TREE
+      || code == ARRAY_REF
+      || code == LROTATE_EXPR
+      || code == RROTATE_EXPR
+      || code == VIEW_CONVERT_EXPR
+      || code == BIT_FIELD_REF
+      || code == REALPART_EXPR
+      || code == IMAGPART_EXPR
+      || code == REDUC_MAX_EXPR
+      || code == REDUC_PLUS_EXPR
+      || code == REDUC_MIN_EXPR)
+    return false;
+  return true;
+}
+
+/* Return true if it is safe to promote the use in the STMT.  */
+static bool
+safe_to_promote_use_p (gimple *stmt)
+{
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  tree lhs = gimple_assign_lhs (stmt);
+
+  if (gimple_vuse (stmt) != NULL_TREE
+      || gimple_vdef (stmt) != NULL_TREE
+      || code == VIEW_CONVERT_EXPR
+      || code == LROTATE_EXPR
+      || code == RROTATE_EXPR
+      || code == CONSTRUCTOR
+      || code == BIT_FIELD_REF
+      || code == COMPLEX_EXPR
+      || code == ASM_EXPR
+      || VECTOR_TYPE_P (TREE_TYPE (lhs)))
+    return false;
+  return true;
+}
+
+/* Return true if the SSA_NAME has to be truncated to preserve the
+   semantics.  */
+static bool
+truncate_use_p (gimple *stmt)
+{
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  if (TREE_CODE_CLASS (code)
+      == tcc_comparison
+      || code == TRUNC_DIV_EXPR
+      || code == CEIL_DIV_EXPR
+      || code == FLOOR_DIV_EXPR
+      || code == ROUND_DIV_EXPR
+      || code == TRUNC_MOD_EXPR
+      || code == CEIL_MOD_EXPR
+      || code == FLOOR_MOD_EXPR
+      || code == ROUND_MOD_EXPR
+      || code == LSHIFT_EXPR
+      || code == RSHIFT_EXPR)
+    return true;
+  return false;
+}
+
+/* Return true if LHS will be promoted later.  */
+static bool
+tobe_promoted_p (tree lhs)
+{
+  if (TREE_CODE (lhs) == SSA_NAME
+      && !POINTER_TYPE_P (TREE_TYPE (lhs))
+      && INTEGRAL_TYPE_P (TREE_TYPE (lhs))
+      && !VECTOR_TYPE_P (TREE_TYPE (lhs))
+      && !ssa_promoted_p (lhs)
+      && (get_promoted_type (TREE_TYPE (lhs))
+	  != TREE_TYPE (lhs)))
+    return true;
+  else
+    return false;
+}
+
+/* Convert constant CST to TYPE.  */
+static tree
+convert_int_cst (tree type, tree cst, signop sign = SIGNED)
+{
+  wide_int wi_cons = fold_convert (type, cst);
+  wi_cons = wi::ext (wi_cons, TYPE_PRECISION (TREE_TYPE (cst)), sign);
+  return wide_int_to_tree (type, wi_cons);
+}
+
+/* Promote constants in STMT to TYPE.  If PROMOTE_COND_EXPR is true,
+   promote only the constants in conditions part of the COND_EXPR.  */
+static void
+promote_cst_in_stmt (gimple *stmt, tree type, bool promote_cond = false)
+{
+  tree op;
+  ssa_op_iter iter;
+  use_operand_p oprnd;
+  int index;
+  tree op0, op1;
+  signop sign = SIGNED;
+
+  switch (gimple_code (stmt))
+    {
+    case GIMPLE_ASSIGN:
+      if (promote_cond
+	  && gimple_assign_rhs_code (stmt) == COND_EXPR)
+	{
+	  /* Promote INTEGER_CST that are tcc_compare arguments.  */
+	  sign = TYPE_SIGN (type);
+	  op = gimple_assign_rhs1 (stmt);
+	  op0 = TREE_OPERAND (op, 0);
+	  op1 = TREE_OPERAND (op, 1);
+	  if (TREE_CODE (op0) == INTEGER_CST)
+	    op0 = convert_int_cst (type, op0, sign);
+	  if (TREE_CODE (op1) == INTEGER_CST)
+	    op1 = convert_int_cst (type, op1, sign);
+	  tree new_op = build2 (TREE_CODE (op), type, op0, op1);
+	  gimple_assign_set_rhs1 (stmt, new_op);
+	}
+      else
+	{
+	  /* Promote INTEGER_CST in GIMPLE_ASSIGN.  */
+	  op = gimple_assign_rhs3 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs3 (stmt, convert_int_cst (type, op, sign));
+	  if (TREE_CODE_CLASS (gimple_assign_rhs_code (stmt))
+	      == tcc_comparison)
+	    sign = TYPE_SIGN (type);
+	  op = gimple_assign_rhs1 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs1 (stmt, convert_int_cst (type, op, sign));
+	  op = gimple_assign_rhs2 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs2 (stmt, convert_int_cst (type, op, sign));
+	}
+      break;
+
+    case GIMPLE_PHI:
+	{
+	  /* Promote INTEGER_CST arguments to GIMPLE_PHI.  */
+	  gphi *phi = as_a <gphi *> (stmt);
+	  FOR_EACH_PHI_ARG (oprnd, phi, iter, SSA_OP_USE)
+	    {
+	      op = USE_FROM_PTR (oprnd);
+	      index = PHI_ARG_INDEX_FROM_USE (oprnd);
+	      if (TREE_CODE (op) == INTEGER_CST)
+		SET_PHI_ARG_DEF (phi, index, convert_int_cst (type, op, sign));
+	    }
+	}
+      break;
+
+    case GIMPLE_COND:
+	{
+	  /* Promote INTEGER_CST that are GIMPLE_COND arguments.  */
+	  gcond *cond = as_a <gcond *> (stmt);
+	  op = gimple_cond_lhs (cond);
+	  sign = TYPE_SIGN (type);
+
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_cond_set_lhs (cond, convert_int_cst (type, op, sign));
+	  op = gimple_cond_rhs (cond);
+
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_cond_set_rhs (cond, convert_int_cst (type, op, sign));
+	}
+      break;
+
+    default:
+      gcc_unreachable ();
+    }
+}
+
+/* Create an ssa with TYPE to copy ssa VAR.  */
+static tree
+make_promoted_copy (tree var, gimple *def_stmt, tree type)
+{
+  tree new_lhs = make_ssa_name (type, def_stmt);
+  if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (var))
+    SSA_NAME_OCCURS_IN_ABNORMAL_PHI (new_lhs) = 1;
+  return new_lhs;
+}
+
+/* Zero/sign extend (depending on type) VAR and truncate to WIDTH bits.
+   Assign the zero/sign extended value in NEW_VAR.  gimple statement
+   that performs the zero/sign extension is returned.  */
+static gimple *
+zero_sign_extend_stmt (tree new_var, tree var, int width)
+{
+  gcc_assert (TYPE_PRECISION (TREE_TYPE (var))
+	      == TYPE_PRECISION (TREE_TYPE (new_var)));
+  gcc_assert (TYPE_PRECISION (TREE_TYPE (var)) > width);
+  gimple *stmt;
+
+  if (TYPE_UNSIGNED (TREE_TYPE (new_var)))
+    {
+      /* Zero extend.  */
+      tree cst
+	= wide_int_to_tree (TREE_TYPE (var),
+			    wi::mask (width, false,
+				      TYPE_PRECISION (TREE_TYPE (var))));
+      stmt = gimple_build_assign (new_var, BIT_AND_EXPR,
+				  var, cst);
+    }
+  else
+    /* Sign extend.  */
+    stmt = gimple_build_assign (new_var,
+				SEXT_EXPR,
+				var, build_int_cst (TREE_TYPE (var), width));
+  return stmt;
+}
+
+
+void duplicate_default_ssa (tree to, tree from)
+{
+  SET_SSA_NAME_VAR_OR_IDENTIFIER (to, SSA_NAME_VAR (from));
+  SSA_NAME_IS_DEFAULT_DEF (to) = SSA_NAME_IS_DEFAULT_DEF (from);
+  SSA_NAME_DEF_STMT (to) = SSA_NAME_DEF_STMT (from);
+  SET_SSA_NAME_VAR_OR_IDENTIFIER (from, NULL_TREE);
+  SSA_NAME_IS_DEFAULT_DEF (to) = 1;
+  SSA_NAME_IS_DEFAULT_DEF (from) = 0;
+}
+
+/* Promote definition DEF to PROMOTED_TYPE.  If the stmt that defines def
+   is def_stmt, make the type of def promoted_type.  If the stmt is such
+   that, result of the def_stmt cannot be of promoted_type, create a new_def
+   of the original_type and make the def_stmt assign its value to newdef.
+   Then, create a CONVERT_EXPR to convert new_def to def of promoted type.
+
+   For example, for stmt with original_type char and promoted_type int:
+		char _1 = mem;
+	becomes:
+		char _2 = mem;
+		int _1 = (int)_2;
+
+   If the def_stmt allows def to be promoted, promote def in-place
+   (and its arguments when needed).
+
+   For example:
+		char _3 = _1 + _2;
+	becomes:
+		int _3 = _1 + _2;
+   Here, _1 and _2 will also be promoted.  */
+
+static void
+promote_ssa (tree def,
+	     tree promoted_type)
+{
+  gimple *def_stmt = SSA_NAME_DEF_STMT (def);
+  gimple *copy_stmt = NULL;
+  basic_block bb;
+  gimple_stmt_iterator gsi;
+  tree original_type = TREE_TYPE (def);
+  tree new_def;
+  bool do_not_promote = false;
+
+  switch (gimple_code (def_stmt))
+    {
+    case GIMPLE_PHI:
+	{
+	  /* Promote def by fixing its type and make def anonymous.  */
+	  TREE_TYPE (def) = promoted_type;
+	  SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+	  promote_cst_in_stmt (def_stmt, promoted_type);
+	  break;
+	}
+
+    case GIMPLE_ASM:
+	{
+	  gasm *asm_stmt = as_a <gasm *> (def_stmt);
+	  for (unsigned int i = 0; i < gimple_asm_noutputs (asm_stmt); ++i)
+	    {
+	      /* Promote def and copy (i.e. convert) the value defined
+		 by asm to def.  */
+	      tree link = gimple_asm_output_op (asm_stmt, i);
+	      tree op = TREE_VALUE (link);
+	      if (op == def)
+		{
+		  new_def = copy_ssa_name (def);
+		  set_ssa_promoted (new_def);
+		  duplicate_default_ssa (new_def, def);
+		  TREE_VALUE (link) = new_def;
+		  gimple_asm_set_output_op (asm_stmt, i, link);
+
+		  TREE_TYPE (def) = promoted_type;
+		  copy_stmt = gimple_build_assign (def, CONVERT_EXPR,
+						   new_def, NULL_TREE);
+		  gsi = gsi_for_stmt (def_stmt);
+		  SSA_NAME_IS_DEFAULT_DEF (new_def) = 0;
+		  gsi_insert_after (&gsi, copy_stmt, GSI_NEW_STMT);
+		  break;
+		}
+	    }
+	  break;
+	}
+
+    case GIMPLE_NOP:
+	{
+	  if (SSA_NAME_VAR (def) == NULL)
+	    {
+	      /* Promote def by fixing its type for anonymous def.  */
+	      TREE_TYPE (def) = promoted_type;
+	    }
+	  else
+	    {
+	      /* Create a promoted copy of parameters.  */
+	      bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+	      gcc_assert (bb);
+	      gsi = gsi_after_labels (bb);
+	      new_def = copy_ssa_name (def);
+	      set_ssa_promoted (new_def);
+	      set_ssa_default_def (cfun, SSA_NAME_VAR (def), new_def);
+	      duplicate_default_ssa (new_def, def);
+	      TREE_TYPE (def) = promoted_type;
+	      copy_stmt = gimple_build_assign (def, CONVERT_EXPR,
+					       new_def, NULL_TREE);
+	      SSA_NAME_DEF_STMT (def) = copy_stmt;
+	      gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+	    }
+	  break;
+	}
+
+    case GIMPLE_ASSIGN:
+	{
+	  enum tree_code code = gimple_assign_rhs_code (def_stmt);
+	  if (!safe_to_promote_def_p (def_stmt))
+	    {
+	      do_not_promote = true;
+	    }
+	  else if (CONVERT_EXPR_CODE_P (code))
+	    {
+	      tree rhs = gimple_assign_rhs1 (def_stmt);
+	      if (!type_precision_ok (TREE_TYPE (rhs)))
+		{
+		  do_not_promote = true;
+		}
+	      else if (types_compatible_p (TREE_TYPE (rhs), promoted_type))
+		{
+		  /* As we travel statements in dominated order, arguments
+		     of def_stmt will be visited before visiting def.  If RHS
+		     is already promoted and type is compatible, we can convert
+		     them into ZERO/SIGN EXTEND stmt.  */
+		  tree &type = original_type_map->get_or_insert (rhs);
+		  if (type == NULL_TREE)
+		    type = TREE_TYPE (rhs);
+		  if ((TYPE_PRECISION (original_type) > TYPE_PRECISION (type))
+		      || (TYPE_UNSIGNED (original_type) != TYPE_UNSIGNED (type)))
+		    {
+		      tree &type = original_type_map->get_or_insert (rhs);
+		      if (type == NULL_TREE)
+			type = TREE_TYPE (rhs);
+		      if (TYPE_PRECISION (original_type) < TYPE_PRECISION (type))
+			type = original_type;
+		      gcc_assert (type != NULL_TREE);
+		      TREE_TYPE (def) = promoted_type;
+		      gimple *copy_stmt =
+			zero_sign_extend_stmt (def, rhs,
+					       TYPE_PRECISION (type));
+		      SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		      gsi = gsi_for_stmt (def_stmt);
+		      gsi_replace (&gsi, copy_stmt, false);
+		    }
+		  else
+		    {
+		      TREE_TYPE (def) = promoted_type;
+		      SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		    }
+		}
+	      else
+		{
+		  /* If RHS is not promoted OR their types are not
+		     compatible, create CONVERT_EXPR that converts
+		     RHS to  promoted DEF type and perform a
+		     ZERO/SIGN EXTEND to get the required value
+		     from RHS.  */
+		  tree s = (TYPE_PRECISION (TREE_TYPE (def))
+			    < TYPE_PRECISION (TREE_TYPE (rhs)))
+		    ? TREE_TYPE (def) : TREE_TYPE (rhs);
+		  new_def = copy_ssa_name (def);
+		  set_ssa_promoted (new_def);
+		  TREE_TYPE (def) = promoted_type;
+		  TREE_TYPE (new_def) = promoted_type;
+		  SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		  SET_SSA_NAME_VAR_OR_IDENTIFIER (new_def, NULL_TREE);
+		  gimple_set_lhs (def_stmt, new_def);
+		  gimple *copy_stmt =
+		    zero_sign_extend_stmt (def, new_def,
+					   TYPE_PRECISION (s));
+		  gsi = gsi_for_stmt (def_stmt);
+		  if (lookup_stmt_eh_lp (def_stmt) > 0
+		      || (gimple_code (def_stmt) == GIMPLE_CALL
+			  && gimple_call_ctrl_altering_p (def_stmt)))
+		    gsi_insert_on_edge (FALLTHRU_EDGE (gimple_bb (def_stmt)),
+						  copy_stmt);
+		  else
+		    gsi_insert_after (&gsi, copy_stmt, GSI_NEW_STMT);
+	      }
+	    }
+	  else
+	    {
+	      /* Promote def by fixing its type and make def anonymous.  */
+	      SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+	      promote_cst_in_stmt (def_stmt, promoted_type);
+	      TREE_TYPE (def) = promoted_type;
+	    }
+	  break;
+	}
+
+    default:
+      do_not_promote = true;
+      break;
+    }
+
+  if (do_not_promote)
+    {
+      /* Promote def and copy (i.e. convert) the value defined
+	 by the stmt that cannot be promoted.  */
+      new_def = copy_ssa_name (def);
+      set_ssa_promoted (new_def);
+      SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+      TREE_TYPE (def) = promoted_type;
+      gimple_set_lhs (def_stmt, new_def);
+      copy_stmt = gimple_build_assign (def, CONVERT_EXPR,
+				       new_def, NULL_TREE);
+      gsi = gsi_for_stmt (def_stmt);
+      if (lookup_stmt_eh_lp (def_stmt) > 0
+	  || (gimple_code (def_stmt) == GIMPLE_CALL
+	      && gimple_call_ctrl_altering_p (def_stmt)))
+	gsi_insert_on_edge (FALLTHRU_EDGE (gimple_bb (def_stmt)),
+				      copy_stmt);
+      else
+	gsi_insert_after (&gsi, copy_stmt, GSI_NEW_STMT);
+    }
+  else
+    {
+      /* Type is now promoted.  Due to this, some of the value ranges computed
+	 by VRP1 will is invalid.  TODO: We can be intelligent in deciding
+	 which ranges to be invalidated instead of invalidating everything.  */
+      SSA_NAME_RANGE_INFO (def) = NULL;
+    }
+}
+
+/* Fix the (promoted) USE in stmts where USE cannot be be promoted.  */
+static unsigned int
+fixup_uses (tree use, tree promoted_type, tree old_type)
+{
+  gimple *stmt;
+  imm_use_iterator ui;
+  gimple_stmt_iterator gsi;
+  use_operand_p op;
+
+  FOR_EACH_IMM_USE_STMT (stmt, ui, use)
+    {
+      bool do_not_promote = false;
+      switch (gimple_code (stmt))
+	{
+	case GIMPLE_DEBUG:
+	    {
+	      FOR_EACH_IMM_USE_ON_STMT (op, ui)
+		SET_USE (op, fold_convert (old_type, use));
+	      update_stmt (stmt);
+	    }
+	  break;
+
+	case GIMPLE_ASM:
+	case GIMPLE_CALL:
+	case GIMPLE_RETURN:
+	    {
+	      /* USE cannot be promoted here.  */
+	      do_not_promote = true;
+	      break;
+	    }
+
+	case GIMPLE_ASSIGN:
+	    {
+	      enum tree_code code = gimple_assign_rhs_code (stmt);
+	      tree lhs = gimple_assign_lhs (stmt);
+	      if (!safe_to_promote_use_p (stmt))
+		{
+		  do_not_promote = true;
+		}
+	      else if (truncate_use_p (stmt)
+			 || !INTEGRAL_TYPE_P (TREE_TYPE (lhs)))
+		{
+		  if (TREE_CODE_CLASS (code)
+		      == tcc_comparison)
+		    promote_cst_in_stmt (stmt, promoted_type, true);
+		  if (!ssa_overflows_p (use))
+		    break;
+		  /* In some stmts, value in USE has to be ZERO/SIGN
+		     Extended based on the original type for correct
+		     result.  */
+		  tree temp = make_promoted_copy (use, NULL, TREE_TYPE (use));
+		  gimple *copy_stmt =
+		    zero_sign_extend_stmt (temp, use,
+					   TYPE_PRECISION (old_type));
+		  gsi = gsi_for_stmt (stmt);
+		  gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+
+		  FOR_EACH_IMM_USE_ON_STMT (op, ui)
+		    SET_USE (op, temp);
+		  update_stmt (stmt);
+		}
+	      else if (CONVERT_EXPR_CODE_P (code))
+		{
+		  tree rhs = gimple_assign_rhs1 (stmt);
+		  if (!type_precision_ok (TREE_TYPE (rhs)))
+		    {
+		      do_not_promote = true;
+		    }
+		  else if (types_compatible_p (TREE_TYPE (lhs), promoted_type))
+		    {
+		      /* Type of LHS and promoted RHS are compatible, we can
+			 convert this into ZERO/SIGN EXTEND stmt.  */
+		      gimple *copy_stmt =
+			zero_sign_extend_stmt (lhs, use,
+					       TYPE_PRECISION (old_type));
+		      gsi = gsi_for_stmt (stmt);
+		      set_ssa_promoted (lhs);
+		      gsi_replace (&gsi, copy_stmt, false);
+		    }
+		  else if (tobe_promoted_p (lhs))
+		    {
+		      /* If LHS will be promoted later, store the original
+			 type of RHS so that we can convert it to ZERO/SIGN
+			 EXTEND when LHS is promoted.  */
+		      tree rhs = gimple_assign_rhs1 (stmt);
+		      tree &type = original_type_map->get_or_insert (rhs);
+		      type = TREE_TYPE (old_type);
+		    }
+		  else
+		    {
+		      do_not_promote = true;
+		    }
+		}
+	      break;
+	    }
+
+	case GIMPLE_COND:
+	  if (ssa_overflows_p (use))
+	    {
+	      /* In GIMPLE_COND, value in USE has to be ZERO/SIGN
+		 Extended based on the original type for correct
+		 result.  */
+	      tree temp = make_promoted_copy (use, NULL, TREE_TYPE (use));
+	      gimple *copy_stmt =
+		zero_sign_extend_stmt (temp, use,
+				       TYPE_PRECISION (old_type));
+	      gsi = gsi_for_stmt (stmt);
+	      gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+
+	      FOR_EACH_IMM_USE_ON_STMT (op, ui)
+		SET_USE (op, temp);
+	    }
+	  promote_cst_in_stmt (stmt, promoted_type, true);
+	  update_stmt (stmt);
+	  break;
+
+	default:
+	  break;
+	}
+
+      if (do_not_promote)
+	{
+	  /* FOR stmts where USE canoot be promoted, create an
+	     original type copy.  */
+	  tree temp;
+	  temp = copy_ssa_name (use);
+	  set_ssa_promoted (temp);
+	  TREE_TYPE (temp) = old_type;
+	  gimple *copy_stmt = gimple_build_assign (temp, CONVERT_EXPR,
+						  use, NULL_TREE);
+	  gsi = gsi_for_stmt (stmt);
+	  gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+	  FOR_EACH_IMM_USE_ON_STMT (op, ui)
+	    SET_USE (op, temp);
+	  update_stmt (stmt);
+	}
+    }
+  return 0;
+}
+void debug_tree (tree);
+
+/* Promote definition of NAME and adjust its uses if necessary.  */
+static unsigned int
+promote_ssa_if_not_promoted (tree name)
+{
+  tree type;
+  if (tobe_promoted_p (name))
+    {
+      type = get_promoted_type (TREE_TYPE (name));
+      tree old_type = TREE_TYPE (name);
+      promote_ssa (name, type);
+      set_ssa_promoted (name);
+      fixup_uses (name, type, old_type);
+    }
+  return 0;
+}
+
+/* Promote all the stmts in the basic block.  */
+static void
+promote_all_stmts (basic_block bb)
+{
+  gimple_stmt_iterator gsi;
+  ssa_op_iter iter;
+  tree def;
+
+  for (gphi_iterator gpi = gsi_start_phis (bb);
+       !gsi_end_p (gpi); gsi_next (&gpi))
+    {
+      gphi *phi = gpi.phi ();
+      use_operand_p op;
+
+      FOR_EACH_PHI_ARG (op, phi, iter, SSA_OP_USE)
+	{
+	  def = USE_FROM_PTR (op);
+	  promote_ssa_if_not_promoted (def);
+	}
+      def = PHI_RESULT (phi);
+      promote_ssa_if_not_promoted (def);
+    }
+  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+    {
+      gimple *stmt = gsi_stmt (gsi);
+
+      FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_USE | SSA_OP_DEF)
+	promote_ssa_if_not_promoted (def);
+    }
+}
+
+
+class type_promotion_dom_walker : public dom_walker
+{
+public:
+  type_promotion_dom_walker (cdi_direction direction)
+    : dom_walker (direction) {}
+  virtual void before_dom_children (basic_block bb)
+    {
+      promote_all_stmts (bb);
+    }
+};
+
+/* Main entry point to the pass.  */
+static unsigned int
+execute_type_promotion (void)
+{
+  n_ssa_val = num_ssa_names;
+  original_type_map = new hash_map<tree, tree>;
+  ssa_to_be_promoted_bitmap = sbitmap_alloc (n_ssa_val);
+  bitmap_clear (ssa_to_be_promoted_bitmap);
+  ssa_sets_higher_bits_bitmap = sbitmap_alloc (n_ssa_val);
+  bitmap_clear (ssa_sets_higher_bits_bitmap);
+
+  calculate_dominance_info (CDI_DOMINATORS);
+  process_all_stmts_for_unsafe_promotion ();
+  /* Walk the CFG in dominator order.  */
+  type_promotion_dom_walker (CDI_DOMINATORS)
+    .walk (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+  gsi_commit_edge_inserts ();
+  sbitmap_free (ssa_to_be_promoted_bitmap);
+  sbitmap_free (ssa_sets_higher_bits_bitmap);
+  delete original_type_map;
+  return 0;
+}
+
+namespace {
+const pass_data pass_data_type_promotion =
+{
+  GIMPLE_PASS, /* type */
+  "promotion", /* name */
+  OPTGROUP_NONE, /* optinfo_flags */
+  TV_TREE_TYPE_PROMOTE, /* tv_id */
+  PROP_ssa, /* properties_required */
+  0, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  (TODO_cleanup_cfg | TODO_update_ssa | TODO_verify_all),
+};
+
+class pass_type_promotion : public gimple_opt_pass
+{
+public:
+  pass_type_promotion (gcc::context *ctxt)
+    : gimple_opt_pass (pass_data_type_promotion, ctxt)
+  {}
+
+  /* opt_pass methods: */
+  opt_pass * clone () { return new pass_type_promotion (m_ctxt); }
+  virtual bool gate (function *) { return flag_tree_type_promote != 0; }
+  virtual unsigned int execute (function *)
+    {
+      return execute_type_promotion ();
+    }
+
+}; // class pass_type_promotion
+
+} // anon namespace
+
+gimple_opt_pass *
+make_pass_type_promote (gcc::context *ctxt)
+{
+  return new pass_type_promotion (ctxt);
+}
+
diff --git a/gcc/passes.def b/gcc/passes.def
index 36d2b3b..78c463a 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -272,6 +272,7 @@ along with GCC; see the file COPYING3.  If not see
       POP_INSERT_PASSES ()
       NEXT_PASS (pass_simduid_cleanup);
       NEXT_PASS (pass_lower_vector_ssa);
+      NEXT_PASS (pass_type_promote);
       NEXT_PASS (pass_cse_reciprocals);
       NEXT_PASS (pass_reassoc);
       NEXT_PASS (pass_strength_reduction);
diff --git a/gcc/timevar.def b/gcc/timevar.def
index b429faf..a8d40c3 100644
--- a/gcc/timevar.def
+++ b/gcc/timevar.def
@@ -278,6 +278,7 @@ DEFTIMEVAR (TV_VTABLE_VERIFICATION   , "vtable verification")
 DEFTIMEVAR (TV_TREE_UBSAN            , "tree ubsan")
 DEFTIMEVAR (TV_INITIALIZE_RTL        , "initialize rtl")
 DEFTIMEVAR (TV_GIMPLE_LADDRESS       , "address lowering")
+DEFTIMEVAR (TV_TREE_TYPE_PROMOTE     , "tree type promote")
 
 /* Everything else in rest_of_compilation not included above.  */
 DEFTIMEVAR (TV_EARLY_LOCAL	     , "early local passes")
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index 333b5a7..449dd19 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -436,6 +436,7 @@ extern gimple_opt_pass *make_pass_fre (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_check_data_deps (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_copy_prop (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_isolate_erroneous_paths (gcc::context *ctxt);
+extern gimple_opt_pass *make_pass_type_promote (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_vrp (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_uncprop (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_return_slot (gcc::context *ctxt);
diff --git a/gcc/tree-ssanames.c b/gcc/tree-ssanames.c
index 82fd4a1..80fcf70 100644
--- a/gcc/tree-ssanames.c
+++ b/gcc/tree-ssanames.c
@@ -207,7 +207,8 @@ set_range_info (tree name, enum value_range_type range_type,
   unsigned int precision = TYPE_PRECISION (TREE_TYPE (name));
 
   /* Allocate if not available.  */
-  if (ri == NULL)
+  if (ri == NULL
+      || (precision != ri->get_min ().get_precision ()))
     {
       size_t size = (sizeof (range_info_def)
 		     + trailing_wide_ints <3>::extra_size (precision));
-- 
1.9.1


[-- Attachment #4: 0001-Add-new-SEXT_EXPR-tree-code.patch --]
[-- Type: text/x-diff, Size: 5067 bytes --]

From c0ce364e3a422912a08189645efde46c36583753 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Thu, 22 Oct 2015 10:51:42 +1100
Subject: [PATCH 1/3] Add new SEXT_EXPR tree code

---
 gcc/cfgexpand.c         | 12 ++++++++++++
 gcc/expr.c              | 20 ++++++++++++++++++++
 gcc/fold-const.c        |  4 ++++
 gcc/tree-cfg.c          | 12 ++++++++++++
 gcc/tree-inline.c       |  1 +
 gcc/tree-pretty-print.c | 11 +++++++++++
 gcc/tree.def            |  5 +++++
 7 files changed, 65 insertions(+)

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index eaad859..aeb64bb 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -5054,6 +5054,18 @@ expand_debug_expr (tree exp)
     case FMA_EXPR:
       return simplify_gen_ternary (FMA, mode, inner_mode, op0, op1, op2);
 
+    case SEXT_EXPR:
+      gcc_assert (CONST_INT_P (op1));
+      inner_mode = mode_for_size (INTVAL (op1), MODE_INT, 0);
+      gcc_assert (GET_MODE_BITSIZE (inner_mode) == INTVAL (op1));
+
+      if (mode != inner_mode)
+	op0 = simplify_gen_unary (SIGN_EXTEND,
+				  mode,
+				  gen_lowpart_SUBREG (inner_mode, op0),
+				  inner_mode);
+      return op0;
+
     default:
     flag_unsupported:
 #ifdef ENABLE_CHECKING
diff --git a/gcc/expr.c b/gcc/expr.c
index da68870..c2f535f 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -9318,6 +9318,26 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
       target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, target);
       return target;
 
+    case SEXT_EXPR:
+	{
+	  machine_mode inner_mode = mode_for_size (tree_to_uhwi (treeop1),
+						   MODE_INT, 0);
+	  rtx temp, result;
+	  rtx op0 = expand_normal (treeop0);
+	  op0 = force_reg (mode, op0);
+	  if (mode != inner_mode)
+	    {
+	      result = gen_reg_rtx (mode);
+	      temp = simplify_gen_unary (SIGN_EXTEND, mode,
+					 gen_lowpart_SUBREG (inner_mode, op0),
+					 inner_mode);
+	      convert_move (result, temp, 0);
+	    }
+	  else
+	    result = op0;
+	  return result;
+	}
+
     default:
       gcc_unreachable ();
     }
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 602ea24..a149bad 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -987,6 +987,10 @@ int_const_binop_1 (enum tree_code code, const_tree arg1, const_tree parg2,
       res = wi::bit_and (arg1, arg2);
       break;
 
+    case SEXT_EXPR:
+      res = wi::sext (arg1, arg2.to_uhwi ());
+      break;
+
     case RSHIFT_EXPR:
     case LSHIFT_EXPR:
       if (wi::neg_p (arg2))
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index 8e3e810..d18b3f7 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -3752,6 +3752,18 @@ verify_gimple_assign_binary (gassign *stmt)
         return false;
       }
 
+    case SEXT_EXPR:
+      {
+	if (!INTEGRAL_TYPE_P (lhs_type)
+	    || !useless_type_conversion_p (lhs_type, rhs1_type)
+	    || !tree_fits_uhwi_p (rhs2))
+	  {
+	    error ("invalid operands in sext expr");
+	    return true;
+	  }
+	return false;
+      }
+
     case VEC_WIDEN_LSHIFT_HI_EXPR:
     case VEC_WIDEN_LSHIFT_LO_EXPR:
       {
diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index b8269ef..e61c200 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -3893,6 +3893,7 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights,
     case BIT_XOR_EXPR:
     case BIT_AND_EXPR:
     case BIT_NOT_EXPR:
+    case SEXT_EXPR:
 
     case TRUTH_ANDIF_EXPR:
     case TRUTH_ORIF_EXPR:
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index 11f90051..bec9082 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -1923,6 +1923,14 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, int flags,
       }
       break;
 
+    case SEXT_EXPR:
+      pp_string (pp, "SEXT_EXPR <");
+      dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);
+      pp_string (pp, ", ");
+      dump_generic_node (pp, TREE_OPERAND (node, 1), spc, flags, false);
+      pp_greater (pp);
+      break;
+
     case MODIFY_EXPR:
     case INIT_EXPR:
       dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags,
@@ -3561,6 +3569,9 @@ op_symbol_code (enum tree_code code)
     case MIN_EXPR:
       return "min";
 
+    case SEXT_EXPR:
+      return "sext";
+
     default:
       return "<<< ??? >>>";
     }
diff --git a/gcc/tree.def b/gcc/tree.def
index d0a3bd6..789cfdd 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -760,6 +760,11 @@ DEFTREECODE (BIT_XOR_EXPR, "bit_xor_expr", tcc_binary, 2)
 DEFTREECODE (BIT_AND_EXPR, "bit_and_expr", tcc_binary, 2)
 DEFTREECODE (BIT_NOT_EXPR, "bit_not_expr", tcc_unary, 1)
 
+/*  Sign-extend operation.  It will sign extend first operand from
+ the sign bit specified by the second operand.  The type of the
+ result is that of the first operand.  */
+DEFTREECODE (SEXT_EXPR, "sext_expr", tcc_binary, 2)
+
 /* ANDIF and ORIF allow the second operand not to be computed if the
    value of the expression is determined from the first operand.  AND,
    OR, and XOR always compute the second operand whether its value is
-- 
1.9.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [0/7] Type promotion pass and elimination of zext/sext
  2015-10-27  1:48         ` kugan
@ 2015-10-28 15:51           ` Richard Biener
  2015-11-02  9:17             ` Kugan
  0 siblings, 1 reply; 28+ messages in thread
From: Richard Biener @ 2015-10-28 15:51 UTC (permalink / raw)
  To: kugan; +Cc: gcc-patches

On Tue, Oct 27, 2015 at 1:50 AM, kugan
<kugan.vivekanandarajah@linaro.org> wrote:
>
>
> On 23/10/15 01:23, Richard Biener wrote:
>>
>> On Thu, Oct 22, 2015 at 12:50 PM, Kugan
>> <kugan.vivekanandarajah@linaro.org> wrote:
>>>
>>>
>>>
>>> On 21/10/15 23:45, Richard Biener wrote:
>>>>
>>>> On Tue, Oct 20, 2015 at 10:03 PM, Kugan
>>>> <kugan.vivekanandarajah@linaro.org> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On 07/09/15 12:53, Kugan wrote:
>>>>>>
>>>>>>
>>>>>> This a new version of the patch posted in
>>>>>> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done
>>>>>> more testing and spitted the patch to make it more easier to review.
>>>>>> There are still couple of issues to be addressed and I am working on
>>>>>> them.
>>>>>>
>>>>>> 1. AARCH64 bootstrap now fails with the commit
>>>>>> 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is
>>>>>> mis-compiled
>>>>>> in stage2 and fwprop.c is failing. It looks to me that there is a
>>>>>> latent
>>>>>> issue which gets exposed my patch. I can also reproduce this in x86_64
>>>>>> if I use the same PROMOTE_MODE which is used in aarch64 port. For the
>>>>>> time being, I am using  patch
>>>>>> 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a
>>>>>> workaround. This meeds to be fixed before the patches are ready to be
>>>>>> committed.
>>>>>>
>>>>>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with
>>>>>> -O3 -g Error: unaligned opcodes detected in executable segment. It
>>>>>> works
>>>>>> fine if I remove the -g. I am looking into it and needs to be fixed as
>>>>>> well.
>>>>>
>>>>>
>>>>> Hi Richard,
>>>>>
>>>>> Now that stage 1 is going to close, I would like to get these patches
>>>>> accepted for stage1. I will try my best to address your review comments
>>>>> ASAP.
>>>>
>>>>
>>>> Ok, can you make the whole patch series available so I can poke at the
>>>> implementation a bit?  Please state the revision it was rebased on
>>>> (or point me to a git/svn branch the work resides on).
>>>>
>>>
>>> Thanks. Please find the patched rebated against trunk@229156. I have
>>> skipped the test-case readjustment patches.
>>
>>
>> Some quick observations.  On x86_64 when building
>
>
> Hi Richard,
>
> Thanks for the review.
>
>>
>> short bar (short y);
>> int foo (short x)
>> {
>>    short y = bar (x) + 15;
>>    return y;
>> }
>>
>> with -m32 -O2 -mtune=pentiumpro (which ends up promoting HImode regs)
>> I get
>>
>>    <bb 2>:
>>    _1 = (int) x_10(D);
>>    _2 = (_1) sext (16);
>>    _11 = bar (_2);
>>    _5 = (int) _11;
>>    _12 = (unsigned int) _5;
>>    _6 = _12 & 65535;
>>    _7 = _6 + 15;
>>    _13 = (int) _7;
>>    _8 = (_13) sext (16);
>>    _9 = (_8) sext (16);
>>    return _9;
>>
>> which looks fine but the VRP optimization doesn't trigger for the
>> redundant sext
>> (ranges are computed correctly but the 2nd extension is not removed).
>>
>> This also makes me notice trivial match.pd patterns are missing, like
>> for example
>>
>> (simplify
>>   (sext (sext@2 @0 @1) @3)
>>   (if (tree_int_cst_compare (@1, @3) <= 0)
>>    @2
>>    (sext @0 @3)))
>>
>> as VRP doesn't run at -O1 we must rely on those to remove rendudant
>> extensions,
>> otherwise generated code might get worse compared to without the pass(?)
>
>
> Do you think that we should enable this pass only when vrp is enabled.
> Otherwise, even when we do the simple optimizations you mentioned below, we
> might not be able to remove all the redundancies.
>
>>
>> I also notice that the 'short' argument does not get it's sign-extension
>> removed
>> as redundand either even though we have
>>
>> _1 = (int) x_8(D);
>> Found new range for _1: [-32768, 32767]
>>
>
> I am looking into it.
>
>> In the end I suspect that keeping track of the "simple" cases in the
>> promotion
>> pass itself (by keeping a lattice) might be a good idea (after we fix VRP
>> to do
>> its work).  In some way whether the ABI guarantees promoted argument
>> registers might need some other target hook queries.
>>
>> Now onto the 0002 patch.
>>
>> +static bool
>> +type_precision_ok (tree type)
>> +{
>> +  return (TYPE_PRECISION (type)  == 8
>> +         || TYPE_PRECISION (type) == 16
>> +         || TYPE_PRECISION (type) == 32);
>> +}
>>
>> that's a weird function to me.  You probably want
>> TYPE_PRECISION (type) == GET_MODE_PRECISION (TYPE_MODE (type))
>> here?  And guard that thing with POINTER_TYPE_P || INTEGRAL_TYPE_P?
>>
>
> I will change this. (I have a patch which I am testing with other changes
> you have asked for)
>
>
>> +/* Return the promoted type for TYPE.  */
>> +static tree
>> +get_promoted_type (tree type)
>> +{
>> +  tree promoted_type;
>> +  enum machine_mode mode;
>> +  int uns;
>> +  if (POINTER_TYPE_P (type)
>> +      || !INTEGRAL_TYPE_P (type)
>> +      || !type_precision_ok (type))
>> +    return type;
>> +
>> +  mode = TYPE_MODE (type);
>> +#ifdef PROMOTE_MODE
>> +  uns = TYPE_SIGN (type);
>> +  PROMOTE_MODE (mode, uns, type);
>> +#endif
>> +  uns = TYPE_SIGN (type);
>> +  promoted_type = lang_hooks.types.type_for_mode (mode, uns);
>> +  if (promoted_type
>> +      && (TYPE_PRECISION (promoted_type) > TYPE_PRECISION (type)))
>> +    type = promoted_type;
>>
>> I think what you want to verify is that TYPE_PRECISION (promoted_type)
>> == GET_MODE_PRECISION (mode).
>> And to not even bother with this simply use
>>
>> promoted_type = build_nonstandard_integer_type (GET_MODE_PRECISION (mode),
>> uns);
>>
>
> I am changing this too.
>
>> You use a domwalk but also might create new basic-blocks during it
>> (insert_on_edge_immediate), that's a
>> no-no, commit edge inserts after the domwalk.
>
>
> I am sorry, I dont understand "commit edge inserts after the domwalk" Is
> there a way to do this in the current implementation?

Yes, simply use gsi_insert_on_edge () and after the domwalk is done do
gsi_commit_edge_inserts ().

>> ssa_sets_higher_bits_bitmap looks unused and
>> we generally don't free dominance info, so please don't do that.
>>
>> I fired off a bootstrap on ppc64-linux which fails building stage1 libgcc
>> with
>>
>> /abuild/rguenther/obj/./gcc/xgcc -B/abuild/rguenther/obj/./gcc/
>> -B/usr/local/powerpc64-unknown-linux-gnu/bin/
>> -B/usr/local/powerpc64-unknown-linux-gnu/lib/ -isystem
>> /usr/local/powerpc64-unknown-linux-gnu/include -isystem
>> /usr/local/powerpc64-unknown-linux-gnu/sys-include    -g -O2 -O2  -g
>> -O2 -DIN_GCC    -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual
>> -Wno-format -Wstrict-prototypes -Wmissing-prototypes
>> -Wold-style-definition  -isystem ./include   -fPIC -mlong-double-128
>> -mno-minimal-toc -g -DIN_LIBGCC2 -fbuilding-libgcc
>> -fno-stack-protector   -fPIC -mlong-double-128 -mno-minimal-toc -I.
>> -I. -I../.././gcc -I../../../trunk/libgcc -I../../../trunk/libgcc/.
>> -I../../../trunk/libgcc/../gcc -I../../../trunk/libgcc/../include
>> -I../../../trunk/libgcc/../libdecnumber/dpd
>> -I../../../trunk/libgcc/../libdecnumber -DHAVE_CC_TLS  -o _divdi3.o
>> -MT _divdi3.o -MD -MP -MF _divdi3.dep -DL_divdi3 -c
>> ../../../trunk/libgcc/libgcc2.c \
>>            -fexceptions -fnon-call-exceptions -fvisibility=hidden
>> -DHIDE_EXPORTS
>> In file included from ../../../trunk/libgcc/libgcc2.c:56:0:
>> ../../../trunk/libgcc/libgcc2.c: In function ‘__divti3’:
>> ../../../trunk/libgcc/libgcc2.h:193:20: internal compiler error: in
>> expand_debug_locations, at cfgexpand.c:5277
>>
>
> I am testing on gcc computefarm. I will get it to bootstrap and will do the
> regression testing before posting the next version.
>
>> as hinted at above a bootstrap on i?86 (yes, 32bit) with
>> --with-tune=pentiumpro might be another good testing candidate.
>>
>> +      FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_USE |
>> SSA_OP_DEF)
>> +       promote_def_and_uses (def);
>>
>> it looks like you are doing some redundant work by walking both defs
>> and uses of each stmt.  I'd say you should separate
>> def and use processing and use
>>
>>    FOR_EACH_SSA_USE_OPERAND (use, stmt, iter, SSA_OP_USE)
>>      promote_use (use);
>>    FOR_EACH_SSA_DEF_OPERAND (def, stmt, iter, SSA_OP_DEF)
>>      promote_def (def);
>>
>
> Name promote_def_and_uses in my implementation is a bit confusing. It is
> promoting the SSA_NAMEs. We only have to do that for the definitions if we
> can do the SSA_NAMEs defined by parameters.
>
> I also have a bitmap to see if we have promoted a variable and avoid doing
> it again. I will try to improve this.
>
>
>
>> this should make processing more efficient (memory local) compared to
>> doing the split handling
>> in promote_def_and_uses.
>>
>> I think it will be convenient to have a SSA name info structure where
>> you can remember the original
>> type a name was promoted from as well as whether it was promoted or
>> not.  This way adjusting
>> debug uses should be "trivial":
>>
>> +static unsigned int
>> +fixup_uses (tree use, tree promoted_type, tree old_type)
>> +{
>> +  gimple *stmt;
>> +  imm_use_iterator ui;
>> +  gimple_stmt_iterator gsi;
>> +  use_operand_p op;
>> +
>> +  FOR_EACH_IMM_USE_STMT (stmt, ui, use)
>> +    {
>> +      bool do_not_promote = false;
>> +      switch (gimple_code (stmt))
>> +       {
>> +       case GIMPLE_DEBUG:
>> +           {
>> +             gsi = gsi_for_stmt (stmt);
>> +             gsi_remove (&gsi, true);
>>
>> rather than doing the above you'd do sth like
>>
>>    SET_USE (use, fold_convert (old_type, new_def));
>>    update_stmt (stmt);
>>
>
> We do have these information (original type a name was promoted from as well
> as whether it was promoted or not). To make it easy to review, in the patch
> that adds the pass,I am removing these debug stmts. But in patch 4, I am
> trying to handle this properly. Maybe  I should combine them.

Yeah, it's a bit confusing otherwise.

>> note that while you may not be able to use promoted regs at all uses
>> (like calls or asms) you can promote all defs, if only with a compensation
>> statement after the original def.  The SSA name info struct can be used
>> to note down the actual SSA name holding the promoted def.
>>
>> The pass looks a lot better than last time (it's way smaller!) but
>> still needs some
>> improvements.  There are some more fishy details with respect to how you
>> allocate/change SSA names but I think those can be dealt with once the
>> basic structure looks how I like it to be.
>>
>
> I will post an updated patch in a day or two.

Thanks,
Richard.

> Thanks again,
> Kugan

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [0/7] Type promotion pass and elimination of zext/sext
  2015-10-22 14:24       ` Richard Biener
@ 2015-10-27  1:48         ` kugan
  2015-10-28 15:51           ` Richard Biener
  0 siblings, 1 reply; 28+ messages in thread
From: kugan @ 2015-10-27  1:48 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches



On 23/10/15 01:23, Richard Biener wrote:
> On Thu, Oct 22, 2015 at 12:50 PM, Kugan
> <kugan.vivekanandarajah@linaro.org> wrote:
>>
>>
>> On 21/10/15 23:45, Richard Biener wrote:
>>> On Tue, Oct 20, 2015 at 10:03 PM, Kugan
>>> <kugan.vivekanandarajah@linaro.org> wrote:
>>>>
>>>>
>>>> On 07/09/15 12:53, Kugan wrote:
>>>>>
>>>>> This a new version of the patch posted in
>>>>> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done
>>>>> more testing and spitted the patch to make it more easier to review.
>>>>> There are still couple of issues to be addressed and I am working on them.
>>>>>
>>>>> 1. AARCH64 bootstrap now fails with the commit
>>>>> 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is mis-compiled
>>>>> in stage2 and fwprop.c is failing. It looks to me that there is a latent
>>>>> issue which gets exposed my patch. I can also reproduce this in x86_64
>>>>> if I use the same PROMOTE_MODE which is used in aarch64 port. For the
>>>>> time being, I am using  patch
>>>>> 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a
>>>>> workaround. This meeds to be fixed before the patches are ready to be
>>>>> committed.
>>>>>
>>>>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with
>>>>> -O3 -g Error: unaligned opcodes detected in executable segment. It works
>>>>> fine if I remove the -g. I am looking into it and needs to be fixed as well.
>>>>
>>>> Hi Richard,
>>>>
>>>> Now that stage 1 is going to close, I would like to get these patches
>>>> accepted for stage1. I will try my best to address your review comments
>>>> ASAP.
>>>
>>> Ok, can you make the whole patch series available so I can poke at the
>>> implementation a bit?  Please state the revision it was rebased on
>>> (or point me to a git/svn branch the work resides on).
>>>
>>
>> Thanks. Please find the patched rebated against trunk@229156. I have
>> skipped the test-case readjustment patches.
>
> Some quick observations.  On x86_64 when building

Hi Richard,

Thanks for the review.
>
> short bar (short y);
> int foo (short x)
> {
>    short y = bar (x) + 15;
>    return y;
> }
>
> with -m32 -O2 -mtune=pentiumpro (which ends up promoting HImode regs)
> I get
>
>    <bb 2>:
>    _1 = (int) x_10(D);
>    _2 = (_1) sext (16);
>    _11 = bar (_2);
>    _5 = (int) _11;
>    _12 = (unsigned int) _5;
>    _6 = _12 & 65535;
>    _7 = _6 + 15;
>    _13 = (int) _7;
>    _8 = (_13) sext (16);
>    _9 = (_8) sext (16);
>    return _9;
>
> which looks fine but the VRP optimization doesn't trigger for the redundant sext
> (ranges are computed correctly but the 2nd extension is not removed).
>
> This also makes me notice trivial match.pd patterns are missing, like
> for example
>
> (simplify
>   (sext (sext@2 @0 @1) @3)
>   (if (tree_int_cst_compare (@1, @3) <= 0)
>    @2
>    (sext @0 @3)))
>
> as VRP doesn't run at -O1 we must rely on those to remove rendudant extensions,
> otherwise generated code might get worse compared to without the pass(?)

Do you think that we should enable this pass only when vrp is enabled. 
Otherwise, even when we do the simple optimizations you mentioned below, 
we might not be able to remove all the redundancies.

>
> I also notice that the 'short' argument does not get it's sign-extension removed
> as redundand either even though we have
>
> _1 = (int) x_8(D);
> Found new range for _1: [-32768, 32767]
>

I am looking into it.

> In the end I suspect that keeping track of the "simple" cases in the promotion
> pass itself (by keeping a lattice) might be a good idea (after we fix VRP to do
> its work).  In some way whether the ABI guarantees promoted argument
> registers might need some other target hook queries.
>
> Now onto the 0002 patch.
>
> +static bool
> +type_precision_ok (tree type)
> +{
> +  return (TYPE_PRECISION (type)  == 8
> +         || TYPE_PRECISION (type) == 16
> +         || TYPE_PRECISION (type) == 32);
> +}
>
> that's a weird function to me.  You probably want
> TYPE_PRECISION (type) == GET_MODE_PRECISION (TYPE_MODE (type))
> here?  And guard that thing with POINTER_TYPE_P || INTEGRAL_TYPE_P?
>

I will change this. (I have a patch which I am testing with other 
changes you have asked for)

> +/* Return the promoted type for TYPE.  */
> +static tree
> +get_promoted_type (tree type)
> +{
> +  tree promoted_type;
> +  enum machine_mode mode;
> +  int uns;
> +  if (POINTER_TYPE_P (type)
> +      || !INTEGRAL_TYPE_P (type)
> +      || !type_precision_ok (type))
> +    return type;
> +
> +  mode = TYPE_MODE (type);
> +#ifdef PROMOTE_MODE
> +  uns = TYPE_SIGN (type);
> +  PROMOTE_MODE (mode, uns, type);
> +#endif
> +  uns = TYPE_SIGN (type);
> +  promoted_type = lang_hooks.types.type_for_mode (mode, uns);
> +  if (promoted_type
> +      && (TYPE_PRECISION (promoted_type) > TYPE_PRECISION (type)))
> +    type = promoted_type;
>
> I think what you want to verify is that TYPE_PRECISION (promoted_type)
> == GET_MODE_PRECISION (mode).
> And to not even bother with this simply use
>
> promoted_type = build_nonstandard_integer_type (GET_MODE_PRECISION (mode), uns);
>

I am changing this too.

> You use a domwalk but also might create new basic-blocks during it
> (insert_on_edge_immediate), that's a
> no-no, commit edge inserts after the domwalk.

I am sorry, I dont understand "commit edge inserts after the domwalk" Is 
there a way to do this in the current implementation?

> ssa_sets_higher_bits_bitmap looks unused and
> we generally don't free dominance info, so please don't do that.
>
> I fired off a bootstrap on ppc64-linux which fails building stage1 libgcc with
>
> /abuild/rguenther/obj/./gcc/xgcc -B/abuild/rguenther/obj/./gcc/
> -B/usr/local/powerpc64-unknown-linux-gnu/bin/
> -B/usr/local/powerpc64-unknown-linux-gnu/lib/ -isystem
> /usr/local/powerpc64-unknown-linux-gnu/include -isystem
> /usr/local/powerpc64-unknown-linux-gnu/sys-include    -g -O2 -O2  -g
> -O2 -DIN_GCC    -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual
> -Wno-format -Wstrict-prototypes -Wmissing-prototypes
> -Wold-style-definition  -isystem ./include   -fPIC -mlong-double-128
> -mno-minimal-toc -g -DIN_LIBGCC2 -fbuilding-libgcc
> -fno-stack-protector   -fPIC -mlong-double-128 -mno-minimal-toc -I.
> -I. -I../.././gcc -I../../../trunk/libgcc -I../../../trunk/libgcc/.
> -I../../../trunk/libgcc/../gcc -I../../../trunk/libgcc/../include
> -I../../../trunk/libgcc/../libdecnumber/dpd
> -I../../../trunk/libgcc/../libdecnumber -DHAVE_CC_TLS  -o _divdi3.o
> -MT _divdi3.o -MD -MP -MF _divdi3.dep -DL_divdi3 -c
> ../../../trunk/libgcc/libgcc2.c \
>            -fexceptions -fnon-call-exceptions -fvisibility=hidden -DHIDE_EXPORTS
> In file included from ../../../trunk/libgcc/libgcc2.c:56:0:
> ../../../trunk/libgcc/libgcc2.c: In function ‘__divti3’:
> ../../../trunk/libgcc/libgcc2.h:193:20: internal compiler error: in
> expand_debug_locations, at cfgexpand.c:5277
>

I am testing on gcc computefarm. I will get it to bootstrap and will do 
the regression testing before posting the next version.

> as hinted at above a bootstrap on i?86 (yes, 32bit) with
> --with-tune=pentiumpro might be another good testing candidate.
>
> +      FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_USE | SSA_OP_DEF)
> +       promote_def_and_uses (def);
>
> it looks like you are doing some redundant work by walking both defs
> and uses of each stmt.  I'd say you should separate
> def and use processing and use
>
>    FOR_EACH_SSA_USE_OPERAND (use, stmt, iter, SSA_OP_USE)
>      promote_use (use);
>    FOR_EACH_SSA_DEF_OPERAND (def, stmt, iter, SSA_OP_DEF)
>      promote_def (def);
>

Name promote_def_and_uses in my implementation is a bit confusing. It is 
  promoting the SSA_NAMEs. We only have to do that for the definitions 
if we can do the SSA_NAMEs defined by parameters.

I also have a bitmap to see if we have promoted a variable and avoid 
doing it again. I will try to improve this.


> this should make processing more efficient (memory local) compared to
> doing the split handling
> in promote_def_and_uses.
>
> I think it will be convenient to have a SSA name info structure where
> you can remember the original
> type a name was promoted from as well as whether it was promoted or
> not.  This way adjusting
> debug uses should be "trivial":
>
> +static unsigned int
> +fixup_uses (tree use, tree promoted_type, tree old_type)
> +{
> +  gimple *stmt;
> +  imm_use_iterator ui;
> +  gimple_stmt_iterator gsi;
> +  use_operand_p op;
> +
> +  FOR_EACH_IMM_USE_STMT (stmt, ui, use)
> +    {
> +      bool do_not_promote = false;
> +      switch (gimple_code (stmt))
> +       {
> +       case GIMPLE_DEBUG:
> +           {
> +             gsi = gsi_for_stmt (stmt);
> +             gsi_remove (&gsi, true);
>
> rather than doing the above you'd do sth like
>
>    SET_USE (use, fold_convert (old_type, new_def));
>    update_stmt (stmt);
>

We do have these information (original type a name was promoted from as 
well as whether it was promoted or not). To make it easy to review, in 
the patch that adds the pass,I am removing these debug stmts. But in 
patch 4, I am trying to handle this properly. Maybe  I should combine them.

> note that while you may not be able to use promoted regs at all uses
> (like calls or asms) you can promote all defs, if only with a compensation
> statement after the original def.  The SSA name info struct can be used
> to note down the actual SSA name holding the promoted def.
>
> The pass looks a lot better than last time (it's way smaller!) but
> still needs some
> improvements.  There are some more fishy details with respect to how you
> allocate/change SSA names but I think those can be dealt with once the
> basic structure looks how I like it to be.
>

I will post an updated patch in a day or two.

Thanks again,
Kugan

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [0/7] Type promotion pass and elimination of zext/sext
  2015-10-22 11:01     ` Kugan
@ 2015-10-22 14:24       ` Richard Biener
  2015-10-27  1:48         ` kugan
  0 siblings, 1 reply; 28+ messages in thread
From: Richard Biener @ 2015-10-22 14:24 UTC (permalink / raw)
  To: Kugan; +Cc: gcc-patches

On Thu, Oct 22, 2015 at 12:50 PM, Kugan
<kugan.vivekanandarajah@linaro.org> wrote:
>
>
> On 21/10/15 23:45, Richard Biener wrote:
>> On Tue, Oct 20, 2015 at 10:03 PM, Kugan
>> <kugan.vivekanandarajah@linaro.org> wrote:
>>>
>>>
>>> On 07/09/15 12:53, Kugan wrote:
>>>>
>>>> This a new version of the patch posted in
>>>> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done
>>>> more testing and spitted the patch to make it more easier to review.
>>>> There are still couple of issues to be addressed and I am working on them.
>>>>
>>>> 1. AARCH64 bootstrap now fails with the commit
>>>> 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is mis-compiled
>>>> in stage2 and fwprop.c is failing. It looks to me that there is a latent
>>>> issue which gets exposed my patch. I can also reproduce this in x86_64
>>>> if I use the same PROMOTE_MODE which is used in aarch64 port. For the
>>>> time being, I am using  patch
>>>> 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a
>>>> workaround. This meeds to be fixed before the patches are ready to be
>>>> committed.
>>>>
>>>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with
>>>> -O3 -g Error: unaligned opcodes detected in executable segment. It works
>>>> fine if I remove the -g. I am looking into it and needs to be fixed as well.
>>>
>>> Hi Richard,
>>>
>>> Now that stage 1 is going to close, I would like to get these patches
>>> accepted for stage1. I will try my best to address your review comments
>>> ASAP.
>>
>> Ok, can you make the whole patch series available so I can poke at the
>> implementation a bit?  Please state the revision it was rebased on
>> (or point me to a git/svn branch the work resides on).
>>
>
> Thanks. Please find the patched rebated against trunk@229156. I have
> skipped the test-case readjustment patches.

Some quick observations.  On x86_64 when building

short bar (short y);
int foo (short x)
{
  short y = bar (x) + 15;
  return y;
}

with -m32 -O2 -mtune=pentiumpro (which ends up promoting HImode regs)
I get

  <bb 2>:
  _1 = (int) x_10(D);
  _2 = (_1) sext (16);
  _11 = bar (_2);
  _5 = (int) _11;
  _12 = (unsigned int) _5;
  _6 = _12 & 65535;
  _7 = _6 + 15;
  _13 = (int) _7;
  _8 = (_13) sext (16);
  _9 = (_8) sext (16);
  return _9;

which looks fine but the VRP optimization doesn't trigger for the redundant sext
(ranges are computed correctly but the 2nd extension is not removed).

This also makes me notice trivial match.pd patterns are missing, like
for example

(simplify
 (sext (sext@2 @0 @1) @3)
 (if (tree_int_cst_compare (@1, @3) <= 0)
  @2
  (sext @0 @3)))

as VRP doesn't run at -O1 we must rely on those to remove rendudant extensions,
otherwise generated code might get worse compared to without the pass(?)

I also notice that the 'short' argument does not get it's sign-extension removed
as redundand either even though we have

_1 = (int) x_8(D);
Found new range for _1: [-32768, 32767]

In the end I suspect that keeping track of the "simple" cases in the promotion
pass itself (by keeping a lattice) might be a good idea (after we fix VRP to do
its work).  In some way whether the ABI guarantees promoted argument
registers might need some other target hook queries.

Now onto the 0002 patch.

+static bool
+type_precision_ok (tree type)
+{
+  return (TYPE_PRECISION (type)  == 8
+         || TYPE_PRECISION (type) == 16
+         || TYPE_PRECISION (type) == 32);
+}

that's a weird function to me.  You probably want
TYPE_PRECISION (type) == GET_MODE_PRECISION (TYPE_MODE (type))
here?  And guard that thing with POINTER_TYPE_P || INTEGRAL_TYPE_P?

+/* Return the promoted type for TYPE.  */
+static tree
+get_promoted_type (tree type)
+{
+  tree promoted_type;
+  enum machine_mode mode;
+  int uns;
+  if (POINTER_TYPE_P (type)
+      || !INTEGRAL_TYPE_P (type)
+      || !type_precision_ok (type))
+    return type;
+
+  mode = TYPE_MODE (type);
+#ifdef PROMOTE_MODE
+  uns = TYPE_SIGN (type);
+  PROMOTE_MODE (mode, uns, type);
+#endif
+  uns = TYPE_SIGN (type);
+  promoted_type = lang_hooks.types.type_for_mode (mode, uns);
+  if (promoted_type
+      && (TYPE_PRECISION (promoted_type) > TYPE_PRECISION (type)))
+    type = promoted_type;

I think what you want to verify is that TYPE_PRECISION (promoted_type)
== GET_MODE_PRECISION (mode).
And to not even bother with this simply use

promoted_type = build_nonstandard_integer_type (GET_MODE_PRECISION (mode), uns);

You use a domwalk but also might create new basic-blocks during it
(insert_on_edge_immediate), that's a
no-no, commit edge inserts after the domwalk.
ssa_sets_higher_bits_bitmap looks unused and
we generally don't free dominance info, so please don't do that.

I fired off a bootstrap on ppc64-linux which fails building stage1 libgcc with

/abuild/rguenther/obj/./gcc/xgcc -B/abuild/rguenther/obj/./gcc/
-B/usr/local/powerpc64-unknown-linux-gnu/bin/
-B/usr/local/powerpc64-unknown-linux-gnu/lib/ -isystem
/usr/local/powerpc64-unknown-linux-gnu/include -isystem
/usr/local/powerpc64-unknown-linux-gnu/sys-include    -g -O2 -O2  -g
-O2 -DIN_GCC    -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual
-Wno-format -Wstrict-prototypes -Wmissing-prototypes
-Wold-style-definition  -isystem ./include   -fPIC -mlong-double-128
-mno-minimal-toc -g -DIN_LIBGCC2 -fbuilding-libgcc
-fno-stack-protector   -fPIC -mlong-double-128 -mno-minimal-toc -I.
-I. -I../.././gcc -I../../../trunk/libgcc -I../../../trunk/libgcc/.
-I../../../trunk/libgcc/../gcc -I../../../trunk/libgcc/../include
-I../../../trunk/libgcc/../libdecnumber/dpd
-I../../../trunk/libgcc/../libdecnumber -DHAVE_CC_TLS  -o _divdi3.o
-MT _divdi3.o -MD -MP -MF _divdi3.dep -DL_divdi3 -c
../../../trunk/libgcc/libgcc2.c \
          -fexceptions -fnon-call-exceptions -fvisibility=hidden -DHIDE_EXPORTS
In file included from ../../../trunk/libgcc/libgcc2.c:56:0:
../../../trunk/libgcc/libgcc2.c: In function ‘__divti3’:
../../../trunk/libgcc/libgcc2.h:193:20: internal compiler error: in
expand_debug_locations, at cfgexpand.c:5277

as hinted at above a bootstrap on i?86 (yes, 32bit) with
--with-tune=pentiumpro might be another good testing candidate.

+      FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_USE | SSA_OP_DEF)
+       promote_def_and_uses (def);

it looks like you are doing some redundant work by walking both defs
and uses of each stmt.  I'd say you should separate
def and use processing and use

  FOR_EACH_SSA_USE_OPERAND (use, stmt, iter, SSA_OP_USE)
    promote_use (use);
  FOR_EACH_SSA_DEF_OPERAND (def, stmt, iter, SSA_OP_DEF)
    promote_def (def);

this should make processing more efficient (memory local) compared to
doing the split handling
in promote_def_and_uses.

I think it will be convenient to have a SSA name info structure where
you can remember the original
type a name was promoted from as well as whether it was promoted or
not.  This way adjusting
debug uses should be "trivial":

+static unsigned int
+fixup_uses (tree use, tree promoted_type, tree old_type)
+{
+  gimple *stmt;
+  imm_use_iterator ui;
+  gimple_stmt_iterator gsi;
+  use_operand_p op;
+
+  FOR_EACH_IMM_USE_STMT (stmt, ui, use)
+    {
+      bool do_not_promote = false;
+      switch (gimple_code (stmt))
+       {
+       case GIMPLE_DEBUG:
+           {
+             gsi = gsi_for_stmt (stmt);
+             gsi_remove (&gsi, true);

rather than doing the above you'd do sth like

  SET_USE (use, fold_convert (old_type, new_def));
  update_stmt (stmt);

note that while you may not be able to use promoted regs at all uses
(like calls or asms) you can promote all defs, if only with a compensation
statement after the original def.  The SSA name info struct can be used
to note down the actual SSA name holding the promoted def.

The pass looks a lot better than last time (it's way smaller!) but
still needs some
improvements.  There are some more fishy details with respect to how you
allocate/change SSA names but I think those can be dealt with once the
basic structure looks how I like it to be.

Thanks,
Richard.



>
> Thanks,
> Kugan

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [0/7] Type promotion pass and elimination of zext/sext
  2015-10-21 18:11       ` Richard Henderson
@ 2015-10-22 12:48         ` Richard Biener
  0 siblings, 0 replies; 28+ messages in thread
From: Richard Biener @ 2015-10-22 12:48 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Kugan, gcc-patches

On Wed, Oct 21, 2015 at 7:55 PM, Richard Henderson <rth@redhat.com> wrote:
> On 10/21/2015 03:56 AM, Richard Biener wrote:
>>
>> On Wed, Oct 21, 2015 at 2:45 PM, Richard Biener
>> <richard.guenther@gmail.com> wrote:
>>>
>>> On Tue, Oct 20, 2015 at 10:03 PM, Kugan
>>> <kugan.vivekanandarajah@linaro.org> wrote:
>>>>
>>>>
>>>>
>>>> On 07/09/15 12:53, Kugan wrote:
>>>>>
>>>>>
>>>>> This a new version of the patch posted in
>>>>> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done
>>>>> more testing and spitted the patch to make it more easier to review.
>>>>> There are still couple of issues to be addressed and I am working on
>>>>> them.
>>>>>
>>>>> 1. AARCH64 bootstrap now fails with the commit
>>>>> 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is
>>>>> mis-compiled
>>>>> in stage2 and fwprop.c is failing. It looks to me that there is a
>>>>> latent
>>>>> issue which gets exposed my patch. I can also reproduce this in x86_64
>>>>> if I use the same PROMOTE_MODE which is used in aarch64 port. For the
>>>>> time being, I am using  patch
>>>>> 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a
>>>>> workaround. This meeds to be fixed before the patches are ready to be
>>>>> committed.
>>>>>
>>>>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with
>>>>> -O3 -g Error: unaligned opcodes detected in executable segment. It
>>>>> works
>>>>> fine if I remove the -g. I am looking into it and needs to be fixed as
>>>>> well.
>>>>
>>>>
>>>> Hi Richard,
>>>>
>>>> Now that stage 1 is going to close, I would like to get these patches
>>>> accepted for stage1. I will try my best to address your review comments
>>>> ASAP.
>>>
>>>
>>> Ok, can you make the whole patch series available so I can poke at the
>>> implementation a bit?  Please state the revision it was rebased on
>>> (or point me to a git/svn branch the work resides on).
>>>
>>>> * Issue 1 above (AARCH64 bootstrap now fails with the commit) is no
>>>> longer present as it is fixed in trunk. Patch-6 is no longer needed.
>>>>
>>>> * Issue 2 is also reported as known issue
>>>>
>>>> *  Promotion of PARM_DECLs and RESULT_DECLs in IPA pass and patterns in
>>>> match.pd for SEXT_EXPR, I would like to propose them as a follow up
>>>> patch once this is accepted.
>>>
>>>
>>> I thought more about this and don't think it can be made work without a
>>> lot of
>>> hassle.  Instead to get rid of the remaining "badly" typed registers in
>>> the
>>> function we can key different type requirements on a pass property
>>> (PROP_promoted_regs), thus simply change the expectation of the
>>> types of function parameters / results according to their promotion.
>>
>>
>> Or maybe we should simply make GIMPLE _always_ adhere to the ABI
>> details from the start (gimplification).  Note that this does not only
>> involve
>> PROMOTE_MODE.  Note that for what GIMPLE is concerned I'd only
>> "lower" passing / returning in registers (whee, and then we have
>> things like targetm.calls.split_complex_arg ... not to mention passing
>> GIMPLE memory in registers).
>>
>> Maybe I'm shooting too far here in the attempt to make GIMPLE closer
>> to the target (to expose those redundant extensions on GIMPLE) and
>> we'll end up with a bigger mess than with not doing this?
>
>
> I'm leary of building this in as early as gimplification, lest we get into
> trouble with splitting out bits of the current function for off-loading.
> What happens when the cpu and gpu have different promotion rules?

Ah, of course.  I tend to forget these issues.

Richard.

>
> r~

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [0/7] Type promotion pass and elimination of zext/sext
  2015-10-21 12:56   ` Richard Biener
  2015-10-21 13:57     ` Richard Biener
@ 2015-10-22 11:01     ` Kugan
  2015-10-22 14:24       ` Richard Biener
  1 sibling, 1 reply; 28+ messages in thread
From: Kugan @ 2015-10-22 11:01 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1751 bytes --]



On 21/10/15 23:45, Richard Biener wrote:
> On Tue, Oct 20, 2015 at 10:03 PM, Kugan
> <kugan.vivekanandarajah@linaro.org> wrote:
>>
>>
>> On 07/09/15 12:53, Kugan wrote:
>>>
>>> This a new version of the patch posted in
>>> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done
>>> more testing and spitted the patch to make it more easier to review.
>>> There are still couple of issues to be addressed and I am working on them.
>>>
>>> 1. AARCH64 bootstrap now fails with the commit
>>> 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is mis-compiled
>>> in stage2 and fwprop.c is failing. It looks to me that there is a latent
>>> issue which gets exposed my patch. I can also reproduce this in x86_64
>>> if I use the same PROMOTE_MODE which is used in aarch64 port. For the
>>> time being, I am using  patch
>>> 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a
>>> workaround. This meeds to be fixed before the patches are ready to be
>>> committed.
>>>
>>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with
>>> -O3 -g Error: unaligned opcodes detected in executable segment. It works
>>> fine if I remove the -g. I am looking into it and needs to be fixed as well.
>>
>> Hi Richard,
>>
>> Now that stage 1 is going to close, I would like to get these patches
>> accepted for stage1. I will try my best to address your review comments
>> ASAP.
> 
> Ok, can you make the whole patch series available so I can poke at the
> implementation a bit?  Please state the revision it was rebased on
> (or point me to a git/svn branch the work resides on).
> 

Thanks. Please find the patched rebated against trunk@229156. I have
skipped the test-case readjustment patches.


Thanks,
Kugan

[-- Attachment #2: 0004-debug-stmt-in-widen-mode.patch --]
[-- Type: text/x-diff, Size: 3166 bytes --]

From 2dc1cccfc59ae6967928b52396227b52a50803d9 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Thu, 22 Oct 2015 10:54:31 +1100
Subject: [PATCH 4/4] debug stmt in widen mode

---
 gcc/gimple-ssa-type-promote.c | 82 +++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 79 insertions(+), 3 deletions(-)

diff --git a/gcc/gimple-ssa-type-promote.c b/gcc/gimple-ssa-type-promote.c
index e62a7c6..c0b6aa1 100644
--- a/gcc/gimple-ssa-type-promote.c
+++ b/gcc/gimple-ssa-type-promote.c
@@ -589,10 +589,86 @@ fixup_uses (tree use, tree promoted_type, tree old_type)
 	{
 	case GIMPLE_DEBUG:
 	    {
-	      gsi = gsi_for_stmt (stmt);
-	      gsi_remove (&gsi, true);
-	      break;
+	      /* Change the GIMPLE_DEBUG stmt such that the value bound is
+		 computed in promoted_type and then converted to required
+		 type.  */
+	      tree op, new_op = NULL_TREE;
+	      gdebug *copy = NULL, *gs = as_a <gdebug *> (stmt);
+	      enum tree_code code;
+
+	      /* Get the value that is bound in debug stmt.  */
+	      switch (gs->subcode)
+		{
+		case GIMPLE_DEBUG_BIND:
+		  op = gimple_debug_bind_get_value (gs);
+		  break;
+		case GIMPLE_DEBUG_SOURCE_BIND:
+		  op = gimple_debug_source_bind_get_value (gs);
+		  break;
+		default:
+		  gcc_unreachable ();
+		}
+
+	      code = TREE_CODE (op);
+	      /* Convert the value computed in promoted_type to
+		 old_type.  */
+	      if (code == SSA_NAME && use == op)
+		new_op = build1 (NOP_EXPR, old_type, use);
+	      else if (TREE_CODE_CLASS (TREE_CODE (op)) == tcc_unary
+		       && code != NOP_EXPR)
+		{
+		  tree op0 = TREE_OPERAND (op, 0);
+		  if (op0 == use)
+		    {
+		      tree temp = build1 (code, promoted_type, op0);
+		      new_op = build1 (NOP_EXPR, old_type, temp);
+		    }
+		}
+	      else if (TREE_CODE_CLASS (TREE_CODE (op)) == tcc_binary
+		       /* Skip codes that are rejected in safe_to_promote_use_p.  */
+		       && code != LROTATE_EXPR
+		       && code != RROTATE_EXPR
+		       && code != COMPLEX_EXPR)
+		{
+		  tree op0 = TREE_OPERAND (op, 0);
+		  tree op1 = TREE_OPERAND (op, 1);
+		  if (op0 == use || op1 == use)
+		    {
+		      if (TREE_CODE (op0) == INTEGER_CST)
+			op0 = convert_int_cst (promoted_type, op0, SIGNED);
+		      if (TREE_CODE (op1) == INTEGER_CST)
+			op1 = convert_int_cst (promoted_type, op1, SIGNED);
+		      tree temp = build2 (code, promoted_type, op0, op1);
+		      new_op = build1 (NOP_EXPR, old_type, temp);
+		    }
+		}
+
+	      /* Create new GIMPLE_DEBUG stmt with the new value (new_op) to
+		 be bound, if new value has been calculated */
+	      if (new_op)
+		{
+		  if (gimple_debug_bind_p (stmt))
+		    {
+		      copy = gimple_build_debug_bind
+			(gimple_debug_bind_get_var (stmt),
+			 new_op,
+			 stmt);
+		    }
+		  if (gimple_debug_source_bind_p (stmt))
+		    {
+		      copy = gimple_build_debug_source_bind
+			(gimple_debug_source_bind_get_var (stmt), new_op,
+			 stmt);
+		    }
+
+		  if (copy)
+		    {
+		      gsi = gsi_for_stmt (stmt);
+		      gsi_replace (&gsi, copy, false);
+		    }
+		}
 	    }
+	  break;
 
 	case GIMPLE_ASM:
 	case GIMPLE_CALL:
-- 
1.9.1


[-- Attachment #3: 0003-Optimize-ZEXT_EXPR-with-tree-vrp.patch --]
[-- Type: text/x-diff, Size: 3519 bytes --]

From 1044b1b5ebf8ad696a942207b031e3668ab2a0de Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Thu, 22 Oct 2015 10:53:56 +1100
Subject: [PATCH 3/4] Optimize ZEXT_EXPR with tree-vrp

---
 gcc/tree-vrp.c | 70 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 70 insertions(+)

diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index fe34ffd..cdff9c0 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -2241,6 +2241,7 @@ extract_range_from_binary_expr_1 (value_range *vr,
       && code != LSHIFT_EXPR
       && code != MIN_EXPR
       && code != MAX_EXPR
+      && code != SEXT_EXPR
       && code != BIT_AND_EXPR
       && code != BIT_IOR_EXPR
       && code != BIT_XOR_EXPR)
@@ -2801,6 +2802,52 @@ extract_range_from_binary_expr_1 (value_range *vr,
       extract_range_from_multiplicative_op_1 (vr, code, &vr0, &vr1);
       return;
     }
+  else if (code == SEXT_EXPR)
+    {
+      gcc_assert (range_int_cst_p (&vr1));
+      HOST_WIDE_INT prec = tree_to_uhwi (vr1.min);
+      type = vr0.type;
+      wide_int tmin, tmax;
+      wide_int may_be_nonzero, must_be_nonzero;
+
+      wide_int type_min = wi::min_value (prec, SIGNED);
+      wide_int type_max = wi::max_value (prec, SIGNED);
+      type_min = wide_int_to_tree (expr_type, type_min);
+      type_max = wide_int_to_tree (expr_type, type_max);
+      wide_int sign_bit
+	= wi::set_bit_in_zero (prec - 1,
+			       TYPE_PRECISION (TREE_TYPE (vr0.min)));
+      if (zero_nonzero_bits_from_vr (expr_type, &vr0,
+				     &may_be_nonzero,
+				     &must_be_nonzero))
+	{
+	  if (wi::bit_and (must_be_nonzero, sign_bit) == sign_bit)
+	    {
+	      /* If to-be-extended sign bit is one.  */
+	      tmin = type_min;
+	      tmax = wi::zext (may_be_nonzero, prec);
+	    }
+	  else if (wi::bit_and (may_be_nonzero, sign_bit)
+		   != sign_bit)
+	    {
+	      /* If to-be-extended sign bit is zero.  */
+	      tmin = wi::zext (must_be_nonzero, prec);
+	      tmax = wi::zext (may_be_nonzero, prec);
+	    }
+	  else
+	    {
+	      tmin = type_min;
+	      tmax = type_max;
+	    }
+	}
+      else
+	{
+	  tmin = type_min;
+	  tmax = type_max;
+	}
+      min = wide_int_to_tree (expr_type, tmin);
+      max = wide_int_to_tree (expr_type, tmax);
+    }
   else if (code == RSHIFT_EXPR
 	   || code == LSHIFT_EXPR)
     {
@@ -9166,6 +9213,28 @@ simplify_bit_ops_using_ranges (gimple_stmt_iterator *gsi, gimple *stmt)
 	  break;
 	}
       break;
+    case SEXT_EXPR:
+	{
+	  unsigned int prec = tree_to_uhwi (op1);
+	  wide_int sign_bit
+	    = wi::set_bit_in_zero (prec - 1,
+				   TYPE_PRECISION (TREE_TYPE (vr0.min)));
+	  wide_int mask = wi::mask (prec, true,
+				    TYPE_PRECISION (TREE_TYPE (vr0.min)));
+	  if (wi::bit_and (must_be_nonzero0, sign_bit) == sign_bit)
+	    {
+	      /* If to-be-extended sign bit is one.  */
+	      if (wi::bit_and (must_be_nonzero0, mask) == mask)
+		op = op0;
+	    }
+	  else if (wi::bit_and (may_be_nonzero0, sign_bit) != sign_bit)
+	    {
+	      /* If to-be-extended sign bit is zero.  */
+	      if (wi::bit_and (may_be_nonzero0, mask) == 0)
+		op = op0;
+	    }
+	}
+      break;
     default:
       gcc_unreachable ();
     }
@@ -9868,6 +9937,7 @@ simplify_stmt_using_ranges (gimple_stmt_iterator *gsi)
 
 	case BIT_AND_EXPR:
 	case BIT_IOR_EXPR:
+	case SEXT_EXPR:
 	  /* Optimize away BIT_AND_EXPR and BIT_IOR_EXPR
 	     if all the bits being cleared are already cleared or
 	     all the bits being set are already set.  */
-- 
1.9.1


[-- Attachment #4: 0002-Add-type-promotion-pass.patch --]
[-- Type: text/x-diff, Size: 29013 bytes --]

From 0cd8d75c4130639f4a3fe8294bcbfdf4f2d3e4eb Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Thu, 22 Oct 2015 10:52:37 +1100
Subject: [PATCH 2/4] Add type promotion pass

---
 gcc/Makefile.in               |   1 +
 gcc/common.opt                |   4 +
 gcc/doc/invoke.texi           |  10 +
 gcc/gimple-ssa-type-promote.c | 831 ++++++++++++++++++++++++++++++++++++++++++
 gcc/passes.def                |   1 +
 gcc/timevar.def               |   1 +
 gcc/tree-pass.h               |   1 +
 gcc/tree-ssanames.c           |   3 +-
 8 files changed, 851 insertions(+), 1 deletion(-)
 create mode 100644 gcc/gimple-ssa-type-promote.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index b91b8dc..c6aed45 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1499,6 +1499,7 @@ OBJS = \
 	tree-vect-slp.o \
 	tree-vectorizer.o \
 	tree-vrp.o \
+	gimple-ssa-type-promote.o \
 	tree.o \
 	valtrack.o \
 	value-prof.o \
diff --git a/gcc/common.opt b/gcc/common.opt
index 12ca0d6..f450428 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2404,6 +2404,10 @@ ftree-vrp
 Common Report Var(flag_tree_vrp) Init(0) Optimization
 Perform Value Range Propagation on trees.
 
+ftree-type-promote
+Common Report Var(flag_tree_type_promote) Init(1) Optimization
+Perform Type Promotion on trees
+
 funit-at-a-time
 Common Report Var(flag_unit_at_a_time) Init(1)
 Compile whole compilation unit at a time.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index cd82544..bc059a0 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -9093,6 +9093,16 @@ enabled by default at @option{-O2} and higher.  Null pointer check
 elimination is only done if @option{-fdelete-null-pointer-checks} is
 enabled.
 
+@item -ftree-type-promote
+@opindex ftree-type-promote
+This pass applies type promotion to SSA names in the function and
+inserts appropriate truncations to preserve the semantics.  Idea of
+this pass is to promote operations such a way that we can minimise
+generation of subreg in RTL, that intern results in removal of
+redundant zero/sign extensions.
+
+This optimization is enabled by default.
+
 @item -fsplit-ivs-in-unroller
 @opindex fsplit-ivs-in-unroller
 Enables expression of values of induction variables in later iterations
diff --git a/gcc/gimple-ssa-type-promote.c b/gcc/gimple-ssa-type-promote.c
new file mode 100644
index 0000000..e62a7c6
--- /dev/null
+++ b/gcc/gimple-ssa-type-promote.c
@@ -0,0 +1,831 @@
+/* Type promotion of SSA names to minimise redundant zero/sign extension.
+   Copyright (C) 2015 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "hash-set.h"
+#include "machmode.h"
+#include "vec.h"
+#include "double-int.h"
+#include "input.h"
+#include "symtab.h"
+#include "wide-int.h"
+#include "inchash.h"
+#include "tree.h"
+#include "fold-const.h"
+#include "stor-layout.h"
+#include "predict.h"
+#include "function.h"
+#include "dominance.h"
+#include "cfg.h"
+#include "basic-block.h"
+#include "tree-ssa-alias.h"
+#include "gimple-fold.h"
+#include "tree-eh.h"
+#include "gimple-expr.h"
+#include "is-a.h"
+#include "gimple.h"
+#include "gimple-iterator.h"
+#include "gimple-ssa.h"
+#include "tree-phinodes.h"
+#include "ssa-iterators.h"
+#include "stringpool.h"
+#include "tree-ssanames.h"
+#include "tree-pass.h"
+#include "gimple-pretty-print.h"
+#include "langhooks.h"
+#include "sbitmap.h"
+#include "domwalk.h"
+#include "tree-dfa.h"
+
+/* This pass applies type promotion to SSA names in the function and
+   inserts appropriate truncations.  Idea of this pass is to promote operations
+   such a way that we can minimise generation of subreg in RTL,
+   that in turn results in removal of redundant zero/sign extensions.  This pass
+   will run prior to The VRP and DOM such that they will be able to optimise
+   redundant truncations and extensions.  This is based on the discussion from
+   https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00472.html.
+
+*/
+
+static unsigned n_ssa_val;
+static sbitmap ssa_to_be_promoted_bitmap;
+static sbitmap ssa_sets_higher_bits_bitmap;
+static hash_map <tree, tree>  *original_type_map;
+
+static bool
+type_precision_ok (tree type)
+{
+  return (TYPE_PRECISION (type)  == 8
+	  || TYPE_PRECISION (type) == 16
+	  || TYPE_PRECISION (type) == 32);
+}
+
+/* Return the promoted type for TYPE.  */
+static tree
+get_promoted_type (tree type)
+{
+  tree promoted_type;
+  enum machine_mode mode;
+  int uns;
+  if (POINTER_TYPE_P (type)
+      || !INTEGRAL_TYPE_P (type)
+      || !type_precision_ok (type))
+    return type;
+
+  mode = TYPE_MODE (type);
+#ifdef PROMOTE_MODE
+  uns = TYPE_SIGN (type);
+  PROMOTE_MODE (mode, uns, type);
+#endif
+  uns = TYPE_SIGN (type);
+  promoted_type = lang_hooks.types.type_for_mode (mode, uns);
+  if (promoted_type
+      && (TYPE_PRECISION (promoted_type) > TYPE_PRECISION (type)))
+    type = promoted_type;
+  return type;
+}
+
+/* Return true if ssa NAME is already considered for promotion.  */
+static bool
+ssa_promoted_p (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	return bitmap_bit_p (ssa_to_be_promoted_bitmap, index);
+    }
+  return true;
+}
+
+
+/* Set ssa NAME to be already considered for promotion.  */
+static void
+set_ssa_promoted (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	bitmap_set_bit (ssa_to_be_promoted_bitmap, index);
+    }
+}
+
+/* Insert COPY_STMT along the edge from STMT to its successor.  */
+static void
+insert_stmt_on_edge (gimple *stmt, gimple *copy_stmt)
+{
+  edge_iterator ei;
+  edge e, edge = NULL;
+  basic_block bb = gimple_bb (stmt);
+
+  FOR_EACH_EDGE (e, ei, bb->succs)
+    if (!(e->flags & EDGE_EH))
+      {
+	gcc_assert (edge == NULL);
+	edge = e;
+      }
+
+  gcc_assert (edge);
+  gsi_insert_on_edge_immediate (edge, copy_stmt);
+}
+
+/* Return true if it is safe to promote the defined SSA_NAME in the STMT
+   itself.  */
+static bool
+safe_to_promote_def_p (gimple *stmt)
+{
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  if (gimple_vuse (stmt) != NULL_TREE
+      || gimple_vdef (stmt) != NULL_TREE
+      || code == ARRAY_REF
+      || code == LROTATE_EXPR
+      || code == RROTATE_EXPR
+      || code == VIEW_CONVERT_EXPR
+      || code == BIT_FIELD_REF
+      || code == REALPART_EXPR
+      || code == IMAGPART_EXPR
+      || code == REDUC_MAX_EXPR
+      || code == REDUC_PLUS_EXPR
+      || code == REDUC_MIN_EXPR)
+    return false;
+  return true;
+}
+
+/* Return true if it is safe to promote the use in the STMT.  */
+static bool
+safe_to_promote_use_p (gimple *stmt)
+{
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  tree lhs = gimple_assign_lhs (stmt);
+
+  if (gimple_vuse (stmt) != NULL_TREE
+      || gimple_vdef (stmt) != NULL_TREE
+      || code == VIEW_CONVERT_EXPR
+      || code == LROTATE_EXPR
+      || code == RROTATE_EXPR
+      || code == CONSTRUCTOR
+      || code == BIT_FIELD_REF
+      || code == COMPLEX_EXPR
+      || code == ASM_EXPR
+      || VECTOR_TYPE_P (TREE_TYPE (lhs)))
+    return false;
+  return true;
+}
+
+/* Return true if the SSA_NAME has to be truncated to preserve the
+   semantics.  */
+static bool
+truncate_use_p (gimple *stmt)
+{
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  if (TREE_CODE_CLASS (code)
+      == tcc_comparison
+      || code == TRUNC_DIV_EXPR
+      || code == CEIL_DIV_EXPR
+      || code == FLOOR_DIV_EXPR
+      || code == ROUND_DIV_EXPR
+      || code == TRUNC_MOD_EXPR
+      || code == CEIL_MOD_EXPR
+      || code == FLOOR_MOD_EXPR
+      || code == ROUND_MOD_EXPR
+      || code == LSHIFT_EXPR
+      || code == RSHIFT_EXPR)
+    return true;
+  return false;
+}
+
+/* Return true if LHS will be promoted later.  */
+static bool
+tobe_promoted_p (tree lhs)
+{
+  if (TREE_CODE (lhs) == SSA_NAME
+      && !POINTER_TYPE_P (TREE_TYPE (lhs))
+      && INTEGRAL_TYPE_P (TREE_TYPE (lhs))
+      && !VECTOR_TYPE_P (TREE_TYPE (lhs))
+      && !ssa_promoted_p (lhs)
+      && (get_promoted_type (TREE_TYPE (lhs))
+	  != TREE_TYPE (lhs)))
+    return true;
+  else
+    return false;
+}
+
+/* Convert constant CST to TYPE.  */
+static tree
+convert_int_cst (tree type, tree cst, signop sign = SIGNED)
+{
+  wide_int wi_cons = fold_convert (type, cst);
+  wi_cons = wi::ext (wi_cons, TYPE_PRECISION (TREE_TYPE (cst)), sign);
+  return wide_int_to_tree (type, wi_cons);
+}
+
+/* Promote constants in STMT to TYPE.  If PROMOTE_COND_EXPR is true,
+   promote only the constants in conditions part of the COND_EXPR.  */
+static void
+promote_cst_in_stmt (gimple *stmt, tree type, bool promote_cond = false)
+{
+  tree op;
+  ssa_op_iter iter;
+  use_operand_p oprnd;
+  int index;
+  tree op0, op1;
+  signop sign = SIGNED;
+
+  switch (gimple_code (stmt))
+    {
+    case GIMPLE_ASSIGN:
+      if (promote_cond
+	  && gimple_assign_rhs_code (stmt) == COND_EXPR)
+	{
+	  /* Promote INTEGER_CST that are tcc_compare arguments.  */
+	  sign = TYPE_SIGN (type);
+	  op = gimple_assign_rhs1 (stmt);
+	  op0 = TREE_OPERAND (op, 0);
+	  op1 = TREE_OPERAND (op, 1);
+	  if (TREE_CODE (op0) == INTEGER_CST)
+	    op0 = convert_int_cst (type, op0, sign);
+	  if (TREE_CODE (op1) == INTEGER_CST)
+	    op1 = convert_int_cst (type, op1, sign);
+	  tree new_op = build2 (TREE_CODE (op), type, op0, op1);
+	  gimple_assign_set_rhs1 (stmt, new_op);
+	}
+      else
+	{
+	  /* Promote INTEGER_CST in GIMPLE_ASSIGN.  */
+	  op = gimple_assign_rhs3 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs3 (stmt, convert_int_cst (type, op, sign));
+	  if (TREE_CODE_CLASS (gimple_assign_rhs_code (stmt))
+	      == tcc_comparison)
+	    sign = TYPE_SIGN (type);
+	  op = gimple_assign_rhs1 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs1 (stmt, convert_int_cst (type, op, sign));
+	  op = gimple_assign_rhs2 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs2 (stmt, convert_int_cst (type, op, sign));
+	}
+      break;
+
+    case GIMPLE_PHI:
+	{
+	  /* Promote INTEGER_CST arguments to GIMPLE_PHI.  */
+	  gphi *phi = as_a <gphi *> (stmt);
+	  FOR_EACH_PHI_ARG (oprnd, phi, iter, SSA_OP_USE)
+	    {
+	      op = USE_FROM_PTR (oprnd);
+	      index = PHI_ARG_INDEX_FROM_USE (oprnd);
+	      if (TREE_CODE (op) == INTEGER_CST)
+		SET_PHI_ARG_DEF (phi, index, convert_int_cst (type, op, sign));
+	    }
+	}
+      break;
+
+    case GIMPLE_COND:
+	{
+	  /* Promote INTEGER_CST that are GIMPLE_COND arguments.  */
+	  gcond *cond = as_a <gcond *> (stmt);
+	  op = gimple_cond_lhs (cond);
+	  sign = TYPE_SIGN (type);
+
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_cond_set_lhs (cond, convert_int_cst (type, op, sign));
+	  op = gimple_cond_rhs (cond);
+
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_cond_set_rhs (cond, convert_int_cst (type, op, sign));
+	}
+      break;
+
+    default:
+      gcc_unreachable ();
+    }
+}
+
+/* Create an ssa with TYPE to copy ssa VAR.  */
+static tree
+make_promoted_copy (tree var, gimple *def_stmt, tree type)
+{
+  tree new_lhs = make_ssa_name (type, def_stmt);
+  if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (var))
+    SSA_NAME_OCCURS_IN_ABNORMAL_PHI (new_lhs) = 1;
+  return new_lhs;
+}
+
+/* Zero/sign extend (depending on type) VAR and truncate to WIDTH bits.
+   Assign the zero/sign extended value in NEW_VAR.  gimple statement
+   that performs the zero/sign extension is returned.  */
+static gimple *
+zero_sign_extend_stmt (tree new_var, tree var, int width)
+{
+  gcc_assert (TYPE_PRECISION (TREE_TYPE (var))
+	      == TYPE_PRECISION (TREE_TYPE (new_var)));
+  gcc_assert (TYPE_PRECISION (TREE_TYPE (var)) > width);
+  gcc_assert (width != 1);
+  gimple *stmt;
+
+  if (TYPE_UNSIGNED (TREE_TYPE (new_var)))
+    {
+      /* Zero extend.  */
+      tree cst
+	= wide_int_to_tree (TREE_TYPE (var),
+			    wi::mask (width, false,
+				      TYPE_PRECISION (TREE_TYPE (var))));
+      stmt = gimple_build_assign (new_var, BIT_AND_EXPR,
+				  var, cst);
+    }
+  else
+    /* Sign extend.  */
+    stmt = gimple_build_assign (new_var,
+				SEXT_EXPR,
+				var, build_int_cst (TREE_TYPE (var), width));
+  return stmt;
+}
+
+
+void duplicate_default_ssa (tree to, tree from)
+{
+  SET_SSA_NAME_VAR_OR_IDENTIFIER (to, SSA_NAME_VAR (from));
+  SSA_NAME_IS_DEFAULT_DEF (to) = SSA_NAME_IS_DEFAULT_DEF (from);
+  SSA_NAME_DEF_STMT (to) = SSA_NAME_DEF_STMT (from);
+  SET_SSA_NAME_VAR_OR_IDENTIFIER (from, NULL_TREE);
+  SSA_NAME_IS_DEFAULT_DEF (to) = 1;
+  SSA_NAME_IS_DEFAULT_DEF (from) = 0;
+}
+
+/* Promote definition DEF to PROMOTED_TYPE.  If the stmt that defines def
+   is def_stmt, make the type of def promoted_type.  If the stmt is such
+   that, result of the def_stmt cannot be of promoted_type, create a new_def
+   of the original_type and make the def_stmt assign its value to newdef.
+   Then, create a CONVERT_EXPR to convert new_def to def of promoted type.
+
+   For example, for stmt with original_type char and promoted_type int:
+		char _1 = mem;
+	becomes:
+		char _2 = mem;
+		int _1 = (int)_2;
+
+   If the def_stmt allows def to be promoted, promote def in-place
+   (and its arguments when needed).
+
+   For example:
+		char _3 = _1 + _2;
+	becomes:
+		int _3 = _1 + _2;
+   Here, _1 and _2 will also be promoted.  */
+
+static void
+promote_definition (tree def,
+		    tree promoted_type)
+{
+  gimple *def_stmt = SSA_NAME_DEF_STMT (def);
+  gimple *copy_stmt = NULL;
+  basic_block bb;
+  gimple_stmt_iterator gsi;
+  tree original_type = TREE_TYPE (def);
+  tree new_def;
+  bool do_not_promote = false;
+
+  switch (gimple_code (def_stmt))
+    {
+    case GIMPLE_PHI:
+	{
+	  /* Promote def by fixing its type and make def anonymous.  */
+	  TREE_TYPE (def) = promoted_type;
+	  SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+	  promote_cst_in_stmt (def_stmt, promoted_type);
+	  break;
+	}
+
+    case GIMPLE_ASM:
+	{
+	  gasm *asm_stmt = as_a <gasm *> (def_stmt);
+	  for (unsigned int i = 0; i < gimple_asm_noutputs (asm_stmt); ++i)
+	    {
+	      /* Promote def and copy (i.e. convert) the value defined
+		 by asm to def.  */
+	      tree link = gimple_asm_output_op (asm_stmt, i);
+	      tree op = TREE_VALUE (link);
+	      if (op == def)
+		{
+		  new_def = copy_ssa_name (def);
+		  set_ssa_promoted (new_def);
+		  duplicate_default_ssa (new_def, def);
+		  TREE_VALUE (link) = new_def;
+		  gimple_asm_set_output_op (asm_stmt, i, link);
+
+		  TREE_TYPE (def) = promoted_type;
+		  copy_stmt = gimple_build_assign (def, CONVERT_EXPR,
+						   new_def, NULL_TREE);
+		  gsi = gsi_for_stmt (def_stmt);
+		  SSA_NAME_IS_DEFAULT_DEF (new_def) = 0;
+		  gsi_insert_after (&gsi, copy_stmt, GSI_NEW_STMT);
+		  break;
+		}
+	    }
+	  break;
+	}
+
+    case GIMPLE_NOP:
+	{
+	  if (SSA_NAME_VAR (def) == NULL)
+	    {
+	      /* Promote def by fixing its type for anonymous def.  */
+	      TREE_TYPE (def) = promoted_type;
+	    }
+	  else
+	    {
+	      /* Create a promoted copy of parameters.  */
+	      bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+	      gcc_assert (bb);
+	      gsi = gsi_after_labels (bb);
+	      new_def = copy_ssa_name (def);
+	      set_ssa_promoted (new_def);
+	      set_ssa_default_def (cfun, SSA_NAME_VAR (def), new_def);
+	      duplicate_default_ssa (new_def, def);
+	      TREE_TYPE (def) = promoted_type;
+	      copy_stmt = gimple_build_assign (def, CONVERT_EXPR,
+					       new_def, NULL_TREE);
+	      SSA_NAME_DEF_STMT (def) = copy_stmt;
+	      gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+	    }
+	  break;
+	}
+
+    case GIMPLE_ASSIGN:
+	{
+	  enum tree_code code = gimple_assign_rhs_code (def_stmt);
+	  if (!safe_to_promote_def_p (def_stmt))
+	    {
+	      do_not_promote = true;
+	    }
+	  else if (CONVERT_EXPR_CODE_P (code))
+	    {
+	      tree rhs = gimple_assign_rhs1 (def_stmt);
+	      if (!type_precision_ok (TREE_TYPE (rhs)))
+		{
+		  do_not_promote = true;
+		}
+	      else if (types_compatible_p (TREE_TYPE (rhs), promoted_type))
+		{
+		  /* As we travel statements in dominated order, arguments
+		     of def_stmt will be visited before visiting def.  If RHS
+		     is already promoted and type is compatible, we can convert
+		     them into ZERO/SIGN EXTEND stmt.  */
+		  tree &type = original_type_map->get_or_insert (rhs);
+		  if (type == NULL_TREE)
+		    type = TREE_TYPE (rhs);
+		  if (TYPE_PRECISION (original_type) < TYPE_PRECISION (type))
+		    type = original_type;
+		  gcc_assert (type != NULL_TREE);
+		  TREE_TYPE (def) = promoted_type;
+		  gimple *copy_stmt =
+		    zero_sign_extend_stmt (def, rhs,
+					   TYPE_PRECISION (type));
+		  SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		  gsi = gsi_for_stmt (def_stmt);
+		  gsi_replace (&gsi, copy_stmt, false);
+		}
+	      else {
+		  /* If RHS is not promoted OR their types are not
+		     compatible, create CONVERT_EXPR that converts
+		     RHS to  promoted DEF type and perform a
+		     ZERO/SIGN EXTEND to get the required value
+		     from RHS.  */
+		  tree s = (TYPE_PRECISION (TREE_TYPE (def))
+			    < TYPE_PRECISION (TREE_TYPE (rhs)))
+		    ? TREE_TYPE (def) : TREE_TYPE (rhs);
+		  new_def = copy_ssa_name (def);
+		  set_ssa_promoted (new_def);
+		  TREE_TYPE (def) = promoted_type;
+		  TREE_TYPE (new_def) = promoted_type;
+		  SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		  SET_SSA_NAME_VAR_OR_IDENTIFIER (new_def, NULL_TREE);
+		  gimple_set_lhs (def_stmt, new_def);
+		  gimple *copy_stmt =
+		    zero_sign_extend_stmt (def, new_def,
+					   TYPE_PRECISION (s));
+		  gsi = gsi_for_stmt (def_stmt);
+		  if (lookup_stmt_eh_lp (def_stmt) > 0)
+		    insert_stmt_on_edge (def_stmt, copy_stmt);
+		  else
+		    gsi_insert_after (&gsi, copy_stmt, GSI_NEW_STMT);
+		}
+	    }
+	  else
+	    {
+	      /* Promote def by fixing its type and make def anonymous.  */
+	      SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+	      promote_cst_in_stmt (def_stmt, promoted_type);
+	      TREE_TYPE (def) = promoted_type;
+	    }
+	  break;
+	}
+
+    default:
+      do_not_promote = true;
+      break;
+    }
+
+  if (do_not_promote)
+    {
+      /* Promote def and copy (i.e. convert) the value defined
+	 by the stmt that cannot be promoted.  */
+      new_def = copy_ssa_name (def);
+      set_ssa_promoted (new_def);
+      SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+      TREE_TYPE (def) = promoted_type;
+      gimple_set_lhs (def_stmt, new_def);
+      copy_stmt = gimple_build_assign (def, CONVERT_EXPR,
+				       new_def, NULL_TREE);
+      gsi = gsi_for_stmt (def_stmt);
+      if (lookup_stmt_eh_lp (def_stmt) > 0)
+	insert_stmt_on_edge (def_stmt, copy_stmt);
+      else
+	gsi_insert_after (&gsi, copy_stmt, GSI_NEW_STMT);
+    }
+  else
+    {
+      /* Type is now promoted.  Due to this, some of the value ranges computed
+	 by VRP1 will is invalid.  TODO: We can be intelligent in deciding
+	 which ranges to be invalidated instead of invalidating everything.  */
+      SSA_NAME_RANGE_INFO (def) = NULL;
+    }
+}
+
+/* Fix the (promoted) USE in stmts where USE cannot be be promoted.  */
+static unsigned int
+fixup_uses (tree use, tree promoted_type, tree old_type)
+{
+  gimple *stmt;
+  imm_use_iterator ui;
+  gimple_stmt_iterator gsi;
+  use_operand_p op;
+
+  FOR_EACH_IMM_USE_STMT (stmt, ui, use)
+    {
+      bool do_not_promote = false;
+      switch (gimple_code (stmt))
+	{
+	case GIMPLE_DEBUG:
+	    {
+	      gsi = gsi_for_stmt (stmt);
+	      gsi_remove (&gsi, true);
+	      break;
+	    }
+
+	case GIMPLE_ASM:
+	case GIMPLE_CALL:
+	case GIMPLE_RETURN:
+	    {
+	      /* USE cannot be promoted here.  */
+	      do_not_promote = true;
+	      break;
+	    }
+
+	case GIMPLE_ASSIGN:
+	    {
+	      enum tree_code code = gimple_assign_rhs_code (stmt);
+	      tree lhs = gimple_assign_lhs (stmt);
+	      if (!safe_to_promote_use_p (stmt))
+		{
+		  do_not_promote = true;
+		}
+	      else if (truncate_use_p (stmt))
+		{
+		  /* In some stmts, value in USE has to be ZERO/SIGN
+		     Extended based on the original type for correct
+		     result.  */
+		  tree temp = make_promoted_copy (use, NULL, TREE_TYPE (use));
+		  gimple *copy_stmt =
+		    zero_sign_extend_stmt (temp, use,
+					   TYPE_PRECISION (old_type));
+		  gsi = gsi_for_stmt (stmt);
+		  gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+
+		  FOR_EACH_IMM_USE_ON_STMT (op, ui)
+		    SET_USE (op, temp);
+		  if (TREE_CODE_CLASS (code)
+		      == tcc_comparison)
+		    promote_cst_in_stmt (stmt, promoted_type, true);
+		  update_stmt (stmt);
+		}
+	      else if (CONVERT_EXPR_CODE_P (code))
+		{
+		  tree rhs = gimple_assign_rhs1 (stmt);
+		  if (!type_precision_ok (TREE_TYPE (rhs)))
+		    {
+		      do_not_promote = true;
+		    }
+		  else if (types_compatible_p (TREE_TYPE (lhs), promoted_type))
+		    {
+		      /* Type of LHS and promoted RHS are compatible, we can
+			 convert this into ZERO/SIGN EXTEND stmt.  */
+		      gimple *copy_stmt =
+			zero_sign_extend_stmt (lhs, use,
+					       TYPE_PRECISION (old_type));
+		      gsi = gsi_for_stmt (stmt);
+		      set_ssa_promoted (lhs);
+		      gsi_replace (&gsi, copy_stmt, false);
+		    }
+		  else if (tobe_promoted_p (lhs))
+		    {
+		      /* If LHS will be promoted later, store the original
+			 type of RHS so that we can convert it to ZERO/SIGN
+			 EXTEND when LHS is promoted.  */
+		      tree rhs = gimple_assign_rhs1 (stmt);
+		      tree &type = original_type_map->get_or_insert (rhs);
+		      type = TREE_TYPE (old_type);
+		    }
+		  else
+		    {
+		      do_not_promote = true;
+		    }
+		}
+	      break;
+	    }
+
+	case GIMPLE_COND:
+	    {
+	      /* In GIMPLE_COND, value in USE has to be ZERO/SIGN
+		 Extended based on the original type for correct
+		 result.  */
+	      tree temp = make_promoted_copy (use, NULL, TREE_TYPE (use));
+	      gimple *copy_stmt =
+		zero_sign_extend_stmt (temp, use,
+				       TYPE_PRECISION (old_type));
+	      gsi = gsi_for_stmt (stmt);
+	      gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+
+	      FOR_EACH_IMM_USE_ON_STMT (op, ui)
+		SET_USE (op, temp);
+	      promote_cst_in_stmt (stmt, promoted_type, true);
+	      update_stmt (stmt);
+	      break;
+	    }
+
+	default:
+	  break;
+	}
+
+      if (do_not_promote)
+	{
+	  /* FOR stmts where USE canoot be promoted, create an
+	     original type copy.  */
+	  tree temp;
+	  temp = copy_ssa_name (use);
+	  set_ssa_promoted (temp);
+	  TREE_TYPE (temp) = old_type;
+	  gimple *copy_stmt = gimple_build_assign (temp, CONVERT_EXPR,
+						  use, NULL_TREE);
+	  gsi = gsi_for_stmt (stmt);
+	  gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+	  FOR_EACH_IMM_USE_ON_STMT (op, ui)
+	    SET_USE (op, temp);
+	  update_stmt (stmt);
+	}
+    }
+  return 0;
+}
+
+/* Promote definition of NAME and adjust its uses if necessary.  */
+static unsigned int
+promote_def_and_uses (tree name)
+{
+  tree type;
+  if (tobe_promoted_p (name))
+    {
+      type = get_promoted_type (TREE_TYPE (name));
+      tree old_type = TREE_TYPE (name);
+      promote_definition (name, type);
+      fixup_uses (name, type, old_type);
+      set_ssa_promoted (name);
+    }
+  return 0;
+}
+
+/* Promote all the stmts in the basic block.  */
+static void
+promote_all_stmts (basic_block bb)
+{
+  gimple_stmt_iterator gsi;
+  ssa_op_iter iter;
+  tree def;
+
+  for (gphi_iterator gpi = gsi_start_phis (bb);
+       !gsi_end_p (gpi); gsi_next (&gpi))
+    {
+      gphi *phi = gpi.phi ();
+      use_operand_p op;
+
+      FOR_EACH_PHI_ARG (op, phi, iter, SSA_OP_USE)
+	{
+	  def = USE_FROM_PTR (op);
+	  promote_def_and_uses (def);
+	}
+      def = PHI_RESULT (phi);
+      promote_def_and_uses (def);
+    }
+  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+    {
+      gimple *stmt = gsi_stmt (gsi);
+
+      FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_USE | SSA_OP_DEF)
+	promote_def_and_uses (def);
+    }
+}
+
+
+class type_promotion_dom_walker : public dom_walker
+{
+public:
+  type_promotion_dom_walker (cdi_direction direction)
+    : dom_walker (direction) {}
+  virtual void before_dom_children (basic_block bb)
+    {
+      promote_all_stmts (bb);
+    }
+};
+
+/* Main entry point to the pass.  */
+static unsigned int
+execute_type_promotion (void)
+{
+  n_ssa_val = num_ssa_names;
+  original_type_map = new hash_map<tree, tree>;
+  ssa_to_be_promoted_bitmap = sbitmap_alloc (n_ssa_val);
+  bitmap_clear (ssa_to_be_promoted_bitmap);
+  ssa_sets_higher_bits_bitmap = sbitmap_alloc (n_ssa_val);
+  bitmap_clear (ssa_sets_higher_bits_bitmap);
+
+  calculate_dominance_info (CDI_DOMINATORS);
+  /* Walk the CFG in dominator order.  */
+  type_promotion_dom_walker (CDI_DOMINATORS)
+    .walk (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+
+  sbitmap_free (ssa_to_be_promoted_bitmap);
+  sbitmap_free (ssa_sets_higher_bits_bitmap);
+  free_dominance_info (CDI_DOMINATORS);
+  delete original_type_map;
+  return 0;
+}
+
+namespace {
+const pass_data pass_data_type_promotion =
+{
+  GIMPLE_PASS, /* type */
+  "promotion", /* name */
+  OPTGROUP_NONE, /* optinfo_flags */
+  TV_TREE_TYPE_PROMOTE, /* tv_id */
+  PROP_ssa, /* properties_required */
+  0, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  (TODO_cleanup_cfg | TODO_update_ssa | TODO_verify_all),
+};
+
+class pass_type_promotion : public gimple_opt_pass
+{
+public:
+  pass_type_promotion (gcc::context *ctxt)
+    : gimple_opt_pass (pass_data_type_promotion, ctxt)
+  {}
+
+  /* opt_pass methods: */
+  opt_pass * clone () { return new pass_type_promotion (m_ctxt); }
+  virtual bool gate (function *) { return flag_tree_type_promote != 0; }
+  virtual unsigned int execute (function *)
+    {
+      return execute_type_promotion ();
+    }
+
+}; // class pass_type_promotion
+
+} // anon namespace
+
+gimple_opt_pass *
+make_pass_type_promote (gcc::context *ctxt)
+{
+  return new pass_type_promotion (ctxt);
+}
+
diff --git a/gcc/passes.def b/gcc/passes.def
index 36d2b3b..78c463a 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -272,6 +272,7 @@ along with GCC; see the file COPYING3.  If not see
       POP_INSERT_PASSES ()
       NEXT_PASS (pass_simduid_cleanup);
       NEXT_PASS (pass_lower_vector_ssa);
+      NEXT_PASS (pass_type_promote);
       NEXT_PASS (pass_cse_reciprocals);
       NEXT_PASS (pass_reassoc);
       NEXT_PASS (pass_strength_reduction);
diff --git a/gcc/timevar.def b/gcc/timevar.def
index b429faf..a8d40c3 100644
--- a/gcc/timevar.def
+++ b/gcc/timevar.def
@@ -278,6 +278,7 @@ DEFTIMEVAR (TV_VTABLE_VERIFICATION   , "vtable verification")
 DEFTIMEVAR (TV_TREE_UBSAN            , "tree ubsan")
 DEFTIMEVAR (TV_INITIALIZE_RTL        , "initialize rtl")
 DEFTIMEVAR (TV_GIMPLE_LADDRESS       , "address lowering")
+DEFTIMEVAR (TV_TREE_TYPE_PROMOTE     , "tree type promote")
 
 /* Everything else in rest_of_compilation not included above.  */
 DEFTIMEVAR (TV_EARLY_LOCAL	     , "early local passes")
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index 333b5a7..449dd19 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -436,6 +436,7 @@ extern gimple_opt_pass *make_pass_fre (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_check_data_deps (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_copy_prop (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_isolate_erroneous_paths (gcc::context *ctxt);
+extern gimple_opt_pass *make_pass_type_promote (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_vrp (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_uncprop (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_return_slot (gcc::context *ctxt);
diff --git a/gcc/tree-ssanames.c b/gcc/tree-ssanames.c
index 82fd4a1..80fcf70 100644
--- a/gcc/tree-ssanames.c
+++ b/gcc/tree-ssanames.c
@@ -207,7 +207,8 @@ set_range_info (tree name, enum value_range_type range_type,
   unsigned int precision = TYPE_PRECISION (TREE_TYPE (name));
 
   /* Allocate if not available.  */
-  if (ri == NULL)
+  if (ri == NULL
+      || (precision != ri->get_min ().get_precision ()))
     {
       size_t size = (sizeof (range_info_def)
 		     + trailing_wide_ints <3>::extra_size (precision));
-- 
1.9.1


[-- Attachment #5: 0001-Add-new-SEXT_EXPR-tree-code.patch --]
[-- Type: text/x-diff, Size: 5067 bytes --]

From c0ce364e3a422912a08189645efde46c36583753 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Thu, 22 Oct 2015 10:51:42 +1100
Subject: [PATCH 1/4] Add new SEXT_EXPR tree code

---
 gcc/cfgexpand.c         | 12 ++++++++++++
 gcc/expr.c              | 20 ++++++++++++++++++++
 gcc/fold-const.c        |  4 ++++
 gcc/tree-cfg.c          | 12 ++++++++++++
 gcc/tree-inline.c       |  1 +
 gcc/tree-pretty-print.c | 11 +++++++++++
 gcc/tree.def            |  5 +++++
 7 files changed, 65 insertions(+)

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index eaad859..aeb64bb 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -5054,6 +5054,18 @@ expand_debug_expr (tree exp)
     case FMA_EXPR:
       return simplify_gen_ternary (FMA, mode, inner_mode, op0, op1, op2);
 
+    case SEXT_EXPR:
+      gcc_assert (CONST_INT_P (op1));
+      inner_mode = mode_for_size (INTVAL (op1), MODE_INT, 0);
+      gcc_assert (GET_MODE_BITSIZE (inner_mode) == INTVAL (op1));
+
+      if (mode != inner_mode)
+	op0 = simplify_gen_unary (SIGN_EXTEND,
+				  mode,
+				  gen_lowpart_SUBREG (inner_mode, op0),
+				  inner_mode);
+      return op0;
+
     default:
     flag_unsupported:
 #ifdef ENABLE_CHECKING
diff --git a/gcc/expr.c b/gcc/expr.c
index da68870..c2f535f 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -9318,6 +9318,26 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
       target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, target);
       return target;
 
+    case SEXT_EXPR:
+	{
+	  machine_mode inner_mode = mode_for_size (tree_to_uhwi (treeop1),
+						   MODE_INT, 0);
+	  rtx temp, result;
+	  rtx op0 = expand_normal (treeop0);
+	  op0 = force_reg (mode, op0);
+	  if (mode != inner_mode)
+	    {
+	      result = gen_reg_rtx (mode);
+	      temp = simplify_gen_unary (SIGN_EXTEND, mode,
+					 gen_lowpart_SUBREG (inner_mode, op0),
+					 inner_mode);
+	      convert_move (result, temp, 0);
+	    }
+	  else
+	    result = op0;
+	  return result;
+	}
+
     default:
       gcc_unreachable ();
     }
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 602ea24..a149bad 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -987,6 +987,10 @@ int_const_binop_1 (enum tree_code code, const_tree arg1, const_tree parg2,
       res = wi::bit_and (arg1, arg2);
       break;
 
+    case SEXT_EXPR:
+      res = wi::sext (arg1, arg2.to_uhwi ());
+      break;
+
     case RSHIFT_EXPR:
     case LSHIFT_EXPR:
       if (wi::neg_p (arg2))
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index 8e3e810..d18b3f7 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -3752,6 +3752,18 @@ verify_gimple_assign_binary (gassign *stmt)
         return false;
       }
 
+    case SEXT_EXPR:
+      {
+	if (!INTEGRAL_TYPE_P (lhs_type)
+	    || !useless_type_conversion_p (lhs_type, rhs1_type)
+	    || !tree_fits_uhwi_p (rhs2))
+	  {
+	    error ("invalid operands in sext expr");
+	    return true;
+	  }
+	return false;
+      }
+
     case VEC_WIDEN_LSHIFT_HI_EXPR:
     case VEC_WIDEN_LSHIFT_LO_EXPR:
       {
diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index b8269ef..e61c200 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -3893,6 +3893,7 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights,
     case BIT_XOR_EXPR:
     case BIT_AND_EXPR:
     case BIT_NOT_EXPR:
+    case SEXT_EXPR:
 
     case TRUTH_ANDIF_EXPR:
     case TRUTH_ORIF_EXPR:
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index 11f90051..bec9082 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -1923,6 +1923,14 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, int flags,
       }
       break;
 
+    case SEXT_EXPR:
+      pp_string (pp, "SEXT_EXPR <");
+      dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);
+      pp_string (pp, ", ");
+      dump_generic_node (pp, TREE_OPERAND (node, 1), spc, flags, false);
+      pp_greater (pp);
+      break;
+
     case MODIFY_EXPR:
     case INIT_EXPR:
       dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags,
@@ -3561,6 +3569,9 @@ op_symbol_code (enum tree_code code)
     case MIN_EXPR:
       return "min";
 
+    case SEXT_EXPR:
+      return "sext";
+
     default:
       return "<<< ??? >>>";
     }
diff --git a/gcc/tree.def b/gcc/tree.def
index d0a3bd6..789cfdd 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -760,6 +760,11 @@ DEFTREECODE (BIT_XOR_EXPR, "bit_xor_expr", tcc_binary, 2)
 DEFTREECODE (BIT_AND_EXPR, "bit_and_expr", tcc_binary, 2)
 DEFTREECODE (BIT_NOT_EXPR, "bit_not_expr", tcc_unary, 1)
 
+/*  Sign-extend operation.  It will sign extend first operand from
+ the sign bit specified by the second operand.  The type of the
+ result is that of the first operand.  */
+DEFTREECODE (SEXT_EXPR, "sext_expr", tcc_binary, 2)
+
 /* ANDIF and ORIF allow the second operand not to be computed if the
    value of the expression is determined from the first operand.  AND,
    OR, and XOR always compute the second operand whether its value is
-- 
1.9.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [0/7] Type promotion pass and elimination of zext/sext
  2015-10-21 13:57     ` Richard Biener
  2015-10-21 17:17       ` Joseph Myers
@ 2015-10-21 18:11       ` Richard Henderson
  2015-10-22 12:48         ` Richard Biener
  1 sibling, 1 reply; 28+ messages in thread
From: Richard Henderson @ 2015-10-21 18:11 UTC (permalink / raw)
  To: Richard Biener, Kugan; +Cc: gcc-patches

On 10/21/2015 03:56 AM, Richard Biener wrote:
> On Wed, Oct 21, 2015 at 2:45 PM, Richard Biener
> <richard.guenther@gmail.com> wrote:
>> On Tue, Oct 20, 2015 at 10:03 PM, Kugan
>> <kugan.vivekanandarajah@linaro.org> wrote:
>>>
>>>
>>> On 07/09/15 12:53, Kugan wrote:
>>>>
>>>> This a new version of the patch posted in
>>>> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done
>>>> more testing and spitted the patch to make it more easier to review.
>>>> There are still couple of issues to be addressed and I am working on them.
>>>>
>>>> 1. AARCH64 bootstrap now fails with the commit
>>>> 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is mis-compiled
>>>> in stage2 and fwprop.c is failing. It looks to me that there is a latent
>>>> issue which gets exposed my patch. I can also reproduce this in x86_64
>>>> if I use the same PROMOTE_MODE which is used in aarch64 port. For the
>>>> time being, I am using  patch
>>>> 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a
>>>> workaround. This meeds to be fixed before the patches are ready to be
>>>> committed.
>>>>
>>>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with
>>>> -O3 -g Error: unaligned opcodes detected in executable segment. It works
>>>> fine if I remove the -g. I am looking into it and needs to be fixed as well.
>>>
>>> Hi Richard,
>>>
>>> Now that stage 1 is going to close, I would like to get these patches
>>> accepted for stage1. I will try my best to address your review comments
>>> ASAP.
>>
>> Ok, can you make the whole patch series available so I can poke at the
>> implementation a bit?  Please state the revision it was rebased on
>> (or point me to a git/svn branch the work resides on).
>>
>>> * Issue 1 above (AARCH64 bootstrap now fails with the commit) is no
>>> longer present as it is fixed in trunk. Patch-6 is no longer needed.
>>>
>>> * Issue 2 is also reported as known issue
>>>
>>> *  Promotion of PARM_DECLs and RESULT_DECLs in IPA pass and patterns in
>>> match.pd for SEXT_EXPR, I would like to propose them as a follow up
>>> patch once this is accepted.
>>
>> I thought more about this and don't think it can be made work without a lot of
>> hassle.  Instead to get rid of the remaining "badly" typed registers in the
>> function we can key different type requirements on a pass property
>> (PROP_promoted_regs), thus simply change the expectation of the
>> types of function parameters / results according to their promotion.
>
> Or maybe we should simply make GIMPLE _always_ adhere to the ABI
> details from the start (gimplification).  Note that this does not only involve
> PROMOTE_MODE.  Note that for what GIMPLE is concerned I'd only
> "lower" passing / returning in registers (whee, and then we have
> things like targetm.calls.split_complex_arg ... not to mention passing
> GIMPLE memory in registers).
>
> Maybe I'm shooting too far here in the attempt to make GIMPLE closer
> to the target (to expose those redundant extensions on GIMPLE) and
> we'll end up with a bigger mess than with not doing this?

I'm leary of building this in as early as gimplification, lest we get into 
trouble with splitting out bits of the current function for off-loading.  What 
happens when the cpu and gpu have different promotion rules?


r~

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [0/7] Type promotion pass and elimination of zext/sext
  2015-10-21 13:57     ` Richard Biener
@ 2015-10-21 17:17       ` Joseph Myers
  2015-10-21 18:11       ` Richard Henderson
  1 sibling, 0 replies; 28+ messages in thread
From: Joseph Myers @ 2015-10-21 17:17 UTC (permalink / raw)
  To: Richard Biener; +Cc: Kugan, Richard Henderson, gcc-patches

On Wed, 21 Oct 2015, Richard Biener wrote:

> Or maybe we should simply make GIMPLE _always_ adhere to the ABI
> details from the start (gimplification).  Note that this does not only involve
> PROMOTE_MODE.  Note that for what GIMPLE is concerned I'd only
> "lower" passing / returning in registers (whee, and then we have
> things like targetm.calls.split_complex_arg ... not to mention passing
> GIMPLE memory in registers).
> 
> Maybe I'm shooting too far here in the attempt to make GIMPLE closer
> to the target (to expose those redundant extensions on GIMPLE) and
> we'll end up with a bigger mess than with not doing this?

I don't know at what point target-specific promotion should appear, but 
right now it's visible before then (front ends use 
targetm.calls.promote_prototypes), which is definitely too early.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [0/7] Type promotion pass and elimination of zext/sext
  2015-10-21 12:56   ` Richard Biener
@ 2015-10-21 13:57     ` Richard Biener
  2015-10-21 17:17       ` Joseph Myers
  2015-10-21 18:11       ` Richard Henderson
  2015-10-22 11:01     ` Kugan
  1 sibling, 2 replies; 28+ messages in thread
From: Richard Biener @ 2015-10-21 13:57 UTC (permalink / raw)
  To: Kugan, Richard Henderson; +Cc: gcc-patches

On Wed, Oct 21, 2015 at 2:45 PM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Tue, Oct 20, 2015 at 10:03 PM, Kugan
> <kugan.vivekanandarajah@linaro.org> wrote:
>>
>>
>> On 07/09/15 12:53, Kugan wrote:
>>>
>>> This a new version of the patch posted in
>>> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done
>>> more testing and spitted the patch to make it more easier to review.
>>> There are still couple of issues to be addressed and I am working on them.
>>>
>>> 1. AARCH64 bootstrap now fails with the commit
>>> 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is mis-compiled
>>> in stage2 and fwprop.c is failing. It looks to me that there is a latent
>>> issue which gets exposed my patch. I can also reproduce this in x86_64
>>> if I use the same PROMOTE_MODE which is used in aarch64 port. For the
>>> time being, I am using  patch
>>> 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a
>>> workaround. This meeds to be fixed before the patches are ready to be
>>> committed.
>>>
>>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with
>>> -O3 -g Error: unaligned opcodes detected in executable segment. It works
>>> fine if I remove the -g. I am looking into it and needs to be fixed as well.
>>
>> Hi Richard,
>>
>> Now that stage 1 is going to close, I would like to get these patches
>> accepted for stage1. I will try my best to address your review comments
>> ASAP.
>
> Ok, can you make the whole patch series available so I can poke at the
> implementation a bit?  Please state the revision it was rebased on
> (or point me to a git/svn branch the work resides on).
>
>> * Issue 1 above (AARCH64 bootstrap now fails with the commit) is no
>> longer present as it is fixed in trunk. Patch-6 is no longer needed.
>>
>> * Issue 2 is also reported as known issue
>>
>> *  Promotion of PARM_DECLs and RESULT_DECLs in IPA pass and patterns in
>> match.pd for SEXT_EXPR, I would like to propose them as a follow up
>> patch once this is accepted.
>
> I thought more about this and don't think it can be made work without a lot of
> hassle.  Instead to get rid of the remaining "badly" typed registers in the
> function we can key different type requirements on a pass property
> (PROP_promoted_regs), thus simply change the expectation of the
> types of function parameters / results according to their promotion.

Or maybe we should simply make GIMPLE _always_ adhere to the ABI
details from the start (gimplification).  Note that this does not only involve
PROMOTE_MODE.  Note that for what GIMPLE is concerned I'd only
"lower" passing / returning in registers (whee, and then we have
things like targetm.calls.split_complex_arg ... not to mention passing
GIMPLE memory in registers).

Maybe I'm shooting too far here in the attempt to make GIMPLE closer
to the target (to expose those redundant extensions on GIMPLE) and
we'll end up with a bigger mess than with not doing this?

Richard.

> The promotion pass would set PROP_promoted_regs then.
>
> I will look over the patch(es) this week but as said I'd like to play with
> some code examples myself and thus like to have the current patchset
> in a more easily accessible form (and sure to apply to some rev.).
>
> Thanks,
> Richard.
>
>> * I am happy to turn this pass off by default till IPA and match.pd
>> changes are accepted. I can do regular testing to make sure that this
>> pass works properly till we enable it by default.
>>
>>
>> Please let me know what you think,
>>
>> Thanks,
>> Kugan

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [0/7] Type promotion pass and elimination of zext/sext
  2015-10-20 20:13 ` Kugan
@ 2015-10-21 12:56   ` Richard Biener
  2015-10-21 13:57     ` Richard Biener
  2015-10-22 11:01     ` Kugan
  0 siblings, 2 replies; 28+ messages in thread
From: Richard Biener @ 2015-10-21 12:56 UTC (permalink / raw)
  To: Kugan; +Cc: gcc-patches

On Tue, Oct 20, 2015 at 10:03 PM, Kugan
<kugan.vivekanandarajah@linaro.org> wrote:
>
>
> On 07/09/15 12:53, Kugan wrote:
>>
>> This a new version of the patch posted in
>> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done
>> more testing and spitted the patch to make it more easier to review.
>> There are still couple of issues to be addressed and I am working on them.
>>
>> 1. AARCH64 bootstrap now fails with the commit
>> 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is mis-compiled
>> in stage2 and fwprop.c is failing. It looks to me that there is a latent
>> issue which gets exposed my patch. I can also reproduce this in x86_64
>> if I use the same PROMOTE_MODE which is used in aarch64 port. For the
>> time being, I am using  patch
>> 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a
>> workaround. This meeds to be fixed before the patches are ready to be
>> committed.
>>
>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with
>> -O3 -g Error: unaligned opcodes detected in executable segment. It works
>> fine if I remove the -g. I am looking into it and needs to be fixed as well.
>
> Hi Richard,
>
> Now that stage 1 is going to close, I would like to get these patches
> accepted for stage1. I will try my best to address your review comments
> ASAP.

Ok, can you make the whole patch series available so I can poke at the
implementation a bit?  Please state the revision it was rebased on
(or point me to a git/svn branch the work resides on).

> * Issue 1 above (AARCH64 bootstrap now fails with the commit) is no
> longer present as it is fixed in trunk. Patch-6 is no longer needed.
>
> * Issue 2 is also reported as known issue
>
> *  Promotion of PARM_DECLs and RESULT_DECLs in IPA pass and patterns in
> match.pd for SEXT_EXPR, I would like to propose them as a follow up
> patch once this is accepted.

I thought more about this and don't think it can be made work without a lot of
hassle.  Instead to get rid of the remaining "badly" typed registers in the
function we can key different type requirements on a pass property
(PROP_promoted_regs), thus simply change the expectation of the
types of function parameters / results according to their promotion.

The promotion pass would set PROP_promoted_regs then.

I will look over the patch(es) this week but as said I'd like to play with
some code examples myself and thus like to have the current patchset
in a more easily accessible form (and sure to apply to some rev.).

Thanks,
Richard.

> * I am happy to turn this pass off by default till IPA and match.pd
> changes are accepted. I can do regular testing to make sure that this
> pass works properly till we enable it by default.
>
>
> Please let me know what you think,
>
> Thanks,
> Kugan

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [0/7] Type promotion pass and elimination of zext/sext
  2015-09-07  2:55 Kugan
@ 2015-10-20 20:13 ` Kugan
  2015-10-21 12:56   ` Richard Biener
  0 siblings, 1 reply; 28+ messages in thread
From: Kugan @ 2015-10-20 20:13 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Biener



On 07/09/15 12:53, Kugan wrote:
> 
> This a new version of the patch posted in
> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done
> more testing and spitted the patch to make it more easier to review.
> There are still couple of issues to be addressed and I am working on them.
> 
> 1. AARCH64 bootstrap now fails with the commit
> 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is mis-compiled
> in stage2 and fwprop.c is failing. It looks to me that there is a latent
> issue which gets exposed my patch. I can also reproduce this in x86_64
> if I use the same PROMOTE_MODE which is used in aarch64 port. For the
> time being, I am using  patch
> 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a
> workaround. This meeds to be fixed before the patches are ready to be
> committed.
> 
> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with
> -O3 -g Error: unaligned opcodes detected in executable segment. It works
> fine if I remove the -g. I am looking into it and needs to be fixed as well.

Hi Richard,

Now that stage 1 is going to close, I would like to get these patches
accepted for stage1. I will try my best to address your review comments
ASAP.

* Issue 1 above (AARCH64 bootstrap now fails with the commit) is no
longer present as it is fixed in trunk. Patch-6 is no longer needed.

* Issue 2 is also reported as known issue

*  Promotion of PARM_DECLs and RESULT_DECLs in IPA pass and patterns in
match.pd for SEXT_EXPR, I would like to propose them as a follow up
patch once this is accepted.

* I am happy to turn this pass off by default till IPA and match.pd
changes are accepted. I can do regular testing to make sure that this
pass works properly till we enable it by default.


Please let me know what you think,

Thanks,
Kugan

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [0/7] Type promotion pass and elimination of zext/sext
@ 2015-09-07  2:55 Kugan
  2015-10-20 20:13 ` Kugan
  0 siblings, 1 reply; 28+ messages in thread
From: Kugan @ 2015-09-07  2:55 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Biener


This a new version of the patch posted in
https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done
more testing and spitted the patch to make it more easier to review.
There are still couple of issues to be addressed and I am working on them.

1. AARCH64 bootstrap now fails with the commit
94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is mis-compiled
in stage2 and fwprop.c is failing. It looks to me that there is a latent
issue which gets exposed my patch. I can also reproduce this in x86_64
if I use the same PROMOTE_MODE which is used in aarch64 port. For the
time being, I am using  patch
0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a
workaround. This meeds to be fixed before the patches are ready to be
committed.

2. vector-compare-1.c from c-c++-common/torture fails to assemble with
-O3 -g Error: unaligned opcodes detected in executable segment. It works
fine if I remove the -g. I am looking into it and needs to be fixed as well.

In the meantime, I would appreciate if you take some time to review this.

I have bootstrapped on x86_64-linux-gnu, arm-linux-gnu and
aarch-64-linux-gnu (with the workaround) and regression tested.

Thanks,
Kugan

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2015-12-16 13:18 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <A610E03AD50BFC4D95529A36D37FA55E8A7AB808CC@GEORGE.Emea.Arm.com>
2015-09-07 10:51 ` [0/7] Type promotion pass and elimination of zext/sext Wilco Dijkstra
2015-09-07 11:31   ` Kugan
2015-09-07 12:17     ` pinskia
2015-09-07 12:49       ` Wilco Dijkstra
2015-09-08  8:03       ` Renlin Li
2015-09-08 12:37         ` Wilco Dijkstra
2015-09-07  2:55 Kugan
2015-10-20 20:13 ` Kugan
2015-10-21 12:56   ` Richard Biener
2015-10-21 13:57     ` Richard Biener
2015-10-21 17:17       ` Joseph Myers
2015-10-21 18:11       ` Richard Henderson
2015-10-22 12:48         ` Richard Biener
2015-10-22 11:01     ` Kugan
2015-10-22 14:24       ` Richard Biener
2015-10-27  1:48         ` kugan
2015-10-28 15:51           ` Richard Biener
2015-11-02  9:17             ` Kugan
2015-11-03 14:40               ` Richard Biener
2015-11-08  9:43                 ` Kugan
2015-11-10 14:13                   ` Richard Biener
2015-11-12  6:08                     ` Kugan
2015-11-14  1:15                     ` Kugan
2015-11-18 14:04                       ` Richard Biener
2015-11-18 15:06                         ` Richard Biener
2015-11-24  2:52                           ` Kugan
2015-12-10  0:27                             ` Kugan
2015-12-16 13:18                               ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).