public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: [PATCH 2/2] Enable elimination of zext/sext
@ 2014-08-27 10:01 Uros Bizjak
  2014-08-27 10:07 ` Richard Biener
  2014-08-27 13:02 ` [PATCH 2/2] Enable elimination of zext/sext Kugan
  0 siblings, 2 replies; 29+ messages in thread
From: Uros Bizjak @ 2014-08-27 10:01 UTC (permalink / raw)
  To: gcc-patches; +Cc: Kugan, Jakub Jelinek, Richard Biener

[-- Attachment #1: Type: text/plain, Size: 5703 bytes --]

Hello!

> 2014-08-07  Kugan Vivekanandarajah  <kuganv@linaro.org>
>
> * calls.c (precompute_arguments): Check
> promoted_for_signed_and_unsigned_p and set the promoted mode.
> (promoted_for_signed_and_unsigned_p): New function.
> (expand_expr_real_1): Check promoted_for_signed_and_unsigned_p
> and set the promoted mode.
> * expr.h (promoted_for_signed_and_unsigned_p): New function definition.
> * cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if
> SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED.

This patch regresses:

Running target unix
FAIL: libgomp.fortran/simd7.f90   -O2  execution test
FAIL: libgomp.fortran/simd7.f90   -Os  execution test

on alphaev6-linux-gnu.

The problem can be illustrated with attached testcase with a
crosscompiler to alphaev68-linux-gnu (-O2 -fopenmp). The problem is in
missing SImode extension after DImode shift of SImode subregs for this
part:

--cut here--
  # test.23_12 = PHI <0(37), 1(36)>
  _242 = ivtmp.181_73 + 2147483645;
  _240 = _242 * 2;
  _63 = (integer(kind=4)) _240;
  if (ubound.6_99 <= 2)
    goto <bb 39>;
  else
    goto <bb 40>;
;;    succ:       39
;;                40

;;   basic block 39, loop depth 1
;;    pred:       38
  pretmp_337 = test.23_12 | l_76;
  goto <bb 45>;
;;    succ:       45

;;   basic block 40, loop depth 1
;;    pred:       38
  _11 = *c_208[0];
  if (_11 != _63)
    goto <bb 45>;
  else
    goto <bb 42>;
--cut here--

this expands to:

(code_label 592 591 593 35 "" [0 uses])

(note 593 592 0 NOTE_INSN_BASIC_BLOCK)

;; _63 = (integer(kind=4)) _240;

(insn 594 593 595 (set (reg:SI 538)
        (const_int 1073741824 [0x40000000])) -1
     (nil))

(insn 595 594 596 (set (reg:SI 539)
        (plus:SI (reg:SI 538)
            (const_int 1073741824 [0x40000000]))) -1
     (nil))

(insn 596 595 597 (set (reg:SI 537)
        (plus:SI (reg:SI 539)
            (const_int -3 [0xfffffffffffffffd]))) -1
     (expr_list:REG_EQUAL (const_int 2147483645 [0x7ffffffd])
        (nil)))

(insn 597 596 598 (set (reg:SI 536 [ D.1700 ])
        (plus:SI (subreg/s/v/u:SI (reg:DI 144 [ ivtmp.181 ]) 0)
            (reg:SI 537))) -1
     (nil))

(insn 598 597 599 (set (reg:DI 540)
        (ashift:DI (subreg:DI (reg:SI 536 [ D.1700 ]) 0)
            (const_int 1 [0x1]))) -1
     (nil))

(insn 599 598 0 (set (reg:DI 145 [ D.1694 ])
        (reg:DI 540)) -1
     (nil))

...

(note 610 609 0 NOTE_INSN_BASIC_BLOCK)

;; _11 = *c_208[0];

(insn 611 610 0 (set (reg:DI 120 [ D.1694 ])
        (sign_extend:DI (mem:SI (reg/v/f:DI 227 [ c ]) [7 *c_208+0 S4
A128]))) simd7.f90:12 -1
     (nil))

;; if (_11 != _63)

(insn 612 611 613 40 (set (reg:DI 545)
        (eq:DI (reg:DI 120 [ D.1694 ])
            (reg:DI 145 [ D.1694 ]))) simd7.f90:12 -1
     (nil))

(jump_insn 613 612 616 40 (set (pc)
        (if_then_else (eq (reg:DI 545)
                (const_int 0 [0]))
            (label_ref 0)
            (pc))) simd7.f90:12 -1
     (int_list:REG_BR_PROB 450 (nil)))

which results in following asm:

$L35:
    addl $25,$7,$2     # 597    addsi3/1    [length = 4]
    addq $2,$2,$2     # 598    ashldi3/1    [length = 4]     <------ here
    bne $24,$L145     # 601    *bcc_normal    [length = 4]
    lda $4,4($20)     # 627    *adddi_internal/2    [length = 4]
    ldl $8,0($20)     # 611    *extendsidi2_1/2    [length = 4]
    lda $3,3($31)     # 74    *movdi/2    [length = 4]
    cmpeq $8,$2,$2     # 612    *setcc_internal    [length = 4]  <-- compare
    bne $2,$L40     # 613    *bcc_normal    [length = 4]
    br $31,$L88     # 2403    jump    [length = 4]
    .align 4
...

Tracking the values with the debugger shows wrong calculation:

   0x000000012000108c <+1788>:  addl    t10,t12,t1
   0x0000000120001090 <+1792>:  addq    t1,t1,t1
   ...
   0x00000001200010a4 <+1812>:  cmpeq   t6,t1,t1
   0x00000001200010a8 <+1816>:  bne     t1,0x1200010c0 <foo_+1840>

(gdb) si
0x000000012000108c      17          l = l .or. any (b /= 7 + i)
(gdb) i r t10 t12
t10            0x7      7
t12            0x7ffffffd       2147483645

(gdb) si
0x0000000120001090      17          l = l .or. any (b /= 7 + i)
(gdb) i r t1
t1             0xffffffff80000004       -2147483644

(gdb) si
18          l = l .or. any (c /= 8 + 2 * i)
(gdb) i r t1
t1             0xffffffff00000008       -4294967288

At this point, the calculation should zero-extend SImode value to full
DImode, since compare operates on DImode values. The problematic insn
is (insn 599), which is now a DImode assignment instead of
zero-extend, due to:

--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -3309,7 +3309,13 @@ expand_gimple_stmt_1 (gimple stmt)
   GET_MODE (target), temp, unsignedp);
   }

- convert_move (SUBREG_REG (target), temp, unsignedp);
+ if ((SUBREG_PROMOTED_GET (target) == SRP_SIGNED_AND_UNSIGNED)
+    && (GET_CODE (temp) == SUBREG)
+    && (GET_MODE (target) == GET_MODE (temp))
+    && (GET_MODE (SUBREG_REG (target)) == GET_MODE (SUBREG_REG (temp))))
+  emit_move_insn (SUBREG_REG (target), SUBREG_REG (temp));
+ else
+  convert_move (SUBREG_REG (target), temp, unsignedp);
       }
     else if (nontemporal && emit_storent_insn (target, temp))
       ;

When compiling this code, we have:

lhs = _63
target = (subreg/s/v/u:SI (reg:DI 145 [ D.1694 ]) 0)
temp = (subreg:SI (reg:DI 540) 0)

So, the code assumes that it is possible to copy (reg:DI 540) directly
to (reg:DI 154). However, this is not the case, since we still have
garbage in the top 32bits.

Reverting the part above fixes the runtime failure, since (insn 599) is now:

(insn 599 598 0 (set (reg:DI 145 [ D.1694 ])
        (zero_extend:DI (subreg:SI (reg:DI 540) 0))) -1
     (nil))

It looks to me that we have also to check the temp with SUBREG_PROMOTED_*.

Uros.

[-- Attachment #2: simd7.f90 --]
[-- Type: text/x-csrc, Size: 1050 bytes --]

subroutine foo (d, e, f, g, m, n)
  integer :: i, j, b(2:9), c(3:n), d(:), e(2:n), f(2:,3:), n
  integer, allocatable :: g(:), h(:), k, m
  logical :: l
  l = .false.
  allocate (h(2:7))
  i = 4; j = 4; b = 7; c = 8; d = 9; e = 10; f = 11; g = 12; h = 13; k = 14; m = 15
!$omp simd linear(b)linear(c:2)linear(d:3)linear(e:4)linear(f:5)linear(g:6) &
!$omp & linear(h:7)linear(k:8)linear(m:9) reduction(.or.:l)
  do i = 0, 63 
    l = l .or. any (b /= 7 + i)
    l = l .or. any (c /= 8 + 2 * i)
    b = b + 1; c = c + 2
    d = d + 3; e = e + 4; f = f + 5; g = g + 6
    h = h + 7; k = k + 8; m = m + 9
  end do
  if (l .or. i /= 64) call abort
  if (any (b /= 7 + 64) .or. any (c /= 8 + 2 * 64)) call abort
end subroutine

  interface
    subroutine foo (d, e, f, g, m, n)
      integer :: d(:), e(2:n), f(2:,3:), n
      integer, allocatable :: g(:), m
    end subroutine
  end interface
  integer, parameter :: n = 8
  integer :: d(2:18), e(3:n+1), f(5:6,7:9)
  integer, allocatable :: g(:), m
  allocate (g(7:10))
  call foo (d, e, f, g, m, n)
end

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-08-27 10:01 [PATCH 2/2] Enable elimination of zext/sext Uros Bizjak
@ 2014-08-27 10:07 ` Richard Biener
  2014-08-27 10:32   ` Uros Bizjak
  2014-08-28  7:50   ` Kugan
  2014-08-27 13:02 ` [PATCH 2/2] Enable elimination of zext/sext Kugan
  1 sibling, 2 replies; 29+ messages in thread
From: Richard Biener @ 2014-08-27 10:07 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: gcc-patches, Kugan, Jakub Jelinek

On Wed, Aug 27, 2014 at 12:01 PM, Uros Bizjak <ubizjak@gmail.com> wrote:
> Hello!
>
>> 2014-08-07  Kugan Vivekanandarajah  <kuganv@linaro.org>
>>
>> * calls.c (precompute_arguments): Check
>> promoted_for_signed_and_unsigned_p and set the promoted mode.
>> (promoted_for_signed_and_unsigned_p): New function.
>> (expand_expr_real_1): Check promoted_for_signed_and_unsigned_p
>> and set the promoted mode.
>> * expr.h (promoted_for_signed_and_unsigned_p): New function definition.
>> * cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if
>> SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED.
>
> This patch regresses:
>
> Running target unix
> FAIL: libgomp.fortran/simd7.f90   -O2  execution test
> FAIL: libgomp.fortran/simd7.f90   -Os  execution test
>
> on alphaev6-linux-gnu.
>
> The problem can be illustrated with attached testcase with a
> crosscompiler to alphaev68-linux-gnu (-O2 -fopenmp). The problem is in
> missing SImode extension after DImode shift of SImode subregs for this
> part:
>
> --cut here--
>   # test.23_12 = PHI <0(37), 1(36)>
>   _242 = ivtmp.181_73 + 2147483645;
>   _240 = _242 * 2;
>   _63 = (integer(kind=4)) _240;
>   if (ubound.6_99 <= 2)
>     goto <bb 39>;
>   else
>     goto <bb 40>;
> ;;    succ:       39
> ;;                40
>
> ;;   basic block 39, loop depth 1
> ;;    pred:       38
>   pretmp_337 = test.23_12 | l_76;
>   goto <bb 45>;
> ;;    succ:       45
>
> ;;   basic block 40, loop depth 1
> ;;    pred:       38
>   _11 = *c_208[0];
>   if (_11 != _63)
>     goto <bb 45>;
>   else
>     goto <bb 42>;
> --cut here--
>
> this expands to:
>
> (code_label 592 591 593 35 "" [0 uses])
>
> (note 593 592 0 NOTE_INSN_BASIC_BLOCK)
>
> ;; _63 = (integer(kind=4)) _240;
>
> (insn 594 593 595 (set (reg:SI 538)
>         (const_int 1073741824 [0x40000000])) -1
>      (nil))
>
> (insn 595 594 596 (set (reg:SI 539)
>         (plus:SI (reg:SI 538)
>             (const_int 1073741824 [0x40000000]))) -1
>      (nil))
>
> (insn 596 595 597 (set (reg:SI 537)
>         (plus:SI (reg:SI 539)
>             (const_int -3 [0xfffffffffffffffd]))) -1
>      (expr_list:REG_EQUAL (const_int 2147483645 [0x7ffffffd])
>         (nil)))
>
> (insn 597 596 598 (set (reg:SI 536 [ D.1700 ])
>         (plus:SI (subreg/s/v/u:SI (reg:DI 144 [ ivtmp.181 ]) 0)
>             (reg:SI 537))) -1
>      (nil))
>
> (insn 598 597 599 (set (reg:DI 540)
>         (ashift:DI (subreg:DI (reg:SI 536 [ D.1700 ]) 0)
>             (const_int 1 [0x1]))) -1
>      (nil))
>
> (insn 599 598 0 (set (reg:DI 145 [ D.1694 ])
>         (reg:DI 540)) -1
>      (nil))
>
> ...
>
> (note 610 609 0 NOTE_INSN_BASIC_BLOCK)
>
> ;; _11 = *c_208[0];
>
> (insn 611 610 0 (set (reg:DI 120 [ D.1694 ])
>         (sign_extend:DI (mem:SI (reg/v/f:DI 227 [ c ]) [7 *c_208+0 S4
> A128]))) simd7.f90:12 -1
>      (nil))
>
> ;; if (_11 != _63)
>
> (insn 612 611 613 40 (set (reg:DI 545)
>         (eq:DI (reg:DI 120 [ D.1694 ])
>             (reg:DI 145 [ D.1694 ]))) simd7.f90:12 -1
>      (nil))
>
> (jump_insn 613 612 616 40 (set (pc)
>         (if_then_else (eq (reg:DI 545)
>                 (const_int 0 [0]))
>             (label_ref 0)
>             (pc))) simd7.f90:12 -1
>      (int_list:REG_BR_PROB 450 (nil)))
>
> which results in following asm:
>
> $L35:
>     addl $25,$7,$2     # 597    addsi3/1    [length = 4]
>     addq $2,$2,$2     # 598    ashldi3/1    [length = 4]     <------ here
>     bne $24,$L145     # 601    *bcc_normal    [length = 4]
>     lda $4,4($20)     # 627    *adddi_internal/2    [length = 4]
>     ldl $8,0($20)     # 611    *extendsidi2_1/2    [length = 4]
>     lda $3,3($31)     # 74    *movdi/2    [length = 4]
>     cmpeq $8,$2,$2     # 612    *setcc_internal    [length = 4]  <-- compare
>     bne $2,$L40     # 613    *bcc_normal    [length = 4]
>     br $31,$L88     # 2403    jump    [length = 4]
>     .align 4
> ...
>
> Tracking the values with the debugger shows wrong calculation:
>
>    0x000000012000108c <+1788>:  addl    t10,t12,t1
>    0x0000000120001090 <+1792>:  addq    t1,t1,t1
>    ...
>    0x00000001200010a4 <+1812>:  cmpeq   t6,t1,t1
>    0x00000001200010a8 <+1816>:  bne     t1,0x1200010c0 <foo_+1840>
>
> (gdb) si
> 0x000000012000108c      17          l = l .or. any (b /= 7 + i)
> (gdb) i r t10 t12
> t10            0x7      7
> t12            0x7ffffffd       2147483645
>
> (gdb) si
> 0x0000000120001090      17          l = l .or. any (b /= 7 + i)
> (gdb) i r t1
> t1             0xffffffff80000004       -2147483644
>
> (gdb) si
> 18          l = l .or. any (c /= 8 + 2 * i)
> (gdb) i r t1
> t1             0xffffffff00000008       -4294967288
>
> At this point, the calculation should zero-extend SImode value to full
> DImode, since compare operates on DImode values. The problematic insn
> is (insn 599), which is now a DImode assignment instead of
> zero-extend, due to:
>
> --- a/gcc/cfgexpand.c
> +++ b/gcc/cfgexpand.c
> @@ -3309,7 +3309,13 @@ expand_gimple_stmt_1 (gimple stmt)
>    GET_MODE (target), temp, unsignedp);
>    }
>
> - convert_move (SUBREG_REG (target), temp, unsignedp);
> + if ((SUBREG_PROMOTED_GET (target) == SRP_SIGNED_AND_UNSIGNED)
> +    && (GET_CODE (temp) == SUBREG)
> +    && (GET_MODE (target) == GET_MODE (temp))
> +    && (GET_MODE (SUBREG_REG (target)) == GET_MODE (SUBREG_REG (temp))))
> +  emit_move_insn (SUBREG_REG (target), SUBREG_REG (temp));
> + else
> +  convert_move (SUBREG_REG (target), temp, unsignedp);
>        }
>      else if (nontemporal && emit_storent_insn (target, temp))
>        ;
>
> When compiling this code, we have:
>
> lhs = _63
> target = (subreg/s/v/u:SI (reg:DI 145 [ D.1694 ]) 0)
> temp = (subreg:SI (reg:DI 540) 0)
>
> So, the code assumes that it is possible to copy (reg:DI 540) directly
> to (reg:DI 154). However, this is not the case, since we still have
> garbage in the top 32bits.
>
> Reverting the part above fixes the runtime failure, since (insn 599) is now:
>
> (insn 599 598 0 (set (reg:DI 145 [ D.1694 ])
>         (zero_extend:DI (subreg:SI (reg:DI 540) 0))) -1
>      (nil))
>
> It looks to me that we have also to check the temp with SUBREG_PROMOTED_*.

Yeah, that makes sense.

Richard.

> Uros.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-08-27 10:07 ` Richard Biener
@ 2014-08-27 10:32   ` Uros Bizjak
  2014-08-27 10:32     ` Richard Biener
  2014-09-01  8:48     ` Jakub Jelinek
  2014-08-28  7:50   ` Kugan
  1 sibling, 2 replies; 29+ messages in thread
From: Uros Bizjak @ 2014-08-27 10:32 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches, Kugan, Jakub Jelinek

On Wed, Aug 27, 2014 at 12:07 PM, Richard Biener
<richard.guenther@gmail.com> wrote:
>> 2014-08-07  Kugan Vivekanandarajah  <kuganv@linaro.org>
>>>
>>> * calls.c (precompute_arguments): Check
>>> promoted_for_signed_and_unsigned_p and set the promoted mode.
>>> (promoted_for_signed_and_unsigned_p): New function.
>>> (expand_expr_real_1): Check promoted_for_signed_and_unsigned_p
>>> and set the promoted mode.
>>> * expr.h (promoted_for_signed_and_unsigned_p): New function definition.
>>> * cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if
>>> SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED.
>>
>> This patch regresses:
>>
>> Running target unix
>> FAIL: libgomp.fortran/simd7.f90   -O2  execution test
>> FAIL: libgomp.fortran/simd7.f90   -Os  execution test
>>
>> on alphaev6-linux-gnu.
>>
>> So, the code assumes that it is possible to copy (reg:DI 540) directly
>> to (reg:DI 154). However, this is not the case, since we still have
>> garbage in the top 32bits.
>>
>> Reverting the part above fixes the runtime failure, since (insn 599) is now:
>>
>> (insn 599 598 0 (set (reg:DI 145 [ D.1694 ])
>>         (zero_extend:DI (subreg:SI (reg:DI 540) 0))) -1
>>      (nil))
>>
>> It looks to me that we have also to check the temp with SUBREG_PROMOTED_*.
>
> Yeah, that makes sense.

Something like following (untested) patch that also fixes the testcase perhaps?

-- cut here--
Index: cfgexpand.c
===================================================================
--- cfgexpand.c (revision 214445)
+++ cfgexpand.c (working copy)
@@ -3322,6 +3322,7 @@ expand_gimple_stmt_1 (gimple stmt)

                if ((SUBREG_PROMOTED_GET (target) == SRP_SIGNED_AND_UNSIGNED)
                    && (GET_CODE (temp) == SUBREG)
+                   && SUBREG_PROMOTED_VAR_P (temp)
                    && (GET_MODE (target) == GET_MODE (temp))
                    && (GET_MODE (SUBREG_REG (target)) == GET_MODE
(SUBREG_REG (temp))))
                  emit_move_insn (SUBREG_REG (target), SUBREG_REG (temp));
-- cut here

Uros.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-08-27 10:32   ` Uros Bizjak
@ 2014-08-27 10:32     ` Richard Biener
  2014-09-01  8:48     ` Jakub Jelinek
  1 sibling, 0 replies; 29+ messages in thread
From: Richard Biener @ 2014-08-27 10:32 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: gcc-patches, Kugan, Jakub Jelinek

On Wed, Aug 27, 2014 at 12:25 PM, Uros Bizjak <ubizjak@gmail.com> wrote:
> On Wed, Aug 27, 2014 at 12:07 PM, Richard Biener
> <richard.guenther@gmail.com> wrote:
>>> 2014-08-07  Kugan Vivekanandarajah  <kuganv@linaro.org>
>>>>
>>>> * calls.c (precompute_arguments): Check
>>>> promoted_for_signed_and_unsigned_p and set the promoted mode.
>>>> (promoted_for_signed_and_unsigned_p): New function.
>>>> (expand_expr_real_1): Check promoted_for_signed_and_unsigned_p
>>>> and set the promoted mode.
>>>> * expr.h (promoted_for_signed_and_unsigned_p): New function definition.
>>>> * cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if
>>>> SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED.
>>>
>>> This patch regresses:
>>>
>>> Running target unix
>>> FAIL: libgomp.fortran/simd7.f90   -O2  execution test
>>> FAIL: libgomp.fortran/simd7.f90   -Os  execution test
>>>
>>> on alphaev6-linux-gnu.
>>>
>>> So, the code assumes that it is possible to copy (reg:DI 540) directly
>>> to (reg:DI 154). However, this is not the case, since we still have
>>> garbage in the top 32bits.
>>>
>>> Reverting the part above fixes the runtime failure, since (insn 599) is now:
>>>
>>> (insn 599 598 0 (set (reg:DI 145 [ D.1694 ])
>>>         (zero_extend:DI (subreg:SI (reg:DI 540) 0))) -1
>>>      (nil))
>>>
>>> It looks to me that we have also to check the temp with SUBREG_PROMOTED_*.
>>
>> Yeah, that makes sense.
>
> Something like following (untested) patch that also fixes the testcase perhaps?

Yes (though I'm not really familiar with the RTL side here and the
comment before SUBREG_PROMOTED_VAR_P looks odd)

Richard.

> -- cut here--
> Index: cfgexpand.c
> ===================================================================
> --- cfgexpand.c (revision 214445)
> +++ cfgexpand.c (working copy)
> @@ -3322,6 +3322,7 @@ expand_gimple_stmt_1 (gimple stmt)
>
>                 if ((SUBREG_PROMOTED_GET (target) == SRP_SIGNED_AND_UNSIGNED)
>                     && (GET_CODE (temp) == SUBREG)
> +                   && SUBREG_PROMOTED_VAR_P (temp)
>                     && (GET_MODE (target) == GET_MODE (temp))
>                     && (GET_MODE (SUBREG_REG (target)) == GET_MODE
> (SUBREG_REG (temp))))
>                   emit_move_insn (SUBREG_REG (target), SUBREG_REG (temp));
> -- cut here
>
> Uros.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-08-27 10:01 [PATCH 2/2] Enable elimination of zext/sext Uros Bizjak
  2014-08-27 10:07 ` Richard Biener
@ 2014-08-27 13:02 ` Kugan
  2014-08-28  3:46   ` Kugan
  1 sibling, 1 reply; 29+ messages in thread
From: Kugan @ 2014-08-27 13:02 UTC (permalink / raw)
  To: Uros Bizjak, gcc-patches; +Cc: Jakub Jelinek, Richard Biener

On 27/08/14 20:01, Uros Bizjak wrote:
> Hello!
> 
>> 2014-08-07  Kugan Vivekanandarajah  <kuganv@linaro.org>
>>
>> * calls.c (precompute_arguments): Check
>> promoted_for_signed_and_unsigned_p and set the promoted mode.
>> (promoted_for_signed_and_unsigned_p): New function.
>> (expand_expr_real_1): Check promoted_for_signed_and_unsigned_p
>> and set the promoted mode.
>> * expr.h (promoted_for_signed_and_unsigned_p): New function definition.
>> * cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if
>> SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED.
> 
> This patch regresses:
> 
> Running target unix
> FAIL: libgomp.fortran/simd7.f90   -O2  execution test
> FAIL: libgomp.fortran/simd7.f90   -Os  execution test
> 

[snip]

> When compiling this code, we have:
> 
> lhs = _63
> target = (subreg/s/v/u:SI (reg:DI 145 [ D.1694 ]) 0)
> temp = (subreg:SI (reg:DI 540) 0)
> 
> So, the code assumes that it is possible to copy (reg:DI 540) directly
> to (reg:DI 154). However, this is not the case, since we still have
> garbage in the top 32bits.
> 
> Reverting the part above fixes the runtime failure, since (insn 599) is now:
> 
> (insn 599 598 0 (set (reg:DI 145 [ D.1694 ])
>         (zero_extend:DI (subreg:SI (reg:DI 540) 0))) -1
>      (nil))
> 
> It looks to me that we have also to check the temp with SUBREG_PROMOTED_*.

Sorry for the breakage. I am looking into this now and I can reproduce
it on qemu-alpha.

I have noticed the following VRP data which is used in deciding this
erroneous removal. It seems suspicious to me.

_343: [2147483652, 2147483715]
_344: [8, 134]
_345: [8, 134]

_343 = ivtmp.179_52 + 2147483645;
_344 = _343 * 2;
_345 = (integer(kind=4)) _344;

Error comes from the third statement.

Thanks,
Kugan

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-08-27 13:02 ` [PATCH 2/2] Enable elimination of zext/sext Kugan
@ 2014-08-28  3:46   ` Kugan
  2014-08-28  6:44     ` Marc Glisse
  0 siblings, 1 reply; 29+ messages in thread
From: Kugan @ 2014-08-28  3:46 UTC (permalink / raw)
  To: Uros Bizjak, gcc-patches, Richard Biener; +Cc: Jakub Jelinek


On 27/08/14 23:02, Kugan wrote:
> On 27/08/14 20:01, Uros Bizjak wrote:
>> Hello!
>>
>>> 2014-08-07  Kugan Vivekanandarajah  <kuganv@linaro.org>
>>>
>>> * calls.c (precompute_arguments): Check
>>> promoted_for_signed_and_unsigned_p and set the promoted mode.
>>> (promoted_for_signed_and_unsigned_p): New function.
>>> (expand_expr_real_1): Check promoted_for_signed_and_unsigned_p
>>> and set the promoted mode.
>>> * expr.h (promoted_for_signed_and_unsigned_p): New function definition.
>>> * cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if
>>> SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED.
>>
>> This patch regresses:
>>
>> Running target unix
>> FAIL: libgomp.fortran/simd7.f90   -O2  execution test
>> FAIL: libgomp.fortran/simd7.f90   -Os  execution test
>>
> 
> [snip]
> 
>> When compiling this code, we have:
>>
>> lhs = _63
>> target = (subreg/s/v/u:SI (reg:DI 145 [ D.1694 ]) 0)
>> temp = (subreg:SI (reg:DI 540) 0)
>>
>> So, the code assumes that it is possible to copy (reg:DI 540) directly
>> to (reg:DI 154). However, this is not the case, since we still have
>> garbage in the top 32bits.
>>
>> Reverting the part above fixes the runtime failure, since (insn 599) is now:
>>
>> (insn 599 598 0 (set (reg:DI 145 [ D.1694 ])
>>         (zero_extend:DI (subreg:SI (reg:DI 540) 0))) -1
>>      (nil))
>>
>> It looks to me that we have also to check the temp with SUBREG_PROMOTED_*.
> 
> Sorry for the breakage. I am looking into this now and I can reproduce
> it on qemu-alpha.
> 
> I have noticed the following VRP data which is used in deciding this
> erroneous removal. It seems suspicious to me.
> 
> _343: [2147483652, 2147483715]
> _344: [8, 134]
> _345: [8, 134]
> 
> _343 = ivtmp.179_52 + 2147483645;
> _344 = _343 * 2;
> _345 = (integer(kind=4)) _344;
> 
> Error comes from the third statement.

In tree-vrp.c, in extract_range_from_binary_expr_1, there is a loss of
precision and the value_range is truncated. For the test-case provided
by Uros, it is

_344 = _343 * 2;
[...,0x100000008], precision = 384
[...,0x100000086], precision = 384

and it is converted to following when it goes from wide_int to tree.
[8, 134]

How about doing something like this to fix it.

diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index d16fd8a..c0fb902 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -2625,6 +2625,8 @@ extract_range_from_binary_expr_1 (value_range_t *vr,
              <wi::extended_tree <WIDE_INT_MAX_PRECISION * 2> > vrp_int_cst;
 	  vrp_int sizem1 = wi::mask <vrp_int> (prec, false);
 	  vrp_int size = sizem1 + 1;
+	  vrp_int type_min = vrp_int_cst (TYPE_MIN_VALUE (expr_type));
+	  vrp_int type_max = vrp_int_cst (TYPE_MAX_VALUE (expr_type));

 	  /* Extend the values using the sign of the result to PREC2.
 	     From here on out, everthing is just signed math no matter
@@ -2688,7 +2690,9 @@ extract_range_from_binary_expr_1 (value_range_t *vr,

 	  /* diff = max - min.  */
 	  prod2 = prod3 - prod0;
-	  if (wi::geu_p (prod2, sizem1))
+	  if (wi::geu_p (prod2, sizem1)
+	      || wi::lts_p (prod0, type_min)
+	      || wi::gts_p (prod3, type_max))
 	    {
 	      /* the range covers all values.  */
 	      set_value_range_to_varying (vr);


If this looks reasonable I will do proper testing and post the results
with the Changelog.

Thanks,
Kugan

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-08-28  3:46   ` Kugan
@ 2014-08-28  6:44     ` Marc Glisse
  2014-08-28  7:29       ` Kugan
  0 siblings, 1 reply; 29+ messages in thread
From: Marc Glisse @ 2014-08-28  6:44 UTC (permalink / raw)
  To: Kugan; +Cc: Uros Bizjak, gcc-patches, Richard Biener, Jakub Jelinek

On Thu, 28 Aug 2014, Kugan wrote:

> On 27/08/14 23:02, Kugan wrote:
>> On 27/08/14 20:01, Uros Bizjak wrote:
>>> Hello!
>>>
>>>> 2014-08-07  Kugan Vivekanandarajah  <kuganv@linaro.org>
>>>>
>>>> * calls.c (precompute_arguments): Check
>>>> promoted_for_signed_and_unsigned_p and set the promoted mode.
>>>> (promoted_for_signed_and_unsigned_p): New function.
>>>> (expand_expr_real_1): Check promoted_for_signed_and_unsigned_p
>>>> and set the promoted mode.
>>>> * expr.h (promoted_for_signed_and_unsigned_p): New function definition.
>>>> * cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if
>>>> SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED.
>>>
>>> This patch regresses:
>>>
>>> Running target unix
>>> FAIL: libgomp.fortran/simd7.f90   -O2  execution test
>>> FAIL: libgomp.fortran/simd7.f90   -Os  execution test
>>>
>>
>> [snip]
>>
>>> When compiling this code, we have:
>>>
>>> lhs = _63
>>> target = (subreg/s/v/u:SI (reg:DI 145 [ D.1694 ]) 0)
>>> temp = (subreg:SI (reg:DI 540) 0)
>>>
>>> So, the code assumes that it is possible to copy (reg:DI 540) directly
>>> to (reg:DI 154). However, this is not the case, since we still have
>>> garbage in the top 32bits.
>>>
>>> Reverting the part above fixes the runtime failure, since (insn 599) is now:
>>>
>>> (insn 599 598 0 (set (reg:DI 145 [ D.1694 ])
>>>         (zero_extend:DI (subreg:SI (reg:DI 540) 0))) -1
>>>      (nil))
>>>
>>> It looks to me that we have also to check the temp with SUBREG_PROMOTED_*.
>>
>> Sorry for the breakage. I am looking into this now and I can reproduce
>> it on qemu-alpha.
>>
>> I have noticed the following VRP data which is used in deciding this
>> erroneous removal. It seems suspicious to me.
>>
>> _343: [2147483652, 2147483715]
>> _344: [8, 134]
>> _345: [8, 134]
>>
>> _343 = ivtmp.179_52 + 2147483645;
>> _344 = _343 * 2;
>> _345 = (integer(kind=4)) _344;
>>
>> Error comes from the third statement.
>
> In tree-vrp.c, in extract_range_from_binary_expr_1, there is a loss of
> precision and the value_range is truncated. For the test-case provided
> by Uros, it is
>
> _344 = _343 * 2;
> [...,0x100000008], precision = 384
> [...,0x100000086], precision = 384
>
> and it is converted to following when it goes from wide_int to tree.
> [8, 134]

Why do you believe that is wrong? Assuming _344 has a 32 bit type with 
wrapping overflow, this is just doing the wrapping modulo 2^32.

-- 
Marc Glisse

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-08-28  6:44     ` Marc Glisse
@ 2014-08-28  7:29       ` Kugan
  0 siblings, 0 replies; 29+ messages in thread
From: Kugan @ 2014-08-28  7:29 UTC (permalink / raw)
  To: gcc-patches; +Cc: Uros Bizjak, Richard Biener, Jakub Jelinek



On 28/08/14 16:44, Marc Glisse wrote:
> On Thu, 28 Aug 2014, Kugan wrote:
> 
>> On 27/08/14 23:02, Kugan wrote:
>>> On 27/08/14 20:01, Uros Bizjak wrote:
>>>> Hello!
>>>>
>>>>> 2014-08-07  Kugan Vivekanandarajah  <kuganv@linaro.org>
>>>>>
>>>>> * calls.c (precompute_arguments): Check
>>>>> promoted_for_signed_and_unsigned_p and set the promoted mode.
>>>>> (promoted_for_signed_and_unsigned_p): New function.
>>>>> (expand_expr_real_1): Check promoted_for_signed_and_unsigned_p
>>>>> and set the promoted mode.
>>>>> * expr.h (promoted_for_signed_and_unsigned_p): New function
>>>>> definition.
>>>>> * cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if
>>>>> SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED.
>>>>
>>>> This patch regresses:
>>>>
>>>> Running target unix
>>>> FAIL: libgomp.fortran/simd7.f90   -O2  execution test
>>>> FAIL: libgomp.fortran/simd7.f90   -Os  execution test
>>>>
>>>
>>> [snip]
>>>
>>>> When compiling this code, we have:
>>>>
>>>> lhs = _63
>>>> target = (subreg/s/v/u:SI (reg:DI 145 [ D.1694 ]) 0)
>>>> temp = (subreg:SI (reg:DI 540) 0)
>>>>
>>>> So, the code assumes that it is possible to copy (reg:DI 540) directly
>>>> to (reg:DI 154). However, this is not the case, since we still have
>>>> garbage in the top 32bits.
>>>>
>>>> Reverting the part above fixes the runtime failure, since (insn 599)
>>>> is now:
>>>>
>>>> (insn 599 598 0 (set (reg:DI 145 [ D.1694 ])
>>>>         (zero_extend:DI (subreg:SI (reg:DI 540) 0))) -1
>>>>      (nil))
>>>>
>>>> It looks to me that we have also to check the temp with
>>>> SUBREG_PROMOTED_*.
>>>
>>> Sorry for the breakage. I am looking into this now and I can reproduce
>>> it on qemu-alpha.
>>>
>>> I have noticed the following VRP data which is used in deciding this
>>> erroneous removal. It seems suspicious to me.
>>>
>>> _343: [2147483652, 2147483715]
>>> _344: [8, 134]
>>> _345: [8, 134]
>>>
>>> _343 = ivtmp.179_52 + 2147483645;
>>> _344 = _343 * 2;
>>> _345 = (integer(kind=4)) _344;
>>>
>>> Error comes from the third statement.
>>
>> In tree-vrp.c, in extract_range_from_binary_expr_1, there is a loss of
>> precision and the value_range is truncated. For the test-case provided
>> by Uros, it is
>>
>> _344 = _343 * 2;
>> [...,0x100000008], precision = 384
>> [...,0x100000086], precision = 384
>>
>> and it is converted to following when it goes from wide_int to tree.
>> [8, 134]
> 
> Why do you believe that is wrong? Assuming _344 has a 32 bit type with
> wrapping overflow, this is just doing the wrapping modulo 2^32.
> 

Indeed. I missed the TYPE_OVERFLOW_WRAPS check earlier. Thanks for
pointing me to that.

Kugan

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-08-27 10:07 ` Richard Biener
  2014-08-27 10:32   ` Uros Bizjak
@ 2014-08-28  7:50   ` Kugan
  2014-08-28  8:57     ` Richard Biener
  1 sibling, 1 reply; 29+ messages in thread
From: Kugan @ 2014-08-28  7:50 UTC (permalink / raw)
  To: Richard Biener, Uros Bizjak; +Cc: gcc-patches, Jakub Jelinek



On 27/08/14 20:07, Richard Biener wrote:
> On Wed, Aug 27, 2014 at 12:01 PM, Uros Bizjak <ubizjak@gmail.com> wrote:
>> Hello!
>>
>>> 2014-08-07  Kugan Vivekanandarajah  <kuganv@linaro.org>
>>>
>>> * calls.c (precompute_arguments): Check
>>> promoted_for_signed_and_unsigned_p and set the promoted mode.
>>> (promoted_for_signed_and_unsigned_p): New function.
>>> (expand_expr_real_1): Check promoted_for_signed_and_unsigned_p
>>> and set the promoted mode.
>>> * expr.h (promoted_for_signed_and_unsigned_p): New function definition.
>>> * cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if
>>> SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED.
>>
>> This patch regresses:
>>
>> Running target unix
>> FAIL: libgomp.fortran/simd7.f90   -O2  execution test
>> FAIL: libgomp.fortran/simd7.f90   -Os  execution test
>>
>> on alphaev6-linux-gnu.
>>
>> The problem can be illustrated with attached testcase with a
>> crosscompiler to alphaev68-linux-gnu (-O2 -fopenmp). The problem is in
>> missing SImode extension after DImode shift of SImode subregs for this
>> part:
>>
>> --cut here--
>>   # test.23_12 = PHI <0(37), 1(36)>
>>   _242 = ivtmp.181_73 + 2147483645;
>>   _240 = _242 * 2;
>>   _63 = (integer(kind=4)) _240;
>>   if (ubound.6_99 <= 2)
>>     goto <bb 39>;
>>   else
>>     goto <bb 40>;
>> ;;    succ:       39
>> ;;                40
>>
>> ;;   basic block 39, loop depth 1
>> ;;    pred:       38
>>   pretmp_337 = test.23_12 | l_76;
>>   goto <bb 45>;
>> ;;    succ:       45
>>
>> ;;   basic block 40, loop depth 1
>> ;;    pred:       38
>>   _11 = *c_208[0];
>>   if (_11 != _63)
>>     goto <bb 45>;
>>   else
>>     goto <bb 42>;
>> --cut here--
>>
>> this expands to:
>>
>> (code_label 592 591 593 35 "" [0 uses])
>>
>> (note 593 592 0 NOTE_INSN_BASIC_BLOCK)
>>
>> ;; _63 = (integer(kind=4)) _240;
>>
>> (insn 594 593 595 (set (reg:SI 538)
>>         (const_int 1073741824 [0x40000000])) -1
>>      (nil))
>>
>> (insn 595 594 596 (set (reg:SI 539)
>>         (plus:SI (reg:SI 538)
>>             (const_int 1073741824 [0x40000000]))) -1
>>      (nil))
>>
>> (insn 596 595 597 (set (reg:SI 537)
>>         (plus:SI (reg:SI 539)
>>             (const_int -3 [0xfffffffffffffffd]))) -1
>>      (expr_list:REG_EQUAL (const_int 2147483645 [0x7ffffffd])
>>         (nil)))
>>
>> (insn 597 596 598 (set (reg:SI 536 [ D.1700 ])
>>         (plus:SI (subreg/s/v/u:SI (reg:DI 144 [ ivtmp.181 ]) 0)
>>             (reg:SI 537))) -1
>>      (nil))
>>
>> (insn 598 597 599 (set (reg:DI 540)
>>         (ashift:DI (subreg:DI (reg:SI 536 [ D.1700 ]) 0)
>>             (const_int 1 [0x1]))) -1
>>      (nil))
>>
>> (insn 599 598 0 (set (reg:DI 145 [ D.1694 ])
>>         (reg:DI 540)) -1
>>      (nil))
>>
>> ...
>>
>> (note 610 609 0 NOTE_INSN_BASIC_BLOCK)
>>
>> ;; _11 = *c_208[0];
>>
>> (insn 611 610 0 (set (reg:DI 120 [ D.1694 ])
>>         (sign_extend:DI (mem:SI (reg/v/f:DI 227 [ c ]) [7 *c_208+0 S4
>> A128]))) simd7.f90:12 -1
>>      (nil))
>>
>> ;; if (_11 != _63)
>>
>> (insn 612 611 613 40 (set (reg:DI 545)
>>         (eq:DI (reg:DI 120 [ D.1694 ])
>>             (reg:DI 145 [ D.1694 ]))) simd7.f90:12 -1
>>      (nil))
>>
>> (jump_insn 613 612 616 40 (set (pc)
>>         (if_then_else (eq (reg:DI 545)
>>                 (const_int 0 [0]))
>>             (label_ref 0)
>>             (pc))) simd7.f90:12 -1
>>      (int_list:REG_BR_PROB 450 (nil)))
>>
>> which results in following asm:
>>
>> $L35:
>>     addl $25,$7,$2     # 597    addsi3/1    [length = 4]
>>     addq $2,$2,$2     # 598    ashldi3/1    [length = 4]     <------ here
>>     bne $24,$L145     # 601    *bcc_normal    [length = 4]
>>     lda $4,4($20)     # 627    *adddi_internal/2    [length = 4]
>>     ldl $8,0($20)     # 611    *extendsidi2_1/2    [length = 4]
>>     lda $3,3($31)     # 74    *movdi/2    [length = 4]
>>     cmpeq $8,$2,$2     # 612    *setcc_internal    [length = 4]  <-- compare
>>     bne $2,$L40     # 613    *bcc_normal    [length = 4]
>>     br $31,$L88     # 2403    jump    [length = 4]
>>     .align 4
>> ...
>>
>> Tracking the values with the debugger shows wrong calculation:
>>
>>    0x000000012000108c <+1788>:  addl    t10,t12,t1
>>    0x0000000120001090 <+1792>:  addq    t1,t1,t1
>>    ...
>>    0x00000001200010a4 <+1812>:  cmpeq   t6,t1,t1
>>    0x00000001200010a8 <+1816>:  bne     t1,0x1200010c0 <foo_+1840>
>>
>> (gdb) si
>> 0x000000012000108c      17          l = l .or. any (b /= 7 + i)
>> (gdb) i r t10 t12
>> t10            0x7      7
>> t12            0x7ffffffd       2147483645
>>
>> (gdb) si
>> 0x0000000120001090      17          l = l .or. any (b /= 7 + i)
>> (gdb) i r t1
>> t1             0xffffffff80000004       -2147483644
>>
>> (gdb) si
>> 18          l = l .or. any (c /= 8 + 2 * i)
>> (gdb) i r t1
>> t1             0xffffffff00000008       -4294967288
>>
>> At this point, the calculation should zero-extend SImode value to full
>> DImode, since compare operates on DImode values. The problematic insn
>> is (insn 599), which is now a DImode assignment instead of
>> zero-extend, due to:
>>
>> --- a/gcc/cfgexpand.c
>> +++ b/gcc/cfgexpand.c
>> @@ -3309,7 +3309,13 @@ expand_gimple_stmt_1 (gimple stmt)
>>    GET_MODE (target), temp, unsignedp);
>>    }
>>
>> - convert_move (SUBREG_REG (target), temp, unsignedp);
>> + if ((SUBREG_PROMOTED_GET (target) == SRP_SIGNED_AND_UNSIGNED)
>> +    && (GET_CODE (temp) == SUBREG)
>> +    && (GET_MODE (target) == GET_MODE (temp))
>> +    && (GET_MODE (SUBREG_REG (target)) == GET_MODE (SUBREG_REG (temp))))
>> +  emit_move_insn (SUBREG_REG (target), SUBREG_REG (temp));
>> + else
>> +  convert_move (SUBREG_REG (target), temp, unsignedp);
>>        }
>>      else if (nontemporal && emit_storent_insn (target, temp))
>>        ;
>>
>> When compiling this code, we have:
>>
>> lhs = _63
>> target = (subreg/s/v/u:SI (reg:DI 145 [ D.1694 ]) 0)
>> temp = (subreg:SI (reg:DI 540) 0)
>>
>> So, the code assumes that it is possible to copy (reg:DI 540) directly
>> to (reg:DI 154). However, this is not the case, since we still have
>> garbage in the top 32bits.
>>
>> Reverting the part above fixes the runtime failure, since (insn 599) is now:
>>
>> (insn 599 598 0 (set (reg:DI 145 [ D.1694 ])
>>         (zero_extend:DI (subreg:SI (reg:DI 540) 0))) -1
>>      (nil))
>>
>> It looks to me that we have also to check the temp with SUBREG_PROMOTED_*.
> 
> Yeah, that makes sense.
> 

Thanks Richard for your comments.

I added this part of the code (in cfgexpand.c) to handle binary/unary/..
gimple operations and used the LHS value range to infer the assigned
value range. I will revert this part of the code as this is wrong.

I dont think checking promoted_mode for temp will be necessary here as
convert_move will handle it correctly if promoted_mode is set for temp.

Thus, I will reimplement setting promoted_mode to temp (in
expand_expr_real_2) based on the gimple statement content on RHS. i.e.
by looking at the RHS operands and its value ranges and by calculating
the resulting value range. Does this sound OK to you.

Thanks,
Kugan

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-08-28  7:50   ` Kugan
@ 2014-08-28  8:57     ` Richard Biener
  2014-09-04  3:41       ` Kugan
  0 siblings, 1 reply; 29+ messages in thread
From: Richard Biener @ 2014-08-28  8:57 UTC (permalink / raw)
  To: Kugan; +Cc: Uros Bizjak, gcc-patches, Jakub Jelinek

On Thu, Aug 28, 2014 at 9:50 AM, Kugan
<kugan.vivekanandarajah@linaro.org> wrote:
>
>
> On 27/08/14 20:07, Richard Biener wrote:
>> On Wed, Aug 27, 2014 at 12:01 PM, Uros Bizjak <ubizjak@gmail.com> wrote:
>>> Hello!
>>>
>>>> 2014-08-07  Kugan Vivekanandarajah  <kuganv@linaro.org>
>>>>
>>>> * calls.c (precompute_arguments): Check
>>>> promoted_for_signed_and_unsigned_p and set the promoted mode.
>>>> (promoted_for_signed_and_unsigned_p): New function.
>>>> (expand_expr_real_1): Check promoted_for_signed_and_unsigned_p
>>>> and set the promoted mode.
>>>> * expr.h (promoted_for_signed_and_unsigned_p): New function definition.
>>>> * cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if
>>>> SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED.
>>>
>>> This patch regresses:
>>>
>>> Running target unix
>>> FAIL: libgomp.fortran/simd7.f90   -O2  execution test
>>> FAIL: libgomp.fortran/simd7.f90   -Os  execution test
>>>
>>> on alphaev6-linux-gnu.
>>>
>>> The problem can be illustrated with attached testcase with a
>>> crosscompiler to alphaev68-linux-gnu (-O2 -fopenmp). The problem is in
>>> missing SImode extension after DImode shift of SImode subregs for this
>>> part:
>>>
>>> --cut here--
>>>   # test.23_12 = PHI <0(37), 1(36)>
>>>   _242 = ivtmp.181_73 + 2147483645;
>>>   _240 = _242 * 2;
>>>   _63 = (integer(kind=4)) _240;
>>>   if (ubound.6_99 <= 2)
>>>     goto <bb 39>;
>>>   else
>>>     goto <bb 40>;
>>> ;;    succ:       39
>>> ;;                40
>>>
>>> ;;   basic block 39, loop depth 1
>>> ;;    pred:       38
>>>   pretmp_337 = test.23_12 | l_76;
>>>   goto <bb 45>;
>>> ;;    succ:       45
>>>
>>> ;;   basic block 40, loop depth 1
>>> ;;    pred:       38
>>>   _11 = *c_208[0];
>>>   if (_11 != _63)
>>>     goto <bb 45>;
>>>   else
>>>     goto <bb 42>;
>>> --cut here--
>>>
>>> this expands to:
>>>
>>> (code_label 592 591 593 35 "" [0 uses])
>>>
>>> (note 593 592 0 NOTE_INSN_BASIC_BLOCK)
>>>
>>> ;; _63 = (integer(kind=4)) _240;
>>>
>>> (insn 594 593 595 (set (reg:SI 538)
>>>         (const_int 1073741824 [0x40000000])) -1
>>>      (nil))
>>>
>>> (insn 595 594 596 (set (reg:SI 539)
>>>         (plus:SI (reg:SI 538)
>>>             (const_int 1073741824 [0x40000000]))) -1
>>>      (nil))
>>>
>>> (insn 596 595 597 (set (reg:SI 537)
>>>         (plus:SI (reg:SI 539)
>>>             (const_int -3 [0xfffffffffffffffd]))) -1
>>>      (expr_list:REG_EQUAL (const_int 2147483645 [0x7ffffffd])
>>>         (nil)))
>>>
>>> (insn 597 596 598 (set (reg:SI 536 [ D.1700 ])
>>>         (plus:SI (subreg/s/v/u:SI (reg:DI 144 [ ivtmp.181 ]) 0)
>>>             (reg:SI 537))) -1
>>>      (nil))
>>>
>>> (insn 598 597 599 (set (reg:DI 540)
>>>         (ashift:DI (subreg:DI (reg:SI 536 [ D.1700 ]) 0)
>>>             (const_int 1 [0x1]))) -1
>>>      (nil))
>>>
>>> (insn 599 598 0 (set (reg:DI 145 [ D.1694 ])
>>>         (reg:DI 540)) -1
>>>      (nil))
>>>
>>> ...
>>>
>>> (note 610 609 0 NOTE_INSN_BASIC_BLOCK)
>>>
>>> ;; _11 = *c_208[0];
>>>
>>> (insn 611 610 0 (set (reg:DI 120 [ D.1694 ])
>>>         (sign_extend:DI (mem:SI (reg/v/f:DI 227 [ c ]) [7 *c_208+0 S4
>>> A128]))) simd7.f90:12 -1
>>>      (nil))
>>>
>>> ;; if (_11 != _63)
>>>
>>> (insn 612 611 613 40 (set (reg:DI 545)
>>>         (eq:DI (reg:DI 120 [ D.1694 ])
>>>             (reg:DI 145 [ D.1694 ]))) simd7.f90:12 -1
>>>      (nil))
>>>
>>> (jump_insn 613 612 616 40 (set (pc)
>>>         (if_then_else (eq (reg:DI 545)
>>>                 (const_int 0 [0]))
>>>             (label_ref 0)
>>>             (pc))) simd7.f90:12 -1
>>>      (int_list:REG_BR_PROB 450 (nil)))
>>>
>>> which results in following asm:
>>>
>>> $L35:
>>>     addl $25,$7,$2     # 597    addsi3/1    [length = 4]
>>>     addq $2,$2,$2     # 598    ashldi3/1    [length = 4]     <------ here
>>>     bne $24,$L145     # 601    *bcc_normal    [length = 4]
>>>     lda $4,4($20)     # 627    *adddi_internal/2    [length = 4]
>>>     ldl $8,0($20)     # 611    *extendsidi2_1/2    [length = 4]
>>>     lda $3,3($31)     # 74    *movdi/2    [length = 4]
>>>     cmpeq $8,$2,$2     # 612    *setcc_internal    [length = 4]  <-- compare
>>>     bne $2,$L40     # 613    *bcc_normal    [length = 4]
>>>     br $31,$L88     # 2403    jump    [length = 4]
>>>     .align 4
>>> ...
>>>
>>> Tracking the values with the debugger shows wrong calculation:
>>>
>>>    0x000000012000108c <+1788>:  addl    t10,t12,t1
>>>    0x0000000120001090 <+1792>:  addq    t1,t1,t1
>>>    ...
>>>    0x00000001200010a4 <+1812>:  cmpeq   t6,t1,t1
>>>    0x00000001200010a8 <+1816>:  bne     t1,0x1200010c0 <foo_+1840>
>>>
>>> (gdb) si
>>> 0x000000012000108c      17          l = l .or. any (b /= 7 + i)
>>> (gdb) i r t10 t12
>>> t10            0x7      7
>>> t12            0x7ffffffd       2147483645
>>>
>>> (gdb) si
>>> 0x0000000120001090      17          l = l .or. any (b /= 7 + i)
>>> (gdb) i r t1
>>> t1             0xffffffff80000004       -2147483644
>>>
>>> (gdb) si
>>> 18          l = l .or. any (c /= 8 + 2 * i)
>>> (gdb) i r t1
>>> t1             0xffffffff00000008       -4294967288
>>>
>>> At this point, the calculation should zero-extend SImode value to full
>>> DImode, since compare operates on DImode values. The problematic insn
>>> is (insn 599), which is now a DImode assignment instead of
>>> zero-extend, due to:
>>>
>>> --- a/gcc/cfgexpand.c
>>> +++ b/gcc/cfgexpand.c
>>> @@ -3309,7 +3309,13 @@ expand_gimple_stmt_1 (gimple stmt)
>>>    GET_MODE (target), temp, unsignedp);
>>>    }
>>>
>>> - convert_move (SUBREG_REG (target), temp, unsignedp);
>>> + if ((SUBREG_PROMOTED_GET (target) == SRP_SIGNED_AND_UNSIGNED)
>>> +    && (GET_CODE (temp) == SUBREG)
>>> +    && (GET_MODE (target) == GET_MODE (temp))
>>> +    && (GET_MODE (SUBREG_REG (target)) == GET_MODE (SUBREG_REG (temp))))
>>> +  emit_move_insn (SUBREG_REG (target), SUBREG_REG (temp));
>>> + else
>>> +  convert_move (SUBREG_REG (target), temp, unsignedp);
>>>        }
>>>      else if (nontemporal && emit_storent_insn (target, temp))
>>>        ;
>>>
>>> When compiling this code, we have:
>>>
>>> lhs = _63
>>> target = (subreg/s/v/u:SI (reg:DI 145 [ D.1694 ]) 0)
>>> temp = (subreg:SI (reg:DI 540) 0)
>>>
>>> So, the code assumes that it is possible to copy (reg:DI 540) directly
>>> to (reg:DI 154). However, this is not the case, since we still have
>>> garbage in the top 32bits.
>>>
>>> Reverting the part above fixes the runtime failure, since (insn 599) is now:
>>>
>>> (insn 599 598 0 (set (reg:DI 145 [ D.1694 ])
>>>         (zero_extend:DI (subreg:SI (reg:DI 540) 0))) -1
>>>      (nil))
>>>
>>> It looks to me that we have also to check the temp with SUBREG_PROMOTED_*.
>>
>> Yeah, that makes sense.
>>
>
> Thanks Richard for your comments.
>
> I added this part of the code (in cfgexpand.c) to handle binary/unary/..
> gimple operations and used the LHS value range to infer the assigned
> value range. I will revert this part of the code as this is wrong.
>
> I dont think checking promoted_mode for temp will be necessary here as
> convert_move will handle it correctly if promoted_mode is set for temp.
>
> Thus, I will reimplement setting promoted_mode to temp (in
> expand_expr_real_2) based on the gimple statement content on RHS. i.e.
> by looking at the RHS operands and its value ranges and by calculating
> the resulting value range. Does this sound OK to you.

No, this sounds backward again and won't work because those operands
again could be just truncated - thus you can't rely on their value-range.

What you would need is VRP computing value-ranges in the promoted
mode from the start (and it doesn't do that).

Richard.

> Thanks,
> Kugan
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-08-27 10:32   ` Uros Bizjak
  2014-08-27 10:32     ` Richard Biener
@ 2014-09-01  8:48     ` Jakub Jelinek
  2014-09-01  8:54       ` Uros Bizjak
  1 sibling, 1 reply; 29+ messages in thread
From: Jakub Jelinek @ 2014-09-01  8:48 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: Richard Biener, gcc-patches, Kugan

On Wed, Aug 27, 2014 at 12:25:14PM +0200, Uros Bizjak wrote:
> Something like following (untested) patch that also fixes the testcase perhaps?
> 
> -- cut here--
> Index: cfgexpand.c
> ===================================================================
> --- cfgexpand.c (revision 214445)
> +++ cfgexpand.c (working copy)
> @@ -3322,6 +3322,7 @@ expand_gimple_stmt_1 (gimple stmt)
> 
>                 if ((SUBREG_PROMOTED_GET (target) == SRP_SIGNED_AND_UNSIGNED)
>                     && (GET_CODE (temp) == SUBREG)
> +                   && SUBREG_PROMOTED_VAR_P (temp)
>                     && (GET_MODE (target) == GET_MODE (temp))
>                     && (GET_MODE (SUBREG_REG (target)) == GET_MODE
> (SUBREG_REG (temp))))

Looks like a wrong order of the predicates in any case, first you should
check if it is a SUBREG, then SUBREG_PROMOTED_VAR_P and only then
SUBREG_PROMOTED_GET.  Also, the extra ()s around single line conditions
are unnecessary.

>                   emit_move_insn (SUBREG_REG (target), SUBREG_REG (temp));
> -- cut here
> 
> Uros.

	Jakub

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-09-01  8:48     ` Jakub Jelinek
@ 2014-09-01  8:54       ` Uros Bizjak
  0 siblings, 0 replies; 29+ messages in thread
From: Uros Bizjak @ 2014-09-01  8:54 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Richard Biener, gcc-patches, Kugan

On Mon, Sep 1, 2014 at 10:47 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Wed, Aug 27, 2014 at 12:25:14PM +0200, Uros Bizjak wrote:
>> Something like following (untested) patch that also fixes the testcase perhaps?
>>
>> -- cut here--
>> Index: cfgexpand.c
>> ===================================================================
>> --- cfgexpand.c (revision 214445)
>> +++ cfgexpand.c (working copy)
>> @@ -3322,6 +3322,7 @@ expand_gimple_stmt_1 (gimple stmt)
>>
>>                 if ((SUBREG_PROMOTED_GET (target) == SRP_SIGNED_AND_UNSIGNED)
>>                     && (GET_CODE (temp) == SUBREG)
>> +                   && SUBREG_PROMOTED_VAR_P (temp)
>>                     && (GET_MODE (target) == GET_MODE (temp))
>>                     && (GET_MODE (SUBREG_REG (target)) == GET_MODE
>> (SUBREG_REG (temp))))
>
> Looks like a wrong order of the predicates in any case, first you should
> check if it is a SUBREG, then SUBREG_PROMOTED_VAR_P and only then
> SUBREG_PROMOTED_GET.  Also, the extra ()s around single line conditions
> are unnecessary.

This comment applies to the original code, not the patched line, I guess.

Uros.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-08-28  8:57     ` Richard Biener
@ 2014-09-04  3:41       ` Kugan
  2014-09-04 13:00         ` Richard Biener
  0 siblings, 1 reply; 29+ messages in thread
From: Kugan @ 2014-09-04  3:41 UTC (permalink / raw)
  To: Richard Biener; +Cc: Uros Bizjak, gcc-patches, Jakub Jelinek

[-- Attachment #1: Type: text/plain, Size: 2196 bytes --]

>> I added this part of the code (in cfgexpand.c) to handle binary/unary/..
>> gimple operations and used the LHS value range to infer the assigned
>> value range. I will revert this part of the code as this is wrong.
>>
>> I dont think checking promoted_mode for temp will be necessary here as
>> convert_move will handle it correctly if promoted_mode is set for temp.
>>
>> Thus, I will reimplement setting promoted_mode to temp (in
>> expand_expr_real_2) based on the gimple statement content on RHS. i.e.
>> by looking at the RHS operands and its value ranges and by calculating
>> the resulting value range. Does this sound OK to you.
> 
> No, this sounds backward again and won't work because those operands
> again could be just truncated - thus you can't rely on their value-range.
> 
> What you would need is VRP computing value-ranges in the promoted
> mode from the start (and it doesn't do that).


Hi Richard,

Here is an attempt to do the value range computation in promoted_mode's
type when it is overflowing. Bootstrapped on x86-84.

Based on your feedback, I will do more testing on this.

Thanks for your time,
Kugan

gcc/ChangeLog:

2014-09-04  Kugan Vivekanandarajah <kuganv@linaro.org>

	* tree-ssa-ccp.c (ccp_finalize): Adjust the nonzero_bits precision to
	the type.
	(evaluate_stmt): Likewise.
	* tree-ssanames.c (set_range_info): Adjust if the precision of stored
	value range is different.
	* tree-vrp.c (normalize_int_cst_precision): New function.
	(set_value_range): Add assert to check precision.
	(set_and_canonicalize_value_range): Call normalize_int_cst_precision
	on min and max.
	(promoted_type): New function.
	(promote_unary_vr): Likewise.
	(promote_binary_vr): Likewise.
	(extract_range_from_binary_expr_1): Adjust type to match value range.
	Store value ranges in promoted type if they overflow.
	(extract_range_from_unary_expr_1): Likewise.
	(adjust_range_with_scev): Call normalize_int_cst_precision
	on min and max.
	(vrp_visit_assignment_or_call): Likewise.
	(simplify_bit_ops_using_ranges): Adjust the value range precision.
	(test_for_singularity): Likewise.
	(simplify_stmt_for_jump_threading): Likewise.
	(extract_range_from_assert): Likewise.

[-- Attachment #2: p.txt --]
[-- Type: text/plain, Size: 13120 bytes --]

diff --git a/gcc/tree-ssa-ccp.c b/gcc/tree-ssa-ccp.c
index a90f708..1733073 100644
--- a/gcc/tree-ssa-ccp.c
+++ b/gcc/tree-ssa-ccp.c
@@ -916,7 +916,11 @@ ccp_finalize (void)
 	  unsigned int precision = TYPE_PRECISION (TREE_TYPE (val->value));
 	  wide_int nonzero_bits = wide_int::from (val->mask, precision,
 						  UNSIGNED) | val->value;
-	  nonzero_bits &= get_nonzero_bits (name);
+	  wide_int nonzero_bits_name = get_nonzero_bits (name);
+	  if (precision != nonzero_bits_name.get_precision ())
+	    nonzero_bits = wi::shwi (*nonzero_bits.get_val (),
+				     nonzero_bits_name.get_precision ());
+	  nonzero_bits &= nonzero_bits_name;
 	  set_nonzero_bits (name, nonzero_bits);
 	}
     }
@@ -1852,6 +1856,8 @@ evaluate_stmt (gimple stmt)
     {
       tree lhs = gimple_get_lhs (stmt);
       wide_int nonzero_bits = get_nonzero_bits (lhs);
+      if (TYPE_PRECISION (TREE_TYPE (lhs)) != nonzero_bits.get_precision ())
+	  nonzero_bits = wide_int_to_tree (TREE_TYPE (lhs), nonzero_bits);
       if (nonzero_bits != -1)
 	{
 	  if (!is_constant)
diff --git a/gcc/tree-ssanames.c b/gcc/tree-ssanames.c
index 3af80a0..459c669 100644
--- a/gcc/tree-ssanames.c
+++ b/gcc/tree-ssanames.c
@@ -192,7 +192,7 @@ set_range_info (tree name, enum value_range_type range_type,
   gcc_assert (!POINTER_TYPE_P (TREE_TYPE (name)));
   gcc_assert (range_type == VR_RANGE || range_type == VR_ANTI_RANGE);
   range_info_def *ri = SSA_NAME_RANGE_INFO (name);
-  unsigned int precision = TYPE_PRECISION (TREE_TYPE (name));
+  unsigned int precision = min.get_precision ();
 
   /* Allocate if not available.  */
   if (ri == NULL)
@@ -204,6 +204,15 @@ set_range_info (tree name, enum value_range_type range_type,
       SSA_NAME_RANGE_INFO (name) = ri;
       ri->set_nonzero_bits (wi::shwi (-1, precision));
     }
+  else if (ri->get_min ().get_precision () != precision)
+    {
+      size_t size = (sizeof (range_info_def)
+		     + trailing_wide_ints <3>::extra_size (precision));
+      ri = static_cast<range_info_def *> (ggc_realloc (ri, size));
+      ri->ints.set_precision (precision);
+      SSA_NAME_RANGE_INFO (name) = ri;
+      ri->set_nonzero_bits (wi::shwi (-1, precision));
+    }
 
   /* Record the range type.  */
   if (SSA_NAME_RANGE_TYPE (name) != range_type)
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index d16fd8a..772676a 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -61,6 +61,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "optabs.h"
 #include "tree-ssa-threadedge.h"
 #include "wide-int.h"
+#include "langhooks.h"
 
 
 
@@ -424,6 +425,23 @@ set_value_range_to_varying (value_range_t *vr)
     bitmap_clear (vr->equiv);
 }
 
+/* Normalize min and max to promoted_type if their precision differs.  */
+
+static void
+normalize_int_cst_precision (tree *min, tree *max)
+{
+  if (TREE_CODE (*min) != INTEGER_CST
+	      || TREE_CODE (*max) != INTEGER_CST)
+    return;
+  if (TYPE_PRECISION (TREE_TYPE (*min)) != TYPE_PRECISION (TREE_TYPE (*max)))
+    {
+      tree type = TREE_TYPE (*min);
+      if (TYPE_PRECISION (TREE_TYPE (*min)) < TYPE_PRECISION (TREE_TYPE (*max)))
+	type = TREE_TYPE (*max);
+      *min = wide_int_to_tree (type, *min);
+      *max = wide_int_to_tree (type, *max);
+    }
+}
 
 /* Set value range VR to {T, MIN, MAX, EQUIV}.  */
 
@@ -438,6 +456,8 @@ set_value_range (value_range_t *vr, enum value_range_type t, tree min,
       int cmp;
 
       gcc_assert (min && max);
+      gcc_assert (TYPE_PRECISION (TREE_TYPE (min))
+		  == TYPE_PRECISION (TREE_TYPE (max)));
 
       gcc_assert ((!TREE_OVERFLOW_P (min) || is_overflow_infinity (min))
 		  && (!TREE_OVERFLOW_P (max) || is_overflow_infinity (max)));
@@ -597,6 +617,8 @@ set_and_canonicalize_value_range (value_range_t *vr, enum value_range_type t,
       return;
     }
 
+  if (min != NULL_TREE && max != NULL_TREE)
+    normalize_int_cst_precision (&min, &max);
   set_value_range (vr, t, min, max, equiv);
 }
 
@@ -951,6 +973,66 @@ usable_range_p (value_range_t *vr, bool *strict_overflow_p)
   return true;
 }
 
+/* Return the promoted type as defined by PROMOTE_MODE of the target.  */
+
+static tree
+promoted_type (tree type)
+{
+#ifdef PROMOTE_MODE
+  tree new_type;
+  if (!POINTER_TYPE_P (type)
+      && (TREE_CODE (type) != ENUMERAL_TYPE)
+      && INTEGRAL_TYPE_P (type))
+    {
+      enum machine_mode mode = TYPE_MODE (type);
+      int uns = TYPE_SIGN (type);
+      PROMOTE_MODE (mode, uns, type);
+      uns = TYPE_SIGN (type);
+      new_type = lang_hooks.types.type_for_mode (mode, uns);
+      if (TYPE_PRECISION (new_type) > TYPE_PRECISION (type))
+	type = new_type;
+    }
+#endif
+  return type;
+}
+
+/* Promote VRO to promoted_type if their precision differ and
+   return the new type.  */
+
+static tree
+promote_unary_vr (tree type, value_range_t *vr0)
+{
+  tree expr_type = type;
+
+  if (!range_int_cst_p (vr0))
+    return expr_type;
+  if ((TYPE_PRECISION (type) != TYPE_PRECISION (TREE_TYPE (vr0->min)))
+      || (TYPE_PRECISION (type) != TYPE_PRECISION (TREE_TYPE (vr0->max))))
+    {
+      expr_type = promoted_type (type);
+      vr0->min = wide_int_to_tree (expr_type, vr0->min);
+      vr0->max = wide_int_to_tree (expr_type, vr0->max);
+    }
+  return expr_type;
+}
+
+/* Promote VRO and VR1 to promoted_type if their precision differ and
+   return the new type.  */
+
+static tree
+promote_binary_vr (tree type, value_range_t *vr0, value_range_t *vr1)
+{
+  tree expr_type0 = promote_unary_vr (type, vr0);
+  tree expr_type1 = promote_unary_vr (type, vr1);
+
+  if (TYPE_PRECISION (expr_type0) == TYPE_PRECISION (expr_type1))
+    return expr_type0;
+  if (TYPE_PRECISION (expr_type0) < TYPE_PRECISION (expr_type1))
+    return promote_unary_vr (expr_type1, vr0);
+  else
+    return promote_unary_vr (expr_type0, vr1);
+}
+
 
 /* Return true if the result of assignment STMT is know to be non-negative.
    If the return value is based on the assumption that signed overflow is
@@ -1741,6 +1823,7 @@ extract_range_from_assert (value_range_t *vr_p, tree expr)
 		TREE_NO_WARNING (max) = 1;
 	    }
 
+	  normalize_int_cst_precision (&min, &max);
 	  set_value_range (vr_p, VR_RANGE, min, max, vr_p->equiv);
 	}
     }
@@ -1781,6 +1864,7 @@ extract_range_from_assert (value_range_t *vr_p, tree expr)
 		TREE_NO_WARNING (min) = 1;
 	    }
 
+	  normalize_int_cst_precision (&min, &max);
 	  set_value_range (vr_p, VR_RANGE, min, max, vr_p->equiv);
 	}
     }
@@ -2376,6 +2460,9 @@ extract_range_from_binary_expr_1 (value_range_t *vr,
      range and see what we end up with.  */
   if (code == PLUS_EXPR || code == MINUS_EXPR)
     {
+      /* If any of the value range is in promoted type, promote them all
+	 including the type.  */
+      expr_type = promote_binary_vr (expr_type, &vr0, &vr1);
       /* If we have a PLUS_EXPR with two VR_RANGE integer constant
          ranges compute the precise range for such case if possible.  */
       if (range_int_cst_p (&vr0)
@@ -2562,6 +2649,9 @@ extract_range_from_binary_expr_1 (value_range_t *vr,
   else if (code == MIN_EXPR
 	   || code == MAX_EXPR)
     {
+      /* If any of the value range is in promoted type, promote them all
+	 including the type.  */
+      expr_type = promote_binary_vr (expr_type, &vr0, &vr1);
       if (vr0.type == VR_RANGE
 	  && !symbolic_range_p (&vr0))
 	{
@@ -2625,6 +2715,8 @@ extract_range_from_binary_expr_1 (value_range_t *vr,
              <wi::extended_tree <WIDE_INT_MAX_PRECISION * 2> > vrp_int_cst;
 	  vrp_int sizem1 = wi::mask <vrp_int> (prec, false);
 	  vrp_int size = sizem1 + 1;
+	  vrp_int type_min = vrp_int_cst (TYPE_MIN_VALUE (expr_type));
+	  vrp_int type_max = vrp_int_cst (TYPE_MAX_VALUE (expr_type));
 
 	  /* Extend the values using the sign of the result to PREC2.
 	     From here on out, everthing is just signed math no matter
@@ -2697,8 +2789,17 @@ extract_range_from_binary_expr_1 (value_range_t *vr,
 
 	  /* The following should handle the wrapping and selecting
 	     VR_ANTI_RANGE for us.  */
-	  min = wide_int_to_tree (expr_type, prod0);
-	  max = wide_int_to_tree (expr_type, prod3);
+	  if (wi::lts_p (prod0, type_min)
+	      || wi::gts_p (prod3, type_max))
+	    {
+	      min = wide_int_to_tree (promoted_type (expr_type), prod0);
+	      max = wide_int_to_tree (promoted_type (expr_type), prod3);
+	    }
+	  else
+	    {
+	      min = wide_int_to_tree (expr_type, prod0);
+	      max = wide_int_to_tree (expr_type, prod3);
+	    }
 	  set_and_canonicalize_value_range (vr, VR_RANGE, min, max, NULL);
 	  return;
 	}
@@ -2724,6 +2825,8 @@ extract_range_from_binary_expr_1 (value_range_t *vr,
   else if (code == RSHIFT_EXPR
 	   || code == LSHIFT_EXPR)
     {
+      /* If value range is in promoted type, promote the type as well.  */
+      expr_type = promote_unary_vr (expr_type, &vr0);
       /* If we have a RSHIFT_EXPR with any shift values outside [0..prec-1],
 	 then drop to VR_VARYING.  Outside of this range we get undefined
 	 behavior from the shift operation.  We cannot even trust
@@ -2946,6 +3049,9 @@ extract_range_from_binary_expr_1 (value_range_t *vr,
       wide_int may_be_nonzero0, may_be_nonzero1;
       wide_int must_be_nonzero0, must_be_nonzero1;
 
+      /* If any of the value range is in promoted type, promote them all
+	 including the type.  */
+      expr_type = promote_binary_vr (expr_type, &vr0, &vr1);
       int_cst_range0 = zero_nonzero_bits_from_vr (expr_type, &vr0,
 						  &may_be_nonzero0,
 						  &must_be_nonzero0);
@@ -3224,14 +3330,22 @@ extract_range_from_unary_expr_1 (value_range_t *vr,
 	  tree new_min, new_max;
 	  if (is_overflow_infinity (vr0.min))
 	    new_min = negative_overflow_infinity (outer_type);
-	  else
+	  else if (int_fits_type_p (vr0.min, outer_type))
 	    new_min = force_fit_type (outer_type, wi::to_widest (vr0.min),
 				      0, false);
+	  else
+	    new_min = force_fit_type (promoted_type (outer_type),
+				      wi::to_widest (vr0.min),
+				      0, false);
 	  if (is_overflow_infinity (vr0.max))
 	    new_max = positive_overflow_infinity (outer_type);
-	  else
+	  else if (int_fits_type_p (vr0.min, outer_type))
 	    new_max = force_fit_type (outer_type, wi::to_widest (vr0.max),
 				      0, false);
+	  else
+	    new_max = force_fit_type (promoted_type (outer_type),
+				      wi::to_widest (vr0.max),
+				      0, false);
 	  set_and_canonicalize_value_range (vr, vr0.type,
 					    new_min, new_max, NULL);
 	  return;
@@ -3940,6 +4054,8 @@ adjust_range_with_scev (value_range_t *vr, struct loop *loop,
 	  && is_positive_overflow_infinity (max)))
     return;
 
+  if (min != NULL_TREE && max != NULL_TREE)
+    normalize_int_cst_precision (&min, &max);
   set_value_range (vr, VR_RANGE, min, max, vr->equiv);
 }
 
@@ -6668,6 +6784,8 @@ vrp_visit_assignment_or_call (gimple stmt, tree *output_p)
       else
 	extract_range_from_assignment (&new_vr, stmt);
 
+      if (range_int_cst_p (&new_vr))
+	normalize_int_cst_precision (&new_vr.min, &new_vr.max);
       if (update_value_range (lhs, &new_vr))
 	{
 	  *output_p = lhs;
@@ -8399,6 +8517,8 @@ vrp_visit_phi_node (gimple phi)
   /* If the new range is different than the previous value, keep
      iterating.  */
 update_range:
+  if (range_int_cst_p (&vr_result))
+    normalize_int_cst_precision (&vr_result.min, &vr_result.max);
   if (update_value_range (lhs, &vr_result))
     {
       if (dump_file && (dump_flags & TDF_DETAILS))
@@ -8655,9 +8775,19 @@ simplify_bit_ops_using_ranges (gimple_stmt_iterator *gsi, gimple stmt)
   if (!zero_nonzero_bits_from_vr (TREE_TYPE (op0), &vr0, &may_be_nonzero0,
 				  &must_be_nonzero0))
     return false;
-  if (!zero_nonzero_bits_from_vr (TREE_TYPE (op1), &vr1, &may_be_nonzero1,
+  if (!zero_nonzero_bits_from_vr (TREE_TYPE (op0), &vr1, &may_be_nonzero1,
 				  &must_be_nonzero1))
     return false;
+  if (TYPE_PRECISION (TREE_TYPE (op0)) != may_be_nonzero0.get_precision ())
+    {
+      may_be_nonzero0 = wide_int_to_tree (TREE_TYPE (op0), may_be_nonzero0);
+      must_be_nonzero0 = wide_int_to_tree (TREE_TYPE (op0), must_be_nonzero0);
+    }
+  if (TYPE_PRECISION (TREE_TYPE (op0)) != may_be_nonzero1.get_precision ())
+    {
+      may_be_nonzero1 = wide_int_to_tree (TREE_TYPE (op1), may_be_nonzero0);
+      must_be_nonzero1 = wide_int_to_tree (TREE_TYPE (op1), must_be_nonzero0);
+    }
 
   switch (gimple_assign_rhs_code (stmt))
     {
@@ -8752,9 +8882,9 @@ test_for_singularity (enum tree_code cond_code, tree op0,
   if (min && max)
     {
       if (compare_values (vr->min, min) == 1)
-	min = vr->min;
+	min = wide_int_to_tree (TREE_TYPE (op0), vr->min);
       if (compare_values (vr->max, max) == -1)
-	max = vr->max;
+	max = wide_int_to_tree (TREE_TYPE (op0), vr->max);
 
       /* If the new min/max values have converged to a single value,
 	 then there is only one value which can satisfy the condition,
@@ -9474,7 +9604,7 @@ simplify_stmt_for_jump_threading (gimple stmt, gimple within_stmt)
 	{
 	  extract_range_from_assignment (&new_vr, stmt);
 	  if (range_int_cst_singleton_p (&new_vr))
-	    return new_vr.min;
+	    return wide_int_to_tree (TREE_TYPE (lhs), new_vr.min);
 	}
     }
 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-09-04  3:41       ` Kugan
@ 2014-09-04 13:00         ` Richard Biener
  2014-09-05  1:33           ` Kugan
  0 siblings, 1 reply; 29+ messages in thread
From: Richard Biener @ 2014-09-04 13:00 UTC (permalink / raw)
  To: Kugan; +Cc: Uros Bizjak, gcc-patches, Jakub Jelinek

On Thu, Sep 4, 2014 at 5:41 AM, Kugan <kugan.vivekanandarajah@linaro.org> wrote:
>>> I added this part of the code (in cfgexpand.c) to handle binary/unary/..
>>> gimple operations and used the LHS value range to infer the assigned
>>> value range. I will revert this part of the code as this is wrong.
>>>
>>> I dont think checking promoted_mode for temp will be necessary here as
>>> convert_move will handle it correctly if promoted_mode is set for temp.
>>>
>>> Thus, I will reimplement setting promoted_mode to temp (in
>>> expand_expr_real_2) based on the gimple statement content on RHS. i.e.
>>> by looking at the RHS operands and its value ranges and by calculating
>>> the resulting value range. Does this sound OK to you.
>>
>> No, this sounds backward again and won't work because those operands
>> again could be just truncated - thus you can't rely on their value-range.
>>
>> What you would need is VRP computing value-ranges in the promoted
>> mode from the start (and it doesn't do that).
>
>
> Hi Richard,
>
> Here is an attempt to do the value range computation in promoted_mode's
> type when it is overflowing. Bootstrapped on x86-84.

Err - I think you misunderstood this as a suggestion to do this ;)
value-ranges should be computed according to the type not according
to the (promoted) mode.  Otherwise we will miss optimization
opportunities.

Richard.

> Based on your feedback, I will do more testing on this.
>
> Thanks for your time,
> Kugan
>
> gcc/ChangeLog:
>
> 2014-09-04  Kugan Vivekanandarajah <kuganv@linaro.org>
>
>         * tree-ssa-ccp.c (ccp_finalize): Adjust the nonzero_bits precision to
>         the type.
>         (evaluate_stmt): Likewise.
>         * tree-ssanames.c (set_range_info): Adjust if the precision of stored
>         value range is different.
>         * tree-vrp.c (normalize_int_cst_precision): New function.
>         (set_value_range): Add assert to check precision.
>         (set_and_canonicalize_value_range): Call normalize_int_cst_precision
>         on min and max.
>         (promoted_type): New function.
>         (promote_unary_vr): Likewise.
>         (promote_binary_vr): Likewise.
>         (extract_range_from_binary_expr_1): Adjust type to match value range.
>         Store value ranges in promoted type if they overflow.
>         (extract_range_from_unary_expr_1): Likewise.
>         (adjust_range_with_scev): Call normalize_int_cst_precision
>         on min and max.
>         (vrp_visit_assignment_or_call): Likewise.
>         (simplify_bit_ops_using_ranges): Adjust the value range precision.
>         (test_for_singularity): Likewise.
>         (simplify_stmt_for_jump_threading): Likewise.
>         (extract_range_from_assert): Likewise.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-09-04 13:00         ` Richard Biener
@ 2014-09-05  1:33           ` Kugan
  2014-09-05  9:51             ` Richard Biener
  0 siblings, 1 reply; 29+ messages in thread
From: Kugan @ 2014-09-05  1:33 UTC (permalink / raw)
  To: Richard Biener; +Cc: Uros Bizjak, gcc-patches, Jakub Jelinek

>> Here is an attempt to do the value range computation in promoted_mode's
>> type when it is overflowing. Bootstrapped on x86-84.
> 
> Err - I think you misunderstood this as a suggestion to do this ;)
> value-ranges should be computed according to the type not according
> to the (promoted) mode.  Otherwise we will miss optimization
> opportunities.

Oops, sorry, I had my doubts about making trees aware of back-end stuff.

Coming back to the original problem, what would be the best approach to
handle this. Looking at the VRP pass, it seems to me that only MULT_EXPR
and LSHIFT_EXPR are truncating values this way. All other operation are
setting it to type_min, type_max. Can we rely on this ?

Is this error not showing up in PROMOTED_MODE <= word_mode (and
the mode precision of register from which we SUBREG is <= word_mode
precision) is just a coincidence. Can we rely on this?

Is there anyway we can fix this?

Thanks again,
Kugan



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-09-05  1:33           ` Kugan
@ 2014-09-05  9:51             ` Richard Biener
  2014-09-07  9:51               ` Kugan
  2014-11-09 23:30               ` [RFC] Elimination of zext/sext - type promotion pass Kugan
  0 siblings, 2 replies; 29+ messages in thread
From: Richard Biener @ 2014-09-05  9:51 UTC (permalink / raw)
  To: Kugan; +Cc: Uros Bizjak, gcc-patches, Jakub Jelinek

On Fri, Sep 5, 2014 at 3:33 AM, Kugan <kugan.vivekanandarajah@linaro.org> wrote:
>>> Here is an attempt to do the value range computation in promoted_mode's
>>> type when it is overflowing. Bootstrapped on x86-84.
>>
>> Err - I think you misunderstood this as a suggestion to do this ;)
>> value-ranges should be computed according to the type not according
>> to the (promoted) mode.  Otherwise we will miss optimization
>> opportunities.
>
> Oops, sorry, I had my doubts about making trees aware of back-end stuff.
>
> Coming back to the original problem, what would be the best approach to
> handle this. Looking at the VRP pass, it seems to me that only MULT_EXPR
> and LSHIFT_EXPR are truncating values this way. All other operation are
> setting it to type_min, type_max. Can we rely on this ?

No, that doesn't sound like a good thing to do.

> Is this error not showing up in PROMOTED_MODE <= word_mode (and
> the mode precision of register from which we SUBREG is <= word_mode
> precision) is just a coincidence. Can we rely on this?

Sounds like a coincidence to me.

> Is there anyway we can fix this?

Well - the best way would be to expose the target specifics to GIMPLE
at some point in the optimization pipeline.  My guess would be that it's
appropriate after loop optimizations (but maybe before induction variable
optimization).

That is, have a pass that applies register promotion to all SSA names
in the function, inserting appropriate truncations and extensions.  That
way you'd never see (set (subreg...) on RTL.  The VRP and DOM
passes running after that pass would then be able to aggressively
optimize redundant truncations and extensions.

Effects on debug information are to be considered.  You can change
the type of SSA names in-place but you don't want to do that for
user DECLs (and we can't have the SSA name type and its DECL
type differ - and not sure if we might want to lift that restriction).

Richard.

> Thanks again,
> Kugan
>
>
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-09-05  9:51             ` Richard Biener
@ 2014-09-07  9:51               ` Kugan
  2014-09-08  9:48                 ` Richard Biener
  2014-11-09 23:30               ` [RFC] Elimination of zext/sext - type promotion pass Kugan
  1 sibling, 1 reply; 29+ messages in thread
From: Kugan @ 2014-09-07  9:51 UTC (permalink / raw)
  To: Richard Biener; +Cc: Uros Bizjak, gcc-patches, Jakub Jelinek

On 05/09/14 19:50, Richard Biener wrote:

> Well - the best way would be to expose the target specifics to GIMPLE
> at some point in the optimization pipeline.  My guess would be that it's
> appropriate after loop optimizations (but maybe before induction variable
> optimization).
> 
> That is, have a pass that applies register promotion to all SSA names
> in the function, inserting appropriate truncations and extensions.  That
> way you'd never see (set (subreg...) on RTL.  The VRP and DOM
> passes running after that pass would then be able to aggressively
> optimize redundant truncations and extensions.
> 
> Effects on debug information are to be considered.  You can change
> the type of SSA names in-place but you don't want to do that for
> user DECLs (and we can't have the SSA name type and its DECL
> type differ - and not sure if we might want to lift that restriction).

Thanks. I will try to implement this.

I still would like to keep the VRP based approach as there are some
cases that I think can only be done with range info. For example:

short foo(unsigned char c)
{
  c = c & (unsigned char)0x0F;
  if( c > 7 )
    return((short)(c - 5));
  else
    return(( short )c);
}


So, how about adding and setting the overflow/wrap around flag to
range_info. We now set static_flag for VR_RANG/VR_ANTI_RANGE. If we go
back to the max + 1, min - 1 for VR_ANTI_RANGE, we can use this
static_flag to encode overflow/wrap around. Will that be something
acceptable?

Thanks again,
Kugan

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-09-07  9:51               ` Kugan
@ 2014-09-08  9:48                 ` Richard Biener
  2014-09-09 10:06                   ` Kugan
  0 siblings, 1 reply; 29+ messages in thread
From: Richard Biener @ 2014-09-08  9:48 UTC (permalink / raw)
  To: Kugan; +Cc: Uros Bizjak, gcc-patches, Jakub Jelinek

On Sun, Sep 7, 2014 at 11:50 AM, Kugan
<kugan.vivekanandarajah@linaro.org> wrote:
> On 05/09/14 19:50, Richard Biener wrote:
>
>> Well - the best way would be to expose the target specifics to GIMPLE
>> at some point in the optimization pipeline.  My guess would be that it's
>> appropriate after loop optimizations (but maybe before induction variable
>> optimization).
>>
>> That is, have a pass that applies register promotion to all SSA names
>> in the function, inserting appropriate truncations and extensions.  That
>> way you'd never see (set (subreg...) on RTL.  The VRP and DOM
>> passes running after that pass would then be able to aggressively
>> optimize redundant truncations and extensions.
>>
>> Effects on debug information are to be considered.  You can change
>> the type of SSA names in-place but you don't want to do that for
>> user DECLs (and we can't have the SSA name type and its DECL
>> type differ - and not sure if we might want to lift that restriction).
>
> Thanks. I will try to implement this.
>
> I still would like to keep the VRP based approach as there are some
> cases that I think can only be done with range info. For example:
>
> short foo(unsigned char c)
> {
>   c = c & (unsigned char)0x0F;
>   if( c > 7 )
>     return((short)(c - 5));
>   else
>     return(( short )c);
> }
>
>
> So, how about adding and setting the overflow/wrap around flag to
> range_info. We now set static_flag for VR_RANG/VR_ANTI_RANGE. If we go
> back to the max + 1, min - 1 for VR_ANTI_RANGE, we can use this
> static_flag to encode overflow/wrap around. Will that be something
> acceptable?

You mean tracking in the VRP lattice whether a value wrapped around
(or was assumed not to due to undefined behavior)?  I'm not sure this
is easy to do correctly (VRP is large).

Note that I don't think we'll lose the testcase you quoted if the promotion
pass runs before VRP2.   We'd have as input to VRP2 sth like (assuming
promote mode would promote to SImode)

  SImode tem_2 = (unsigned int)c_1(D);
  tem_3 = tem_3 & 0xF;
  if (tem_3 > 7)
    {
      tem_4 = tem_3 - 5;
      short _5 = (short)_4;
      tem_5 = (unsigned int)_5;
     return tem_5;
   }
else
   {
     short _6 = (short)_3;
     return _6;
   }

VRP should be able to remove the (unsigned int)(short) sign-extension
of tem_4.

note that both incoming registers and return registers are "interesting".
For simplicity I suggest to not promote them on GIMPLE.

What you'd lose in VRP2 is the smaller value-ranges you'd get from
(undefined) wrapping.  You could recover the undefinedness by
looking at SSA names recorded value-range and transfering that
in the promotion pass (but I'm not sure if you want to open the
can of latent signed overflow bugs in programs even more for
PROMOTE_MODE targets...)

Richard.

>
> Thanks again,
> Kugan

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-09-08  9:48                 ` Richard Biener
@ 2014-09-09 10:06                   ` Kugan
  2014-09-09 10:28                     ` Richard Biener
  0 siblings, 1 reply; 29+ messages in thread
From: Kugan @ 2014-09-09 10:06 UTC (permalink / raw)
  To: Richard Biener; +Cc: Uros Bizjak, gcc-patches, Jakub Jelinek

[-- Attachment #1: Type: text/plain, Size: 3667 bytes --]



On 08/09/14 19:48, Richard Biener wrote:
> On Sun, Sep 7, 2014 at 11:50 AM, Kugan
> <kugan.vivekanandarajah@linaro.org> wrote:
>> On 05/09/14 19:50, Richard Biener wrote:
>>
>>> Well - the best way would be to expose the target specifics to GIMPLE
>>> at some point in the optimization pipeline.  My guess would be that it's
>>> appropriate after loop optimizations (but maybe before induction variable
>>> optimization).
>>>
>>> That is, have a pass that applies register promotion to all SSA names
>>> in the function, inserting appropriate truncations and extensions.  That
>>> way you'd never see (set (subreg...) on RTL.  The VRP and DOM
>>> passes running after that pass would then be able to aggressively
>>> optimize redundant truncations and extensions.
>>>
>>> Effects on debug information are to be considered.  You can change
>>> the type of SSA names in-place but you don't want to do that for
>>> user DECLs (and we can't have the SSA name type and its DECL
>>> type differ - and not sure if we might want to lift that restriction).
>>
>> Thanks. I will try to implement this.
>>
>> I still would like to keep the VRP based approach as there are some
>> cases that I think can only be done with range info. For example:
>>
>> short foo(unsigned char c)
>> {
>>   c = c & (unsigned char)0x0F;
>>   if( c > 7 )
>>     return((short)(c - 5));
>>   else
>>     return(( short )c);
>> }
>>
>>
>> So, how about adding and setting the overflow/wrap around flag to
>> range_info. We now set static_flag for VR_RANG/VR_ANTI_RANGE. If we go
>> back to the max + 1, min - 1 for VR_ANTI_RANGE, we can use this
>> static_flag to encode overflow/wrap around. Will that be something
>> acceptable?
> 
> You mean tracking in the VRP lattice whether a value wrapped around
> (or was assumed not to due to undefined behavior)?  I'm not sure this
> is easy to do correctly (VRP is large).
> 
> Note that I don't think we'll lose the testcase you quoted if the promotion
> pass runs before VRP2.   We'd have as input to VRP2 sth like (assuming
> promote mode would promote to SImode)
> 
>   SImode tem_2 = (unsigned int)c_1(D);
>   tem_3 = tem_3 & 0xF;
>   if (tem_3 > 7)
>     {
>       tem_4 = tem_3 - 5;
>       short _5 = (short)_4;
>       tem_5 = (unsigned int)_5;
>      return tem_5;
>    }
> else
>    {
>      short _6 = (short)_3;
>      return _6;
>    }
> 
> VRP should be able to remove the (unsigned int)(short) sign-extension
> of tem_4.
> 
> note that both incoming registers and return registers are "interesting".
> For simplicity I suggest to not promote them on GIMPLE.
> 
> What you'd lose in VRP2 is the smaller value-ranges you'd get from
> (undefined) wrapping.  You could recover the undefinedness by
> looking at SSA names recorded value-range and transfering that
> in the promotion pass (but I'm not sure if you want to open the
> can of latent signed overflow bugs in programs even more for
> PROMOTE_MODE targets...)
> 

Thanks. In the meantime I would like to revert the patch which is
enabling zero/sign extension. I have bootstrapped it in x86_64 and
regression testing is ongoing. Is this OK ?

Thanks,
Kugan

gcc/ChangeLog:

2014-09-09  Kugan Vivekanandarajah  <kuganv@linaro.org>

	Revert r213751:
	* calls.c (precompute_arguments): Check
	 promoted_for_signed_and_unsigned_p and set the promoted mode.
	(promoted_for_signed_and_unsigned_p): New function.
	(expand_expr_real_1): Check promoted_for_signed_and_unsigned_p
	and set the promoted mode.
	* expr.h (promoted_for_signed_and_unsigned_p): New function definition.
	* cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if
	SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED.

[-- Attachment #2: p.txt --]
[-- Type: text/plain, Size: 3886 bytes --]

diff --git a/gcc/calls.c b/gcc/calls.c
index 03ed9c8..345331f 100644
--- a/gcc/calls.c
+++ b/gcc/calls.c
@@ -1486,10 +1486,7 @@ precompute_arguments (int num_actuals, struct arg_data *args)
 	      args[i].initial_value
 		= gen_lowpart_SUBREG (mode, args[i].value);
 	      SUBREG_PROMOTED_VAR_P (args[i].initial_value) = 1;
-	      if (promoted_for_signed_and_unsigned_p (args[i].tree_value, mode))
-		SUBREG_PROMOTED_SET (args[i].initial_value, SRP_SIGNED_AND_UNSIGNED);
-	      else
-		SUBREG_PROMOTED_SET (args[i].initial_value, args[i].unsignedp);
+	      SUBREG_PROMOTED_SET (args[i].initial_value, args[i].unsignedp);
 	    }
 	}
     }
diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index db76897..8916305 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -3325,13 +3325,7 @@ expand_gimple_stmt_1 (gimple stmt)
 					  GET_MODE (target), temp, unsignedp);
 		  }
 
-		if ((SUBREG_PROMOTED_GET (target) == SRP_SIGNED_AND_UNSIGNED)
-		    && (GET_CODE (temp) == SUBREG)
-		    && (GET_MODE (target) == GET_MODE (temp))
-		    && (GET_MODE (SUBREG_REG (target)) == GET_MODE (SUBREG_REG (temp))))
-		  emit_move_insn (SUBREG_REG (target), SUBREG_REG (temp));
-		else
-		  convert_move (SUBREG_REG (target), temp, unsignedp);
+		convert_move (SUBREG_REG (target), temp, unsignedp);
 	      }
 	    else if (nontemporal && emit_storent_insn (target, temp))
 	      ;
diff --git a/gcc/expr.c b/gcc/expr.c
index 7e69955..5039734 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -68,7 +68,6 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-ssa-address.h"
 #include "cfgexpand.h"
 #include "builtins.h"
-#include "tree-ssa.h"
 
 #ifndef STACK_PUSH_CODE
 #ifdef STACK_GROWS_DOWNWARD
@@ -9250,35 +9249,6 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode,
 }
 #undef REDUCE_BIT_FIELD
 
-/* Return TRUE if value in SSA is zero and sign extended for wider mode MODE
-   using value range information stored.  Return FALSE otherwise.
-
-   This is used to check if SUBREG is zero and sign extended and to set
-   promoted mode SRP_SIGNED_AND_UNSIGNED to SUBREG.  */
-
-bool
-promoted_for_signed_and_unsigned_p (tree ssa, enum machine_mode mode)
-{
-  wide_int min, max;
-
-  if (ssa == NULL_TREE
-      || TREE_CODE (ssa) != SSA_NAME
-      || !INTEGRAL_TYPE_P (TREE_TYPE (ssa))
-      || (TYPE_PRECISION (TREE_TYPE (ssa)) != GET_MODE_PRECISION (mode)))
-    return false;
-
-  /* Return FALSE if value_range is not recorded for SSA.  */
-  if (get_range_info (ssa, &min, &max) != VR_RANGE)
-    return false;
-
-  /* Return true (to set SRP_SIGNED_AND_UNSIGNED to SUBREG) if MSB of the
-     smaller mode is not set (i.e.  MSB of ssa is not set).  */
-  if (!wi::neg_p (min, SIGNED) && !wi::neg_p(max, SIGNED))
-    return true;
-  else
-    return false;
-
-}
 
 /* Return TRUE if expression STMT is suitable for replacement.  
    Never consider memory loads as replaceable, because those don't ever lead 
@@ -9582,10 +9552,7 @@ expand_expr_real_1 (tree exp, rtx target, enum machine_mode tmode,
 
 	  temp = gen_lowpart_SUBREG (mode, decl_rtl);
 	  SUBREG_PROMOTED_VAR_P (temp) = 1;
-	  if (promoted_for_signed_and_unsigned_p (ssa_name, mode))
-	    SUBREG_PROMOTED_SET (temp, SRP_SIGNED_AND_UNSIGNED);
-	  else
-	    SUBREG_PROMOTED_SET (temp, unsignedp);
+	  SUBREG_PROMOTED_SET (temp, unsignedp);
 	  return temp;
 	}
 
diff --git a/gcc/expr.h b/gcc/expr.h
index 756c894..2e66329 100644
--- a/gcc/expr.h
+++ b/gcc/expr.h
@@ -440,7 +440,6 @@ extern rtx expand_expr_real_1 (tree, rtx, enum machine_mode,
 			       enum expand_modifier, rtx *, bool);
 extern rtx expand_expr_real_2 (sepops, rtx, enum machine_mode,
 			       enum expand_modifier);
-extern bool promoted_for_signed_and_unsigned_p (tree, enum machine_mode);
 
 /* Generate code for computing expression EXP.
    An rtx for the computed value is returned.  The value is never null.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-09-09 10:06                   ` Kugan
@ 2014-09-09 10:28                     ` Richard Biener
  0 siblings, 0 replies; 29+ messages in thread
From: Richard Biener @ 2014-09-09 10:28 UTC (permalink / raw)
  To: Kugan; +Cc: Uros Bizjak, gcc-patches, Jakub Jelinek

On Tue, Sep 9, 2014 at 12:06 PM, Kugan
<kugan.vivekanandarajah@linaro.org> wrote:
>
>
> On 08/09/14 19:48, Richard Biener wrote:
>> On Sun, Sep 7, 2014 at 11:50 AM, Kugan
>> <kugan.vivekanandarajah@linaro.org> wrote:
>>> On 05/09/14 19:50, Richard Biener wrote:
>>>
>>>> Well - the best way would be to expose the target specifics to GIMPLE
>>>> at some point in the optimization pipeline.  My guess would be that it's
>>>> appropriate after loop optimizations (but maybe before induction variable
>>>> optimization).
>>>>
>>>> That is, have a pass that applies register promotion to all SSA names
>>>> in the function, inserting appropriate truncations and extensions.  That
>>>> way you'd never see (set (subreg...) on RTL.  The VRP and DOM
>>>> passes running after that pass would then be able to aggressively
>>>> optimize redundant truncations and extensions.
>>>>
>>>> Effects on debug information are to be considered.  You can change
>>>> the type of SSA names in-place but you don't want to do that for
>>>> user DECLs (and we can't have the SSA name type and its DECL
>>>> type differ - and not sure if we might want to lift that restriction).
>>>
>>> Thanks. I will try to implement this.
>>>
>>> I still would like to keep the VRP based approach as there are some
>>> cases that I think can only be done with range info. For example:
>>>
>>> short foo(unsigned char c)
>>> {
>>>   c = c & (unsigned char)0x0F;
>>>   if( c > 7 )
>>>     return((short)(c - 5));
>>>   else
>>>     return(( short )c);
>>> }
>>>
>>>
>>> So, how about adding and setting the overflow/wrap around flag to
>>> range_info. We now set static_flag for VR_RANG/VR_ANTI_RANGE. If we go
>>> back to the max + 1, min - 1 for VR_ANTI_RANGE, we can use this
>>> static_flag to encode overflow/wrap around. Will that be something
>>> acceptable?
>>
>> You mean tracking in the VRP lattice whether a value wrapped around
>> (or was assumed not to due to undefined behavior)?  I'm not sure this
>> is easy to do correctly (VRP is large).
>>
>> Note that I don't think we'll lose the testcase you quoted if the promotion
>> pass runs before VRP2.   We'd have as input to VRP2 sth like (assuming
>> promote mode would promote to SImode)
>>
>>   SImode tem_2 = (unsigned int)c_1(D);
>>   tem_3 = tem_3 & 0xF;
>>   if (tem_3 > 7)
>>     {
>>       tem_4 = tem_3 - 5;
>>       short _5 = (short)_4;
>>       tem_5 = (unsigned int)_5;
>>      return tem_5;
>>    }
>> else
>>    {
>>      short _6 = (short)_3;
>>      return _6;
>>    }
>>
>> VRP should be able to remove the (unsigned int)(short) sign-extension
>> of tem_4.
>>
>> note that both incoming registers and return registers are "interesting".
>> For simplicity I suggest to not promote them on GIMPLE.
>>
>> What you'd lose in VRP2 is the smaller value-ranges you'd get from
>> (undefined) wrapping.  You could recover the undefinedness by
>> looking at SSA names recorded value-range and transfering that
>> in the promotion pass (but I'm not sure if you want to open the
>> can of latent signed overflow bugs in programs even more for
>> PROMOTE_MODE targets...)
>>
>
> Thanks. In the meantime I would like to revert the patch which is
> enabling zero/sign extension. I have bootstrapped it in x86_64 and
> regression testing is ongoing. Is this OK ?

Ok.

Thanks,
Richard.

> Thanks,
> Kugan
>
> gcc/ChangeLog:
>
> 2014-09-09  Kugan Vivekanandarajah  <kuganv@linaro.org>
>
>         Revert r213751:
>         * calls.c (precompute_arguments): Check
>          promoted_for_signed_and_unsigned_p and set the promoted mode.
>         (promoted_for_signed_and_unsigned_p): New function.
>         (expand_expr_real_1): Check promoted_for_signed_and_unsigned_p
>         and set the promoted mode.
>         * expr.h (promoted_for_signed_and_unsigned_p): New function definition.
>         * cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if
>         SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [RFC] Elimination of zext/sext - type promotion pass
  2014-09-05  9:51             ` Richard Biener
  2014-09-07  9:51               ` Kugan
@ 2014-11-09 23:30               ` Kugan
  2014-11-10 12:56                 ` Richard Biener
  1 sibling, 1 reply; 29+ messages in thread
From: Kugan @ 2014-11-09 23:30 UTC (permalink / raw)
  To: Richard Biener; +Cc: Uros Bizjak, gcc-patches, Jakub Jelinek

[-- Attachment #1: Type: text/plain, Size: 3535 bytes --]


> Well - the best way would be to expose the target specifics to GIMPLE
> at some point in the optimization pipeline.  My guess would be that it's
> appropriate after loop optimizations (but maybe before induction variable
> optimization).
> 
> That is, have a pass that applies register promotion to all SSA names
> in the function, inserting appropriate truncations and extensions.  That
> way you'd never see (set (subreg...) on RTL.  The VRP and DOM
> passes running after that pass would then be able to aggressively
> optimize redundant truncations and extensions.
> 
> Effects on debug information are to be considered.  You can change
> the type of SSA names in-place but you don't want to do that for
> user DECLs (and we can't have the SSA name type and its DECL
> type differ - and not sure if we might want to lift that restriction).


Thanks for the comments. Here is a prototype patch that implements a
type promotion pass. This pass records SSA variables that will have
values in higher bits (than the original type precision) if promoted and
uses this information in inserting appropriate truncations and
extensions. This pass also classifies some of the stmts that sets ssa's
to be unsafe to promote. Here is a gimple difference for the type
promotion as compared to previous dump for a testcase.

 crc2 (short unsigned int crc, unsigned char data)
 {
   unsigned char carry;
   unsigned char x16;
   unsigned char i;
-  unsigned char ivtmp_5;
+  unsigned int _2;
+  unsigned char _3;
+  unsigned int _4;
+  unsigned int _5;
   unsigned char _9;
-  unsigned char _10;
-  unsigned char ivtmp_18;
+  unsigned int _10;
+  unsigned int _11;
+  unsigned int _12;
+  unsigned int _13;
+  unsigned int _15;
+  unsigned int _16;
+  unsigned int _18;
+  unsigned int _21;
+  unsigned int _22;
+  unsigned int _24;
+  short unsigned int _26;
+  unsigned char _27;
+  unsigned int _28;
+  unsigned int _29;
+  unsigned int _30;

   <bb 2>:
+  _12 = (unsigned int) data_8(D);
+  _2 = (unsigned int) crc_7(D);

   <bb 3>:
-  # crc_28 = PHI <crc_2(5), crc_7(D)(2)>
-  # data_29 = PHI <data_12(5), data_8(D)(2)>
-  # ivtmp_18 = PHI <ivtmp_5(5), 8(2)>
-  _9 = (unsigned char) crc_28;
-  _10 = _9 ^ data_29;
-  x16_11 = _10 & 1;
-  data_12 = data_29 >> 1;
-  if (x16_11 == 1)
+  # _30 = PHI <_28(5), _2(2)>
+  # _16 = PHI <_29(5), _12(2)>
+  # _4 = PHI <_18(5), 8(2)>
+  _9 = (unsigned char) _30;
+  _5 = (unsigned int) _9;
+  _22 = _5 ^ _16;
+  _10 = _22 & 1;
+  _29 = _16 >> 1;
+  _27 = (unsigned char) _10;
+  if (_27 == 1)
     goto <bb 4>;
   else
     goto <bb 7>;

   <bb 4>:
-  crc_13 = crc_28 ^ 16386;
-  crc_24 = crc_13 >> 1;
-  crc_15 = crc_24 | 32768;
+  _11 = _30 ^ 16386;
+  _13 = _11 >> 1;
+  _24 = _13 | 32768;

   <bb 5>:
-  # crc_2 = PHI <crc_15(4), crc_21(7)>
-  ivtmp_5 = ivtmp_18 - 1;
-  if (ivtmp_5 != 0)
+  # _28 = PHI <_24(4), _15(7)>
+  _18 = _4 - 1;
+  _3 = (unsigned char) _18;
+  if (_3 != 0)
     goto <bb 3>;
   else
     goto <bb 6>;

   <bb 6>:
-  # crc_19 = PHI <crc_2(5)>
-  return crc_19;
+  # _21 = PHI <_28(5)>
+  _26 = (short unsigned int) _21;
+  return _26;

   <bb 7>:
-  crc_21 = crc_28 >> 1;
+  _15 = _30 >> 1;
   goto <bb 5>;

 }


I experimented with few simple test-cases and results so far are mixed.
It also seems that subsequent passes are not always optimising as
expected. I haven't looked in detail but will look into it based on the
feedback.

Please also note that this pass still doest handle debug instructions
and there are couple regression failures for ARM.

Thanks,
Kugan



[-- Attachment #2: p.txt --]
[-- Type: text/plain, Size: 33524 bytes --]

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 4a51393..1e26172 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,13 @@
+2014-11-08  Kugan Vivekanandarajah  <kuganv@linaro.org>
+
+	* Makefile.in (OBJS): Add tree-type-prmtn.o.
+	* common.opt (ftree-type-prmt): New flag.
+	* opts.c (OPT_ftree_type_prmt): New option added.
+	* passes.def: New pass included.
+	* tree-pass.h: LikeWise.
+	* timevar.def (TV_TREE_TYPE_PRMT): New timevar.
+	* tree-type-prmtn.c: New file.
+
 2014-11-08  Richard Sandiford  <richard.sandiford@arm.com>
 
 	* config/aarch64/aarch64.c: Include rtl-iter.h.
diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 2c3f02e..2c257b9 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1463,6 +1463,7 @@ OBJS = \
 	tree-vect-slp.o \
 	tree-vectorizer.o \
 	tree-vrp.o \
+	tree-type-prmtn.o \
 	tree.o \
 	valtrack.o \
 	value-prof.o \
diff --git a/gcc/common.opt b/gcc/common.opt
index b400636..4556e0a 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2304,6 +2304,10 @@ ftree-vrp
 Common Report Var(flag_tree_vrp) Init(0) Optimization
 Perform Value Range Propagation on trees
 
+ftree-type-prmt
+Common Report Var(flag_tree_type_prmt) Init(0) Optimization
+Perform Type Promotion on trees
+
 funit-at-a-time
 Common Report Var(flag_unit_at_a_time) Init(1) Optimization
 Compile whole compilation unit at a time
diff --git a/gcc/opts.c b/gcc/opts.c
index 752cc84..784c74f 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -500,6 +500,7 @@ static const struct default_options default_options_table[] =
     { OPT_LEVELS_2_PLUS, OPT_fipa_icf, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fisolate_erroneous_paths_dereference, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fuse_caller_save, NULL, 1 },
+    { OPT_LEVELS_2_PLUS, OPT_ftree_type_prmt, NULL, 1 },
 
     /* -O3 optimizations.  */
     { OPT_LEVELS_3_PLUS, OPT_ftree_loop_distribute_patterns, NULL, 1 },
diff --git a/gcc/passes.def b/gcc/passes.def
index 2305d67..50c5f96 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -264,6 +264,7 @@ along with GCC; see the file COPYING3.  If not see
       PUSH_INSERT_PASSES_WITHIN (pass_tree_no_loop)
 	  NEXT_PASS (pass_slp_vectorize);
       POP_INSERT_PASSES ()
+      NEXT_PASS (pass_type_promotion);
       NEXT_PASS (pass_lower_vector_ssa);
       NEXT_PASS (pass_cse_reciprocals);
       NEXT_PASS (pass_reassoc);
diff --git a/gcc/timevar.def b/gcc/timevar.def
index b74cbb0..0410f9b 100644
--- a/gcc/timevar.def
+++ b/gcc/timevar.def
@@ -266,6 +266,7 @@ DEFTIMEVAR (TV_PLUGIN_RUN            , "plugin execution")
 DEFTIMEVAR (TV_GIMPLE_SLSR           , "straight-line strength reduction")
 DEFTIMEVAR (TV_VTABLE_VERIFICATION   , "vtable verification")
 DEFTIMEVAR (TV_TREE_UBSAN            , "tree ubsan")
+DEFTIMEVAR (TV_TREE_TYPE_PRMT        , "tree type promotion")
 
 /* Everything else in rest_of_compilation not included above.  */
 DEFTIMEVAR (TV_EARLY_LOCAL	     , "early local passes")
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index a3efdd8..0acc15f 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -424,6 +424,7 @@ extern gimple_opt_pass *make_pass_fre (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_check_data_deps (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_copy_prop (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_isolate_erroneous_paths (gcc::context *ctxt);
+extern gimple_opt_pass *make_pass_type_promotion (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_vrp (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_uncprop (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_return_slot (gcc::context *ctxt);
diff --git a/gcc/tree-type-prmtn.c b/gcc/tree-type-prmtn.c
new file mode 100644
index 0000000..1e79c8d
--- /dev/null
+++ b/gcc/tree-type-prmtn.c
@@ -0,0 +1,1103 @@
+/* Type promotion of SSA names to minimise redundant zero/sign extension.
+   Copyright (C) 2014 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "tree.h"
+#include "stor-layout.h"
+#include "calls.h"
+#include "predict.h"
+#include "machmode.h"
+#include "hard-reg-set.h"
+#include "function.h"
+#include "dominance.h"
+#include "cfg.h"
+#include "basic-block.h"
+#include "tree-ssa-alias.h"
+#include "gimple-fold.h"
+#include "tree-eh.h"
+#include "gimple-expr.h"
+#include "is-a.h"
+#include "gimple.h"
+#include "gimple-iterator.h"
+#include "gimple-ssa.h"
+#include "tree-cfg.h"
+#include "tree-phinodes.h"
+#include "ssa-iterators.h"
+#include "stringpool.h"
+#include "tree-ssanames.h"
+#include "tree-ssa.h"
+#include "tree-pass.h"
+#include "gimple-pretty-print.h"
+#include "langhooks.h"
+#include "sbitmap.h"
+#include "domwalk.h"
+
+/* This pass applies type promotion to SSA names in the function and
+   inserts appropriate truncations.  Idea of this pass is to promote operations
+   such a way that we can minimise generation of subreg in RTL,
+   that intern results in removal of redundant zero/sign extensions.  This pass
+   will run prior to The VRP and DOM such that they will be able to optimise
+   redundant truncations and extensions.  This is based on the discussion from
+   https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00472.html.  */
+
+static unsigned n_ssa_val;
+static sbitmap ssa_not_safe_bitmap;
+static sbitmap ssa_to_be_promoted_bitmap;
+static sbitmap ssa_sets_higher_bits_bitmap;
+
+/* Return the promoted type for TYPE as defined by PROMOTE_MODE of the
+   target.  */
+static tree
+get_promoted_type (tree type)
+{
+#ifdef PROMOTE_MODE
+  tree promoted_type;
+  enum machine_mode mode = TYPE_MODE (type);
+  int uns = TYPE_SIGN (type);
+
+  if (POINTER_TYPE_P (type)
+      || TYPE_PRECISION (type) == 1
+      || !INTEGRAL_TYPE_P (type))
+    return type;
+
+  PROMOTE_MODE (mode, uns, type);
+  uns = TYPE_SIGN (type);
+  promoted_type = lang_hooks.types.type_for_mode (mode, uns);
+
+  if (TYPE_PRECISION (promoted_type) > TYPE_PRECISION (type))
+    {
+      promoted_type = build_type_attribute_qual_variant (promoted_type,
+						    TYPE_ATTRIBUTES (type),
+						    TYPE_QUALS (type));
+      type = promoted_type;
+    }
+#endif
+  return type;
+}
+
+/* Predicate that tells if promoting computation with ssa NAME is safe.  */
+static bool
+promotion_safe_p (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      gimple stmt = SSA_NAME_DEF_STMT (name);
+      unsigned int index = SSA_NAME_VERSION (name);
+
+      if (gimple_vdef (stmt) != NULL_TREE
+	  || gimple_vuse (stmt) != NULL_TREE)
+	return false;
+      if (index < n_ssa_val)
+	return !bitmap_bit_p (ssa_not_safe_bitmap, index);
+    }
+  return false;
+}
+
+/* Return true if ssa NAME is already considered for promotion.  */
+static bool
+ssa_promoted_p (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	return !bitmap_bit_p (ssa_to_be_promoted_bitmap, index);
+    }
+  return true;
+}
+
+/* Return true if ssa NAME will be considered for promotion.  */
+static bool
+ssa_tobe_promoted_p (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	return bitmap_bit_p (ssa_to_be_promoted_bitmap, index);
+    }
+  return false;
+}
+
+/* Set ssa NAME to be already considered for promotion.  */
+static void
+set_ssa_promoted (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	bitmap_clear_bit (ssa_to_be_promoted_bitmap, index);
+    }
+}
+
+/* Set ssa NAME will have higher bits if promoted.  */
+static void
+set_ssa_overflows (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	bitmap_set_bit (ssa_sets_higher_bits_bitmap, index);
+    }
+}
+
+/* Return true if ssa NAME will have higher bits if promoted.  */
+static bool
+ssa_overflows_p (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	return bitmap_bit_p (ssa_sets_higher_bits_bitmap, index);
+    }
+  return false;
+}
+
+/* Create an ssa with TYPE to copy ssa VAR.  */
+static tree
+make_promoted_copy (tree var, gimple def_stmt, tree type)
+{
+  tree new_lhs = make_ssa_name (type, def_stmt);
+  if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (var))
+    SSA_NAME_OCCURS_IN_ABNORMAL_PHI (new_lhs) = 1;
+  return new_lhs;
+}
+
+/* Return single successor (excluding EH edge) basic block.  If there are more
+   than one successors, return NULL.  */
+static basic_block
+get_next_bb (basic_block bb)
+{
+  edge e, res = NULL;
+  edge_iterator ei;
+
+  FOR_EACH_EDGE (e, ei, bb->succs)
+    if (!(e->flags & EDGE_EH))
+      {
+	if (res)
+	  return NULL;
+	res = e;
+      }
+  return res->dest;
+}
+
+/* Insert COPY_STMT after STMT when STMT can throw.  Create a new basic block
+   between basic block containing STMT and its successor.  */
+static void
+insert_next_bb (gimple stmt, gimple copy_stmt)
+{
+  edge_iterator ei;
+  edge e, edge = NULL;
+  gimple_stmt_iterator gsi;
+  basic_block bb = gimple_bb (stmt);
+
+  FOR_EACH_EDGE (e, ei, bb->succs)
+    if (!(e->flags & EDGE_EH))
+      {
+	gcc_assert (edge == NULL);
+	edge = e;
+      }
+
+  gcc_assert (edge);
+  basic_block new_bb = split_edge (edge);
+  gsi = gsi_after_labels (new_bb);
+  gsi_insert_after (&gsi, copy_stmt, GSI_NEW_STMT);
+}
+
+
+/* Return false if rhs type cannot be promoted in the stmt.  Return true
+   otherwise.  */
+static bool
+assign_rhs_promotable_p (gimple stmt, tree promoted_type)
+{
+  gcc_assert (gimple_code (stmt) == GIMPLE_ASSIGN);
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  tree lhs = gimple_assign_lhs (stmt);
+  tree rhs1 = gimple_assign_rhs1 (stmt);
+  tree rhs2 = gimple_assign_rhs2 (stmt);
+
+  /* If the OP is part of condition in COND_EXPR, it can be promoted only if
+     higher_bits for both the operands are not set.  */
+  if (TREE_CODE_CLASS (code) == tcc_comparison)
+    {
+      /* LHS and RHS can be promoted without changing the results of
+	 comparison.  */
+      if (((ssa_tobe_promoted_p (rhs1)
+	    && promotion_safe_p (rhs1)
+	    && !ssa_overflows_p (rhs1))
+	   || (TREE_CODE (rhs1) == INTEGER_CST))
+	  && ((ssa_tobe_promoted_p (rhs2)
+	       && promotion_safe_p (rhs2)
+	       && !ssa_overflows_p (rhs2))
+	      || (TREE_CODE (rhs2) == INTEGER_CST)))
+	return true;
+      /* LHS or RHS of the comparison is already promoted.  */
+      else if ((TYPE_PRECISION (TREE_TYPE (rhs1))
+		== TYPE_PRECISION (promoted_type))
+	       || (TYPE_PRECISION (TREE_TYPE (rhs2))
+		   == TYPE_PRECISION (promoted_type)))
+	return true;
+      else
+	return false;
+    }
+
+  if (TREE_CODE_CLASS (code) == tcc_reference
+      || code == VIEW_CONVERT_EXPR
+      || code == COMPLEX_EXPR
+      || code == ASM_EXPR
+      || code == OBJ_TYPE_REF
+      || gimple_vdef (stmt)
+      || VECTOR_TYPE_P (TREE_TYPE (lhs)))
+    return false;
+  return true;
+}
+
+/* Promote constants in STMT to TYPE.  */
+static void
+promote_cst_in_stmt (gimple stmt, tree type)
+{
+  tree op;
+  ssa_op_iter iter;
+  use_operand_p oprnd;
+  int index;
+
+  switch (gimple_code (stmt))
+    {
+    case GIMPLE_ASSIGN:
+      op = gimple_assign_rhs1 (stmt);
+      if (op && TREE_CODE (op) == INTEGER_CST)
+	gimple_assign_set_rhs1 (stmt, fold_convert (type, op));
+      op = gimple_assign_rhs2 (stmt);
+      if (op && TREE_CODE (op) == INTEGER_CST)
+	gimple_assign_set_rhs2 (stmt, fold_convert (type, op));
+      break;
+
+    case GIMPLE_PHI:
+      FOR_EACH_PHI_ARG (oprnd, stmt, iter, SSA_OP_USE)
+	{
+	  op = USE_FROM_PTR (oprnd);
+	  index = PHI_ARG_INDEX_FROM_USE (oprnd);
+	  if (TREE_CODE (op) == INTEGER_CST)
+	    SET_PHI_ARG_DEF (stmt, index, fold_convert (type, op));
+	}
+      break;
+
+    case GIMPLE_COND:
+      op = gimple_cond_lhs (stmt);
+      if (op && TREE_CODE (op) == INTEGER_CST)
+	gimple_cond_set_lhs (stmt, fold_convert (type, op));
+      op = gimple_cond_rhs (stmt);
+      if (op && TREE_CODE (op) == INTEGER_CST)
+	gimple_cond_set_rhs (stmt, fold_convert (type, op));
+
+    default:
+      break;
+    }
+}
+
+/* Promote use in an assignment.  Depending on the gimple_assign_rhs_code,
+   values in NEW_USE might have to be truncated to the type of USE.  */
+static void
+promote_assign_stmt_use (gimple stmt,
+			 tree use,
+			 imm_use_iterator *ui,
+			 tree new_use,
+			 tree copy_of_use,
+			 tree promoted_type)
+{
+  tree lhs = gimple_assign_lhs (stmt);
+  tree rhs1 = gimple_assign_rhs1 (stmt);
+  tree rhs2 = gimple_assign_rhs2 (stmt);
+  gimple_stmt_iterator gsi;
+  use_operand_p op;
+  tree type;
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  /* If promoted and fix up is tobe performed, fix is true.  */
+  bool fix = false;
+  /* If stmt code specifc and fix upis  performed, done is true.  */
+  bool done = false;
+
+  switch (code)
+    {
+    CASE_CONVERT:
+      /* If this is where precision is lost, just replace the use with
+	 new_use. */
+      if (TYPE_PRECISION (TREE_TYPE (lhs)) < TYPE_PRECISION (TREE_TYPE (rhs1)))
+	{
+	  done = true;
+	  FOR_EACH_IMM_USE_ON_STMT (op, *ui)
+	    SET_USE (op, new_use);
+	  update_stmt (stmt);
+	}
+      break;
+
+    case RSHIFT_EXPR:
+    case LSHIFT_EXPR:
+    case WIDEN_LSHIFT_EXPR:
+      if (use == rhs2
+	  && ssa_overflows_p (use))
+	fix = true;
+      break;
+
+    case TRUNC_MOD_EXPR:
+    case CEIL_MOD_EXPR:
+    case FLOOR_MOD_EXPR:
+    case ROUND_MOD_EXPR:
+    case TRUNC_DIV_EXPR:
+    case CEIL_DIV_EXPR:
+    case FLOOR_DIV_EXPR:
+    case RDIV_EXPR:
+    case ROUND_DIV_EXPR:
+    case EXACT_DIV_EXPR:
+      if (ssa_overflows_p (use))
+	fix = true;
+      break;
+
+    default:
+      break;
+    }
+
+  if (fix && !done)
+    {
+      if (promotion_safe_p (lhs))
+	{
+	  tree temp = make_promoted_copy (use, NULL, promoted_type);
+	  gimple copy_stmt
+	    = gimple_build_assign_with_ops (CONVERT_EXPR, temp,
+					    new_use, NULL_TREE);
+	  gsi = gsi_for_stmt (stmt);
+	  gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+	  FOR_EACH_IMM_USE_ON_STMT (op, *ui)
+	    SET_USE (op, temp);
+	  update_stmt (stmt);
+	}
+      else
+	{
+	  tree temp;
+	  if (copy_of_use)
+	    temp = copy_of_use;
+	  else
+	    {
+	      temp = make_promoted_copy (use, NULL, TREE_TYPE (use));
+	      gimple copy_stmt
+		= gimple_build_assign_with_ops (CONVERT_EXPR, temp,
+						new_use, NULL_TREE);
+	      gsi = gsi_for_stmt (stmt);
+	      gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+	    }
+
+	  FOR_EACH_IMM_USE_ON_STMT (op, *ui)
+	    SET_USE (op, temp);
+	  update_stmt (stmt);
+	}
+    }
+  else if (!done)
+    {
+      if (assign_rhs_promotable_p (stmt, promoted_type)
+	  && (promotion_safe_p (lhs)
+	      || (TREE_CODE_CLASS (code) == tcc_comparison)))
+	{
+	  type = promoted_type;
+	  if (TYPE_PRECISION (TREE_TYPE (use))
+	      < TYPE_PRECISION (promoted_type))
+	    promote_cst_in_stmt (stmt, promoted_type);
+	}
+      else
+	type = TREE_TYPE (use);
+
+      if ((type != TREE_TYPE (new_use)
+	   && type != TREE_TYPE (use))
+	  || (type == TREE_TYPE (use)
+	      && !copy_of_use))
+	{
+	  tree temp = make_promoted_copy (use, NULL, type);
+	  gimple copy_stmt
+	    = gimple_build_assign_with_ops (CONVERT_EXPR, temp,
+					    new_use, NULL_TREE);
+	  gsi = gsi_for_stmt (stmt);
+	  gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+	  FOR_EACH_IMM_USE_ON_STMT (op, *ui)
+	    SET_USE (op, temp);
+	  update_stmt (stmt);
+	}
+      else if (use != new_use)
+	{
+	  FOR_EACH_IMM_USE_ON_STMT (op, *ui)
+	    if (type == TREE_TYPE (new_use))
+	      SET_USE (op, new_use);
+	    else
+	      SET_USE (op, copy_of_use);
+	  update_stmt (stmt);
+	}
+    }
+}
+
+/* Promote ssa USE in phi STMT to PROMOTED_TYPE.  */
+static void
+promote_phi_stmt_use (gimple stmt,
+		       tree use,
+		       imm_use_iterator *ui,
+		       tree new_use,
+		       tree copy_of_use,
+		       tree promoted_type)
+{
+  tree lhs = PHI_RESULT (stmt);
+  tree type;
+  tree temp;
+  gimple_stmt_iterator gsi;
+  use_operand_p op;
+
+  if (ssa_tobe_promoted_p (lhs)
+      && promotion_safe_p (lhs))
+    type = promoted_type;
+  else
+    type = TREE_TYPE (lhs);
+
+  /* Check if we need a convert stmt to get the required type.  */
+  if ((type != TREE_TYPE (new_use) && type != TREE_TYPE (use))
+      || (type == TREE_TYPE (use) && !copy_of_use))
+    {
+      temp = make_promoted_copy (use, NULL, type);
+      gimple copy_stmt
+	= gimple_build_assign_with_ops (CONVERT_EXPR, temp,
+					new_use, NULL_TREE);
+      if (gimple_code (SSA_NAME_DEF_STMT (new_use)) == GIMPLE_NOP)
+	{
+	  basic_block bb = ENTRY_BLOCK_PTR_FOR_FN (cfun);
+	  bb = get_next_bb (bb);
+	  gcc_assert (bb);
+	  gsi = gsi_after_labels (bb);
+	  gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+	}
+      else if (gimple_code (SSA_NAME_DEF_STMT (new_use))
+	       != GIMPLE_PHI)
+	{
+	  gsi = gsi_for_stmt (SSA_NAME_DEF_STMT (new_use));
+	  if (lookup_stmt_eh_lp (SSA_NAME_DEF_STMT (new_use)) > 0)
+	    insert_next_bb (SSA_NAME_DEF_STMT (new_use), copy_stmt);
+	  else
+	    gsi_insert_after (&gsi, copy_stmt, GSI_NEW_STMT);
+	}
+      else
+	{
+	  gsi = gsi_after_labels
+	    (gimple_bb (SSA_NAME_DEF_STMT (new_use)));
+	  gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+	}
+    }
+  else if (type == TREE_TYPE (new_use))
+    temp = new_use;
+  else
+    temp = copy_of_use;
+
+  FOR_EACH_IMM_USE_ON_STMT (op, *ui)
+    SET_USE (op, temp);
+  update_stmt (stmt);
+}
+
+/* Promote ssa USE in STMT to PROMOTED_TYPE.  */
+static void
+promote_cond_stmt_use (gimple stmt,
+		       tree use,
+		       imm_use_iterator *ui,
+		       tree new_use,
+		       tree copy_of_use,
+		       tree promoted_type)
+{
+  bool promote;
+  tree lhs = gimple_cond_lhs (stmt);
+  tree rhs = gimple_cond_rhs (stmt);
+  gimple_stmt_iterator gsi;
+  use_operand_p op;
+
+  /* check if LHS and RHS can be promoted without changing the results
+     of comparison.  */
+  if (((ssa_tobe_promoted_p (lhs)
+	&& promotion_safe_p (lhs)
+	&& !ssa_overflows_p (lhs))
+       || (TREE_CODE (lhs) == INTEGER_CST))
+      && (( ssa_tobe_promoted_p (rhs)
+	   && promotion_safe_p (rhs)
+	   && !ssa_overflows_p (rhs))
+	  || (TREE_CODE (rhs) == INTEGER_CST)))
+    promote = true;
+  /* LHS or RHS of the comparsion is already promoted.  */
+  else if ((TYPE_PRECISION (TREE_TYPE (lhs))
+	    == TYPE_PRECISION (promoted_type))
+	   || (TYPE_PRECISION (TREE_TYPE (rhs))
+	       == TYPE_PRECISION (promoted_type)))
+    promote = true;
+  else
+    promote = false;
+
+  if (promote)
+    {
+      /* Copmparison will happen in promoted type.  */
+      tree temp;
+      if (TREE_TYPE (use) != TREE_TYPE (new_use))
+	temp = new_use;
+      else
+	{
+	  temp = make_promoted_copy (use, NULL, promoted_type);
+	  gimple copy_stmt
+	    = gimple_build_assign_with_ops (CONVERT_EXPR, temp,
+					    new_use, NULL_TREE);
+	  gsi = gsi_for_stmt (stmt);
+	  gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+	}
+      FOR_EACH_IMM_USE_ON_STMT (op, *ui)
+	SET_USE (op, temp);
+      update_stmt (stmt);
+      promote_cst_in_stmt (stmt, promoted_type);
+    }
+  else if (TREE_TYPE (use) != TREE_TYPE (new_use))
+    {
+      /* Copmparison will happen in original type.  */
+      tree temp;
+      if (copy_of_use)
+	temp = copy_of_use;
+      else
+	{
+	  temp = make_promoted_copy (use, NULL, TREE_TYPE (use));
+	  gimple copy_stmt
+	    = gimple_build_assign_with_ops (CONVERT_EXPR, temp,
+					    new_use, NULL_TREE);
+	  gsi = gsi_for_stmt (stmt);
+	  gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+	}
+
+      FOR_EACH_IMM_USE_ON_STMT (op, *ui)
+	SET_USE (op, temp);
+      update_stmt (stmt);
+    }
+}
+
+/* Promote definition DEF to NEW_TYPE.  If the DEF is replaced and has to
+   be released, set RELEASE_DEF.  Also return COPY_OF_DEF with the original
+   type for any use statement that needs truncation.  */
+static tree
+promote_definition (tree def,
+		    tree promoted_type,
+		    tree *copy_of_def,
+		    bool *release_def)
+{
+  gimple def_stmt = SSA_NAME_DEF_STMT (def);
+  gimple copy_stmt = NULL;
+  tree new_def;
+  basic_block bb;
+  gimple_stmt_iterator gsi;
+  gcc_assert (release_def);
+  *release_def = false;
+
+  switch (gimple_code (def_stmt))
+    {
+    case GIMPLE_PHI:
+      new_def = make_promoted_copy (def, def_stmt, promoted_type);
+      *copy_of_def = NULL;
+      gimple_phi_set_result (def_stmt, new_def);
+      SET_PHI_RESULT (def_stmt, new_def);
+      *release_def = true;
+      update_stmt (def_stmt);
+      promote_cst_in_stmt (def_stmt, promoted_type);
+      break;
+
+    case GIMPLE_NOP:
+      /* Create a promoted type copy of parameters.  */
+      bb = ENTRY_BLOCK_PTR_FOR_FN (cfun);
+      bb = get_next_bb (bb);
+      gcc_assert (bb);
+      gsi = gsi_after_labels (bb);
+      new_def = make_promoted_copy (def, NULL, promoted_type);
+      copy_stmt = gimple_build_assign_with_ops (CONVERT_EXPR, new_def,
+						def, NULL_TREE);
+      gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+      *copy_of_def = def;
+      break;
+
+    case GIMPLE_ASSIGN:
+      new_def = make_promoted_copy (def, def_stmt, promoted_type);
+      gimple_assign_set_lhs (def_stmt, new_def);
+      update_stmt (def_stmt);
+      if (TREE_CODE_CLASS (gimple_assign_rhs_code (def_stmt))
+	  != tcc_comparison)
+	promote_cst_in_stmt (def_stmt, promoted_type);
+      *release_def = true;
+      *copy_of_def = NULL;
+      break;
+
+    default:
+      new_def = make_promoted_copy (def, NULL, promoted_type);
+      copy_stmt = gimple_build_assign_with_ops (CONVERT_EXPR, new_def,
+						def, NULL_TREE);
+      gsi = gsi_for_stmt (def_stmt);
+      if (lookup_stmt_eh_lp (def_stmt) > 0)
+	insert_next_bb (def_stmt, copy_stmt);
+      else
+	gsi_insert_after (&gsi, copy_stmt, GSI_NEW_STMT);
+      update_stmt (copy_stmt);
+      *copy_of_def = def;
+      break;
+    }
+  return new_def;
+}
+
+
+/* Promote all the USE with NEW_USE.  */
+static unsigned int
+promote_all_uses (tree use, tree new_use, tree copy_of_use, tree promoted_type)
+{
+  gimple stmt;
+  imm_use_iterator ui;
+  gimple_stmt_iterator gsi;
+  use_operand_p op;
+
+  /* Replace all the use with the promoted variable.  */
+  FOR_EACH_IMM_USE_STMT (stmt, ui, use)
+    {
+      if (stmt == SSA_NAME_DEF_STMT (new_use))
+	continue;
+
+      switch (gimple_code (stmt))
+	{
+	case GIMPLE_ASSIGN:
+	  promote_assign_stmt_use (stmt, use, &ui, new_use,
+				   copy_of_use, promoted_type);
+	  break;
+	case GIMPLE_PHI:
+	  promote_phi_stmt_use (stmt, use, &ui, new_use,
+				copy_of_use, promoted_type);
+	  break;
+	case GIMPLE_COND:
+	  promote_cond_stmt_use (stmt, use, &ui, new_use,
+				 copy_of_use, promoted_type);
+	  break;
+	case GIMPLE_DEBUG:
+	  if (TREE_TYPE (use) != TREE_TYPE (new_use))
+	    {
+	      gsi = gsi_for_stmt (stmt);
+	      gsi_remove (&gsi, true);
+	    }
+	  break;
+	case GIMPLE_RETURN:
+	default:
+	  if (TREE_TYPE (use) != TREE_TYPE (new_use))
+	    {
+	      tree temp;
+	      if (copy_of_use)
+		temp = copy_of_use;
+	      else
+		{
+		  temp = make_promoted_copy (use, NULL, TREE_TYPE (use));
+		  gimple copy_stmt
+		    = gimple_build_assign_with_ops (CONVERT_EXPR, temp,
+						    new_use, NULL_TREE);
+		  gsi = gsi_for_stmt (stmt);
+		  gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+		}
+
+	      FOR_EACH_IMM_USE_ON_STMT (op, ui)
+		SET_USE (op, temp);
+	      update_stmt (stmt);
+	    }
+	  break;
+	}
+    }
+  return 0;
+}
+
+/* Promote definition of NAME and uses.  */
+static unsigned int
+promote_def_and_uses (tree name)
+{
+  tree type, new_name, copy_of_name;
+  bool release_def = false;
+
+  if (TREE_CODE (name) != SSA_NAME
+      || POINTER_TYPE_P (TREE_TYPE (name))
+      || !INTEGRAL_TYPE_P (TREE_TYPE (name))
+      || VECTOR_TYPE_P (TREE_TYPE (name))
+      || ssa_promoted_p (name)
+      || (type = get_promoted_type (TREE_TYPE (name))) == TREE_TYPE (name))
+    return 0;
+
+  if (promotion_safe_p (name))
+    {
+      new_name = promote_definition (name, type, &copy_of_name,
+				     &release_def);
+      promote_all_uses (name, new_name, copy_of_name, type);
+    }
+  else
+    promote_all_uses (name, name, name, type);
+  set_ssa_promoted (name);
+
+  if (release_def)
+    release_ssa_name (name);
+  return 0;
+}
+
+/* Mark the candidates.  */
+static void
+set_ssa_to_be_promoted_flag (gimple stmt)
+{
+  ssa_op_iter i;
+  tree def;
+  use_operand_p op;
+
+  switch (gimple_code (stmt))
+    {
+    case GIMPLE_PHI:
+      def = PHI_RESULT (stmt);
+      bitmap_set_bit (ssa_to_be_promoted_bitmap, SSA_NAME_VERSION (def));
+      FOR_EACH_PHI_ARG (op, stmt, i, SSA_OP_USE)
+	{
+	  def = USE_FROM_PTR (op);
+	  if (TREE_CODE (def) == SSA_NAME)
+	    bitmap_set_bit (ssa_to_be_promoted_bitmap, SSA_NAME_VERSION (def));
+	}
+      break;
+
+    default:
+      FOR_EACH_SSA_TREE_OPERAND (def, stmt, i, SSA_OP_USE | SSA_OP_DEF)
+	{
+	  if (TREE_CODE (def) == SSA_NAME)
+	    bitmap_set_bit (ssa_to_be_promoted_bitmap, SSA_NAME_VERSION (def));
+	}
+      break;
+    }
+}
+
+/* Visit PHI stmt and record if variables might have higher bits set if
+   promoted.  */
+static bool
+record_visit_phi_node (gimple phi)
+{
+  tree def;
+  ssa_op_iter i;
+  use_operand_p op;
+  bool high_bits_set = false;
+  tree lhs = PHI_RESULT (phi);
+
+  if (TREE_CODE (lhs) != SSA_NAME
+      || POINTER_TYPE_P (TREE_TYPE (lhs))
+      || !INTEGRAL_TYPE_P (TREE_TYPE (lhs))
+      || ssa_overflows_p (lhs))
+    return false;
+
+  FOR_EACH_PHI_ARG (op, phi, i, SSA_OP_USE)
+    {
+      def = USE_FROM_PTR (op);
+      if (ssa_overflows_p (def))
+	high_bits_set = true;
+    }
+
+  if (high_bits_set)
+    {
+      set_ssa_overflows (lhs);
+      return true;
+    }
+  else
+    return false;
+}
+
+/* Visit STMT and record if variables might have higher bits set if
+   promoted.  */
+static bool
+record_visit_stmt (gimple stmt)
+{
+  tree def;
+  ssa_op_iter i;
+  bool changed = false;
+  gcc_assert (gimple_code (stmt) == GIMPLE_ASSIGN);
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  tree lhs = gimple_assign_lhs (stmt);
+  tree rhs1 = gimple_assign_rhs1 (stmt);
+
+  if (TREE_CODE (lhs) != SSA_NAME
+      || POINTER_TYPE_P (TREE_TYPE (lhs))
+      || !INTEGRAL_TYPE_P (TREE_TYPE (lhs)))
+    return false;
+
+  switch (code)
+    {
+      /* Conversion expressions that may need to be preserved.  */
+    CASE_CONVERT:
+      bitmap_set_bit (ssa_not_safe_bitmap, SSA_NAME_VERSION (lhs));
+      break;
+
+    case SSA_NAME:
+      if (!ssa_overflows_p (lhs)
+	  && ssa_overflows_p (rhs1))
+	{
+	  set_ssa_overflows (lhs);
+	  changed = true;
+	}
+      break;
+
+    case NE_EXPR:
+    case LT_EXPR:
+    case LE_EXPR:
+    case GT_EXPR:
+    case GE_EXPR:
+    case EQ_EXPR:
+    case UNLT_EXPR:
+    case UNLE_EXPR:
+    case UNGT_EXPR:
+    case UNGE_EXPR:
+    case UNEQ_EXPR:
+    case LTGT_EXPR:
+    case RSHIFT_EXPR:
+    case LSHIFT_EXPR:
+    case WIDEN_LSHIFT_EXPR:
+      break;
+
+    case TRUNC_DIV_EXPR:
+    case CEIL_DIV_EXPR:
+    case FLOOR_DIV_EXPR:
+    case RDIV_EXPR:
+    case ROUND_DIV_EXPR:
+    case EXACT_DIV_EXPR:
+      if (!ssa_overflows_p (lhs))
+	{
+	  set_ssa_overflows (lhs);
+	  changed = true;
+	}
+      break;
+
+      /* Expressions which may produce results that will have higher bits if
+	 computed in promoted type.  (i.e.  results may overflow) */
+    case MULT_HIGHPART_EXPR:
+    case PLUS_EXPR:
+    case MINUS_EXPR:
+    case MULT_EXPR:
+    case BIT_XOR_EXPR:
+    case BIT_NOT_EXPR:
+    case WIDEN_MULT_EXPR:
+    case WIDEN_MULT_PLUS_EXPR:
+    case WIDEN_MULT_MINUS_EXPR:
+    case WIDEN_SUM_EXPR:
+    case BIT_IOR_EXPR:
+    case BIT_AND_EXPR:
+      if (!ssa_overflows_p (lhs))
+	{
+	  set_ssa_overflows (lhs);
+	  changed = true;
+	}
+      break;
+
+      /* Expressions for which operation has to be performed in original
+	 types if promoted operands may have higher bits.  */
+    case MIN_EXPR:
+    case MAX_EXPR:
+    case RANGE_EXPR:
+    case ABS_EXPR:
+    case NEGATE_EXPR:
+    case TRUNC_MOD_EXPR:
+    case CEIL_MOD_EXPR:
+    case FLOOR_MOD_EXPR:
+    case ROUND_MOD_EXPR:
+      FOR_EACH_SSA_TREE_OPERAND (def, stmt, i, SSA_OP_USE)
+	{
+	  if (ssa_overflows_p (def))
+	    bitmap_set_bit (ssa_not_safe_bitmap, SSA_NAME_VERSION (lhs));
+	}
+      break;
+
+      /* Expressions that has to be done in original types.  */
+    case LROTATE_EXPR:
+    case RROTATE_EXPR:
+      bitmap_set_bit (ssa_not_safe_bitmap, SSA_NAME_VERSION (lhs));
+      break;
+
+      /* To be safe, all other have to be done in original types.  */
+    default:
+      bitmap_set_bit (ssa_not_safe_bitmap, SSA_NAME_VERSION (lhs));
+      break;
+    }
+  return changed;
+}
+
+
+/* Promote all the stmts in the basic block.  */
+static void
+promote_all_stmts (basic_block bb)
+{
+  gimple_stmt_iterator gsi;
+  ssa_op_iter iter;
+  tree def;
+  for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+    {
+      gimple stmt = gsi_stmt (gsi);
+      use_operand_p op;
+
+      def = PHI_RESULT (stmt);
+      promote_def_and_uses (def);
+      FOR_EACH_PHI_ARG (op, stmt, iter, SSA_OP_USE)
+	{
+	  def = USE_FROM_PTR (op);
+	  promote_def_and_uses (def);
+	}
+    }
+  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+    {
+      gimple stmt = gsi_stmt (gsi);
+
+      FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_USE | SSA_OP_DEF)
+	promote_def_and_uses (def);
+    }
+}
+
+static void
+process_all_stmts_for_unsafe_promotion ()
+{
+  basic_block bb;
+  gimple_stmt_iterator gsi;
+  auto_vec<gimple> work_list;
+
+  FOR_EACH_BB_FN (bb, cfun)
+    {
+      for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+	{
+	  gimple phi = gsi_stmt (gsi);
+	  set_ssa_to_be_promoted_flag (phi);
+	  work_list.safe_push (phi);
+	}
+
+      for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+	{
+	  gimple stmt = gsi_stmt (gsi);
+	  set_ssa_to_be_promoted_flag (stmt);
+	  if (gimple_code (stmt) == GIMPLE_ASSIGN)
+	    work_list.safe_push (stmt);
+	}
+    }
+
+  while (work_list.length () > 0)
+    {
+      bool changed;
+      gimple stmt = work_list.pop ();
+      tree lhs;
+
+      switch (gimple_code (stmt))
+	{
+	case GIMPLE_ASSIGN:
+	  changed = record_visit_stmt (stmt);
+	  lhs = gimple_assign_lhs (stmt);
+	  break;
+	case GIMPLE_PHI:
+	  changed = record_visit_phi_node (stmt);
+	  lhs = PHI_RESULT (stmt);
+	  break;
+	default:
+	  gcc_assert (false);
+	  break;
+	}
+
+      if (changed)
+	{
+	  gimple use_stmt;
+	  imm_use_iterator ui;
+	  FOR_EACH_IMM_USE_STMT (use_stmt, ui, lhs)
+	    {
+	      if (gimple_code (use_stmt) == GIMPLE_ASSIGN
+		  || gimple_code (use_stmt) == GIMPLE_PHI)
+		work_list.safe_push (use_stmt);
+	    }
+	}
+    }
+}
+
+class type_promotion_dom_walker : public dom_walker
+{
+public:
+  type_promotion_dom_walker (cdi_direction direction)
+    : dom_walker (direction) {}
+  virtual void before_dom_children (basic_block bb)
+    {
+      promote_all_stmts (bb);
+    }
+};
+
+/* Main entry point to the pass.  */
+static unsigned int
+execute_type_promotion (void)
+{
+  n_ssa_val = num_ssa_names;
+  ssa_not_safe_bitmap = sbitmap_alloc (n_ssa_val);
+  bitmap_clear (ssa_not_safe_bitmap);
+  ssa_to_be_promoted_bitmap = sbitmap_alloc (n_ssa_val);
+  bitmap_clear (ssa_to_be_promoted_bitmap);
+  ssa_sets_higher_bits_bitmap = sbitmap_alloc (n_ssa_val);
+  bitmap_clear (ssa_sets_higher_bits_bitmap);
+
+  process_all_stmts_for_unsafe_promotion ();
+  /* Walk the CFG in dominator order.  */
+  type_promotion_dom_walker (CDI_DOMINATORS)
+    .walk (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+
+  sbitmap_free (ssa_not_safe_bitmap);
+  sbitmap_free (ssa_to_be_promoted_bitmap);
+  sbitmap_free (ssa_sets_higher_bits_bitmap);
+  return 0;
+}
+
+namespace {
+const pass_data pass_data_type_promotion =
+{
+  GIMPLE_PASS, /* type */
+  "promotion", /* name */
+  OPTGROUP_NONE, /* optinfo_flags */
+  TV_TREE_TYPE_PRMT, /* tv_id */
+  PROP_ssa, /* properties_required */
+  0, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  (TODO_cleanup_cfg | TODO_update_ssa | TODO_verify_all),
+};
+
+class pass_type_promotion : public gimple_opt_pass
+{
+public:
+  pass_type_promotion (gcc::context *ctxt)
+    : gimple_opt_pass (pass_data_type_promotion, ctxt)
+  {}
+
+  /* opt_pass methods: */
+  opt_pass * clone () { return new pass_type_promotion (m_ctxt); }
+  virtual bool gate (function *) { return flag_tree_type_prmt != 0; }
+  virtual unsigned int execute (function *)
+    {
+      return execute_type_promotion ();
+    }
+
+}; // class pass_type_promotion
+
+} // anon namespace
+
+gimple_opt_pass *
+make_pass_type_promotion (gcc::context *ctxt)
+{
+  return new pass_type_promotion (ctxt);
+}
+

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC] Elimination of zext/sext - type promotion pass
  2014-11-09 23:30               ` [RFC] Elimination of zext/sext - type promotion pass Kugan
@ 2014-11-10 12:56                 ` Richard Biener
  2015-05-01  4:41                   ` Kugan
  0 siblings, 1 reply; 29+ messages in thread
From: Richard Biener @ 2014-11-10 12:56 UTC (permalink / raw)
  To: Kugan; +Cc: Uros Bizjak, gcc-patches, Jakub Jelinek

On Mon, Nov 10, 2014 at 12:29 AM, Kugan
<kugan.vivekanandarajah@linaro.org> wrote:
>
>> Well - the best way would be to expose the target specifics to GIMPLE
>> at some point in the optimization pipeline.  My guess would be that it's
>> appropriate after loop optimizations (but maybe before induction variable
>> optimization).
>>
>> That is, have a pass that applies register promotion to all SSA names
>> in the function, inserting appropriate truncations and extensions.  That
>> way you'd never see (set (subreg...) on RTL.  The VRP and DOM
>> passes running after that pass would then be able to aggressively
>> optimize redundant truncations and extensions.
>>
>> Effects on debug information are to be considered.  You can change
>> the type of SSA names in-place but you don't want to do that for
>> user DECLs (and we can't have the SSA name type and its DECL
>> type differ - and not sure if we might want to lift that restriction).
>
>
> Thanks for the comments. Here is a prototype patch that implements a
> type promotion pass. This pass records SSA variables that will have
> values in higher bits (than the original type precision) if promoted and
> uses this information in inserting appropriate truncations and
> extensions. This pass also classifies some of the stmts that sets ssa's
> to be unsafe to promote. Here is a gimple difference for the type
> promotion as compared to previous dump for a testcase.

Note that while GIMPLE has a way to zero-extend (using BIT_AND_EXPR)
it has no convenient way to sign-extend other than truncating to a signed
(non-promoted) type and then extending to the promoted type.  Thus
I think such pass should be accompanied with a new tree code,
SEXT_EXPR.  Otherwise we end up with "spurious" un-promoted
signed types which later optimizations may be confused about.

Not sure if that is the actual issue though.

Instead op "prmt" and "prmtn" I'd spell out promote and tree-type-prmtn
should be gimple-ssa-type-promote.c.  In the end all targets with
non-trivial PROMOTE_MODE should run the pass as a lowering step
so it should be enabled even at -O0 (and not disablable).

I'd definitely run the pass _after_ pass_lower_vector_ssa (and in the
end I'd like to run it before IVOPTs ... which means moving IVOPTs
later, after VRP which should be the pass optimizing away some of
the extensions).

In get_promoted_type I don't understand why you preserve qualifiers.
Also even for targets without PROMOTE_MODE it may be
beneficial to expose truncations required by expanding bit-precision
arithmetic earlier (that is, if !PROMOTE_MODE at least promote
to GET_MODE_PRECISION (TYPE_MODE (type))).  A testcase
for that is for example

struct { long i : 33; long j : 33; } a;
return a.i + a.j;

where bitfields of type > int do not promote so you get a
33 bit add which we expand to a 64bit add plus a sign-extension
(and nothing optimizes that later usually).

insert_next_bb sounds like you want to use insert_on_edge
somewhere.

in assign_rhs_promotable_p you handle comparisons special
but the ternary COND_EXPR and VEC_COND_EXPR can have
comparisons embedded in their first operand.  The comment
confuses me though - with proper sign- or zero-extensions inserted
you should be able to promote them anyway?

You seem to miss that a GIMPLE_ASSIGN can have 3 operands
in promote_cst_in_stmt as well.

In promote_assign_stmt_use I consider a default: case that ends
up doing nothing dangerous ;)  Please either use gcc_unreachable ()
or do the safe thing (fix = true;?).  You seem to be working with
a lattice of some kind - fixing up stmt uses the way you do - walking
over immediate uses - is not very cache friendly.  Why not use
a lattice for this - record promoted vars to be used for old SSA names
and walk over all stmts instead, replacing SSA uses on them?
Btw, you don't need to call update_stmt if you SET_USE and not
replace an SSA name with a constant.

You seem to "fix" with a single stmt but I don't see where you insert
zero- or sign-extensions for ssa_overflows_p cases?

Note that at least for SSA names with !SSA_NAME_VAR (thus
anonymous vars) you want to do a cheaper promotion by not
allocating a new SSA name but simply "fixing" its type by
assigning to its TREE_TYPE.   For SSA names with SSA_NAME_VAR
there is of course debug-info to consider and thus doing what you
do is better (but probably still will wreck debuginfo?).

GIMPLE_NOPs are not only used for parameters but also uninitialized
uses - for non-parameters you should simply adjust their type.  No
need to fixup their value.

The pass needs more comments.

It looks like you are not promoting all variables but only those
where compensation code (zero-/sign-extensions) is not necessary?

Thanks for trying to work on this.
Richard.

>  crc2 (short unsigned int crc, unsigned char data)
>  {
>    unsigned char carry;
>    unsigned char x16;
>    unsigned char i;
> -  unsigned char ivtmp_5;
> +  unsigned int _2;
> +  unsigned char _3;
> +  unsigned int _4;
> +  unsigned int _5;
>    unsigned char _9;
> -  unsigned char _10;
> -  unsigned char ivtmp_18;
> +  unsigned int _10;
> +  unsigned int _11;
> +  unsigned int _12;
> +  unsigned int _13;
> +  unsigned int _15;
> +  unsigned int _16;
> +  unsigned int _18;
> +  unsigned int _21;
> +  unsigned int _22;
> +  unsigned int _24;
> +  short unsigned int _26;
> +  unsigned char _27;
> +  unsigned int _28;
> +  unsigned int _29;
> +  unsigned int _30;
>
>    <bb 2>:
> +  _12 = (unsigned int) data_8(D);
> +  _2 = (unsigned int) crc_7(D);
>
>    <bb 3>:
> -  # crc_28 = PHI <crc_2(5), crc_7(D)(2)>
> -  # data_29 = PHI <data_12(5), data_8(D)(2)>
> -  # ivtmp_18 = PHI <ivtmp_5(5), 8(2)>
> -  _9 = (unsigned char) crc_28;
> -  _10 = _9 ^ data_29;
> -  x16_11 = _10 & 1;
> -  data_12 = data_29 >> 1;
> -  if (x16_11 == 1)
> +  # _30 = PHI <_28(5), _2(2)>
> +  # _16 = PHI <_29(5), _12(2)>
> +  # _4 = PHI <_18(5), 8(2)>
> +  _9 = (unsigned char) _30;
> +  _5 = (unsigned int) _9;
> +  _22 = _5 ^ _16;
> +  _10 = _22 & 1;
> +  _29 = _16 >> 1;
> +  _27 = (unsigned char) _10;
> +  if (_27 == 1)
>      goto <bb 4>;
>    else
>      goto <bb 7>;
>
>    <bb 4>:
> -  crc_13 = crc_28 ^ 16386;
> -  crc_24 = crc_13 >> 1;
> -  crc_15 = crc_24 | 32768;
> +  _11 = _30 ^ 16386;
> +  _13 = _11 >> 1;
> +  _24 = _13 | 32768;
>
>    <bb 5>:
> -  # crc_2 = PHI <crc_15(4), crc_21(7)>
> -  ivtmp_5 = ivtmp_18 - 1;
> -  if (ivtmp_5 != 0)
> +  # _28 = PHI <_24(4), _15(7)>
> +  _18 = _4 - 1;
> +  _3 = (unsigned char) _18;
> +  if (_3 != 0)
>      goto <bb 3>;
>    else
>      goto <bb 6>;
>
>    <bb 6>:
> -  # crc_19 = PHI <crc_2(5)>
> -  return crc_19;
> +  # _21 = PHI <_28(5)>
> +  _26 = (short unsigned int) _21;
> +  return _26;
>
>    <bb 7>:
> -  crc_21 = crc_28 >> 1;
> +  _15 = _30 >> 1;
>    goto <bb 5>;
>
>  }
>
>
> I experimented with few simple test-cases and results so far are mixed.
> It also seems that subsequent passes are not always optimising as
> expected. I haven't looked in detail but will look into it based on the
> feedback.
>
> Please also note that this pass still doest handle debug instructions
> and there are couple regression failures for ARM.
>
> Thanks,
> Kugan
>
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC] Elimination of zext/sext - type promotion pass
  2014-11-10 12:56                 ` Richard Biener
@ 2015-05-01  4:41                   ` Kugan
  2015-05-08 12:48                     ` Richard Biener
  0 siblings, 1 reply; 29+ messages in thread
From: Kugan @ 2015-05-01  4:41 UTC (permalink / raw)
  To: Richard Biener; +Cc: Uros Bizjak, gcc-patches, Jakub Jelinek

[-- Attachment #1: Type: text/plain, Size: 5261 bytes --]


>> Thanks for the comments. Here is a prototype patch that implements a
>> type promotion pass. This pass records SSA variables that will have
>> values in higher bits (than the original type precision) if promoted and
>> uses this information in inserting appropriate truncations and
>> extensions. This pass also classifies some of the stmts that sets ssa's
>> to be unsafe to promote. Here is a gimple difference for the type
>> promotion as compared to previous dump for a testcase.
> 
> Note that while GIMPLE has a way to zero-extend (using BIT_AND_EXPR)
> it has no convenient way to sign-extend other than truncating to a signed
> (non-promoted) type and then extending to the promoted type.  Thus
> I think such pass should be accompanied with a new tree code,
> SEXT_EXPR.  Otherwise we end up with "spurious" un-promoted
> signed types which later optimizations may be confused about.
> 
> Not sure if that is the actual issue though.
> 
> Instead op "prmt" and "prmtn" I'd spell out promote and tree-type-prmtn
> should be gimple-ssa-type-promote.c.  In the end all targets with
> non-trivial PROMOTE_MODE should run the pass as a lowering step
> so it should be enabled even at -O0 (and not disablable).
> 
> I'd definitely run the pass _after_ pass_lower_vector_ssa (and in the
> end I'd like to run it before IVOPTs ... which means moving IVOPTs
> later, after VRP which should be the pass optimizing away some of
> the extensions).
> 
> In get_promoted_type I don't understand why you preserve qualifiers.
> Also even for targets without PROMOTE_MODE it may be
> beneficial to expose truncations required by expanding bit-precision
> arithmetic earlier (that is, if !PROMOTE_MODE at least promote
> to GET_MODE_PRECISION (TYPE_MODE (type))).  A testcase
> for that is for example
> 
> struct { long i : 33; long j : 33; } a;
> return a.i + a.j;
> 
> where bitfields of type > int do not promote so you get a
> 33 bit add which we expand to a 64bit add plus a sign-extension
> (and nothing optimizes that later usually).
> 
> insert_next_bb sounds like you want to use insert_on_edge
> somewhere.
> 
> in assign_rhs_promotable_p you handle comparisons special
> but the ternary COND_EXPR and VEC_COND_EXPR can have
> comparisons embedded in their first operand.  The comment
> confuses me though - with proper sign- or zero-extensions inserted
> you should be able to promote them anyway?
> 
> You seem to miss that a GIMPLE_ASSIGN can have 3 operands
> in promote_cst_in_stmt as well.
> 
> In promote_assign_stmt_use I consider a default: case that ends
> up doing nothing dangerous ;)  Please either use gcc_unreachable ()
> or do the safe thing (fix = true;?).  You seem to be working with
> a lattice of some kind - fixing up stmt uses the way you do - walking
> over immediate uses - is not very cache friendly.  Why not use
> a lattice for this - record promoted vars to be used for old SSA names
> and walk over all stmts instead, replacing SSA uses on them?
> Btw, you don't need to call update_stmt if you SET_USE and not
> replace an SSA name with a constant.
> 
> You seem to "fix" with a single stmt but I don't see where you insert
> zero- or sign-extensions for ssa_overflows_p cases?
> 
> Note that at least for SSA names with !SSA_NAME_VAR (thus
> anonymous vars) you want to do a cheaper promotion by not
> allocating a new SSA name but simply "fixing" its type by
> assigning to its TREE_TYPE.   For SSA names with SSA_NAME_VAR
> there is of course debug-info to consider and thus doing what you
> do is better (but probably still will wreck debuginfo?).
> 
> GIMPLE_NOPs are not only used for parameters but also uninitialized
> uses - for non-parameters you should simply adjust their type.  No
> need to fixup their value.
> 
> The pass needs more comments.
> 
> It looks like you are not promoting all variables but only those
> where compensation code (zero-/sign-extensions) is not necessary?
> 

Thanks for the comments. Please find an updated version of this which
addresses your review comments above. I am still to do full benchmarking
on this, but tried with few small benchmarks. I will do proper
benchmarking after getting feedback on the implementation. I have
however bootstrapped on x86-64-none-linux and regression tested on
x86-64, ARM and AArch64.

I am also not clear with how I should handle the gimple debug statements
when the intermediate temporary variable that maps to the original
variable is promoted.

Thanks,
Kugan


gcc/ChangeLog:

2015-05-01  Kugan Vivekanandarajah  <kuganv@linaro.org>

	* Makefile.in: Add gimple-ssa-type-promote.o.
	* cfgexpand.c (expand_debug_expr): Handle SEXT_EXPR.
	* common.opt: New option -ftree-type-promote.
	* expr.c (expand_expr_real_2): Handle SEXT_EXPR.
	* fold-const.c (int_const_binop_1):
	* gimple-ssa-type-promote.c: New file.
	* passes.def: Define new pass_type_promote.
	* timevar.def: Define new TV_TREE_TYPE_PROMOTE.
	* tree-cfg.c (verify_gimple_assign_binary): Handle SEXT_EXPR.
	* tree-inline.c (estimate_operator_cost):
	* tree-pass.h (make_pass_type_promote): New.
	* tree-pretty-print.c (dump_generic_node): Handle SEXT_EXPR.
	(op_symbol_code): Likewise.
	* tree-vrp.c (extract_range_from_binary_expr_1): Likewise.
	* tree.def: Define new SEXT_EXPR.

[-- Attachment #2: p.txt --]
[-- Type: text/plain, Size: 44012 bytes --]

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 80c91f0..0318631 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1478,6 +1478,7 @@ OBJS = \
 	tree-vect-slp.o \
 	tree-vectorizer.o \
 	tree-vrp.o \
+	gimple-ssa-type-promote.o \
 	tree.o \
 	valtrack.o \
 	value-prof.o \
diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index ca491a0..99a1d4c 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -4881,6 +4881,10 @@ expand_debug_expr (tree exp)
     case FMA_EXPR:
       return simplify_gen_ternary (FMA, mode, inner_mode, op0, op1, op2);
 
+    case SEXT_EXPR:
+      return op0;
+
+
     default:
     flag_unsupported:
 #ifdef ENABLE_CHECKING
diff --git a/gcc/common.opt b/gcc/common.opt
index 380848c..1dc9b1b 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2356,6 +2356,10 @@ ftree-vrp
 Common Report Var(flag_tree_vrp) Init(0) Optimization
 Perform Value Range Propagation on trees
 
+ftree-type-promote
+Common Report Var(flag_tree_type_promote) Init(1) Optimization
+Perform Type Promotion on trees
+
 funit-at-a-time
 Common Report Var(flag_unit_at_a_time) Init(1)
 Compile whole compilation unit at a time
diff --git a/gcc/expr.c b/gcc/expr.c
index 530a944..f672a99 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -9336,6 +9336,21 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
       target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, target);
       return target;
 
+    case SEXT_EXPR:
+	{
+	  rtx op0 = expand_normal (treeop0);
+	  rtx temp;
+	  if (!target)
+	    target = gen_reg_rtx (TYPE_MODE (TREE_TYPE (treeop0)));
+
+	  machine_mode inner_mode = smallest_mode_for_size (tree_to_shwi (treeop1),
+							    MODE_INT);
+	  temp = convert_modes (inner_mode,
+				TYPE_MODE (TREE_TYPE (treeop0)), op0, 0);
+	  convert_move (target, temp, 0);
+	  return target;
+	}
+
     default:
       gcc_unreachable ();
     }
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 3654fd6..f5f00af 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -1007,6 +1007,10 @@ int_const_binop_1 (enum tree_code code, const_tree arg1, const_tree parg2,
       res = wi::bit_and (arg1, arg2);
       break;
 
+    case SEXT_EXPR:
+      res = wi::sext (arg1, arg2.to_uhwi ());
+      break;
+
     case RSHIFT_EXPR:
     case LSHIFT_EXPR:
       if (wi::neg_p (arg2))
diff --git a/gcc/gimple-ssa-type-promote.c b/gcc/gimple-ssa-type-promote.c
index e69de29..a226e50c 100644
--- a/gcc/gimple-ssa-type-promote.c
+++ b/gcc/gimple-ssa-type-promote.c
@@ -0,0 +1,1311 @@
+/* Type promotion of SSA names to minimise redundant zero/sign extension.
+   Copyright (C) 2015 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "flags.h"
+#include "hash-set.h"
+#include "machmode.h"
+#include "vec.h"
+#include "double-int.h"
+#include "input.h"
+#include "alias.h"
+#include "symtab.h"
+#include "wide-int.h"
+#include "inchash.h"
+#include "tree.h"
+#include "fold-const.h"
+#include "stor-layout.h"
+#include "calls.h"
+#include "predict.h"
+#include "hard-reg-set.h"
+#include "function.h"
+#include "dominance.h"
+#include "cfg.h"
+#include "basic-block.h"
+#include "tree-ssa-alias.h"
+#include "gimple-fold.h"
+#include "tree-eh.h"
+#include "gimple-expr.h"
+#include "is-a.h"
+#include "gimple.h"
+#include "gimple-iterator.h"
+#include "gimple-ssa.h"
+#include "tree-phinodes.h"
+#include "ssa-iterators.h"
+#include "stringpool.h"
+#include "tree-ssanames.h"
+#include "tree-pass.h"
+#include "gimple-pretty-print.h"
+#include "langhooks.h"
+#include "sbitmap.h"
+#include "domwalk.h"
+
+/* This pass applies type promotion to SSA names in the function and
+   inserts appropriate truncations.  Idea of this pass is to promote operations
+   such a way that we can minimise generation of subreg in RTL,
+   that intern results in removal of redundant zero/sign extensions.  This pass
+   will run prior to The VRP and DOM such that they will be able to optimise
+   redundant truncations and extensions.  This is based on the discussion from
+   https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00472.html.
+
+   This pass execute as follows:
+
+   1.  This pass records gimple statements that may produce results that can
+   overflow (beyond the original type) and operations that has to be always
+   performed in the original type.  This is done in
+   process_all_stmts_for_unsafe_promotion.  Here, gimple which sets SSA_NAMES
+   are processed in a work_list to set ssa_sets_higher_bits_bitmap
+   (set_ssa_overflows) and ssa_not_safe_bitmap.
+
+   2.  promote_all_stmts  traverses the basic blocks in dominator order and
+   promotes all the SSA_NAMES that were  selected as safe in the  step 1 above.
+   It uses promote_all_stmts to do the register promotion stmt by stmt.
+   The definition of the SSA_NAME is promoted first and then all the uses are
+   promoted according to the gimple stmt type.  If the SSA_NAME can overflow
+   when promoted necessary fix-ups are also performed to preserve the semantics
+   of the program.
+*/
+
+static unsigned n_ssa_val;
+static sbitmap ssa_not_safe_bitmap;
+static sbitmap ssa_to_be_promoted_bitmap;
+static sbitmap ssa_sets_higher_bits_bitmap;
+
+/* Return the promoted type for TYPE.  */
+static tree
+get_promoted_type (tree type)
+{
+  tree promoted_type;
+  enum machine_mode mode;
+  int uns;
+  if (POINTER_TYPE_P (type)
+      || TYPE_PRECISION (type) == 1
+      || !INTEGRAL_TYPE_P (type))
+    return type;
+#ifdef PROMOTE_MODE
+  mode = TYPE_MODE (type);
+  uns = TYPE_SIGN (type);
+  PROMOTE_MODE (mode, uns, type);
+#else
+  mode = smallest_mode_for_size (GET_MODE_PRECISION (TYPE_MODE (type)),
+				 MODE_INT);
+#endif
+  uns = TYPE_SIGN (type);
+  promoted_type = lang_hooks.types.type_for_mode (mode, uns);
+  if (promoted_type
+      && (TYPE_PRECISION (promoted_type) > TYPE_PRECISION (type)))
+    type = promoted_type;
+  return type;
+}
+
+/* Predicate that tells if promoting computation with ssa NAME is safe.  */
+static bool
+promotion_safe_p (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      gimple stmt = SSA_NAME_DEF_STMT (name);
+      unsigned int index = SSA_NAME_VERSION (name);
+
+      if (gimple_vdef (stmt) != NULL_TREE
+	  || gimple_vuse (stmt) != NULL_TREE)
+	return false;
+      if (index < n_ssa_val)
+	return !bitmap_bit_p (ssa_not_safe_bitmap, index);
+    }
+  return false;
+}
+
+/* Return true if ssa NAME is already considered for promotion.  */
+static bool
+ssa_promoted_p (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	return !bitmap_bit_p (ssa_to_be_promoted_bitmap, index);
+    }
+  return true;
+}
+
+/* Return true if ssa NAME will be considered for promotion.  */
+static bool
+ssa_tobe_promoted_p (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	return bitmap_bit_p (ssa_to_be_promoted_bitmap, index);
+    }
+  return false;
+}
+
+/* Set ssa NAME to be already considered for promotion.  */
+static void
+set_ssa_promoted (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	bitmap_clear_bit (ssa_to_be_promoted_bitmap, index);
+    }
+}
+
+/* Set ssa NAME will have higher bits if promoted.  */
+static void
+set_ssa_overflows (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	bitmap_set_bit (ssa_sets_higher_bits_bitmap, index);
+    }
+}
+
+/* Return true if ssa NAME will have higher bits if promoted.  */
+static bool
+ssa_overflows_p (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	return bitmap_bit_p (ssa_sets_higher_bits_bitmap, index);
+    }
+  return false;
+}
+
+/* Create an ssa with TYPE to copy ssa VAR.  */
+static tree
+make_promoted_copy (tree var, gimple def_stmt, tree type)
+{
+  tree new_lhs = make_ssa_name (type, def_stmt);
+  if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (var))
+    SSA_NAME_OCCURS_IN_ABNORMAL_PHI (new_lhs) = 1;
+  return new_lhs;
+}
+
+/* Return single successor (excluding EH edge) for basic block BB.  If there
+   are more than one successors, return NULL.  */
+static basic_block
+get_single_successor_bb (basic_block bb)
+{
+  edge e, res = NULL;
+  edge_iterator ei;
+
+  FOR_EACH_EDGE (e, ei, bb->succs)
+    if (!(e->flags & EDGE_EH))
+      {
+	if (res)
+	  return NULL;
+	res = e;
+      }
+  return res->dest;
+}
+
+/* Insert COPY_STMT along the edge from STMT to its successor.  */
+static void
+insert_stmt_on_edge (gimple stmt, gimple copy_stmt)
+{
+  edge_iterator ei;
+  edge e, edge = NULL;
+  basic_block bb = gimple_bb (stmt);
+
+  FOR_EACH_EDGE (e, ei, bb->succs)
+    if (!(e->flags & EDGE_EH))
+      {
+	gcc_assert (edge == NULL);
+	edge = e;
+      }
+
+  gcc_assert (edge);
+  gsi_insert_on_edge_immediate (edge, copy_stmt);
+}
+
+/* Convert constant CST to TYPE.  */
+static tree
+convert_int_cst (tree type, tree cst, signop sign = SIGNED)
+{
+  wide_int wi_cons = fold_convert (type, cst);
+  wi_cons = wi::ext (wi_cons, TYPE_PRECISION (TREE_TYPE (cst)), sign);
+  return wide_int_to_tree (type, wi_cons);
+}
+
+
+/* Promote constants in STMT to TYPE.  If PROMOTE_COND_EXPR is true,
+ promote only the constants in conditions part of the COND_EXPR.  */
+static void
+promote_cst_in_stmt (gimple stmt, tree type,
+		     bool promote_cond_expr = false, signop sign = SIGNED)
+{
+  tree op;
+  ssa_op_iter iter;
+  use_operand_p oprnd;
+  int index;
+  tree op0, op1;
+
+  if (promote_cond_expr)
+    {
+      /* Promote constant in COND_EXPR.  */
+      gcc_assert (gimple_assign_rhs_code (stmt) == COND_EXPR);
+      op = gimple_assign_rhs1 (stmt);
+      op0 = TREE_OPERAND (op, 0);
+      op1 = TREE_OPERAND (op, 1);
+
+      if (TREE_CODE (op0) == INTEGER_CST)
+	op0 = convert_int_cst (type, op0, sign);
+      if (TREE_CODE (op1) == INTEGER_CST)
+	op1 = convert_int_cst (type, op1, sign);
+
+      tree new_op = build2 (TREE_CODE (op), type, op0, op1);
+      gimple_assign_set_rhs1 (stmt, new_op);
+      return;
+    }
+
+  switch (gimple_code (stmt))
+    {
+    case GIMPLE_ASSIGN:
+      op = gimple_assign_rhs1 (stmt);
+
+      if (op && TREE_CODE (op) == INTEGER_CST)
+	gimple_assign_set_rhs1 (stmt, convert_int_cst (type, op, sign));
+      op = gimple_assign_rhs2 (stmt);
+
+      if (op && TREE_CODE (op) == INTEGER_CST)
+	gimple_assign_set_rhs2 (stmt, convert_int_cst (type, op, sign));
+      op = gimple_assign_rhs3 (stmt);
+
+      if (op && TREE_CODE (op) == INTEGER_CST)
+	gimple_assign_set_rhs3 (stmt, convert_int_cst (type, op, sign));
+      break;
+
+    case GIMPLE_PHI:
+	{
+	  gphi *phi = as_a <gphi *> (stmt);
+	  FOR_EACH_PHI_ARG (oprnd, phi, iter, SSA_OP_USE)
+	    {
+	      op = USE_FROM_PTR (oprnd);
+	      index = PHI_ARG_INDEX_FROM_USE (oprnd);
+	      if (TREE_CODE (op) == INTEGER_CST)
+		SET_PHI_ARG_DEF (phi, index, convert_int_cst (type, op, sign));
+	    }
+	}
+      break;
+
+    case GIMPLE_COND:
+	{
+	  gcond *cond = as_a <gcond *> (stmt);
+	  op = gimple_cond_lhs (cond);
+
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_cond_set_lhs (cond, convert_int_cst (type, op, sign));
+	  op = gimple_cond_rhs (cond);
+
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_cond_set_rhs (cond, convert_int_cst (type, op, sign));
+	}
+
+    default:
+      break;
+    }
+}
+
+/* Zero/sign extend (depending on type) VAR and truncate to WIDTH bits.
+   Assign the zero/sign extended value in NEW_VAR.  gimple statement
+   that performs the zero/sign extension is returned.  */
+static gimple
+zero_sign_extend_stmt (tree new_var, tree var, int width)
+{
+  gcc_assert (TYPE_PRECISION (TREE_TYPE (var))
+	      == TYPE_PRECISION (TREE_TYPE (new_var)));
+  gcc_assert (TYPE_PRECISION (TREE_TYPE (var)) > width);
+  gimple stmt;
+
+  if (TYPE_UNSIGNED (TREE_TYPE (new_var)))
+    /* Zero extend.  */
+    stmt = gimple_build_assign (new_var,
+				BIT_AND_EXPR,
+				var, build_int_cst (TREE_TYPE (var),
+						    ((1ULL << width) - 1)));
+  else
+    /* Sign extend.  */
+    stmt = gimple_build_assign (new_var,
+				SEXT_EXPR,
+				var, build_int_cst (TREE_TYPE (var), width));
+  return stmt;
+}
+
+/* Promote use in an assignment.  Depending on the gimple_assign_rhs_code,
+   values in NEW_USE might have to be truncated to the type of USE.  */
+static void
+promote_assign_stmt_use (gimple stmt,
+			 tree use,
+			 imm_use_iterator *ui,
+			 tree new_use,
+			 tree copy_of_use,
+			 tree promoted_type)
+{
+  tree lhs = gimple_assign_lhs (stmt);
+  tree rhs1 = gimple_assign_rhs1 (stmt);
+  tree rhs2 = gimple_assign_rhs2 (stmt);
+  tree rhs3 = gimple_assign_rhs3 (stmt);
+  gimple_stmt_iterator gsi;
+  use_operand_p op;
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  /* If promoted and fix up is to be performed, fix is true.  */
+  bool fix = false;
+
+  switch (code)
+    {
+    CASE_CONVERT:
+      if (ssa_tobe_promoted_p (lhs)
+	  && promotion_safe_p (lhs)
+	  && TREE_TYPE (new_use) == promoted_type)
+	{
+	  if (TYPE_PRECISION (TREE_TYPE (lhs)) > TYPE_PRECISION (TREE_TYPE (rhs1)))
+	    {
+	      tree temp = make_promoted_copy (lhs, NULL, promoted_type);
+	      gimple copy_stmt =
+		zero_sign_extend_stmt (temp, new_use,
+				       TYPE_PRECISION (TREE_TYPE (use)));
+	      gsi = gsi_for_stmt (stmt);
+	      gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+	      FOR_EACH_IMM_USE_ON_STMT (op, *ui)
+		SET_USE (op, temp);
+	      update_stmt (stmt);
+	    }
+	  else
+	    {
+	      FOR_EACH_IMM_USE_ON_STMT (op, *ui)
+		SET_USE (op, new_use);
+	      update_stmt (stmt);
+	    }
+	}
+      else
+	{
+	  if (TYPE_PRECISION (TREE_TYPE (lhs)) < TYPE_PRECISION (TREE_TYPE (rhs1)))
+	    {
+	      FOR_EACH_IMM_USE_ON_STMT (op, *ui)
+		SET_USE (op, new_use);
+	      update_stmt (stmt);
+	    }
+	  else if (!copy_of_use)
+	    {
+	      tree temp = make_promoted_copy (use, NULL, TREE_TYPE (use));
+	      gimple copy_stmt = gimple_build_assign (temp, CONVERT_EXPR,
+						      new_use, NULL_TREE);
+	      gsi = gsi_for_stmt (stmt);
+	      gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+	      FOR_EACH_IMM_USE_ON_STMT (op, *ui)
+		SET_USE (op, temp);
+	      update_stmt (stmt);
+	    }
+	  else
+	    {
+	      FOR_EACH_IMM_USE_ON_STMT (op, *ui)
+		SET_USE (op, copy_of_use);
+	      update_stmt (stmt);
+	    }
+	}
+      return;
+
+    case COND_EXPR:
+      /* Promote COND_EXPR coparison operands.  */
+      if (use != rhs2
+	  && use != rhs3)
+	{
+	  tree temp;
+	  tree op0 = TREE_OPERAND (rhs1, 0);
+	  tree op1 = TREE_OPERAND (rhs1, 1);
+	  bool is_cst = false;
+
+	  if (TREE_CODE (op0) == INTEGER_CST
+	      || TREE_CODE (op1) == INTEGER_CST)
+	    is_cst = true;
+
+	  /* If this SSA is not promoted.  */
+	  if (use == new_use)
+	    {
+	      if (is_cst)
+		temp = new_use;
+	      else
+		{
+		  temp = make_promoted_copy (use, NULL, promoted_type);
+		  gimple copy_stmt = gimple_build_assign (temp, CONVERT_EXPR,
+							  new_use, NULL_TREE);
+		  gsi = gsi_for_stmt (stmt);
+		  gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+		  promote_cst_in_stmt (stmt, promoted_type, true,
+				       TYPE_SIGN (TREE_TYPE (use)));
+		}
+	    }
+	  /* If this SSA is promoted.  */
+	  else
+	    {
+	      temp = make_promoted_copy (use, NULL, promoted_type);
+	      gimple copy_stmt
+		= zero_sign_extend_stmt (temp, new_use,
+					 TYPE_PRECISION (TREE_TYPE (use)));
+	      gsi = gsi_for_stmt (stmt);
+	      gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+	      promote_cst_in_stmt (stmt, promoted_type, true,
+				   TYPE_SIGN (TREE_TYPE (use)));
+	    }
+
+	  if (op0 == use)
+	    op0 = temp;
+	  else
+	    op1 = temp;
+
+	  tree new_op = build2 (TREE_CODE (rhs1), promoted_type, op0, op1);
+	  gimple_assign_set_rhs1 (stmt, new_op);
+	  update_stmt (stmt);
+	  return;
+	}
+      else
+	{
+	  promote_cst_in_stmt (stmt, promoted_type);
+	}
+      break;
+
+    case RSHIFT_EXPR:
+    case LSHIFT_EXPR:
+    case WIDEN_LSHIFT_EXPR:
+    case TRUNC_MOD_EXPR:
+    case CEIL_MOD_EXPR:
+    case FLOOR_MOD_EXPR:
+    case ROUND_MOD_EXPR:
+    case TRUNC_DIV_EXPR:
+    case CEIL_DIV_EXPR:
+    case FLOOR_DIV_EXPR:
+    case RDIV_EXPR:
+    case ROUND_DIV_EXPR:
+    case EXACT_DIV_EXPR:
+    case MIN_EXPR:
+    case MAX_EXPR:
+    case RANGE_EXPR:
+      if (ssa_overflows_p (use))
+	fix = true;
+      break;
+
+    default:
+      break;
+    }
+
+  if (fix && promotion_safe_p (lhs)
+      && TREE_TYPE (new_use) == promoted_type)
+    {
+      /* Promoted with values truncated.  */
+      tree temp = make_promoted_copy (use, NULL, promoted_type);
+      gimple copy_stmt = zero_sign_extend_stmt (temp, new_use,
+						TYPE_PRECISION (TREE_TYPE (use)));
+      gsi = gsi_for_stmt (stmt);
+      gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+      FOR_EACH_IMM_USE_ON_STMT (op, *ui)
+	SET_USE (op, temp);
+      update_stmt (stmt);
+      return;
+    }
+  else if (!(TREE_CODE_CLASS (code) == tcc_comparison
+	     || TREE_CODE_CLASS (code) == tcc_reference
+	     || code == VIEW_CONVERT_EXPR
+	     || code == COMPLEX_EXPR
+	     || code == ASM_EXPR
+	     || code == OBJ_TYPE_REF
+	     || gimple_vdef (stmt)
+	     || VECTOR_TYPE_P (TREE_TYPE (lhs)))
+	   && (promotion_safe_p (lhs)
+	       || (TREE_CODE_CLASS (code) == tcc_comparison)))
+    {
+      /* Statement promoted.  */
+      if ((TYPE_PRECISION (TREE_TYPE (use))
+	   < TYPE_PRECISION (promoted_type))
+	  && (code != COND_EXPR))
+	promote_cst_in_stmt (stmt, promoted_type);
+
+      if (promoted_type == TREE_TYPE (new_use))
+	{
+	  /* Operand also promoted.  */
+	  FOR_EACH_IMM_USE_ON_STMT (op, *ui)
+	    SET_USE (op, new_use);
+	  update_stmt (stmt);
+	}
+      else
+	{
+	  /* Operand not promoted.  */
+	  tree temp = make_promoted_copy (use, NULL, promoted_type);
+	  gimple copy_stmt = gimple_build_assign (temp, CONVERT_EXPR,
+						  new_use, NULL_TREE);
+	  gsi = gsi_for_stmt (stmt);
+	  gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+	  FOR_EACH_IMM_USE_ON_STMT (op, *ui)
+	    SET_USE (op, temp);
+	  update_stmt (stmt);
+	}
+    }
+  else
+    {
+      /* Statement not promoted.  */
+      if (copy_of_use)
+	{
+	  /* Operand also not promoted.  */
+	  FOR_EACH_IMM_USE_ON_STMT (op, *ui)
+	    SET_USE (op, copy_of_use);
+	  update_stmt (stmt);
+	}
+      else
+	{
+	  /* Operand promoted.  */
+	  tree temp = make_promoted_copy (use, NULL, TREE_TYPE (use));
+	  gimple copy_stmt = gimple_build_assign (temp, CONVERT_EXPR,
+				   new_use, NULL_TREE);
+
+	  gsi = gsi_for_stmt (stmt);
+	  gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+
+	  FOR_EACH_IMM_USE_ON_STMT (op, *ui)
+	    SET_USE (op, temp);
+	  update_stmt (stmt);
+	}
+    }
+}
+
+/* Promote ssa USE in phi STMT to PROMOTED_TYPE.  */
+static void
+promote_phi_stmt_use (gimple stmt,
+		       tree use,
+		       imm_use_iterator *ui,
+		       tree new_use,
+		       tree copy_of_use,
+		       tree promoted_type)
+{
+  tree lhs = PHI_RESULT (stmt);
+  tree type;
+  tree temp;
+  gimple_stmt_iterator gsi;
+  use_operand_p op;
+
+  if (ssa_tobe_promoted_p (lhs)
+      && promotion_safe_p (lhs))
+    type = promoted_type;
+  else
+    type = TREE_TYPE (lhs);
+
+  /* Check if we need a convert stmt to get the required type.  */
+  if (type == TREE_TYPE (new_use))
+    temp = new_use;
+  else if (copy_of_use && (type == TREE_TYPE (copy_of_use)))
+    temp = copy_of_use;
+  else
+    {
+      temp = make_promoted_copy (use, NULL, type);
+      gimple copy_stmt
+	= gimple_build_assign (temp, CONVERT_EXPR,
+			       new_use, NULL_TREE);
+
+      if (gimple_code (SSA_NAME_DEF_STMT (new_use)) == GIMPLE_NOP)
+	{
+	  basic_block bb = ENTRY_BLOCK_PTR_FOR_FN (cfun);
+	  bb = get_single_successor_bb (bb);
+	  gcc_assert (bb);
+	  gsi = gsi_after_labels (bb);
+	  gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+	}
+      else if (gimple_code (SSA_NAME_DEF_STMT (new_use))
+	       != GIMPLE_PHI)
+	{
+	  gsi = gsi_for_stmt (SSA_NAME_DEF_STMT (new_use));
+	  if (lookup_stmt_eh_lp (SSA_NAME_DEF_STMT (new_use)) > 0)
+	    insert_stmt_on_edge (SSA_NAME_DEF_STMT (new_use), copy_stmt);
+	  else
+	    gsi_insert_after (&gsi, copy_stmt, GSI_NEW_STMT);
+	}
+      else
+	{
+	  gsi = gsi_after_labels
+	    (gimple_bb (SSA_NAME_DEF_STMT (new_use)));
+	  gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+	}
+    }
+
+  FOR_EACH_IMM_USE_ON_STMT (op, *ui)
+    SET_USE (op, temp);
+  update_stmt (stmt);
+}
+
+/* Promote ssa USE in STMT to PROMOTED_TYPE.  */
+static void
+promote_cond_stmt_use (gimple stmt,
+		       tree use,
+		       imm_use_iterator *ui,
+		       tree new_use,
+		       tree promoted_type)
+{
+  gimple_stmt_iterator gsi;
+  use_operand_p op;
+  bool is_cst = false;
+  tree lhs = gimple_cond_lhs (stmt);
+  tree rhs = gimple_cond_rhs (stmt);
+
+  if (TREE_CODE (lhs) == INTEGER_CST
+      || TREE_CODE (rhs) == INTEGER_CST)
+    is_cst = true;
+
+  if (TREE_TYPE (new_use) == promoted_type)
+    {
+      tree temp = make_promoted_copy (use, NULL, promoted_type);
+      gimple copy_stmt = zero_sign_extend_stmt (temp, new_use,
+						TYPE_PRECISION (TREE_TYPE (use)));
+
+      gsi = gsi_for_stmt (stmt);
+      gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+      FOR_EACH_IMM_USE_ON_STMT (op, *ui)
+	SET_USE (op, temp);
+      update_stmt (stmt);
+      promote_cst_in_stmt (stmt, promoted_type, false,
+			   TYPE_SIGN (TREE_TYPE (use)));
+    }
+  else
+    {
+      /* Copmparison will happen in promoted type.  */
+      tree temp;
+      if (TREE_TYPE (new_use) == promoted_type)
+	{
+	  temp = new_use;
+	  promote_cst_in_stmt (stmt, promoted_type, false,
+			       TYPE_SIGN (TREE_TYPE (use)));
+	}
+      else if (is_cst)
+	{
+	  temp = new_use;
+	}
+      else
+	{
+	  temp = make_promoted_copy (use, NULL, promoted_type);
+	  gimple copy_stmt = gimple_build_assign (temp, CONVERT_EXPR,
+						  new_use, NULL_TREE);
+	  gsi = gsi_for_stmt (stmt);
+	  gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+	}
+
+      FOR_EACH_IMM_USE_ON_STMT (op, *ui)
+	SET_USE (op, temp);
+      update_stmt (stmt);
+    }
+}
+
+/* Promote definition DEF to NEW_TYPE.  If the DEF is replaced and has to
+   be released, set RELEASE_DEF.  Also return COPY_OF_DEF with the original
+   type for any use statement that needs truncation.  */
+static tree
+promote_definition (tree def,
+		    tree promoted_type,
+		    tree *copy_of_def,
+		    bool *release_def)
+{
+  gimple def_stmt = SSA_NAME_DEF_STMT (def);
+  gimple copy_stmt = NULL;
+  tree new_def;
+  basic_block bb;
+  gimple_stmt_iterator gsi;
+  gphi *phi;
+
+  gcc_assert (release_def);
+  *release_def = false;
+  if (SSA_NAME_VAR (def) == NULL
+      && gimple_code (def_stmt) == GIMPLE_NOP)
+    {
+      TREE_TYPE (def) = promoted_type;
+      promote_cst_in_stmt (def_stmt, promoted_type);
+      new_def = def;
+      *copy_of_def = NULL;
+      return new_def;
+    }
+
+  switch (gimple_code (def_stmt))
+    {
+
+    case GIMPLE_PHI:
+      phi = as_a <gphi *> (def_stmt);
+      new_def = make_promoted_copy (def, phi, promoted_type);
+      *copy_of_def = NULL;
+      gimple_phi_set_result (phi, new_def);
+      SET_PHI_RESULT (phi, new_def);
+      *release_def = true;
+      update_stmt (def_stmt);
+      promote_cst_in_stmt (def_stmt, promoted_type);
+      break;
+
+    case GIMPLE_NOP:
+      /* Create a promoted type copy of parameters.  */
+      bb = ENTRY_BLOCK_PTR_FOR_FN (cfun);
+      bb = get_single_successor_bb (bb);
+      gcc_assert (bb);
+      gsi = gsi_after_labels (bb);
+      new_def = make_promoted_copy (def, NULL, promoted_type);
+      copy_stmt = gimple_build_assign (new_def, CONVERT_EXPR,
+				       def, NULL_TREE);
+      gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+      *copy_of_def = def;
+      break;
+
+    case GIMPLE_ASSIGN:
+	{
+	  enum tree_code code = gimple_assign_rhs_code (def_stmt);
+	  tree rhs1 = gimple_assign_rhs1 (def_stmt);
+	  if (CONVERT_EXPR_CODE_P (code)
+	      && TREE_TYPE (rhs1) == promoted_type)
+	    {
+	      new_def = make_promoted_copy (def, NULL, promoted_type);
+	      gimple copy_stmt =
+		zero_sign_extend_stmt (new_def, rhs1,
+				       TYPE_PRECISION (TREE_TYPE (def)));
+	      gsi = gsi_for_stmt (def_stmt);
+	      gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+	      gsi = gsi_for_stmt (def_stmt);
+	      gsi_remove (&gsi, true);
+	    }
+	  else
+	    {
+	      new_def = make_promoted_copy (def, def_stmt, promoted_type);
+	      gimple_assign_set_lhs (def_stmt, new_def);
+	      update_stmt (def_stmt);
+	      if (TREE_CODE_CLASS (gimple_assign_rhs_code (def_stmt))
+		  != tcc_comparison)
+		promote_cst_in_stmt (def_stmt, promoted_type);
+	    }
+	  *release_def = true;
+	  *copy_of_def = NULL;
+	  break;
+	}
+
+    default:
+      new_def = make_promoted_copy (def, NULL, promoted_type);
+      copy_stmt = gimple_build_assign (new_def, CONVERT_EXPR,
+						def, NULL_TREE);
+      gsi = gsi_for_stmt (def_stmt);
+      if (lookup_stmt_eh_lp (def_stmt) > 0)
+	insert_stmt_on_edge (def_stmt, copy_stmt);
+      else
+	gsi_insert_after (&gsi, copy_stmt, GSI_NEW_STMT);
+      update_stmt (copy_stmt);
+      *copy_of_def = def;
+      break;
+    }
+
+  return new_def;
+}
+
+
+/* Promote all the USE with NEW_USE.  */
+static unsigned int
+promote_all_uses (tree use, tree new_use, tree copy_of_use,
+		  tree promoted_type)
+{
+  gimple stmt;
+  imm_use_iterator ui;
+  gimple_stmt_iterator gsi;
+  use_operand_p op;
+
+  /* Replace all the use with the promoted variable.  */
+  FOR_EACH_IMM_USE_STMT (stmt, ui, use)
+    {
+      if (stmt == SSA_NAME_DEF_STMT (new_use))
+	continue;
+
+      switch (gimple_code (stmt))
+	{
+
+	case GIMPLE_ASSIGN:
+	  promote_assign_stmt_use (stmt, use, &ui, new_use,
+				   copy_of_use, promoted_type);
+	  break;
+
+	case GIMPLE_PHI:
+	  promote_phi_stmt_use (stmt, use, &ui, new_use,
+				copy_of_use, promoted_type);
+	  break;
+
+	case GIMPLE_COND:
+	  promote_cond_stmt_use (stmt, use, &ui, new_use,
+				 promoted_type);
+	  break;
+
+	case GIMPLE_DEBUG:
+	  if (TREE_TYPE (use) != TREE_TYPE (new_use)
+	      && gimple_debug_bind_p (stmt))
+	    {
+	      gsi = gsi_for_stmt (stmt);
+	      gsi_remove (&gsi, true);
+	    }
+	  break;
+
+	default:
+	  if (TREE_TYPE (use) != TREE_TYPE (new_use))
+	    {
+	      tree temp;
+	      if (copy_of_use)
+		temp = copy_of_use;
+	      else
+		{
+		  temp = make_promoted_copy (use, NULL, TREE_TYPE (use));
+		  gimple copy_stmt = gimple_build_assign (temp, CONVERT_EXPR,
+							  new_use, NULL_TREE);
+		  gsi = gsi_for_stmt (stmt);
+		  gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+		}
+
+	      FOR_EACH_IMM_USE_ON_STMT (op, ui)
+		SET_USE (op, temp);
+	      update_stmt (stmt);
+	    }
+	  break;
+	}
+    }
+
+  return 0;
+}
+
+/* Promote definition of NAME and all its uses.  */
+static unsigned int
+promote_def_and_uses (tree name)
+{
+  tree type, new_name, copy_of_name;
+  bool release_def = false;
+
+  if (TREE_CODE (name) != SSA_NAME
+      || POINTER_TYPE_P (TREE_TYPE (name))
+      || !INTEGRAL_TYPE_P (TREE_TYPE (name))
+      || VECTOR_TYPE_P (TREE_TYPE (name))
+      || ssa_promoted_p (name)
+      || (type = get_promoted_type (TREE_TYPE (name))) == TREE_TYPE (name))
+    return 0;
+
+  if (promotion_safe_p (name))
+    {
+      new_name = promote_definition (name, type, &copy_of_name,
+				     &release_def);
+      promote_all_uses (name, new_name, copy_of_name, type);
+    }
+  else
+    promote_all_uses (name, name, name, type);
+  set_ssa_promoted (name);
+
+  if (release_def)
+    release_ssa_name (name);
+  return 0;
+}
+
+/* Mark the candidates for promotion.  */
+static void
+set_ssa_to_be_promoted_flag (gimple stmt)
+{
+  ssa_op_iter i;
+  tree def;
+  use_operand_p op;
+
+  switch (gimple_code (stmt))
+    {
+
+    case GIMPLE_PHI:
+	{
+	  gphi *phi = as_a <gphi *> (stmt);
+	  def = PHI_RESULT (phi);
+	  bitmap_set_bit (ssa_to_be_promoted_bitmap, SSA_NAME_VERSION (def));
+	  FOR_EACH_PHI_ARG (op, phi, i, SSA_OP_USE)
+	    {
+	      def = USE_FROM_PTR (op);
+	      if (TREE_CODE (def) == SSA_NAME)
+		bitmap_set_bit (ssa_to_be_promoted_bitmap, SSA_NAME_VERSION (def));
+	    }
+	  break;
+	}
+
+    default:
+      FOR_EACH_SSA_TREE_OPERAND (def, stmt, i, SSA_OP_USE | SSA_OP_DEF)
+	{
+	  if (TREE_CODE (def) == SSA_NAME)
+	    bitmap_set_bit (ssa_to_be_promoted_bitmap, SSA_NAME_VERSION (def));
+	}
+      break;
+    }
+}
+
+/* Visit PHI stmt and record if variables might have higher bits set if
+   promoted.  */
+static bool
+record_visit_phi_node (gimple stmt)
+{
+  tree def;
+  ssa_op_iter i;
+  use_operand_p op;
+  bool high_bits_set = false;
+  gphi *phi = as_a <gphi *> (stmt);
+  tree lhs = PHI_RESULT (phi);
+
+  if (TREE_CODE (lhs) != SSA_NAME
+      || POINTER_TYPE_P (TREE_TYPE (lhs))
+      || !INTEGRAL_TYPE_P (TREE_TYPE (lhs))
+      || ssa_overflows_p (lhs))
+    return false;
+
+  FOR_EACH_PHI_ARG (op, phi, i, SSA_OP_USE)
+    {
+      def = USE_FROM_PTR (op);
+      if (ssa_overflows_p (def))
+	high_bits_set = true;
+    }
+
+  if (high_bits_set)
+    {
+      set_ssa_overflows (lhs);
+      return true;
+    }
+  else
+    return false;
+}
+
+/* Visit STMT and record if variables might have higher bits set if
+   promoted.  */
+static bool
+record_visit_stmt (gimple stmt)
+{
+  tree def;
+  ssa_op_iter i;
+  bool changed = false;
+  gcc_assert (gimple_code (stmt) == GIMPLE_ASSIGN);
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  tree lhs = gimple_assign_lhs (stmt);
+  tree rhs1 = gimple_assign_rhs1 (stmt);
+
+  if (TREE_CODE (lhs) != SSA_NAME
+      || POINTER_TYPE_P (TREE_TYPE (lhs))
+      || !INTEGRAL_TYPE_P (TREE_TYPE (lhs)))
+    return false;
+
+  switch (code)
+    {
+      /* Conversion expressions that may need to be preserved.  */
+    CASE_CONVERT:
+      /* if the precision of LHS is greater than RHS, it is not safe to
+	 convert this with  ZEXT/SEXT stmt when there is also type change.  */
+      if ((TYPE_PRECISION (TREE_TYPE (lhs))
+	   > TYPE_PRECISION (TREE_TYPE (rhs1)))
+	  && (TYPE_UNSIGNED (TREE_TYPE (lhs))
+	      != TYPE_PRECISION (TREE_TYPE (rhs1))))
+	bitmap_set_bit (ssa_not_safe_bitmap, SSA_NAME_VERSION (lhs));
+      else if ((TYPE_PRECISION (TREE_TYPE (lhs))
+		<= TYPE_PRECISION (TREE_TYPE (rhs1)))
+	       && !ssa_overflows_p (lhs))
+	{
+	  set_ssa_overflows (lhs);
+	  changed = true;
+	}
+      break;
+
+    case SSA_NAME:
+      if (!ssa_overflows_p (lhs)
+	  && ssa_overflows_p (rhs1))
+	{
+	  set_ssa_overflows (lhs);
+	  changed = true;
+	}
+      break;
+
+    case NE_EXPR:
+    case LT_EXPR:
+    case LE_EXPR:
+    case GT_EXPR:
+    case GE_EXPR:
+    case EQ_EXPR:
+    case UNLT_EXPR:
+    case UNLE_EXPR:
+    case UNGT_EXPR:
+    case UNGE_EXPR:
+    case UNEQ_EXPR:
+    case LTGT_EXPR:
+    case RSHIFT_EXPR:
+    case LSHIFT_EXPR:
+    case WIDEN_LSHIFT_EXPR:
+    case MIN_EXPR:
+    case MAX_EXPR:
+    case RANGE_EXPR:
+      break;
+
+    case TRUNC_DIV_EXPR:
+    case CEIL_DIV_EXPR:
+    case FLOOR_DIV_EXPR:
+    case RDIV_EXPR:
+    case ROUND_DIV_EXPR:
+    case EXACT_DIV_EXPR:
+      if (!ssa_overflows_p (lhs))
+	{
+	  set_ssa_overflows (lhs);
+	  changed = true;
+	}
+      break;
+
+      /* Expressions which may produce results that will have higher bits if
+	 computed in promoted type.  (i.e.  results may overflow) */
+    case MULT_HIGHPART_EXPR:
+    case PLUS_EXPR:
+    case MINUS_EXPR:
+    case MULT_EXPR:
+    case BIT_XOR_EXPR:
+    case BIT_NOT_EXPR:
+    case WIDEN_MULT_EXPR:
+    case WIDEN_MULT_PLUS_EXPR:
+    case WIDEN_MULT_MINUS_EXPR:
+    case WIDEN_SUM_EXPR:
+    case BIT_IOR_EXPR:
+    case BIT_AND_EXPR:
+      if (!ssa_overflows_p (lhs))
+	{
+	  set_ssa_overflows (lhs);
+	  changed = true;
+	}
+      break;
+
+      /* Expressions for which operation has to be performed in original
+	 types if promoted operands may have higher bits.  */
+    case ABS_EXPR:
+    case NEGATE_EXPR:
+    case TRUNC_MOD_EXPR:
+    case CEIL_MOD_EXPR:
+    case FLOOR_MOD_EXPR:
+    case ROUND_MOD_EXPR:
+      FOR_EACH_SSA_TREE_OPERAND (def, stmt, i, SSA_OP_USE)
+	{
+	  if (ssa_overflows_p (def))
+	    bitmap_set_bit (ssa_not_safe_bitmap, SSA_NAME_VERSION (lhs));
+	}
+      break;
+
+    case COND_EXPR:
+	{
+	  tree rhs2 = gimple_assign_rhs2 (stmt);
+	  tree rhs3 = gimple_assign_rhs3 (stmt);
+
+	  if (ssa_overflows_p (rhs2))
+	    {
+	      set_ssa_overflows (lhs);
+	      changed = true;
+	    }
+	  else if (ssa_overflows_p (rhs3))
+	    {
+	      set_ssa_overflows (lhs);
+	      changed = true;
+	    }
+	}
+      break;
+
+      /* Expressions that has to be done in original types.  */
+    case LROTATE_EXPR:
+    case RROTATE_EXPR:
+      bitmap_set_bit (ssa_not_safe_bitmap, SSA_NAME_VERSION (lhs));
+      break;
+
+      /* To be safe, all other have to be done in original types.  */
+    default:
+      bitmap_set_bit (ssa_not_safe_bitmap, SSA_NAME_VERSION (lhs));
+      break;
+    }
+  return changed;
+}
+
+
+/* Promote all the stmts in the basic block.  */
+static void
+promote_all_stmts (basic_block bb)
+{
+  gimple_stmt_iterator gsi;
+  ssa_op_iter iter;
+  tree def;
+
+  for (gphi_iterator gpi = gsi_start_phis (bb);
+       !gsi_end_p (gpi); gsi_next (&gpi))
+    {
+      gphi *phi = gpi.phi ();
+      use_operand_p op;
+
+      def = PHI_RESULT (phi);
+      promote_def_and_uses (def);
+      FOR_EACH_PHI_ARG (op, phi, iter, SSA_OP_USE)
+	{
+	  def = USE_FROM_PTR (op);
+	  promote_def_and_uses (def);
+	}
+    }
+  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+    {
+      gimple stmt = gsi_stmt (gsi);
+
+      FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_USE | SSA_OP_DEF)
+	promote_def_and_uses (def);
+    }
+}
+
+static void
+process_all_stmts_for_unsafe_promotion ()
+{
+  basic_block bb;
+  gimple_stmt_iterator gsi;
+  auto_vec<gimple> work_list;
+
+  FOR_EACH_BB_FN (bb, cfun)
+    {
+      for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+	{
+	  gimple phi = gsi_stmt (gsi);
+
+	  set_ssa_to_be_promoted_flag (phi);
+	  work_list.safe_push (phi);
+	}
+
+      for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+	{
+	  gimple stmt = gsi_stmt (gsi);
+
+	  set_ssa_to_be_promoted_flag (stmt);
+	  if (gimple_code (stmt) == GIMPLE_ASSIGN)
+	    work_list.safe_push (stmt);
+	}
+    }
+
+  while (work_list.length () > 0)
+    {
+      bool changed;
+      gimple stmt = work_list.pop ();
+      tree lhs;
+
+      switch (gimple_code (stmt))
+	{
+
+	case GIMPLE_ASSIGN:
+	  changed = record_visit_stmt (stmt);
+	  lhs = gimple_assign_lhs (stmt);
+	  break;
+
+	case GIMPLE_PHI:
+	  changed = record_visit_phi_node (stmt);
+	  lhs = PHI_RESULT (stmt);
+	  break;
+
+	default:
+	  gcc_assert (false);
+	  break;
+	}
+
+      if (changed)
+	{
+	  gimple use_stmt;
+	  imm_use_iterator ui;
+
+	  FOR_EACH_IMM_USE_STMT (use_stmt, ui, lhs)
+	    {
+	      if (gimple_code (use_stmt) == GIMPLE_ASSIGN
+		  || gimple_code (use_stmt) == GIMPLE_PHI)
+		work_list.safe_push (use_stmt);
+	    }
+	}
+    }
+}
+
+class type_promotion_dom_walker : public dom_walker
+{
+public:
+  type_promotion_dom_walker (cdi_direction direction)
+    : dom_walker (direction) {}
+  virtual void before_dom_children (basic_block bb)
+    {
+      promote_all_stmts (bb);
+    }
+};
+
+/* Main entry point to the pass.  */
+static unsigned int
+execute_type_promotion (void)
+{
+  n_ssa_val = num_ssa_names;
+  ssa_not_safe_bitmap = sbitmap_alloc (n_ssa_val);
+  bitmap_clear (ssa_not_safe_bitmap);
+  ssa_to_be_promoted_bitmap = sbitmap_alloc (n_ssa_val);
+  bitmap_clear (ssa_to_be_promoted_bitmap);
+  ssa_sets_higher_bits_bitmap = sbitmap_alloc (n_ssa_val);
+  bitmap_clear (ssa_sets_higher_bits_bitmap);
+
+  calculate_dominance_info (CDI_DOMINATORS);
+  process_all_stmts_for_unsafe_promotion ();
+  /* Walk the CFG in dominator order.  */
+  type_promotion_dom_walker (CDI_DOMINATORS)
+    .walk (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+
+  sbitmap_free (ssa_not_safe_bitmap);
+  sbitmap_free (ssa_to_be_promoted_bitmap);
+  sbitmap_free (ssa_sets_higher_bits_bitmap);
+    free_dominance_info (CDI_DOMINATORS);
+  return 0;
+}
+
+namespace {
+const pass_data pass_data_type_promotion =
+{
+  GIMPLE_PASS, /* type */
+  "promotion", /* name */
+  OPTGROUP_NONE, /* optinfo_flags */
+  TV_TREE_TYPE_PROMOTE, /* tv_id */
+  PROP_ssa, /* properties_required */
+  0, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  (TODO_cleanup_cfg | TODO_update_ssa | TODO_verify_all),
+};
+
+class pass_type_promotion : public gimple_opt_pass
+{
+public:
+  pass_type_promotion (gcc::context *ctxt)
+    : gimple_opt_pass (pass_data_type_promotion, ctxt)
+  {}
+
+  /* opt_pass methods: */
+  opt_pass * clone () { return new pass_type_promotion (m_ctxt); }
+  virtual bool gate (function *) { return flag_tree_type_promote != 0; }
+  virtual unsigned int execute (function *)
+    {
+      return execute_type_promotion ();
+    }
+
+}; // class pass_type_promotion
+
+} // anon namespace
+
+gimple_opt_pass *
+make_pass_type_promote (gcc::context *ctxt)
+{
+  return new pass_type_promotion (ctxt);
+}
+
diff --git a/gcc/passes.def b/gcc/passes.def
index ffa63b5..846ec1b 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -271,6 +271,7 @@ along with GCC; see the file COPYING3.  If not see
 	  NEXT_PASS (pass_slp_vectorize);
       POP_INSERT_PASSES ()
       NEXT_PASS (pass_lower_vector_ssa);
+      NEXT_PASS (pass_type_promote);
       NEXT_PASS (pass_cse_reciprocals);
       NEXT_PASS (pass_reassoc);
       NEXT_PASS (pass_strength_reduction);
diff --git a/gcc/timevar.def b/gcc/timevar.def
index 711bbed..f15e931 100644
--- a/gcc/timevar.def
+++ b/gcc/timevar.def
@@ -268,6 +268,7 @@ DEFTIMEVAR (TV_PLUGIN_RUN            , "plugin execution")
 DEFTIMEVAR (TV_GIMPLE_SLSR           , "straight-line strength reduction")
 DEFTIMEVAR (TV_VTABLE_VERIFICATION   , "vtable verification")
 DEFTIMEVAR (TV_TREE_UBSAN            , "tree ubsan")
+DEFTIMEVAR (TV_TREE_TYPE_PROMOTE     , "tree type promote")
 
 /* Everything else in rest_of_compilation not included above.  */
 DEFTIMEVAR (TV_EARLY_LOCAL	     , "early local passes")
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index 4929629..a766ac6 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -3805,6 +3805,18 @@ verify_gimple_assign_binary (gassign *stmt)
         return false;
       }
 
+    case SEXT_EXPR:
+      {
+	if (!INTEGRAL_TYPE_P (lhs_type)
+	    || !INTEGRAL_TYPE_P (rhs1_type)
+	    || TREE_CODE (rhs2) != INTEGER_CST)
+	  {
+	    error ("invalid operands in sext expr");
+	    return true;
+	  }
+	return false;
+      }
+
     case VEC_WIDEN_LSHIFT_HI_EXPR:
     case VEC_WIDEN_LSHIFT_LO_EXPR:
       {
diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index 42ddb9f..8c20089 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -3913,6 +3913,7 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights,
     case BIT_XOR_EXPR:
     case BIT_AND_EXPR:
     case BIT_NOT_EXPR:
+    case SEXT_EXPR:
 
     case TRUTH_ANDIF_EXPR:
     case TRUTH_ORIF_EXPR:
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index 172bd82..533e4a6 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -428,6 +428,7 @@ extern gimple_opt_pass *make_pass_fre (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_check_data_deps (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_copy_prop (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_isolate_erroneous_paths (gcc::context *ctxt);
+extern gimple_opt_pass *make_pass_type_promote (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_vrp (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_uncprop (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_return_slot (gcc::context *ctxt);
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index d7c049f..0045962 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -1812,6 +1812,14 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, int flags,
       }
       break;
 
+    case SEXT_EXPR:
+      pp_string (pp, "SEXT_EXPR <");
+      dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);
+      pp_string (pp, ", ");
+      dump_generic_node (pp, TREE_OPERAND (node, 1), spc, flags, false);
+      pp_greater (pp);
+      break;
+
     case MODIFY_EXPR:
     case INIT_EXPR:
       dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags,
@@ -3432,6 +3440,9 @@ op_symbol_code (enum tree_code code)
     case MIN_EXPR:
       return "min";
 
+    case SEXT_EXPR:
+      return "sext from bit";
+
     default:
       return "<<< ??? >>>";
     }
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index e7ab23c..581b1fe 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -2408,6 +2408,7 @@ extract_range_from_binary_expr_1 (value_range_t *vr,
       && code != LSHIFT_EXPR
       && code != MIN_EXPR
       && code != MAX_EXPR
+      && code != SEXT_EXPR
       && code != BIT_AND_EXPR
       && code != BIT_IOR_EXPR
       && code != BIT_XOR_EXPR)
@@ -2984,6 +2985,49 @@ extract_range_from_binary_expr_1 (value_range_t *vr,
       extract_range_from_multiplicative_op_1 (vr, code, &vr0, &vr1);
       return;
     }
+  else if (code == SEXT_EXPR)
+    {
+      gcc_assert (range_int_cst_p (&vr1));
+      unsigned int prec = tree_to_uhwi (vr1.min);
+      type = vr0.type;
+      wide_int tmin, tmax;
+      wide_int type_min, type_max;
+      wide_int may_be_nonzero, must_be_nonzero;
+
+      gcc_assert (!TYPE_UNSIGNED (expr_type));
+      type_min = wi::shwi (1 << (prec - 1),
+			   TYPE_PRECISION (TREE_TYPE (vr0.min)));
+      type_max = wi::shwi (((1 << (prec - 1)) - 1),
+			   TYPE_PRECISION (TREE_TYPE (vr0.max)));
+      if (zero_nonzero_bits_from_vr (expr_type, &vr0,
+				     &may_be_nonzero,
+				     &must_be_nonzero))
+	{
+	  HOST_WIDE_INT _may_be_nonzero = may_be_nonzero.to_uhwi ();
+
+	  if (_may_be_nonzero & (1 << (prec - 1)))
+	    {
+	      /* If to-be-extended sign bit can be one.  */
+	      tmin = type_min;
+	      tmax = may_be_nonzero & type_max;
+	    }
+	  else
+	    {
+	      /* If to-be-extended sign bit is zero.  */
+	      tmin = must_be_nonzero;
+	      tmax = may_be_nonzero;
+	    }
+	}
+      else
+	{
+	  tmin = type_min;
+	  tmax = type_max;
+	}
+      tmin = wi::sext (tmin, prec);
+      tmax = wi::sext (tmax, prec);
+      min = wide_int_to_tree (expr_type, tmin);
+      max = wide_int_to_tree (expr_type, tmax);
+    }
   else if (code == RSHIFT_EXPR
 	   || code == LSHIFT_EXPR)
     {
diff --git a/gcc/tree.def b/gcc/tree.def
index b4b4164..f58b073 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -747,6 +747,9 @@ DEFTREECODE (BIT_XOR_EXPR, "bit_xor_expr", tcc_binary, 2)
 DEFTREECODE (BIT_AND_EXPR, "bit_and_expr", tcc_binary, 2)
 DEFTREECODE (BIT_NOT_EXPR, "bit_not_expr", tcc_unary, 1)
 
+/*  Sign-extend operation.  */
+DEFTREECODE (SEXT_EXPR, "sext_expr", tcc_binary, 2)
+
 /* ANDIF and ORIF allow the second operand not to be computed if the
    value of the expression is determined from the first operand.  AND,
    OR, and XOR always compute the second operand whether its value is

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC] Elimination of zext/sext - type promotion pass
  2015-05-01  4:41                   ` Kugan
@ 2015-05-08 12:48                     ` Richard Biener
  2015-06-01 23:20                       ` Kugan
  0 siblings, 1 reply; 29+ messages in thread
From: Richard Biener @ 2015-05-08 12:48 UTC (permalink / raw)
  To: Kugan; +Cc: Uros Bizjak, gcc-patches, Jakub Jelinek

On Fri, May 1, 2015 at 6:41 AM, Kugan <kugan.vivekanandarajah@linaro.org> wrote:
>
>>> Thanks for the comments. Here is a prototype patch that implements a
>>> type promotion pass. This pass records SSA variables that will have
>>> values in higher bits (than the original type precision) if promoted and
>>> uses this information in inserting appropriate truncations and
>>> extensions. This pass also classifies some of the stmts that sets ssa's
>>> to be unsafe to promote. Here is a gimple difference for the type
>>> promotion as compared to previous dump for a testcase.
>>
>> Note that while GIMPLE has a way to zero-extend (using BIT_AND_EXPR)
>> it has no convenient way to sign-extend other than truncating to a signed
>> (non-promoted) type and then extending to the promoted type.  Thus
>> I think such pass should be accompanied with a new tree code,
>> SEXT_EXPR.  Otherwise we end up with "spurious" un-promoted
>> signed types which later optimizations may be confused about.
>>
>> Not sure if that is the actual issue though.
>>
>> Instead op "prmt" and "prmtn" I'd spell out promote and tree-type-prmtn
>> should be gimple-ssa-type-promote.c.  In the end all targets with
>> non-trivial PROMOTE_MODE should run the pass as a lowering step
>> so it should be enabled even at -O0 (and not disablable).
>>
>> I'd definitely run the pass _after_ pass_lower_vector_ssa (and in the
>> end I'd like to run it before IVOPTs ... which means moving IVOPTs
>> later, after VRP which should be the pass optimizing away some of
>> the extensions).
>>
>> In get_promoted_type I don't understand why you preserve qualifiers.
>> Also even for targets without PROMOTE_MODE it may be
>> beneficial to expose truncations required by expanding bit-precision
>> arithmetic earlier (that is, if !PROMOTE_MODE at least promote
>> to GET_MODE_PRECISION (TYPE_MODE (type))).  A testcase
>> for that is for example
>>
>> struct { long i : 33; long j : 33; } a;
>> return a.i + a.j;
>>
>> where bitfields of type > int do not promote so you get a
>> 33 bit add which we expand to a 64bit add plus a sign-extension
>> (and nothing optimizes that later usually).
>>
>> insert_next_bb sounds like you want to use insert_on_edge
>> somewhere.
>>
>> in assign_rhs_promotable_p you handle comparisons special
>> but the ternary COND_EXPR and VEC_COND_EXPR can have
>> comparisons embedded in their first operand.  The comment
>> confuses me though - with proper sign- or zero-extensions inserted
>> you should be able to promote them anyway?
>>
>> You seem to miss that a GIMPLE_ASSIGN can have 3 operands
>> in promote_cst_in_stmt as well.
>>
>> In promote_assign_stmt_use I consider a default: case that ends
>> up doing nothing dangerous ;)  Please either use gcc_unreachable ()
>> or do the safe thing (fix = true;?).  You seem to be working with
>> a lattice of some kind - fixing up stmt uses the way you do - walking
>> over immediate uses - is not very cache friendly.  Why not use
>> a lattice for this - record promoted vars to be used for old SSA names
>> and walk over all stmts instead, replacing SSA uses on them?
>> Btw, you don't need to call update_stmt if you SET_USE and not
>> replace an SSA name with a constant.
>>
>> You seem to "fix" with a single stmt but I don't see where you insert
>> zero- or sign-extensions for ssa_overflows_p cases?
>>
>> Note that at least for SSA names with !SSA_NAME_VAR (thus
>> anonymous vars) you want to do a cheaper promotion by not
>> allocating a new SSA name but simply "fixing" its type by
>> assigning to its TREE_TYPE.   For SSA names with SSA_NAME_VAR
>> there is of course debug-info to consider and thus doing what you
>> do is better (but probably still will wreck debuginfo?).
>>
>> GIMPLE_NOPs are not only used for parameters but also uninitialized
>> uses - for non-parameters you should simply adjust their type.  No
>> need to fixup their value.
>>
>> The pass needs more comments.
>>
>> It looks like you are not promoting all variables but only those
>> where compensation code (zero-/sign-extensions) is not necessary?
>>
>
> Thanks for the comments. Please find an updated version of this which
> addresses your review comments above. I am still to do full benchmarking
> on this, but tried with few small benchmarks. I will do proper
> benchmarking after getting feedback on the implementation. I have
> however bootstrapped on x86-64-none-linux and regression tested on
> x86-64, ARM and AArch64.
>
> I am also not clear with how I should handle the gimple debug statements
> when the intermediate temporary variable that maps to the original
> variable is promoted.

A few notes.

+/*  Sign-extend operation.  */
+DEFTREECODE (SEXT_EXPR, "sext_expr", tcc_binary, 2)

this needs an extended comment documenting the operands.

+    case SEXT_EXPR:
+       {
+         rtx op0 = expand_normal (treeop0);
+         rtx temp;
+         if (!target)
+           target = gen_reg_rtx (TYPE_MODE (TREE_TYPE (treeop0)));
+
+         machine_mode inner_mode = smallest_mode_for_size
(tree_to_shwi (treeop1),
+                                                           MODE_INT);
+         temp = convert_modes (inner_mode,
+                               TYPE_MODE (TREE_TYPE (treeop0)), op0, 0);
+         convert_move (target, temp, 0);
+         return target;
+       }

I think that if you allow arbitrary treeop1 you have to properly implement
fallbacks for the case where direct expansion to
(sign_extend:<target-mode> (subreg:<inner-mode> reg)) does not work
which is the intended operation modeled by SEXT_EXPR.

Direct expansion to that RTL would also be best I suppose.

+/* Return the promoted type for TYPE.  */
+static tree
+get_promoted_type (tree type)
+{
+  tree promoted_type;
+  enum machine_mode mode;
+  int uns;
+  if (POINTER_TYPE_P (type)
+      || TYPE_PRECISION (type) == 1
+      || !INTEGRAL_TYPE_P (type))
+    return type;

you should check for INTEGRAL_TYPE_P before looking at TYPE_PRECISION.

+#ifdef PROMOTE_MODE
+  mode = TYPE_MODE (type);
+  uns = TYPE_SIGN (type);
+  PROMOTE_MODE (mode, uns, type);
+#else
+  mode = smallest_mode_for_size (GET_MODE_PRECISION (TYPE_MODE (type)),
+                                MODE_INT);
+#endif

That smallest_mode_for_size should be a no-op.  Just hoist out
mode = TYPE_MODE (type).

Now before I get into too much details at this point.

You compute which promotions are unsafe, like sources/sinks of memory
(I think you miss call arguments/return values and also asm operands here).
But instead of simply marking those SSA names as not to be promoted
I'd instead split their life-ranges, thus replace

  _1 = mem;

with

  _2 = mem;
  _1 = [zs]ext (_2, ...);

and promote _1 anyway.  So in the first phase I'd do that (and obviously
note that _2 isn't to be promoted in the specific example).

For promotions that apply I wouldn't bother allocating new SSA names
but just "fix" their types (assign to their TREE_TYPE).  This also means
they have to become anonymous and if they didn't have a !DECL_IGNORED_P
decl before then a debug stmt should be inserted at the point of the
promotions.  So

  bar_3 = _1 + _2;

when promoted would become

 _4 = _1 + _2;
 _3 = sext <_4, ...>;
 # DEBUG bar = (orig-type) _4;  // or _3?

so you'd basically always promote defs (you have a lot of stmt/operand
walking code I didn't look too closely at - but it looks like too much) and
the uses get promoted automagically (because you promote the original
SSA name). Promotion of constants has to remain, of course.

I wouldn't promote pointers at all (are targets doing that?)

There are existing various helpers for stuff you re-invent.  I've just
spotted get_single_successor_bb for which there is single_succ_p ()
plus single_succ ().

Generally most of your stmt walking code could either use walk_stmt ()
or the low-level gimple_op (...) interface and a loop over all gimple_num_ops ()
operands.

You seem to mix in optimization and lowering - you are extending
at uses, not at defs for example.  IMHO that complicates the code
or do you think that a later optimization pass cannot recover from some
obviously bad decisions here?  If so then I suggest to implement sth
less ad-hoc by using a SSA lattice to track this and propagate the
info properly, still emitting the truncations at the defs where necessary.

Thanks,
Richard.

> Thanks,
> Kugan
>
>
> gcc/ChangeLog:
>
> 2015-05-01  Kugan Vivekanandarajah  <kuganv@linaro.org>
>
>         * Makefile.in: Add gimple-ssa-type-promote.o.
>         * cfgexpand.c (expand_debug_expr): Handle SEXT_EXPR.
>         * common.opt: New option -ftree-type-promote.
>         * expr.c (expand_expr_real_2): Handle SEXT_EXPR.
>         * fold-const.c (int_const_binop_1):
>         * gimple-ssa-type-promote.c: New file.
>         * passes.def: Define new pass_type_promote.
>         * timevar.def: Define new TV_TREE_TYPE_PROMOTE.
>         * tree-cfg.c (verify_gimple_assign_binary): Handle SEXT_EXPR.
>         * tree-inline.c (estimate_operator_cost):
>         * tree-pass.h (make_pass_type_promote): New.
>         * tree-pretty-print.c (dump_generic_node): Handle SEXT_EXPR.
>         (op_symbol_code): Likewise.
>         * tree-vrp.c (extract_range_from_binary_expr_1): Likewise.
>         * tree.def: Define new SEXT_EXPR.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC] Elimination of zext/sext - type promotion pass
  2015-05-08 12:48                     ` Richard Biener
@ 2015-06-01 23:20                       ` Kugan
  2015-06-19  2:55                         ` Kugan
  2015-07-28 11:05                         ` Richard Biener
  0 siblings, 2 replies; 29+ messages in thread
From: Kugan @ 2015-06-01 23:20 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jeff Law, gcc-patches, Jakub Jelinek

[-- Attachment #1: Type: text/plain, Size: 2712 bytes --]



On 08/05/15 22:48, Richard Biener wrote:
> You compute which promotions are unsafe, like sources/sinks of memory
> (I think you miss call arguments/return values and also asm operands here).
> But instead of simply marking those SSA names as not to be promoted
> I'd instead split their life-ranges, thus replace
> 
>   _1 = mem;
> 
> with
> 
>   _2 = mem;
>   _1 = [zs]ext (_2, ...);
> 
> and promote _1 anyway.  So in the first phase I'd do that (and obviously
> note that _2 isn't to be promoted in the specific example).
> 
> For promotions that apply I wouldn't bother allocating new SSA names
> but just "fix" their types (assign to their TREE_TYPE).  This also means
> they have to become anonymous and if they didn't have a !DECL_IGNORED_P
> decl before then a debug stmt should be inserted at the point of the
> promotions.  So
> 
>   bar_3 = _1 + _2;
> 
> when promoted would become
> 
>  _4 = _1 + _2;
>  _3 = sext <_4, ...>;
>  # DEBUG bar = (orig-type) _4;  // or _3?
> 
> so you'd basically always promote defs (you have a lot of stmt/operand
> walking code I didn't look too closely at - but it looks like too much) and
> the uses get promoted automagically (because you promote the original
> SSA name). Promotion of constants has to remain, of course.


Thanks Richard. I experimented on this idea to understand it better.
Please see the attached prototype (I am still working on your other
comments which is not addressed here). Please have a look and let me
know if this is along what you would expect. I have few questions though.

1. In the following example above :
  char _1;
  _1 = mem;

when changing with

  char _2;
  int _1;
  _2 = mem;
  _1 = [zs]ext (_2, ...);

for the [zs]ext operation we now use BIT_AND_EXPR and ZEXT_EXPR which
(as of now) requires that the LHS and RHS are of the same type. Are you
suggesting that we should have a true ZEXT_EXPR and SEXT_EXPR which can
do the above in the gimple? I am now using CONVER_EXPR and which is the
source of many optimization issue.

2. for inline asm (a reduced test case that might not make much as a
stand alone test-case, but I ran into similar cases with valid programmes)

;; Function fn1 (fn1, funcdef_no=0, decl_uid=4220, cgraph_uid=0,
symbol_order=0)

fn1 (short int p1)
{
  <bb 2>:
  __asm__("" : "=r" p1_2 : "0" p1_1(D));
  return;

}


I am generating something like the following which ICEs. What is the
expected out?

;; Function fn1 (fn1, funcdef_no=0, decl_uid=4220, cgraph_uid=0,
symbol_order=0)

fn1 (short int p1)
{
  int _1;
  int _2;
  short int _5;

  <bb 2>:
  _1 = (int) p1_4(D);
  _5 = (short int) _1;
  __asm__("" : "=r" p1_6 : "0" _5);
  _2 = (int) p1_6;
  return;

}

Thanks a lot for your time,
Kugan

[-- Attachment #2: p.txt --]
[-- Type: text/plain, Size: 29184 bytes --]

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index b59b5d9..16b2df1 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1491,6 +1491,7 @@ OBJS = \
 	tree-vect-slp.o \
 	tree-vectorizer.o \
 	tree-vrp.o \
+	gimple-ssa-type-promote.o \
 	tree.o \
 	valtrack.o \
 	value-prof.o \
diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index b190f91..bbf70ce 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -4809,6 +4809,10 @@ expand_debug_expr (tree exp)
     case FMA_EXPR:
       return simplify_gen_ternary (FMA, mode, inner_mode, op0, op1, op2);
 
+    case SEXT_EXPR:
+      return op0;
+
+
     default:
     flag_unsupported:
 #ifdef ENABLE_CHECKING
diff --git a/gcc/common.opt b/gcc/common.opt
index 6e15af2..56231b5 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2368,6 +2368,10 @@ ftree-vrp
 Common Report Var(flag_tree_vrp) Init(0) Optimization
 Perform Value Range Propagation on trees
 
+ftree-type-promote
+Common Report Var(flag_tree_type_promote) Init(1) Optimization
+Perform Type Promotion on trees
+
 funit-at-a-time
 Common Report Var(flag_unit_at_a_time) Init(1)
 Compile whole compilation unit at a time
diff --git a/gcc/expr.c b/gcc/expr.c
index 5a931dc..0badfa5 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -9249,6 +9249,21 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
       target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, target);
       return target;
 
+    case SEXT_EXPR:
+	{
+	  rtx op0 = expand_normal (treeop0);
+	  rtx temp;
+	  if (!target)
+	    target = gen_reg_rtx (TYPE_MODE (TREE_TYPE (treeop0)));
+
+	  machine_mode inner_mode = smallest_mode_for_size (tree_to_shwi (treeop1),
+							    MODE_INT);
+	  temp = convert_modes (inner_mode,
+				TYPE_MODE (TREE_TYPE (treeop0)), op0, 0);
+	  convert_move (target, temp, 0);
+	  return target;
+	}
+
     default:
       gcc_unreachable ();
     }
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 55196b5..3c42449 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -1007,6 +1007,10 @@ int_const_binop_1 (enum tree_code code, const_tree arg1, const_tree parg2,
       res = wi::bit_and (arg1, arg2);
       break;
 
+    case SEXT_EXPR:
+      res = wi::sext (arg1, arg2.to_uhwi ());
+      break;
+
     case RSHIFT_EXPR:
     case LSHIFT_EXPR:
       if (wi::neg_p (arg2))
diff --git a/gcc/gimple-ssa-type-promote.c b/gcc/gimple-ssa-type-promote.c
index e69de29..6ccda0c 100644
--- a/gcc/gimple-ssa-type-promote.c
+++ b/gcc/gimple-ssa-type-promote.c
@@ -0,0 +1,719 @@
+/* Type promotion of SSA names to minimise redundant zero/sign extension.
+   Copyright (C) 2015 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "flags.h"
+#include "hash-set.h"
+#include "machmode.h"
+#include "vec.h"
+#include "double-int.h"
+#include "input.h"
+#include "alias.h"
+#include "symtab.h"
+#include "wide-int.h"
+#include "inchash.h"
+#include "tree.h"
+#include "fold-const.h"
+#include "stor-layout.h"
+#include "calls.h"
+#include "predict.h"
+#include "hard-reg-set.h"
+#include "function.h"
+#include "dominance.h"
+#include "cfg.h"
+#include "basic-block.h"
+#include "tree-ssa-alias.h"
+#include "gimple-fold.h"
+#include "tree-eh.h"
+#include "gimple-expr.h"
+#include "is-a.h"
+#include "gimple.h"
+#include "gimple-iterator.h"
+#include "gimple-ssa.h"
+#include "tree-phinodes.h"
+#include "ssa-iterators.h"
+#include "stringpool.h"
+#include "tree-ssanames.h"
+#include "tree-pass.h"
+#include "gimple-pretty-print.h"
+#include "langhooks.h"
+#include "sbitmap.h"
+#include "domwalk.h"
+
+#include "tree-dfa.h"
+
+/* This pass applies type promotion to SSA names in the function and
+   inserts appropriate truncations.  Idea of this pass is to promote operations
+   such a way that we can minimise generation of subreg in RTL,
+   that intern results in removal of redundant zero/sign extensions.  This pass
+   will run prior to The VRP and DOM such that they will be able to optimise
+   redundant truncations and extensions.  This is based on the discussion from
+   https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00472.html.
+
+*/
+
+static unsigned n_ssa_val;
+static sbitmap ssa_to_be_promoted_bitmap;
+static sbitmap ssa_sets_higher_bits_bitmap;
+
+/* Return the promoted type for TYPE.  */
+static tree
+get_promoted_type (tree type)
+{
+  tree promoted_type;
+  enum machine_mode mode;
+  int uns;
+  if (POINTER_TYPE_P (type)
+      || !INTEGRAL_TYPE_P (type)
+      || TYPE_PRECISION (type) == 1)
+    return type;
+  mode = TYPE_MODE (type);
+#ifdef PROMOTE_MODE
+  uns = TYPE_SIGN (type);
+  PROMOTE_MODE (mode, uns, type);
+#endif
+  uns = TYPE_SIGN (type);
+  promoted_type = lang_hooks.types.type_for_mode (mode, uns);
+  if (promoted_type
+      && (TYPE_PRECISION (promoted_type) > TYPE_PRECISION (type)))
+    type = promoted_type;
+  return type;
+}
+
+/* Return true if ssa NAME is already considered for promotion.  */
+static bool
+ssa_promoted_p (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	return bitmap_bit_p (ssa_to_be_promoted_bitmap, index);
+    }
+  return true;
+}
+
+
+/* Set ssa NAME to be already considered for promotion.  */
+static void
+set_ssa_promoted (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	bitmap_set_bit (ssa_to_be_promoted_bitmap, index);
+    }
+}
+
+/* Return single successor (excluding EH edge) for basic block BB.  If there
+   are more than one successors, return NULL.  */
+static basic_block
+get_single_successor_bb (basic_block bb)
+{
+  edge e, res = NULL;
+  edge_iterator ei;
+
+  FOR_EACH_EDGE (e, ei, bb->succs)
+    if (!(e->flags & EDGE_EH))
+      {
+	if (res)
+	  return NULL;
+	res = e;
+      }
+  return res->dest;
+}
+
+/* Insert COPY_STMT along the edge from STMT to its successor.  */
+static void
+insert_stmt_on_edge (gimple stmt, gimple copy_stmt)
+{
+  edge_iterator ei;
+  edge e, edge = NULL;
+  basic_block bb = gimple_bb (stmt);
+
+  FOR_EACH_EDGE (e, ei, bb->succs)
+    if (!(e->flags & EDGE_EH))
+      {
+	gcc_assert (edge == NULL);
+	edge = e;
+      }
+
+  gcc_assert (edge);
+  gsi_insert_on_edge_immediate (edge, copy_stmt);
+}
+
+/* Convert constant CST to TYPE.  */
+static tree
+convert_int_cst (tree type, tree cst, signop sign = SIGNED)
+{
+  wide_int wi_cons = fold_convert (type, cst);
+  wi_cons = wi::ext (wi_cons, TYPE_PRECISION (TREE_TYPE (cst)), sign);
+  return wide_int_to_tree (type, wi_cons);
+}
+
+
+/* Promote constants in STMT to TYPE.  If PROMOTE_COND_EXPR is true,
+ promote only the constants in conditions part of the COND_EXPR.  */
+static void
+promote_cst_in_stmt (gimple stmt, tree type,
+		     bool promote_cond_expr = false, signop sign = SIGNED)
+{
+  tree op;
+  ssa_op_iter iter;
+  use_operand_p oprnd;
+  int index;
+  tree op0, op1;
+
+  if (promote_cond_expr)
+    {
+      /* Promote constant in COND_EXPR.  */
+      gcc_assert (gimple_assign_rhs_code (stmt) == COND_EXPR);
+      op = gimple_assign_rhs1 (stmt);
+      op0 = TREE_OPERAND (op, 0);
+      op1 = TREE_OPERAND (op, 1);
+
+      if (TREE_CODE (op0) == INTEGER_CST)
+	op0 = convert_int_cst (type, op0, sign);
+      if (TREE_CODE (op1) == INTEGER_CST)
+	op1 = convert_int_cst (type, op1, sign);
+
+      tree new_op = build2 (TREE_CODE (op), type, op0, op1);
+      gimple_assign_set_rhs1 (stmt, new_op);
+      return;
+    }
+
+  switch (gimple_code (stmt))
+    {
+    case GIMPLE_ASSIGN:
+      op = gimple_assign_rhs1 (stmt);
+
+      if (op && TREE_CODE (op) == INTEGER_CST)
+	gimple_assign_set_rhs1 (stmt, convert_int_cst (type, op, sign));
+      op = gimple_assign_rhs2 (stmt);
+
+      if (op && TREE_CODE (op) == INTEGER_CST)
+	gimple_assign_set_rhs2 (stmt, convert_int_cst (type, op, sign));
+      op = gimple_assign_rhs3 (stmt);
+
+      if (op && TREE_CODE (op) == INTEGER_CST)
+	gimple_assign_set_rhs3 (stmt, convert_int_cst (type, op, sign));
+      break;
+
+    case GIMPLE_PHI:
+	{
+	  gphi *phi = as_a <gphi *> (stmt);
+	  FOR_EACH_PHI_ARG (oprnd, phi, iter, SSA_OP_USE)
+	    {
+	      op = USE_FROM_PTR (oprnd);
+	      index = PHI_ARG_INDEX_FROM_USE (oprnd);
+	      if (TREE_CODE (op) == INTEGER_CST)
+		SET_PHI_ARG_DEF (phi, index, convert_int_cst (type, op, sign));
+	    }
+	}
+      break;
+
+    case GIMPLE_COND:
+	{
+	  gcond *cond = as_a <gcond *> (stmt);
+	  op = gimple_cond_lhs (cond);
+
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_cond_set_lhs (cond, convert_int_cst (type, op, sign));
+	  op = gimple_cond_rhs (cond);
+
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_cond_set_rhs (cond, convert_int_cst (type, op, sign));
+	}
+
+    default:
+      break;
+    }
+}
+
+/* Create an ssa with TYPE to copy ssa VAR.  */
+static tree
+make_promoted_copy (tree var, gimple def_stmt, tree type)
+{
+  tree new_lhs = make_ssa_name (type, def_stmt);
+  if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (var))
+    SSA_NAME_OCCURS_IN_ABNORMAL_PHI (new_lhs) = 1;
+  return new_lhs;
+}
+
+/* Zero/sign extend (depending on type) VAR and truncate to WIDTH bits.
+   Assign the zero/sign extended value in NEW_VAR.  gimple statement
+   that performs the zero/sign extension is returned.  */
+static gimple
+zero_sign_extend_stmt (tree new_var, tree var, int width)
+{
+  gcc_assert (TYPE_PRECISION (TREE_TYPE (var))
+	      == TYPE_PRECISION (TREE_TYPE (new_var)));
+  gcc_assert (TYPE_PRECISION (TREE_TYPE (var)) > width);
+  gimple stmt;
+
+  if (TYPE_UNSIGNED (TREE_TYPE (new_var)))
+    /* Zero extend.  */
+    stmt = gimple_build_assign (new_var,
+				BIT_AND_EXPR,
+				var, build_int_cst (TREE_TYPE (var),
+						    ((1ULL << width) - 1)));
+  else
+    /* Sign extend.  */
+    stmt = gimple_build_assign (new_var,
+				SEXT_EXPR,
+				var, build_int_cst (TREE_TYPE (var), width));
+  return stmt;
+}
+
+
+void duplicate_default_ssa (tree to, tree from)
+{
+  SET_SSA_NAME_VAR_OR_IDENTIFIER (to, SSA_NAME_VAR (from));
+  SSA_NAME_IS_DEFAULT_DEF (to) = SSA_NAME_IS_DEFAULT_DEF (from);
+  SSA_NAME_DEF_STMT (to) = SSA_NAME_DEF_STMT (from);
+  SET_SSA_NAME_VAR_OR_IDENTIFIER (from, NULL_TREE);
+  SSA_NAME_IS_DEFAULT_DEF (from)= 0;
+  SSA_NAME_IS_DEFAULT_DEF (to) = 1;
+  SSA_NAME_IS_DEFAULT_DEF (from) = 0;
+
+}
+
+
+/* Promote definition DEF to NEW_TYPE.  If the DEF is replaced and has to
+   be released, set RELEASE_DEF.  Also return COPY_OF_DEF with the original
+   type for any use statement that needs truncation.  */
+static void
+promote_definition (tree def,
+		    tree promoted_type)
+{
+  gimple def_stmt = SSA_NAME_DEF_STMT (def);
+  gimple copy_stmt = NULL;
+  basic_block bb;
+  gimple_stmt_iterator gsi;
+  tree original_type = TREE_TYPE (def);
+  gphi *phi;
+  tree new_def;
+  bool do_not_promote = false;
+
+  if (gimple_vuse (def_stmt) != NULL_TREE
+      || gimple_vdef (def_stmt) != NULL_TREE)
+    {
+      do_not_promote = true;
+    }
+  else
+    {
+      switch (gimple_code (def_stmt))
+	{
+	case GIMPLE_PHI:
+	    {
+	      phi = as_a <gphi *> (def_stmt);
+	      TREE_TYPE (def) = promoted_type;
+	      gimple_phi_set_result (phi, def);
+	      SET_PHI_RESULT (phi, def);
+	      SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+	      promote_cst_in_stmt (def_stmt, promoted_type);
+	      break;
+	    }
+
+	case GIMPLE_ASM:
+	    {
+	      gasm *asm_stmt = as_a <gasm *> (def_stmt);
+	      for (unsigned int i = 0; i < gimple_asm_noutputs (asm_stmt); ++i)
+		{
+		  tree link = gimple_asm_output_op (asm_stmt, i);
+		  tree op = TREE_VALUE (link);
+		  if (op == def)
+		    {
+		      new_def = copy_ssa_name (def);
+		      duplicate_default_ssa (new_def, def);
+		      TREE_VALUE (link) = new_def;
+		      gimple_asm_set_output_op (asm_stmt, i, link);
+
+		      TREE_TYPE (def) = promoted_type;
+		      copy_stmt = gimple_build_assign (def, CONVERT_EXPR,
+						       new_def, NULL_TREE);
+		      SSA_NAME_DEF_STMT (def) = copy_stmt;
+		      gsi = gsi_for_stmt (def_stmt);
+		      gsi_insert_after (&gsi, copy_stmt, GSI_NEW_STMT);
+		      break;
+		    }
+		}
+	      break;
+	    }
+
+	case GIMPLE_NOP:
+	    {
+	      if (SSA_NAME_VAR (def) == NULL)
+		{
+		  TREE_TYPE (def) = promoted_type;
+		}
+	      else
+		{
+		  /* Create a promoted type copy of parameters.  */
+		  bb = ENTRY_BLOCK_PTR_FOR_FN (cfun);
+		  bb = get_single_successor_bb (bb);
+		  gcc_assert (bb);
+		  gsi = gsi_after_labels (bb);
+		  new_def = copy_ssa_name (def);
+		  set_ssa_default_def (cfun, SSA_NAME_VAR (def), new_def);
+		  duplicate_default_ssa (new_def, def);
+		  TREE_TYPE (def) = promoted_type;
+		  copy_stmt = gimple_build_assign (def, CONVERT_EXPR,
+						   new_def, NULL_TREE);
+		  SSA_NAME_DEF_STMT (def) = copy_stmt;
+		  gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+		}
+	      break;
+	    }
+
+	case GIMPLE_ASSIGN:
+	    {
+	      enum tree_code code = gimple_assign_rhs_code (def_stmt);
+	      if (code == ARRAY_REF
+		  || code == LROTATE_EXPR
+		  || code == RROTATE_EXPR
+		  || code == VIEW_CONVERT_EXPR
+		  || code == BIT_FIELD_REF
+		  || code == REALPART_EXPR
+		  || code == IMAGPART_EXPR
+		  || code == REDUC_MAX_EXPR
+		  || code == REDUC_PLUS_EXPR
+		  || code == REDUC_MIN_EXPR)
+		{
+		  do_not_promote = true;
+		  break;
+		}
+
+	      if (CONVERT_EXPR_CODE_P (code))
+		{
+		  tree rhs = gimple_assign_rhs1 (def_stmt);
+		  if ((TYPE_PRECISION (TREE_TYPE (rhs)) == TYPE_PRECISION (promoted_type))
+		      && (TYPE_UNSIGNED (TREE_TYPE (rhs)) == TYPE_UNSIGNED (promoted_type)))
+		    {
+		      TREE_TYPE (def) = promoted_type;
+		      gimple copy_stmt =
+			zero_sign_extend_stmt (def, rhs,
+					       TYPE_PRECISION (original_type));
+		      SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		      gsi = gsi_for_stmt (def_stmt);
+		      gsi_replace (&gsi, copy_stmt, false);
+		    }
+		  else
+		    {
+		      do_not_promote = true;
+		    }
+		}
+	      else
+		{
+		  SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		  promote_cst_in_stmt (def_stmt, promoted_type);
+		  TREE_TYPE (def) = promoted_type;
+		  new_def = copy_ssa_name (def);
+		  gimple_set_lhs (def_stmt, new_def);
+		  gimple copy_stmt =
+		    zero_sign_extend_stmt (def, new_def,
+					   TYPE_PRECISION (original_type));
+		  gsi = gsi_for_stmt (def_stmt);
+		  gsi_insert_after (&gsi, copy_stmt, GSI_NEW_STMT);
+		}
+	      break;
+	    }
+
+	default:
+	  do_not_promote = true;
+	  break;
+	}
+    }
+
+  if (do_not_promote)
+    {
+      new_def = copy_ssa_name (def);
+      SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+      TREE_TYPE (def) = promoted_type;
+      gimple_set_lhs (def_stmt, new_def);
+      copy_stmt = gimple_build_assign (def, CONVERT_EXPR,
+				       new_def, NULL_TREE);
+      gsi = gsi_for_stmt (def_stmt);
+      if (lookup_stmt_eh_lp (def_stmt) > 0)
+	insert_stmt_on_edge (def_stmt, copy_stmt);
+      else
+	gsi_insert_after (&gsi, copy_stmt, GSI_NEW_STMT);
+    }
+}
+
+
+/* Promote all the USE with NEW_USE.  */
+static unsigned int
+promote_all_uses (tree use, tree promoted_type, tree old_type)
+{
+  gimple stmt;
+  imm_use_iterator ui;
+  gimple_stmt_iterator gsi;
+  use_operand_p op;
+
+  /* Replace all the use with the promoted variable.  */
+  FOR_EACH_IMM_USE_STMT (stmt, ui, use)
+    {
+      bool do_not_promote = false;
+      if (gimple_vuse (stmt) != NULL_TREE
+	  || gimple_vdef (stmt) != NULL_TREE)
+	do_not_promote = true;
+      else
+	{
+	  switch (gimple_code (stmt))
+	    {
+
+	    case GIMPLE_DEBUG:
+		{
+		  gsi = gsi_for_stmt (stmt);
+		  gsi_remove (&gsi, true);
+		}
+	      break;
+	    case GIMPLE_ASM:
+		{
+		  gasm *asm_stmt = as_a <gasm *> (stmt);
+		  for (unsigned int i = 0; i < gimple_asm_ninputs (asm_stmt); ++i)
+		    {
+		      tree link = gimple_asm_input_op (asm_stmt, i);
+		      tree op = TREE_VALUE (link);
+		      if (op == use)
+			{
+			  tree temp = make_promoted_copy (use, NULL, old_type);
+			  gsi = gsi_for_stmt (stmt);
+			  gimple copy_stmt = gimple_build_assign (temp, CONVERT_EXPR,
+								  use, NULL_TREE);
+			  gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+			  TREE_VALUE (link) = temp;
+			  gimple_asm_set_input_op (asm_stmt, i, link);
+			  break;
+			}
+		    }
+		}
+	      break;
+
+	    case GIMPLE_ASSIGN:
+		{
+		  enum tree_code code = gimple_assign_rhs_code (stmt);
+		  tree lhs = gimple_assign_lhs (stmt);
+		  if (code == VIEW_CONVERT_EXPR
+		      || code == LROTATE_EXPR
+		      || code == RROTATE_EXPR
+		      || code == CONSTRUCTOR
+		      || code == BIT_FIELD_REF
+		      || code == COMPLEX_EXPR
+		      || code == ASM_EXPR
+		      || VECTOR_TYPE_P (TREE_TYPE (lhs)))
+		    {
+		      do_not_promote = true;
+		    }
+
+		  if (TREE_CODE_CLASS (code)
+		      == tcc_comparison)
+		    {
+		      if (TREE_TYPE (use) == promoted_type)
+			promote_cst_in_stmt (stmt, promoted_type);
+		    }
+
+		  if (CONVERT_EXPR_CODE_P (code))
+		    {
+		      tree lhs = gimple_assign_lhs (stmt);
+		      if ((TYPE_PRECISION (TREE_TYPE (lhs)) == TYPE_PRECISION (promoted_type))
+			  && (TYPE_UNSIGNED (TREE_TYPE (lhs)) == TYPE_UNSIGNED (promoted_type)))
+			{
+			  gimple copy_stmt =
+			    zero_sign_extend_stmt (lhs, use,
+						   TYPE_PRECISION (old_type));
+			  gsi = gsi_for_stmt (stmt);
+			  gsi_replace (&gsi, copy_stmt, false);
+			}
+		      else if (TYPE_PRECISION (TREE_TYPE (lhs)) < TYPE_PRECISION (old_type))
+			{
+			  /* do nothing */
+			}
+		      else
+			do_not_promote = true;
+		    }
+		}
+	      break;
+
+	    case GIMPLE_COND:
+		{
+		  tree temp = make_promoted_copy (use, NULL, TREE_TYPE (use));
+		  gimple copy_stmt =
+		    zero_sign_extend_stmt (temp, use,
+					   TYPE_PRECISION (old_type));
+		  gsi = gsi_for_stmt (stmt);
+		  gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+
+		  FOR_EACH_IMM_USE_ON_STMT (op, ui)
+		    SET_USE (op, temp);
+		  promote_cst_in_stmt (stmt, promoted_type, false,
+				       TYPE_SIGN (TREE_TYPE (use)));
+		  update_stmt (stmt);
+		}
+	      break;
+	    default:
+	      break;
+	    }
+	}
+
+      if (do_not_promote)
+	{
+	  tree temp;
+	  temp = copy_ssa_name (use);
+	  TREE_TYPE (temp) = old_type;
+	  gimple copy_stmt = gimple_build_assign (temp, CONVERT_EXPR,
+						  use, NULL_TREE);
+	  gsi = gsi_for_stmt (stmt);
+	  gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+	  FOR_EACH_IMM_USE_ON_STMT (op, ui)
+	    SET_USE (op, temp);
+	  update_stmt (stmt);
+	}
+    }
+
+  return 0;
+}
+
+/* Promote definition of NAME and all its uses.  */
+static unsigned int
+promote_def_and_uses (tree name)
+{
+  tree type;
+  if (TREE_CODE (name) != SSA_NAME
+      || POINTER_TYPE_P (TREE_TYPE (name))
+      || !INTEGRAL_TYPE_P (TREE_TYPE (name))
+      || VECTOR_TYPE_P (TREE_TYPE (name))
+      || ssa_promoted_p (name)
+      || (type = get_promoted_type (TREE_TYPE (name))) == TREE_TYPE (name))
+    return 0;
+  tree old_type = TREE_TYPE (name);
+  promote_definition (name, type);
+  promote_all_uses (name, type, old_type);
+  set_ssa_promoted (name);
+  return 0;
+}
+
+/* Promote all the stmts in the basic block.  */
+static void
+promote_all_stmts (basic_block bb)
+{
+  gimple_stmt_iterator gsi;
+  ssa_op_iter iter;
+  tree def;
+
+  for (gphi_iterator gpi = gsi_start_phis (bb);
+       !gsi_end_p (gpi); gsi_next (&gpi))
+    {
+      gphi *phi = gpi.phi ();
+      use_operand_p op;
+
+      FOR_EACH_PHI_ARG (op, phi, iter, SSA_OP_USE)
+	{
+	  def = USE_FROM_PTR (op);
+	  promote_def_and_uses (def);
+	}
+      def = PHI_RESULT (phi);
+      promote_def_and_uses (def);
+    }
+  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+    {
+      gimple stmt = gsi_stmt (gsi);
+
+      FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_USE | SSA_OP_DEF)
+	promote_def_and_uses (def);
+    }
+}
+
+
+class type_promotion_dom_walker : public dom_walker
+{
+public:
+  type_promotion_dom_walker (cdi_direction direction)
+    : dom_walker (direction) {}
+  virtual void before_dom_children (basic_block bb)
+    {
+      promote_all_stmts (bb);
+    }
+};
+
+/* Main entry point to the pass.  */
+static unsigned int
+execute_type_promotion (void)
+{
+  n_ssa_val = num_ssa_names;
+  ssa_to_be_promoted_bitmap = sbitmap_alloc (n_ssa_val);
+  bitmap_clear (ssa_to_be_promoted_bitmap);
+  ssa_sets_higher_bits_bitmap = sbitmap_alloc (n_ssa_val);
+  bitmap_clear (ssa_sets_higher_bits_bitmap);
+
+  calculate_dominance_info (CDI_DOMINATORS);
+  /* Walk the CFG in dominator order.  */
+  type_promotion_dom_walker (CDI_DOMINATORS)
+    .walk (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+
+  sbitmap_free (ssa_to_be_promoted_bitmap);
+  sbitmap_free (ssa_sets_higher_bits_bitmap);
+    free_dominance_info (CDI_DOMINATORS);
+  return 0;
+}
+
+namespace {
+const pass_data pass_data_type_promotion =
+{
+  GIMPLE_PASS, /* type */
+  "promotion", /* name */
+  OPTGROUP_NONE, /* optinfo_flags */
+  TV_TREE_TYPE_PROMOTE, /* tv_id */
+  PROP_ssa, /* properties_required */
+  0, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  (TODO_cleanup_cfg | TODO_update_ssa | TODO_verify_all),
+};
+
+class pass_type_promotion : public gimple_opt_pass
+{
+public:
+  pass_type_promotion (gcc::context *ctxt)
+    : gimple_opt_pass (pass_data_type_promotion, ctxt)
+  {}
+
+  /* opt_pass methods: */
+  opt_pass * clone () { return new pass_type_promotion (m_ctxt); }
+  virtual bool gate (function *) { return flag_tree_type_promote != 0; }
+  virtual unsigned int execute (function *)
+    {
+      return execute_type_promotion ();
+    }
+
+}; // class pass_type_promotion
+
+} // anon namespace
+
+gimple_opt_pass *
+make_pass_type_promote (gcc::context *ctxt)
+{
+  return new pass_type_promotion (ctxt);
+}
+
diff --git a/gcc/passes.def b/gcc/passes.def
index 4690e23..dfa8a5b 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -271,6 +271,7 @@ along with GCC; see the file COPYING3.  If not see
 	  NEXT_PASS (pass_slp_vectorize);
       POP_INSERT_PASSES ()
       NEXT_PASS (pass_lower_vector_ssa);
+      NEXT_PASS (pass_type_promote);
       NEXT_PASS (pass_cse_reciprocals);
       NEXT_PASS (pass_reassoc);
       NEXT_PASS (pass_strength_reduction);
diff --git a/gcc/timevar.def b/gcc/timevar.def
index cf8f37d..57afa8d 100644
--- a/gcc/timevar.def
+++ b/gcc/timevar.def
@@ -269,6 +269,7 @@ DEFTIMEVAR (TV_GIMPLE_SLSR           , "straight-line strength reduction")
 DEFTIMEVAR (TV_VTABLE_VERIFICATION   , "vtable verification")
 DEFTIMEVAR (TV_TREE_UBSAN            , "tree ubsan")
 DEFTIMEVAR (TV_INITIALIZE_RTL        , "initialize rtl")
+DEFTIMEVAR (TV_TREE_TYPE_PROMOTE     , "tree type promote")
 
 /* Everything else in rest_of_compilation not included above.  */
 DEFTIMEVAR (TV_EARLY_LOCAL	     , "early local passes")
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index 99b27c7..3332626 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -3805,6 +3805,18 @@ verify_gimple_assign_binary (gassign *stmt)
         return false;
       }
 
+    case SEXT_EXPR:
+      {
+	if (!INTEGRAL_TYPE_P (lhs_type)
+	    || !INTEGRAL_TYPE_P (rhs1_type)
+	    || TREE_CODE (rhs2) != INTEGER_CST)
+	  {
+	    error ("invalid operands in sext expr");
+	    return true;
+	  }
+	return false;
+      }
+
     case VEC_WIDEN_LSHIFT_HI_EXPR:
     case VEC_WIDEN_LSHIFT_LO_EXPR:
       {
@@ -5235,6 +5247,7 @@ gimple_verify_flow_info (void)
 
 	  if (found_ctrl_stmt)
 	    {
+	      dump_bb (stderr, gimple_bb (stmt), 0, 0);
 	      error ("control flow in the middle of basic block %d",
 		     bb->index);
 	      err = 1;
diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index 71d75d9..e19ac3d 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -3912,6 +3912,7 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights,
     case BIT_XOR_EXPR:
     case BIT_AND_EXPR:
     case BIT_NOT_EXPR:
+    case SEXT_EXPR:
 
     case TRUTH_ANDIF_EXPR:
     case TRUTH_ORIF_EXPR:
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index 172bd82..533e4a6 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -428,6 +428,7 @@ extern gimple_opt_pass *make_pass_fre (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_check_data_deps (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_copy_prop (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_isolate_erroneous_paths (gcc::context *ctxt);
+extern gimple_opt_pass *make_pass_type_promote (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_vrp (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_uncprop (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_return_slot (gcc::context *ctxt);
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index cf875c8..02bc101 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -1812,6 +1812,14 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, int flags,
       }
       break;
 
+    case SEXT_EXPR:
+      pp_string (pp, "SEXT_EXPR <");
+      dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);
+      pp_string (pp, ", ");
+      dump_generic_node (pp, TREE_OPERAND (node, 1), spc, flags, false);
+      pp_greater (pp);
+      break;
+
     case MODIFY_EXPR:
     case INIT_EXPR:
       dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags,
@@ -3432,6 +3440,9 @@ op_symbol_code (enum tree_code code)
     case MIN_EXPR:
       return "min";
 
+    case SEXT_EXPR:
+      return "sext from bit";
+
     default:
       return "<<< ??? >>>";
     }
diff --git a/gcc/tree-ssanames.c b/gcc/tree-ssanames.c
index 9c39f65..05eef17 100644
--- a/gcc/tree-ssanames.c
+++ b/gcc/tree-ssanames.c
@@ -209,7 +209,8 @@ set_range_info (tree name, enum value_range_type range_type,
   unsigned int precision = TYPE_PRECISION (TREE_TYPE (name));
 
   /* Allocate if not available.  */
-  if (ri == NULL)
+  if (ri == NULL
+      || (precision != ri->get_min().get_precision()))
     {
       size_t size = (sizeof (range_info_def)
 		     + trailing_wide_ints <3>::extra_size (precision));
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index 22587d0..9ceae8d 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -2413,6 +2413,7 @@ extract_range_from_binary_expr_1 (value_range_t *vr,
       && code != LSHIFT_EXPR
       && code != MIN_EXPR
       && code != MAX_EXPR
+      && code != SEXT_EXPR
       && code != BIT_AND_EXPR
       && code != BIT_IOR_EXPR
       && code != BIT_XOR_EXPR)
@@ -2989,6 +2990,49 @@ extract_range_from_binary_expr_1 (value_range_t *vr,
       extract_range_from_multiplicative_op_1 (vr, code, &vr0, &vr1);
       return;
     }
+  else if (code == SEXT_EXPR)
+    {
+      gcc_assert (range_int_cst_p (&vr1));
+      unsigned int prec = tree_to_uhwi (vr1.min);
+      type = vr0.type;
+      wide_int tmin, tmax;
+      wide_int type_min, type_max;
+      wide_int may_be_nonzero, must_be_nonzero;
+
+      gcc_assert (!TYPE_UNSIGNED (expr_type));
+      type_min = wi::shwi (1 << (prec - 1),
+			   TYPE_PRECISION (TREE_TYPE (vr0.min)));
+      type_max = wi::shwi (((1 << (prec - 1)) - 1),
+			   TYPE_PRECISION (TREE_TYPE (vr0.max)));
+      if (zero_nonzero_bits_from_vr (expr_type, &vr0,
+				     &may_be_nonzero,
+				     &must_be_nonzero))
+	{
+	  HOST_WIDE_INT _may_be_nonzero = may_be_nonzero.to_uhwi ();
+
+	  if (_may_be_nonzero & (1 << (prec - 1)))
+	    {
+	      /* If to-be-extended sign bit can be one.  */
+	      tmin = type_min;
+	      tmax = may_be_nonzero & type_max;
+	    }
+	  else
+	    {
+	      /* If to-be-extended sign bit is zero.  */
+	      tmin = must_be_nonzero;
+	      tmax = may_be_nonzero;
+	    }
+	}
+      else
+	{
+	  tmin = type_min;
+	  tmax = type_max;
+	}
+      tmin = wi::sext (tmin, prec);
+      tmax = wi::sext (tmax, prec);
+      min = wide_int_to_tree (expr_type, tmin);
+      max = wide_int_to_tree (expr_type, tmax);
+    }
   else if (code == RSHIFT_EXPR
 	   || code == LSHIFT_EXPR)
     {
diff --git a/gcc/tree.def b/gcc/tree.def
index 56580af..57a1981 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -752,6 +752,9 @@ DEFTREECODE (BIT_XOR_EXPR, "bit_xor_expr", tcc_binary, 2)
 DEFTREECODE (BIT_AND_EXPR, "bit_and_expr", tcc_binary, 2)
 DEFTREECODE (BIT_NOT_EXPR, "bit_not_expr", tcc_unary, 1)
 
+/*  Sign-extend operation.  */
+DEFTREECODE (SEXT_EXPR, "sext_expr", tcc_binary, 2)
+
 /* ANDIF and ORIF allow the second operand not to be computed if the
    value of the expression is determined from the first operand.  AND,
    OR, and XOR always compute the second operand whether its value is

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC] Elimination of zext/sext - type promotion pass
  2015-06-01 23:20                       ` Kugan
@ 2015-06-19  2:55                         ` Kugan
  2015-07-28 11:05                         ` Richard Biener
  1 sibling, 0 replies; 29+ messages in thread
From: Kugan @ 2015-06-19  2:55 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jeff Law, gcc-patches, Jakub Jelinek

ping?

Thanks,
Kugan

On 02/06/15 09:14, Kugan wrote:
> 
> 
> On 08/05/15 22:48, Richard Biener wrote:
>> You compute which promotions are unsafe, like sources/sinks of memory
>> (I think you miss call arguments/return values and also asm operands here).
>> But instead of simply marking those SSA names as not to be promoted
>> I'd instead split their life-ranges, thus replace
>>
>>   _1 = mem;
>>
>> with
>>
>>   _2 = mem;
>>   _1 = [zs]ext (_2, ...);
>>
>> and promote _1 anyway.  So in the first phase I'd do that (and obviously
>> note that _2 isn't to be promoted in the specific example).
>>
>> For promotions that apply I wouldn't bother allocating new SSA names
>> but just "fix" their types (assign to their TREE_TYPE).  This also means
>> they have to become anonymous and if they didn't have a !DECL_IGNORED_P
>> decl before then a debug stmt should be inserted at the point of the
>> promotions.  So
>>
>>   bar_3 = _1 + _2;
>>
>> when promoted would become
>>
>>  _4 = _1 + _2;
>>  _3 = sext <_4, ...>;
>>  # DEBUG bar = (orig-type) _4;  // or _3?
>>
>> so you'd basically always promote defs (you have a lot of stmt/operand
>> walking code I didn't look too closely at - but it looks like too much) and
>> the uses get promoted automagically (because you promote the original
>> SSA name). Promotion of constants has to remain, of course.
> 
> 
> Thanks Richard. I experimented on this idea to understand it better.
> Please see the attached prototype (I am still working on your other
> comments which is not addressed here). Please have a look and let me
> know if this is along what you would expect. I have few questions though.
> 
> 1. In the following example above :
>   char _1;
>   _1 = mem;
> 
> when changing with
> 
>   char _2;
>   int _1;
>   _2 = mem;
>   _1 = [zs]ext (_2, ...);
> 
> for the [zs]ext operation we now use BIT_AND_EXPR and ZEXT_EXPR which
> (as of now) requires that the LHS and RHS are of the same type. Are you
> suggesting that we should have a true ZEXT_EXPR and SEXT_EXPR which can
> do the above in the gimple? I am now using CONVER_EXPR and which is the
> source of many optimization issue.
> 
> 2. for inline asm (a reduced test case that might not make much as a
> stand alone test-case, but I ran into similar cases with valid programmes)
> 
> ;; Function fn1 (fn1, funcdef_no=0, decl_uid=4220, cgraph_uid=0,
> symbol_order=0)
> 
> fn1 (short int p1)
> {
>   <bb 2>:
>   __asm__("" : "=r" p1_2 : "0" p1_1(D));
>   return;
> 
> }
> 
> 
> I am generating something like the following which ICEs. What is the
> expected out?
> 
> ;; Function fn1 (fn1, funcdef_no=0, decl_uid=4220, cgraph_uid=0,
> symbol_order=0)
> 
> fn1 (short int p1)
> {
>   int _1;
>   int _2;
>   short int _5;
> 
>   <bb 2>:
>   _1 = (int) p1_4(D);
>   _5 = (short int) _1;
>   __asm__("" : "=r" p1_6 : "0" _5);
>   _2 = (int) p1_6;
>   return;
> 
> }
> 
> Thanks a lot for your time,
> Kugan
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC] Elimination of zext/sext - type promotion pass
  2015-06-01 23:20                       ` Kugan
  2015-06-19  2:55                         ` Kugan
@ 2015-07-28 11:05                         ` Richard Biener
  2015-08-05  0:12                           ` kugan
  1 sibling, 1 reply; 29+ messages in thread
From: Richard Biener @ 2015-07-28 11:05 UTC (permalink / raw)
  To: Kugan; +Cc: Jeff Law, gcc-patches, Jakub Jelinek

On Tue, Jun 2, 2015 at 1:14 AM, Kugan <kugan.vivekanandarajah@linaro.org> wrote:
>

Sorry for replying so late...

> On 08/05/15 22:48, Richard Biener wrote:
>> You compute which promotions are unsafe, like sources/sinks of memory
>> (I think you miss call arguments/return values and also asm operands here).
>> But instead of simply marking those SSA names as not to be promoted
>> I'd instead split their life-ranges, thus replace
>>
>>   _1 = mem;
>>
>> with
>>
>>   _2 = mem;
>>   _1 = [zs]ext (_2, ...);
>>
>> and promote _1 anyway.  So in the first phase I'd do that (and obviously
>> note that _2 isn't to be promoted in the specific example).
>>
>> For promotions that apply I wouldn't bother allocating new SSA names
>> but just "fix" their types (assign to their TREE_TYPE).  This also means
>> they have to become anonymous and if they didn't have a !DECL_IGNORED_P
>> decl before then a debug stmt should be inserted at the point of the
>> promotions.  So
>>
>>   bar_3 = _1 + _2;
>>
>> when promoted would become
>>
>>  _4 = _1 + _2;
>>  _3 = sext <_4, ...>;
>>  # DEBUG bar = (orig-type) _4;  // or _3?
>>
>> so you'd basically always promote defs (you have a lot of stmt/operand
>> walking code I didn't look too closely at - but it looks like too much) and
>> the uses get promoted automagically (because you promote the original
>> SSA name). Promotion of constants has to remain, of course.
>
>
> Thanks Richard. I experimented on this idea to understand it better.
> Please see the attached prototype (I am still working on your other
> comments which is not addressed here). Please have a look and let me
> know if this is along what you would expect. I have few questions though.
>
> 1. In the following example above :
>   char _1;
>   _1 = mem;
>
> when changing with
>
>   char _2;
>   int _1;
>   _2 = mem;
>   _1 = [zs]ext (_2, ...);
>
> for the [zs]ext operation we now use BIT_AND_EXPR and ZEXT_EXPR which
> (as of now) requires that the LHS and RHS are of the same type. Are you
> suggesting that we should have a true ZEXT_EXPR and SEXT_EXPR which can
> do the above in the gimple? I am now using CONVER_EXPR and which is the
> source of many optimization issue.

You indeed need to use CONVERT_EXPR here, maybe you can elaborate
on the optimization issues.

> 2. for inline asm (a reduced test case that might not make much as a
> stand alone test-case, but I ran into similar cases with valid programmes)
>
> ;; Function fn1 (fn1, funcdef_no=0, decl_uid=4220, cgraph_uid=0,
> symbol_order=0)
>
> fn1 (short int p1)
> {
>   <bb 2>:
>   __asm__("" : "=r" p1_2 : "0" p1_1(D));
>   return;
>
> }
>
>
> I am generating something like the following which ICEs. What is the
> expected out?
>
> ;; Function fn1 (fn1, funcdef_no=0, decl_uid=4220, cgraph_uid=0,
> symbol_order=0)
>
> fn1 (short int p1)
> {
>   int _1;
>   int _2;
>   short int _5;
>
>   <bb 2>:
>   _1 = (int) p1_4(D);
>   _5 = (short int) _1;
>   __asm__("" : "=r" p1_6 : "0" _5);
>   _2 = (int) p1_6;
>   return;
>
> }

Parameters are indeed "interesting" to handle ;)  As we now see on ARM
the incoming parameter (the default def) and later assignments to it
can require different promotions (well, different extensions for ARM).

The only sensible way to deal with promoting parameters is to
promote them by changing the function signature.  Thus reflect the
targets ABI for parameters in the GIMPLE representation (which
includes TYPE_ARG_TYPES and DECL_ARGUMENTS).
IMHO we should do this during gimplification of parameters / call
arguments already.

So for your example you'd end up with

fn1 (int p1)
{
  __asm__("" : "=r" p1_6 : "0" p1_4(D));
  return;
}

that is, promotions also apply to asm inputs/outputs (no?)

Richard.

> Thanks a lot for your time,
> Kugan

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC] Elimination of zext/sext - type promotion pass
  2015-07-28 11:05                         ` Richard Biener
@ 2015-08-05  0:12                           ` kugan
  2015-08-05  9:10                             ` Richard Biener
  0 siblings, 1 reply; 29+ messages in thread
From: kugan @ 2015-08-05  0:12 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jeff Law, gcc-patches, Jakub Jelinek

[-- Attachment #1: Type: text/plain, Size: 9583 bytes --]


> You indeed need to use CONVERT_EXPR here, maybe you can elaborate
> on the optimization issues.
>
>> 2. for inline asm (a reduced test case that might not make much as a
>> stand alone test-case, but I ran into similar cases with valid programmes)
>>
>> ;; Function fn1 (fn1, funcdef_no=0, decl_uid=4220, cgraph_uid=0,
>> symbol_order=0)
>>
>> fn1 (short int p1)
>> {
>>    <bb 2>:
>>    __asm__("" : "=r" p1_2 : "0" p1_1(D));
>>    return;
>>
>> }
>>
>>
>> I am generating something like the following which ICEs. What is the
>> expected out?
>>
>> ;; Function fn1 (fn1, funcdef_no=0, decl_uid=4220, cgraph_uid=0,
>> symbol_order=0)
>>
>> fn1 (short int p1)
>> {
>>    int _1;
>>    int _2;
>>    short int _5;
>>
>>    <bb 2>:
>>    _1 = (int) p1_4(D);
>>    _5 = (short int) _1;
>>    __asm__("" : "=r" p1_6 : "0" _5);
>>    _2 = (int) p1_6;
>>    return;
>>
>> }
>
> Parameters are indeed "interesting" to handle ;)  As we now see on ARM
> the incoming parameter (the default def) and later assignments to it
> can require different promotions (well, different extensions for ARM).
>
> The only sensible way to deal with promoting parameters is to
> promote them by changing the function signature.  Thus reflect the
> targets ABI for parameters in the GIMPLE representation (which
> includes TYPE_ARG_TYPES and DECL_ARGUMENTS).
> IMHO we should do this during gimplification of parameters / call
> arguments already.
>
> So for your example you'd end up with
>
> fn1 (int p1)
> {
>    __asm__("" : "=r" p1_6 : "0" p1_4(D));
>    return;
> }
>
> that is, promotions also apply to asm inputs/outputs (no?)


Thanks for the review and answers. For the time being, I am handling 
gimple_asm as one that has to be handled in original type. I Will look 
into improving it after getting the basic framework right.

As it is, attached patch bootstraps on x86_64-linux-gnu, arm-linux-gnu 
and aarch64-linux-gnu. There are few regressions to look into (Please 
see below).

There are cases it is working well. There are cases where it can be 
improved. I am attaching couple test cases (and their results). I am 
seeing some BIT_AND_EXPR which are inserted by promotion are not being 
optimized when they are redundant. This is especially the case when I 
invalidate the VRP range into from VRP1 during the type promotion. I am 
looking into it.

Please note that attached patch still needs to address:
* Adding gimple_debug stmts.
* Address review comment for expr.c handling SEXT_EXPR.
* Address regression failures

Based on the feedback, I will address the above and split the patch into 
logical patch set for easy detailed review.

Here are the outputs for the testcases.

--- c5.c.142t.veclower21	2015-08-05 08:50:11.367135339 +1000
+++ c5.c.143t.promotion	2015-08-05 08:50:11.367135339 +1000
@@ -1,34 +1,45 @@

  ;; Function unPack (unPack, funcdef_no=0, decl_uid=4145, cgraph_uid=0, 
symbol_order=0)

  unPack (unsigned char c)
  {
-  short int _1;
-  unsigned short _4;
-  unsigned short _5;
-  short int _6;
-  short int _7;
+  int _1;
+  unsigned int _2;
+  unsigned int _3;
+  unsigned int _4;
+  unsigned int _5;
+  int _6;
+  int _7;
+  unsigned int _9;
+  int _11;
+  int _12;
+  short int _13;

    <bb 2>:
-  c_3 = c_2(D) & 15;
-  if (c_3 > 7)
+  _2 = (unsigned int) c_10(D);
+  _3 = _2 & 15;
+  _9 = _3 & 255;
+  if (_9 > 7)
      goto <bb 3>;
    else
      goto <bb 4>;

    <bb 3>:
-  _4 = (unsigned short) c_3;
-  _5 = _4 + 65531;
-  _6 = (short int) _5;
+  _4 = _3 & 65535;
+  _5 = _4 + 4294967291;
+  _11 = (int) _5;
+  _6 = (_11) sext from bit (16);
    goto <bb 5>;

    <bb 4>:
-  _7 = (short int) c_3;
+  _12 = (int) _3;
+  _7 = (_12) sext from bit (16);

    <bb 5>:
    # _1 = PHI <_6(3), _7(4)>
-  return _1;
+  _13 = (short int) _1;
+  return _13;

  }


--- c5.org.s	2015-08-05 08:51:44.619133892 +1000
+++ c5.new.s	2015-08-05 08:51:29.643134124 +1000
@@ -16,16 +16,14 @@
  	.syntax divided
  	.arm
  	.type	unPack, %function
  unPack:
  	@ args = 0, pretend = 0, frame = 0
  	@ frame_needed = 0, uses_anonymous_args = 0
  	@ link register save eliminated.
  	and	r0, r0, #15
  	cmp	r0, #7
  	subhi	r0, r0, #5
-	uxth	r0, r0
-	sxth	r0, r0
  	bx	lr
  	.size	unPack, .-unPack
  	.ident	"GCC: (GNU) 6.0.0 20150724 (experimental)"
  	.section	.note.GNU-stack,"",%progbits
--- crc.c.142t.veclower21	2015-08-05 08:52:43.811132974 +1000
+++ crc.c.143t.promotion	2015-08-05 08:52:43.811132974 +1000
@@ -1,52 +1,78 @@

  ;; Function crc2 (crc2, funcdef_no=0, decl_uid=4146, cgraph_uid=0, 
symbol_order=0)

  crc2 (short unsigned int crc, unsigned char data)
  {
    unsigned char carry;
    unsigned char x16;
    unsigned char i;
-  unsigned char ivtmp_5;
-  unsigned char _9;
-  unsigned char _10;
-  unsigned char ivtmp_18;
+  unsigned int _2;
+  unsigned int _3;
+  unsigned int _5;
+  unsigned int _7;
+  unsigned int _8;
+  unsigned int _9;
+  unsigned int _10;
+  unsigned int _11;
+  unsigned int _12;
+  unsigned int _13;
+  unsigned int _15;
+  unsigned int _16;
+  unsigned int _18;
+  unsigned int _19;
+  unsigned int _21;
+  unsigned int _22;
+  unsigned int _24;
+  short unsigned int _25;
+  unsigned int _26;
+  unsigned int _27;
+  unsigned int _28;
+  unsigned int _29;

    <bb 2>:
+  _8 = (unsigned int) data_4(D);
+  _7 = (unsigned int) crc_30(D);

    <bb 3>:
-  # crc_28 = PHI <crc_2(5), crc_7(D)(2)>
-  # data_29 = PHI <data_12(5), data_8(D)(2)>
-  # ivtmp_18 = PHI <ivtmp_5(5), 8(2)>
-  _9 = (unsigned char) crc_28;
-  _10 = _9 ^ data_29;
-  x16_11 = _10 & 1;
-  data_12 = data_29 >> 1;
-  if (x16_11 == 1)
+  # _28 = PHI <_2(5), _7(2)>
+  # _29 = PHI <_12(5), _8(2)>
+  # _18 = PHI <_5(5), 8(2)>
+  _9 = _28 & 255;
+  _10 = _9 ^ _29;
+  _11 = _10 & 1;
+  _3 = _29 & 255;
+  _12 = _3 >> 1;
+  _27 = _11 & 255;
+  if (_27 == 1)
      goto <bb 4>;
    else
      goto <bb 7>;

    <bb 4>:
-  crc_13 = crc_28 ^ 16386;
-  crc_24 = crc_13 >> 1;
-  crc_15 = crc_24 | 32768;
+  _13 = _28 ^ 16386;
+  _26 = _13 & 65535;
+  _24 = _26 >> 1;
+  _15 = _24 | 4294934528;

    <bb 5>:
-  # crc_2 = PHI <crc_15(4), crc_21(7)>
-  ivtmp_5 = ivtmp_18 - 1;
-  if (ivtmp_5 != 0)
+  # _2 = PHI <_15(4), _21(7)>
+  _5 = _18 - 1;
+  _22 = _5 & 255;
+  if (_22 != 0)
      goto <bb 3>;
    else
      goto <bb 6>;

    <bb 6>:
-  # crc_19 = PHI <crc_2(5)>
-  return crc_19;
+  # _19 = PHI <_2(5)>
+  _25 = (short unsigned int) _19;
+  return _25;

    <bb 7>:
-  crc_21 = crc_28 >> 1;
+  _16 = _28 & 65535;
+  _21 = _16 >> 1;
    goto <bb 5>;

  }


--- crc.org.s	2015-08-05 08:54:17.491131520 +1000
+++ crc.new.s	2015-08-05 08:53:12.183132534 +1000
@@ -15,27 +15,28 @@
  	.global	crc2
  	.syntax divided
  	.arm
  	.type	crc2, %function
  crc2:
  	@ args = 0, pretend = 0, frame = 0
  	@ frame_needed = 0, uses_anonymous_args = 0
  	mov	ip, #32768
  	movt	ip, 65535
  	str	lr, [sp, #-4]!
-	mov	r3, #8
+	mov	r2, #8
  	movw	lr, #16386
  .L3:
-	eor	r2, r1, r0
-	sub	r3, r3, #1
-	tst	r2, #1
+	uxtb	r3, r0
+	eor	r3, r3, r1
  	mov	r1, r1, lsr #1
+	tst	r3, #1
  	eorne	r0, r0, lr
-	moveq	r0, r0, lsr #1
-	orrne	r0, ip, r0, lsr #1
-	uxthne	r0, r0
-	ands	r3, r3, #255
+	ubfxeq	r0, r0, #1, #15
+	ubfxne	r0, r0, #1, #15
+	orrne	r0, r0, ip
+	subs	r2, r2, #1
  	bne	.L3
+	uxth	r0, r0
  	ldr	pc, [sp], #4
  	.size	crc2, .-crc2
  	.ident	"GCC: (GNU) 6.0.0 20150724 (experimental)"
  	.section	.note.GNU-stack,"",%progbits



Testsuite regression for x86_64-unknown-linux-gnu:
Tests that now fail, but worked before:
gfortran.dg/graphite/pr42393-1.f90   -O  (test for excess errors)


Testsuite regression for  arm-linux-gnu:
Tests that now fail, but worked before:
arm-sim: gcc.dg/fixed-point/convert-sat.c execution test
arm-sim: gcc.dg/tree-ssa/20030729-1.c scan-tree-dump-times dom2 "\\(unsigned
int\\)" 0
arm-sim: gcc.dg/tree-ssa/pr54245.c scan-tree-dump-times slsr "Inserting
initializer" 0
arm-sim: gcc.dg/tree-ssa/shorten-1.c scan-tree-dump-not optimized
"\\(int\\)"
arm-sim: gcc.dg/tree-ssa/shorten-1.c scan-tree-dump-times optimized
"\\(unsigned char\\)" 8
arm-sim: gcc.target/arm/mla-2.c scan-assembler smlalbb
arm-sim: gcc.target/arm/unsigned-extend-2.c scan-assembler ands
arm-sim: gcc.target/arm/wmul-1.c scan-assembler-times smlabb 2
arm-sim: gcc.target/arm/wmul-2.c scan-assembler-times smulbb 1
arm-sim: gcc.target/arm/wmul-3.c scan-assembler-times smulbb 2
arm-sim: gcc.target/arm/wmul-9.c scan-assembler smlalbb
arm-sim: gfortran.dg/graphite/pr42393-1.f90   -O  (test for excess errors)

Tests that now work, but didn't before:
arm-sim: gcc.dg/tree-prof/time-profiler-2.c scan-ipa-dump-times profile
"Read tp_first_run: 0" 2
arm-sim: gcc.dg/tree-prof/time-profiler-2.c scan-ipa-dump-times profile
"Read tp_first_run: 2" 1
arm-sim: gcc.dg/tree-prof/time-profiler-2.c scan-ipa-dump-times profile
"Read tp_first_run: 3" 1
arm-sim: gcc.target/arm/builtin-bswap-1.c scan-assembler-times rev16ne\\t 1
arm-sim: gcc.target/arm/builtin-bswap-1.c scan-assembler-times revshne\\t 1
arm-sim: gcc.target/arm/smlaltb-1.c scan-assembler smlaltb\\t
arm-sim: gcc.target/arm/smlaltt-1.c scan-assembler smlaltt\\t


Testsuite regression for  aarch64-linux-gnu:
Tests that now fail, but worked before:
c-c++-common/torture/vector-compare-1.c   -O3 -g  (test for excess errors)
c-c++-common/torture/vector-compare-1.c   -O3 -g  (test for excess errors)
gcc.dg/tree-ssa/20030729-1.c scan-tree-dump-times dom2 "\\(unsigned int\\)"
0
gcc.dg/tree-ssa/pr54245.c scan-tree-dump-times slsr "Inserting initializer"
0
gcc.dg/tree-ssa/shorten-1.c scan-tree-dump-not optimized "\\(int\\)"
gcc.dg/tree-ssa/shorten-1.c scan-tree-dump-times optimized "\\(unsigned
char\\)" 8

Thanks,
Kugan

[-- Attachment #2: p.txt --]
[-- Type: text/plain, Size: 34824 bytes --]

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 16d5582..63c9dd2 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1493,6 +1493,7 @@ OBJS = \
 	tree-vect-slp.o \
 	tree-vectorizer.o \
 	tree-vrp.o \
+	gimple-ssa-type-promote.o \
 	tree.o \
 	valtrack.o \
 	value-prof.o \
diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index 0b19953..6642c01 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -4972,6 +4972,10 @@ expand_debug_expr (tree exp)
     case FMA_EXPR:
       return simplify_gen_ternary (FMA, mode, inner_mode, op0, op1, op2);
 
+    case SEXT_EXPR:
+      return op0;
+
+
     default:
     flag_unsupported:
 #ifdef ENABLE_CHECKING
diff --git a/gcc/common.opt b/gcc/common.opt
index 6d47e94..5afda05 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2374,6 +2374,10 @@ ftree-vrp
 Common Report Var(flag_tree_vrp) Init(0) Optimization
 Perform Value Range Propagation on trees
 
+ftree-type-promote
+Common Report Var(flag_tree_type_promote) Init(1) Optimization
+Perform Type Promotion on trees
+
 funit-at-a-time
 Common Report Var(flag_unit_at_a_time) Init(1)
 Compile whole compilation unit at a time
diff --git a/gcc/expr.c b/gcc/expr.c
index d601129..7483950 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -9194,6 +9194,22 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
       target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, target);
       return target;
 
+    case SEXT_EXPR:
+	{
+	  rtx op0 = expand_normal (treeop0);
+	  rtx temp;
+	  if (!target)
+	    target = gen_reg_rtx (TYPE_MODE (TREE_TYPE (treeop0)));
+
+	  machine_mode inner_mode
+	    = smallest_mode_for_size (tree_to_shwi (treeop1),
+				      MODE_INT);
+	  temp = convert_modes (inner_mode,
+				TYPE_MODE (TREE_TYPE (treeop0)), op0, 0);
+	  convert_move (target, temp, 0);
+	  return target;
+	}
+
     default:
       gcc_unreachable ();
     }
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 886922f..bac899c 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -984,6 +984,10 @@ int_const_binop_1 (enum tree_code code, const_tree arg1, const_tree parg2,
       res = wi::bit_and (arg1, arg2);
       break;
 
+    case SEXT_EXPR:
+      res = wi::sext (arg1, arg2.to_uhwi ());
+      break;
+
     case RSHIFT_EXPR:
     case LSHIFT_EXPR:
       if (wi::neg_p (arg2))
diff --git a/gcc/gimple-ssa-type-promote.c b/gcc/gimple-ssa-type-promote.c
index e69de29..b5b69cc 100644
--- a/gcc/gimple-ssa-type-promote.c
+++ b/gcc/gimple-ssa-type-promote.c
@@ -0,0 +1,815 @@
+/* Type promotion of SSA names to minimise redundant zero/sign extension.
+   Copyright (C) 2015 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "flags.h"
+#include "hash-set.h"
+#include "machmode.h"
+#include "vec.h"
+#include "double-int.h"
+#include "input.h"
+#include "alias.h"
+#include "symtab.h"
+#include "wide-int.h"
+#include "inchash.h"
+#include "tree.h"
+#include "fold-const.h"
+#include "stor-layout.h"
+#include "calls.h"
+#include "predict.h"
+#include "hard-reg-set.h"
+#include "function.h"
+#include "dominance.h"
+#include "cfg.h"
+#include "basic-block.h"
+#include "tree-ssa-alias.h"
+#include "gimple-fold.h"
+#include "tree-eh.h"
+#include "gimple-expr.h"
+#include "is-a.h"
+#include "gimple.h"
+#include "gimple-iterator.h"
+#include "gimple-ssa.h"
+#include "tree-phinodes.h"
+#include "ssa-iterators.h"
+#include "stringpool.h"
+#include "tree-ssanames.h"
+#include "tree-pass.h"
+#include "gimple-pretty-print.h"
+#include "langhooks.h"
+#include "sbitmap.h"
+#include "domwalk.h"
+
+#include "tree-dfa.h"
+
+/* This pass applies type promotion to SSA names in the function and
+   inserts appropriate truncations.  Idea of this pass is to promote operations
+   such a way that we can minimise generation of subreg in RTL,
+   that intern results in removal of redundant zero/sign extensions.  This pass
+   will run prior to The VRP and DOM such that they will be able to optimise
+   redundant truncations and extensions.  This is based on the discussion from
+   https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00472.html.
+
+*/
+
+static unsigned n_ssa_val;
+static sbitmap ssa_to_be_promoted_bitmap;
+static sbitmap ssa_sets_higher_bits_bitmap;
+static hash_map <tree, tree>  *original_type_map;
+
+/* Return the promoted type for TYPE.  */
+static tree
+get_promoted_type (tree type)
+{
+  tree promoted_type;
+  enum machine_mode mode;
+  int uns;
+  if (POINTER_TYPE_P (type)
+      || !INTEGRAL_TYPE_P (type)
+      || TYPE_PRECISION (type) % 8 != 0)
+    return type;
+  mode = TYPE_MODE (type);
+#ifdef PROMOTE_MODE
+  uns = TYPE_SIGN (type);
+  PROMOTE_MODE (mode, uns, type);
+#endif
+  uns = TYPE_SIGN (type);
+  promoted_type = lang_hooks.types.type_for_mode (mode, uns);
+  if (promoted_type
+      && (TYPE_PRECISION (promoted_type) > TYPE_PRECISION (type)))
+    type = promoted_type;
+  return type;
+}
+
+/* Return true if ssa NAME is already considered for promotion.  */
+static bool
+ssa_promoted_p (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	return bitmap_bit_p (ssa_to_be_promoted_bitmap, index);
+    }
+  return true;
+}
+
+
+/* Set ssa NAME to be already considered for promotion.  */
+static void
+set_ssa_promoted (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	bitmap_set_bit (ssa_to_be_promoted_bitmap, index);
+    }
+}
+
+/* Insert COPY_STMT along the edge from STMT to its successor.  */
+static void
+insert_stmt_on_edge (gimple stmt, gimple copy_stmt)
+{
+  edge_iterator ei;
+  edge e, edge = NULL;
+  basic_block bb = gimple_bb (stmt);
+
+  FOR_EACH_EDGE (e, ei, bb->succs)
+    if (!(e->flags & EDGE_EH))
+      {
+	gcc_assert (edge == NULL);
+	edge = e;
+      }
+
+  gcc_assert (edge);
+  gsi_insert_on_edge_immediate (edge, copy_stmt);
+}
+
+/* Return true if it is safe to promote the defined SSA_NAME in the STMT
+   itself.  */
+static bool
+safe_to_promote_def_p (gimple stmt)
+{
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  if (gimple_vuse (stmt) != NULL_TREE
+      || gimple_vdef (stmt) != NULL_TREE
+      || code == ARRAY_REF
+      || code == LROTATE_EXPR
+      || code == RROTATE_EXPR
+      || code == VIEW_CONVERT_EXPR
+      || code == BIT_FIELD_REF
+      || code == REALPART_EXPR
+      || code == IMAGPART_EXPR
+      || code == REDUC_MAX_EXPR
+      || code == REDUC_PLUS_EXPR
+      || code == REDUC_MIN_EXPR)
+    return false;
+  return true;
+}
+
+/* Return true if it is safe to promote the use in the STMT.  */
+static bool
+safe_to_promote_use_p (gimple stmt)
+{
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  tree lhs = gimple_assign_lhs (stmt);
+
+  if (gimple_vuse (stmt) != NULL_TREE
+      || gimple_vdef (stmt) != NULL_TREE
+      ||code == VIEW_CONVERT_EXPR
+      || code == LROTATE_EXPR
+      || code == RROTATE_EXPR
+      || code == CONSTRUCTOR
+      || code == BIT_FIELD_REF
+      || code == COMPLEX_EXPR
+      || code == ASM_EXPR
+      || VECTOR_TYPE_P (TREE_TYPE (lhs)))
+    return false;
+  return true;
+}
+
+/* Return true if the SSA_NAME has to be truncated to preserve the
+   semantics.  */
+static bool
+truncate_use_p (gimple stmt)
+{
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  if (TREE_CODE_CLASS (code)
+      == tcc_comparison
+      || code == TRUNC_DIV_EXPR
+      || code == CEIL_DIV_EXPR
+      || code == FLOOR_DIV_EXPR
+      || code == ROUND_DIV_EXPR
+      || code == TRUNC_MOD_EXPR
+      || code == CEIL_MOD_EXPR
+      || code == FLOOR_MOD_EXPR
+      || code == ROUND_MOD_EXPR
+      || code == LSHIFT_EXPR
+      || code == RSHIFT_EXPR)
+    return true;
+  return false;
+}
+
+/* Return true if LHS will be promoted later.  */
+static bool
+tobe_promoted_p (tree lhs)
+{
+  if (TREE_CODE (lhs) == SSA_NAME
+      && !POINTER_TYPE_P (TREE_TYPE (lhs))
+      && INTEGRAL_TYPE_P (TREE_TYPE (lhs))
+      && !VECTOR_TYPE_P (TREE_TYPE (lhs))
+      && !ssa_promoted_p (lhs)
+      && (get_promoted_type (TREE_TYPE (lhs))
+	  != TREE_TYPE (lhs)))
+    return true;
+  else
+    return false;
+}
+
+/* Convert constant CST to TYPE.  */
+static tree
+convert_int_cst (tree type, tree cst, signop sign = SIGNED)
+{
+  wide_int wi_cons = fold_convert (type, cst);
+  wi_cons = wi::ext (wi_cons, TYPE_PRECISION (TREE_TYPE (cst)), sign);
+  return wide_int_to_tree (type, wi_cons);
+}
+
+/* Promote constants in STMT to TYPE.  If PROMOTE_COND_EXPR is true,
+   promote only the constants in conditions part of the COND_EXPR.  */
+static void
+promote_cst_in_stmt (gimple stmt, tree type, bool promote_cond = false)
+{
+  tree op;
+  ssa_op_iter iter;
+  use_operand_p oprnd;
+  int index;
+  tree op0, op1;
+  signop sign = SIGNED;
+
+  switch (gimple_code (stmt))
+    {
+    case GIMPLE_ASSIGN:
+      if (promote_cond
+	  && gimple_assign_rhs_code (stmt) == COND_EXPR)
+	{
+	  /* Promote INTEGER_CST that are tcc_compare arguments.  */
+	  sign = TYPE_SIGN (type);
+	  op = gimple_assign_rhs1 (stmt);
+	  op0 = TREE_OPERAND (op, 0);
+	  op1 = TREE_OPERAND (op, 1);
+	  if (TREE_CODE (op0) == INTEGER_CST)
+	    op0 = convert_int_cst (type, op0, sign);
+	  if (TREE_CODE (op1) == INTEGER_CST)
+	    op1 = convert_int_cst (type, op1, sign);
+	  tree new_op = build2 (TREE_CODE (op), type, op0, op1);
+	  gimple_assign_set_rhs1 (stmt, new_op);
+	}
+      else
+	{
+	  /* Promote INTEGER_CST in GIMPLE_ASSIGN.  */
+	  op = gimple_assign_rhs3 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs3 (stmt, convert_int_cst (type, op, sign));
+	  if (TREE_CODE_CLASS (gimple_assign_rhs_code (stmt))
+	      == tcc_comparison)
+	    sign = TYPE_SIGN (type);
+	  op = gimple_assign_rhs1 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs1 (stmt, convert_int_cst (type, op, sign));
+	  op = gimple_assign_rhs2 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs2 (stmt, convert_int_cst (type, op, sign));
+	}
+      break;
+
+    case GIMPLE_PHI:
+	{
+	  /* Promote INTEGER_CST arguments to GIMPLE_PHI.  */
+	  gphi *phi = as_a <gphi *> (stmt);
+	  FOR_EACH_PHI_ARG (oprnd, phi, iter, SSA_OP_USE)
+	    {
+	      op = USE_FROM_PTR (oprnd);
+	      index = PHI_ARG_INDEX_FROM_USE (oprnd);
+	      if (TREE_CODE (op) == INTEGER_CST)
+		SET_PHI_ARG_DEF (phi, index, convert_int_cst (type, op, sign));
+	    }
+	}
+      break;
+
+    case GIMPLE_COND:
+	{
+	  /* Promote INTEGER_CST that are GIMPLE_COND arguments.  */
+	  gcond *cond = as_a <gcond *> (stmt);
+	  op = gimple_cond_lhs (cond);
+	  sign = TYPE_SIGN (type);
+
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_cond_set_lhs (cond, convert_int_cst (type, op, sign));
+	  op = gimple_cond_rhs (cond);
+
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_cond_set_rhs (cond, convert_int_cst (type, op, sign));
+	}
+      break;
+
+    default:
+      gcc_unreachable ();
+    }
+}
+
+/* Create an ssa with TYPE to copy ssa VAR.  */
+static tree
+make_promoted_copy (tree var, gimple def_stmt, tree type)
+{
+  tree new_lhs = make_ssa_name (type, def_stmt);
+  if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (var))
+    SSA_NAME_OCCURS_IN_ABNORMAL_PHI (new_lhs) = 1;
+  return new_lhs;
+}
+
+/* Zero/sign extend (depending on type) VAR and truncate to WIDTH bits.
+   Assign the zero/sign extended value in NEW_VAR.  gimple statement
+   that performs the zero/sign extension is returned.  */
+static gimple
+zero_sign_extend_stmt (tree new_var, tree var, int width)
+{
+  gcc_assert (TYPE_PRECISION (TREE_TYPE (var))
+	      == TYPE_PRECISION (TREE_TYPE (new_var)));
+  gcc_assert (TYPE_PRECISION (TREE_TYPE (var)) > width);
+  gimple stmt;
+
+  if (TYPE_UNSIGNED (TREE_TYPE (new_var)))
+    /* Zero extend.  */
+    stmt = gimple_build_assign (new_var,
+				BIT_AND_EXPR,
+				var, build_int_cst (TREE_TYPE (var),
+						    ((1ULL << width) - 1)));
+  else
+    /* Sign extend.  */
+    stmt = gimple_build_assign (new_var,
+				SEXT_EXPR,
+				var, build_int_cst (TREE_TYPE (var), width));
+  return stmt;
+}
+
+
+void duplicate_default_ssa (tree to, tree from)
+{
+  SET_SSA_NAME_VAR_OR_IDENTIFIER (to, SSA_NAME_VAR (from));
+  SSA_NAME_IS_DEFAULT_DEF (to) = SSA_NAME_IS_DEFAULT_DEF (from);
+  SSA_NAME_DEF_STMT (to) = SSA_NAME_DEF_STMT (from);
+  SET_SSA_NAME_VAR_OR_IDENTIFIER (from, NULL_TREE);
+  SSA_NAME_IS_DEFAULT_DEF (to) = 1;
+  SSA_NAME_IS_DEFAULT_DEF (from) = 0;
+}
+
+/* Promote definition DEF to PROMOTED_TYPE. If the stmt that defines def
+   is def_stmt, make the type of def promoted_type. If the stmt is such
+   that, result of the def_stmt cannot be of promoted_type, create a new_def
+   of the original_type and make the def_stmt assign its value to newdef.
+   Then, create a CONVERT_EXPR to convert new_def to def of promoted type.
+
+   For example, for stmt with original_type char and promoted_type int:
+		char _1 = mem;
+	becomes:
+		char _2 = mem;
+		int _1 = (int)_2;
+
+   If the def_stmt allows def to be promoted, promote def in-place
+   (and its arguments when needed).
+
+   For example:
+		char _3 = _1 + _2;
+	becomes:
+		int _3 = _1 + _2;
+   Here, _1 and _2 will also be promoted. */
+
+static void
+promote_definition (tree def,
+		    tree promoted_type)
+{
+  gimple def_stmt = SSA_NAME_DEF_STMT (def);
+  gimple copy_stmt = NULL;
+  basic_block bb;
+  gimple_stmt_iterator gsi;
+  tree original_type = TREE_TYPE (def);
+  tree new_def;
+  bool do_not_promote = false;
+
+  switch (gimple_code (def_stmt))
+    {
+    case GIMPLE_PHI:
+	{
+	  /* Promote def by fixing its type and make def anonymous.  */
+	  TREE_TYPE (def) = promoted_type;
+	  SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+	  promote_cst_in_stmt (def_stmt, promoted_type);
+	  /* TODO: If def doesnt have a !DECL_IGNORED_P, insert a debug stmt.  */
+	  break;
+	}
+
+    case GIMPLE_ASM:
+	{
+	  gasm *asm_stmt = as_a <gasm *> (def_stmt);
+	  for (unsigned int i = 0; i < gimple_asm_noutputs (asm_stmt); ++i)
+	    {
+	      /* Promote def and copy (i.e. convert) the value defined
+		 by asm to def.  */
+	      tree link = gimple_asm_output_op (asm_stmt, i);
+	      tree op = TREE_VALUE (link);
+	      if (op == def)
+		{
+		  new_def = copy_ssa_name (def);
+		  set_ssa_promoted (new_def);
+		  duplicate_default_ssa (new_def, def);
+		  TREE_VALUE (link) = new_def;
+		  gimple_asm_set_output_op (asm_stmt, i, link);
+
+		  TREE_TYPE (def) = promoted_type;
+		  copy_stmt = gimple_build_assign (def, CONVERT_EXPR,
+						   new_def, NULL_TREE);
+		  gsi = gsi_for_stmt (def_stmt);
+		  SSA_NAME_IS_DEFAULT_DEF (new_def) = 0;
+		  gsi_insert_after (&gsi, copy_stmt, GSI_NEW_STMT);
+		  break;
+		}
+	    }
+	  break;
+	}
+
+    case GIMPLE_NOP:
+	{
+	  if (SSA_NAME_VAR (def) == NULL)
+	    {
+	      /* Promote def by fixing its type for anonymous def.  */
+	      TREE_TYPE (def) = promoted_type;
+	    }
+	  else
+	    {
+	      /* Create a promoted copy of parameters.  */
+	      bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+	      gcc_assert (bb);
+	      gsi = gsi_after_labels (bb);
+	      new_def = copy_ssa_name (def);
+	      set_ssa_promoted (new_def);
+	      set_ssa_default_def (cfun, SSA_NAME_VAR (def), new_def);
+	      duplicate_default_ssa (new_def, def);
+	      TREE_TYPE (def) = promoted_type;
+	      copy_stmt = gimple_build_assign (def, CONVERT_EXPR,
+					       new_def, NULL_TREE);
+	      SSA_NAME_DEF_STMT (def) = copy_stmt;
+	      gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+	    }
+	  break;
+	}
+
+    case GIMPLE_ASSIGN:
+	{
+	  enum tree_code code = gimple_assign_rhs_code (def_stmt);
+	  if (!safe_to_promote_def_p (def_stmt))
+	    {
+	      do_not_promote = true;
+	    }
+	  else if (CONVERT_EXPR_CODE_P (code))
+	    {
+	      tree rhs = gimple_assign_rhs1 (def_stmt);
+	      if (types_compatible_p (TREE_TYPE (rhs), promoted_type))
+		{
+		  /* As we travel statements in dominated order, arguments
+		     of def_stmt will be visited before visiting def.  If RHS
+		     is already promoted and type is compatible, we can convert
+		     them into ZERO/SIGN EXTEND stmt.  */
+		  tree &type = original_type_map->get_or_insert (rhs);
+		  if (type == NULL_TREE)
+		    type = TREE_TYPE (rhs);
+		  if (TYPE_PRECISION (original_type) < TYPE_PRECISION (type))
+		    type = original_type;
+		  gcc_assert (type != NULL_TREE);
+		  TREE_TYPE (def) = promoted_type;
+		  gimple copy_stmt =
+		    zero_sign_extend_stmt (def, rhs,
+					   TYPE_PRECISION (type));
+		  SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		  gsi = gsi_for_stmt (def_stmt);
+		  gsi_replace (&gsi, copy_stmt, false);
+		}
+	      else
+		{
+		  /* If RHS is not promoted OR their types are not
+		     compatible, create CONVERT_EXPR that converts
+		     RHS to  promoted DEF type and perform a
+		     ZERO/SIGN EXTEND to get the required value
+		     from RHS.  */
+		  tree s = (TYPE_PRECISION (TREE_TYPE (def))
+			    < TYPE_PRECISION (TREE_TYPE (rhs)))
+		    ? TREE_TYPE (def) : TREE_TYPE (rhs);
+		  new_def = copy_ssa_name (def);
+		  set_ssa_promoted (new_def);
+		  TREE_TYPE (def) = promoted_type;
+		  TREE_TYPE (new_def) = promoted_type;
+		  SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		  SET_SSA_NAME_VAR_OR_IDENTIFIER (new_def, NULL_TREE);
+		  gimple_set_lhs (def_stmt, new_def);
+		  gimple copy_stmt =
+		    zero_sign_extend_stmt (def, new_def,
+					   TYPE_PRECISION (s));
+		  gsi = gsi_for_stmt (def_stmt);
+		  if (lookup_stmt_eh_lp (def_stmt) > 0)
+		    insert_stmt_on_edge (def_stmt, copy_stmt);
+		  else
+		    gsi_insert_after (&gsi, copy_stmt, GSI_NEW_STMT);
+		}
+	    }
+	  else
+	    {
+	      /* Promote def by fixing its type and make def anonymous.  */
+	      SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+	      promote_cst_in_stmt (def_stmt, promoted_type);
+	      TREE_TYPE (def) = promoted_type;
+	    }
+	  break;
+	}
+
+    default:
+      do_not_promote = true;
+      break;
+    }
+
+  if (do_not_promote)
+    {
+      /* Promote def and copy (i.e. convert) the value defined
+	 by the stmt that cannot be promoted.  */
+      new_def = copy_ssa_name (def);
+      set_ssa_promoted (new_def);
+      SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+      TREE_TYPE (def) = promoted_type;
+      gimple_set_lhs (def_stmt, new_def);
+      copy_stmt = gimple_build_assign (def, CONVERT_EXPR,
+				       new_def, NULL_TREE);
+      gsi = gsi_for_stmt (def_stmt);
+      if (lookup_stmt_eh_lp (def_stmt) > 0)
+	insert_stmt_on_edge (def_stmt, copy_stmt);
+      else
+	gsi_insert_after (&gsi, copy_stmt, GSI_NEW_STMT);
+    }
+  else
+    {
+      /* Type is now promoted.  Due to this, some of the value ranges computed by
+	 VRP1 will is invalid.  TODO: We can be intelligent in deciding
+	 which ranges to be invalidated instead of invalidating everything.  */
+      SSA_NAME_RANGE_INFO (def) = NULL;
+    }
+}
+
+/* Fix the (promoted) USE in stmts where USE cannot be be promoted.  */
+static unsigned int
+fixup_uses (tree use, tree promoted_type, tree old_type)
+{
+  gimple stmt;
+  imm_use_iterator ui;
+  gimple_stmt_iterator gsi;
+  use_operand_p op;
+
+  FOR_EACH_IMM_USE_STMT (stmt, ui, use)
+    {
+      bool do_not_promote = false;
+      switch (gimple_code (stmt))
+	{
+	case GIMPLE_DEBUG:
+	    {
+	      gsi = gsi_for_stmt (stmt);
+	      gsi_remove (&gsi, true);
+	      break;
+	    }
+
+	case GIMPLE_ASM:
+	case GIMPLE_CALL:
+	case GIMPLE_RETURN:
+	    {
+	      /* USE cannot be promoted here.  */
+	      do_not_promote = true;
+	      break;
+	    }
+
+	case GIMPLE_ASSIGN:
+	    {
+	      enum tree_code code = gimple_assign_rhs_code (stmt);
+	      tree lhs = gimple_assign_lhs (stmt);
+	      if (!safe_to_promote_use_p (stmt))
+		{
+		  do_not_promote = true;
+		}
+	      else if (truncate_use_p (stmt))
+		{
+		  /* In some stmts, value in USE has to be ZERO/SIGN
+		     Extended based on the original type for correct
+		     result.  */
+		  tree temp = make_promoted_copy (use, NULL, TREE_TYPE (use));
+		  gimple copy_stmt =
+		    zero_sign_extend_stmt (temp, use,
+					   TYPE_PRECISION (old_type));
+		  gsi = gsi_for_stmt (stmt);
+		  gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+
+		  FOR_EACH_IMM_USE_ON_STMT (op, ui)
+		    SET_USE (op, temp);
+		  if (TREE_CODE_CLASS (code)
+		      == tcc_comparison)
+		    promote_cst_in_stmt (stmt, promoted_type, true);
+		  update_stmt (stmt);
+		}
+	      else if (CONVERT_EXPR_CODE_P (code))
+		{
+		  if (types_compatible_p (TREE_TYPE (lhs), promoted_type))
+		    {
+		      /* Type of LHS and promoted RHS are compatible, we can
+			 convert this into ZERO/SIGN EXTEND stmt.  */
+		      gimple copy_stmt =
+			zero_sign_extend_stmt (lhs, use,
+					       TYPE_PRECISION (old_type));
+		      gsi = gsi_for_stmt (stmt);
+		      set_ssa_promoted (lhs);
+		      gsi_replace (&gsi, copy_stmt, false);
+		    }
+		  else if (tobe_promoted_p (lhs))
+		    {
+		      /* If LHS will be promoted later, store the original
+			 type of RHS so that we can convert it to ZERO/SIGN
+			 EXTEND when LHS is promoted.  */
+		      tree rhs = gimple_assign_rhs1 (stmt);
+		      tree &type = original_type_map->get_or_insert (rhs);
+		      type = TREE_TYPE (old_type);
+		    }
+		  else
+		    {
+		      do_not_promote = true;
+		    }
+		}
+	      break;
+	    }
+
+	case GIMPLE_COND:
+	    {
+	      /* In GIMPLE_COND, value in USE has to be ZERO/SIGN
+		 Extended based on the original type for correct
+		 result.  */
+	      tree temp = make_promoted_copy (use, NULL, TREE_TYPE (use));
+	      gimple copy_stmt =
+		zero_sign_extend_stmt (temp, use,
+				       TYPE_PRECISION (old_type));
+	      gsi = gsi_for_stmt (stmt);
+	      gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+
+	      FOR_EACH_IMM_USE_ON_STMT (op, ui)
+		SET_USE (op, temp);
+	      promote_cst_in_stmt (stmt, promoted_type, true);
+	      update_stmt (stmt);
+	      break;
+	    }
+
+	default:
+	  break;
+	}
+
+      if (do_not_promote)
+	{
+	  /* FOR stmts where USE canoot be promoted, create an
+	     original type copy.  */
+	  tree temp;
+	  temp = copy_ssa_name (use);
+	  set_ssa_promoted (temp);
+	  TREE_TYPE (temp) = old_type;
+	  gimple copy_stmt = gimple_build_assign (temp, CONVERT_EXPR,
+						  use, NULL_TREE);
+	  gsi = gsi_for_stmt (stmt);
+	  gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+	  FOR_EACH_IMM_USE_ON_STMT (op, ui)
+	    SET_USE (op, temp);
+	  update_stmt (stmt);
+	}
+    }
+  return 0;
+}
+
+/* Promote definition of NAME and adjust its uses if necessary.  */
+static unsigned int
+promote_def_and_uses (tree name)
+{
+  tree type;
+  if (tobe_promoted_p (name))
+    {
+      type = get_promoted_type (TREE_TYPE (name));
+      tree old_type = TREE_TYPE (name);
+      promote_definition (name, type);
+      fixup_uses (name, type, old_type);
+      set_ssa_promoted (name);
+    }
+  return 0;
+}
+
+/* Promote all the stmts in the basic block.  */
+static void
+promote_all_stmts (basic_block bb)
+{
+  gimple_stmt_iterator gsi;
+  ssa_op_iter iter;
+  tree def;
+
+  for (gphi_iterator gpi = gsi_start_phis (bb);
+       !gsi_end_p (gpi); gsi_next (&gpi))
+    {
+      gphi *phi = gpi.phi ();
+      use_operand_p op;
+
+      FOR_EACH_PHI_ARG (op, phi, iter, SSA_OP_USE)
+	{
+	  def = USE_FROM_PTR (op);
+	  promote_def_and_uses (def);
+	}
+      def = PHI_RESULT (phi);
+      promote_def_and_uses (def);
+    }
+  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+    {
+      gimple stmt = gsi_stmt (gsi);
+
+      FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_USE | SSA_OP_DEF)
+	promote_def_and_uses (def);
+    }
+}
+
+
+class type_promotion_dom_walker : public dom_walker
+{
+public:
+  type_promotion_dom_walker (cdi_direction direction)
+    : dom_walker (direction) {}
+  virtual void before_dom_children (basic_block bb)
+    {
+      promote_all_stmts (bb);
+    }
+};
+
+/* Main entry point to the pass.  */
+static unsigned int
+execute_type_promotion (void)
+{
+  n_ssa_val = num_ssa_names;
+  original_type_map = new hash_map<tree, tree>;
+  ssa_to_be_promoted_bitmap = sbitmap_alloc (n_ssa_val);
+  bitmap_clear (ssa_to_be_promoted_bitmap);
+  ssa_sets_higher_bits_bitmap = sbitmap_alloc (n_ssa_val);
+  bitmap_clear (ssa_sets_higher_bits_bitmap);
+
+  calculate_dominance_info (CDI_DOMINATORS);
+  /* Walk the CFG in dominator order.  */
+  type_promotion_dom_walker (CDI_DOMINATORS)
+    .walk (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+
+  sbitmap_free (ssa_to_be_promoted_bitmap);
+  sbitmap_free (ssa_sets_higher_bits_bitmap);
+  free_dominance_info (CDI_DOMINATORS);
+  delete original_type_map;
+  return 0;
+}
+
+namespace {
+const pass_data pass_data_type_promotion =
+{
+  GIMPLE_PASS, /* type */
+  "promotion", /* name */
+  OPTGROUP_NONE, /* optinfo_flags */
+  TV_TREE_TYPE_PROMOTE, /* tv_id */
+  PROP_ssa, /* properties_required */
+  0, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  (TODO_cleanup_cfg | TODO_update_ssa | TODO_verify_all),
+};
+
+class pass_type_promotion : public gimple_opt_pass
+{
+public:
+  pass_type_promotion (gcc::context *ctxt)
+    : gimple_opt_pass (pass_data_type_promotion, ctxt)
+  {}
+
+  /* opt_pass methods: */
+  opt_pass * clone () { return new pass_type_promotion (m_ctxt); }
+  virtual bool gate (function *) { return flag_tree_type_promote != 0; }
+  virtual unsigned int execute (function *)
+    {
+      return execute_type_promotion ();
+    }
+
+}; // class pass_type_promotion
+
+} // anon namespace
+
+gimple_opt_pass *
+make_pass_type_promote (gcc::context *ctxt)
+{
+  return new pass_type_promotion (ctxt);
+}
+
diff --git a/gcc/passes.def b/gcc/passes.def
index 64fc4d9..254496b 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -270,6 +270,7 @@ along with GCC; see the file COPYING3.  If not see
       POP_INSERT_PASSES ()
       NEXT_PASS (pass_simduid_cleanup);
       NEXT_PASS (pass_lower_vector_ssa);
+      NEXT_PASS (pass_type_promote);
       NEXT_PASS (pass_cse_reciprocals);
       NEXT_PASS (pass_reassoc);
       NEXT_PASS (pass_strength_reduction);
diff --git a/gcc/timevar.def b/gcc/timevar.def
index aee36e6..38b8d7d 100644
--- a/gcc/timevar.def
+++ b/gcc/timevar.def
@@ -276,6 +276,7 @@ DEFTIMEVAR (TV_VTABLE_VERIFICATION   , "vtable verification")
 DEFTIMEVAR (TV_TREE_UBSAN            , "tree ubsan")
 DEFTIMEVAR (TV_INITIALIZE_RTL        , "initialize rtl")
 DEFTIMEVAR (TV_GIMPLE_LADDRESS       , "address lowering")
+DEFTIMEVAR (TV_TREE_TYPE_PROMOTE     , "tree type promote")
 
 /* Everything else in rest_of_compilation not included above.  */
 DEFTIMEVAR (TV_EARLY_LOCAL	     , "early local passes")
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index 66f999e..1db888d 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -3748,6 +3748,18 @@ verify_gimple_assign_binary (gassign *stmt)
         return false;
       }
 
+    case SEXT_EXPR:
+      {
+	if (!INTEGRAL_TYPE_P (lhs_type)
+	    || !INTEGRAL_TYPE_P (rhs1_type)
+	    || TREE_CODE (rhs2) != INTEGER_CST)
+	  {
+	    error ("invalid operands in sext expr");
+	    return true;
+	  }
+	return false;
+      }
+
     case VEC_WIDEN_LSHIFT_HI_EXPR:
     case VEC_WIDEN_LSHIFT_LO_EXPR:
       {
@@ -5192,6 +5204,7 @@ gimple_verify_flow_info (void)
 
 	  if (found_ctrl_stmt)
 	    {
+	      dump_bb (stderr, gimple_bb (stmt), 0, 0);
 	      error ("control flow in the middle of basic block %d",
 		     bb->index);
 	      err = 1;
diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index e1ceea4..272c409 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -3884,6 +3884,7 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights,
     case BIT_XOR_EXPR:
     case BIT_AND_EXPR:
     case BIT_NOT_EXPR:
+    case SEXT_EXPR:
 
     case TRUTH_ANDIF_EXPR:
     case TRUTH_ORIF_EXPR:
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index 7b66a1c..7ddb55c 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -431,6 +431,7 @@ extern gimple_opt_pass *make_pass_fre (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_check_data_deps (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_copy_prop (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_isolate_erroneous_paths (gcc::context *ctxt);
+extern gimple_opt_pass *make_pass_type_promote (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_vrp (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_uncprop (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_return_slot (gcc::context *ctxt);
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index 7cd1fe7..04f6777 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -1794,6 +1794,14 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, int flags,
       }
       break;
 
+    case SEXT_EXPR:
+      pp_string (pp, "SEXT_EXPR <");
+      dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);
+      pp_string (pp, ", ");
+      dump_generic_node (pp, TREE_OPERAND (node, 1), spc, flags, false);
+      pp_greater (pp);
+      break;
+
     case MODIFY_EXPR:
     case INIT_EXPR:
       dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags,
@@ -3414,6 +3422,9 @@ op_symbol_code (enum tree_code code)
     case MIN_EXPR:
       return "min";
 
+    case SEXT_EXPR:
+      return "sext from bit";
+
     default:
       return "<<< ??? >>>";
     }
diff --git a/gcc/tree-ssanames.c b/gcc/tree-ssanames.c
index 910cb19..19aa918 100644
--- a/gcc/tree-ssanames.c
+++ b/gcc/tree-ssanames.c
@@ -190,7 +190,8 @@ set_range_info (tree name, enum value_range_type range_type,
   unsigned int precision = TYPE_PRECISION (TREE_TYPE (name));
 
   /* Allocate if not available.  */
-  if (ri == NULL)
+  if (ri == NULL
+      || (precision != ri->get_min ().get_precision ()))
     {
       size_t size = (sizeof (range_info_def)
 		     + trailing_wide_ints <3>::extra_size (precision));
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index d962683..dee8f6f 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -2389,6 +2389,7 @@ extract_range_from_binary_expr_1 (value_range_t *vr,
       && code != LSHIFT_EXPR
       && code != MIN_EXPR
       && code != MAX_EXPR
+      && code != SEXT_EXPR
       && code != BIT_AND_EXPR
       && code != BIT_IOR_EXPR
       && code != BIT_XOR_EXPR)
@@ -2949,6 +2950,57 @@ extract_range_from_binary_expr_1 (value_range_t *vr,
       extract_range_from_multiplicative_op_1 (vr, code, &vr0, &vr1);
       return;
     }
+  else if (code == SEXT_EXPR)
+    {
+      gcc_assert (range_int_cst_p (&vr1));
+      unsigned int prec = tree_to_uhwi (vr1.min);
+      type = vr0.type;
+      wide_int tmin, tmax;
+      wide_int type_min, type_max;
+      wide_int may_be_nonzero, must_be_nonzero;
+      wide_int mask = wi::shwi (((1 << (prec - 1)) - 1),
+				TYPE_PRECISION (TREE_TYPE (vr0.max)));
+
+      gcc_assert (!TYPE_UNSIGNED (expr_type));
+      type_min = wi::shwi (1 << (prec - 1),
+			   TYPE_PRECISION (TREE_TYPE (vr0.min)));
+      type_max = wi::shwi (((1 << (prec - 1)) - 1),
+			   TYPE_PRECISION (TREE_TYPE (vr0.max)));
+      if (zero_nonzero_bits_from_vr (expr_type, &vr0,
+				     &may_be_nonzero,
+				     &must_be_nonzero))
+	{
+	  HOST_WIDE_INT int_may_be_nonzero = may_be_nonzero.to_uhwi ();
+	  HOST_WIDE_INT int_must_be_nonzero = must_be_nonzero.to_uhwi ();
+
+	  if (int_must_be_nonzero & (1 << (prec - 1)))
+	    {
+	      /* If to-be-extended sign bit is one.  */
+	      tmin = type_min;
+	      tmax = may_be_nonzero & mask;
+	    }
+	  else if ((int_may_be_nonzero & (1 << (prec - 1))) == 0)
+	    {
+	      /* If to-be-extended sign bit is zero.  */
+	      tmin = must_be_nonzero & mask;
+	      tmax = may_be_nonzero & mask;
+	    }
+	  else
+	    {
+	      tmin = type_min;
+	      tmax = type_max;
+	    }
+	}
+      else
+	{
+	  tmin = type_min;
+	  tmax = type_max;
+	}
+      tmin = wi::sext (tmin, prec);
+      tmax = wi::sext (tmax, prec);
+      min = wide_int_to_tree (expr_type, tmin);
+      max = wide_int_to_tree (expr_type, tmax);
+    }
   else if (code == RSHIFT_EXPR
 	   || code == LSHIFT_EXPR)
     {
@@ -9279,6 +9331,30 @@ simplify_bit_ops_using_ranges (gimple_stmt_iterator *gsi, gimple stmt)
 	  break;
 	}
       break;
+    case SEXT_EXPR:
+	{
+	  gcc_assert (is_gimple_min_invariant (op1));
+	  unsigned int prec = tree_to_uhwi (op1);
+	  wide_int mask;
+	  HOST_WIDE_INT may_be_nonzero = may_be_nonzero0.to_uhwi ();
+	  HOST_WIDE_INT must_be_nonzero = must_be_nonzero0.to_uhwi ();
+	  mask = wi::shwi (((1 << (prec - 1)) - 1),
+			   TYPE_PRECISION (TREE_TYPE (vr0.max)));
+	  mask = wi::bit_not (mask);
+	  if (must_be_nonzero & (1 << (prec - 1)))
+	    {
+	      /* If to-be-extended sign bit is one.  */
+	      if (wi::bit_and (must_be_nonzero0, mask) == mask)
+		op = op0;
+	    }
+	  else if ((may_be_nonzero & (1 << (prec - 1))) == 0)
+	    {
+	      /* If to-be-extended sign bit is zero.  */
+	      if (wi::bit_and (may_be_nonzero0, mask) == 0)
+		op = op0;
+	    }
+	}
+      break;
     default:
       gcc_unreachable ();
     }
@@ -9980,6 +10056,7 @@ simplify_stmt_using_ranges (gimple_stmt_iterator *gsi)
 
 	case BIT_AND_EXPR:
 	case BIT_IOR_EXPR:
+	case SEXT_EXPR:
 	  /* Optimize away BIT_AND_EXPR and BIT_IOR_EXPR
 	     if all the bits being cleared are already cleared or
 	     all the bits being set are already set.  */
diff --git a/gcc/tree.def b/gcc/tree.def
index 56580af..d98c999 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -752,6 +752,10 @@ DEFTREECODE (BIT_XOR_EXPR, "bit_xor_expr", tcc_binary, 2)
 DEFTREECODE (BIT_AND_EXPR, "bit_and_expr", tcc_binary, 2)
 DEFTREECODE (BIT_NOT_EXPR, "bit_not_expr", tcc_unary, 1)
 
+/*  Sign-extend operation. It will sign extend first operand from
+ the sign bit specified by the second operand. */
+DEFTREECODE (SEXT_EXPR, "sext_expr", tcc_binary, 2)
+
 /* ANDIF and ORIF allow the second operand not to be computed if the
    value of the expression is determined from the first operand.  AND,
    OR, and XOR always compute the second operand whether its value is

[-- Attachment #3: log.txt --]
[-- Type: text/plain, Size: 825 bytes --]

gcc/ChangeLog:

2015-07-05  Kugan Vivekanandarajah  <kuganv@linaro.org>

	* Makefile.in: Add gimple-ssa-type-promote.o.
	* cfgexpand.c (expand_debug_expr): Handle SEXT_EXPR.
	* common.opt: New option -ftree-type-promote.
	* expr.c (expand_expr_real_2): Handle SEXT_EXPR.
	* fold-const.c (int_const_binop_1):
	* gimple-ssa-type-promote.c: New file.
	* passes.def: Define new pass_type_promote.
	* timevar.def: Define new TV_TREE_TYPE_PROMOTE.
	* tree-cfg.c (verify_gimple_assign_binary): Handle SEXT_EXPR.
	* tree-inline.c (estimate_operator_cost):
	* tree-pass.h (make_pass_type_promote): New.
	* tree-pretty-print.c (dump_generic_node): Handle SEXT_EXPR.
	(op_symbol_code): Likewise.
	* tree-vrp.c (extract_range_from_binary_expr_1): Likewise.
	(simplify_bit_ops_using_ranges): Likewise.
	* tree.def: Define new SEXT_EXPR.


[-- Attachment #4: c5.c --]
[-- Type: text/x-csrc, Size: 290 bytes --]

short unPack( unsigned char c )
{
    /* Only want lower four bit nibble */
    c = c & (unsigned char)0x0F ;

    if( c > 7 ) {
        /* Negative nibble */
        return( ( short )( c - 5 ) ) ;

    }
    else
    {
        /* positive nibble */
        return( ( short )c ) ;
    }
}


[-- Attachment #5: crc.c --]
[-- Type: text/x-csrc, Size: 431 bytes --]

unsigned short
crc2(unsigned short crc, unsigned char data)
{
   unsigned char i, x16, carry;
 
   for (i = 0; i < 8; i++)
     {
       x16 = (data ^ crc) & 1;
       data >>= 1;
 
       if (x16 == 1)
         {
           crc ^= 0x4002;
           carry = 1;
         }
       else
         carry = 0;
 
       crc >>= 1;
 
       if (carry)
         crc |= 0x8000;
       else
         crc &= 0x7fff;
     }
 
   return crc;
}

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC] Elimination of zext/sext - type promotion pass
  2015-08-05  0:12                           ` kugan
@ 2015-08-05  9:10                             ` Richard Biener
  0 siblings, 0 replies; 29+ messages in thread
From: Richard Biener @ 2015-08-05  9:10 UTC (permalink / raw)
  To: kugan; +Cc: Jeff Law, gcc-patches, Jakub Jelinek

On Wed, Aug 5, 2015 at 2:12 AM, kugan <kugan.vivekanandarajah@linaro.org> wrote:
>
>> You indeed need to use CONVERT_EXPR here, maybe you can elaborate
>> on the optimization issues.
>>
>>> 2. for inline asm (a reduced test case that might not make much as a
>>> stand alone test-case, but I ran into similar cases with valid
>>> programmes)
>>>
>>> ;; Function fn1 (fn1, funcdef_no=0, decl_uid=4220, cgraph_uid=0,
>>> symbol_order=0)
>>>
>>> fn1 (short int p1)
>>> {
>>>    <bb 2>:
>>>    __asm__("" : "=r" p1_2 : "0" p1_1(D));
>>>    return;
>>>
>>> }
>>>
>>>
>>> I am generating something like the following which ICEs. What is the
>>> expected out?
>>>
>>> ;; Function fn1 (fn1, funcdef_no=0, decl_uid=4220, cgraph_uid=0,
>>> symbol_order=0)
>>>
>>> fn1 (short int p1)
>>> {
>>>    int _1;
>>>    int _2;
>>>    short int _5;
>>>
>>>    <bb 2>:
>>>    _1 = (int) p1_4(D);
>>>    _5 = (short int) _1;
>>>    __asm__("" : "=r" p1_6 : "0" _5);
>>>    _2 = (int) p1_6;
>>>    return;
>>>
>>> }
>>
>>
>> Parameters are indeed "interesting" to handle ;)  As we now see on ARM
>> the incoming parameter (the default def) and later assignments to it
>> can require different promotions (well, different extensions for ARM).
>>
>> The only sensible way to deal with promoting parameters is to
>> promote them by changing the function signature.  Thus reflect the
>> targets ABI for parameters in the GIMPLE representation (which
>> includes TYPE_ARG_TYPES and DECL_ARGUMENTS).
>> IMHO we should do this during gimplification of parameters / call
>> arguments already.
>>
>> So for your example you'd end up with
>>
>> fn1 (int p1)
>> {
>>    __asm__("" : "=r" p1_6 : "0" p1_4(D));
>>    return;
>> }
>>
>> that is, promotions also apply to asm inputs/outputs (no?)
>
>
>
> Thanks for the review and answers. For the time being, I am handling
> gimple_asm as one that has to be handled in original type. I Will look into
> improving it after getting the basic framework right.

Yeah, that's always a possibility.  I also see from the dumps that we probably
want to promote function arguments and results on GIMPLE as well.  Possibly
very early during gimplification or as an early IPA pass (as it needs to adjust
the IL for calls as well, exposing ABI required promotions / extensions).

> As it is, attached patch bootstraps on x86_64-linux-gnu, arm-linux-gnu and
> aarch64-linux-gnu. There are few regressions to look into (Please see
> below).
>
> There are cases it is working well. There are cases where it can be
> improved. I am attaching couple test cases (and their results). I am seeing
> some BIT_AND_EXPR which are inserted by promotion are not being optimized
> when they are redundant. This is especially the case when I invalidate the
> VRP range into from VRP1 during the type promotion. I am looking into it.
>
> Please note that attached patch still needs to address:
> * Adding gimple_debug stmts.
> * Address review comment for expr.c handling SEXT_EXPR.
> * Address regression failures
>
> Based on the feedback, I will address the above and split the patch into
> logical patch set for easy detailed review.
>
> Here are the outputs for the testcases.
>
> --- c5.c.142t.veclower21        2015-08-05 08:50:11.367135339 +1000
> +++ c5.c.143t.promotion 2015-08-05 08:50:11.367135339 +1000
> @@ -1,34 +1,45 @@
>
>  ;; Function unPack (unPack, funcdef_no=0, decl_uid=4145, cgraph_uid=0,
> symbol_order=0)
>
>  unPack (unsigned char c)
>  {
> -  short int _1;
> -  unsigned short _4;
> -  unsigned short _5;
> -  short int _6;
> -  short int _7;
> +  int _1;
> +  unsigned int _2;
> +  unsigned int _3;
> +  unsigned int _4;
> +  unsigned int _5;
> +  int _6;
> +  int _7;
> +  unsigned int _9;
> +  int _11;
> +  int _12;
> +  short int _13;
>
>    <bb 2>:
> -  c_3 = c_2(D) & 15;
> -  if (c_3 > 7)
> +  _2 = (unsigned int) c_10(D);
> +  _3 = _2 & 15;
> +  _9 = _3 & 255;
> +  if (_9 > 7)
>      goto <bb 3>;
>    else
>      goto <bb 4>;
>
>    <bb 3>:
> -  _4 = (unsigned short) c_3;
> -  _5 = _4 + 65531;
> -  _6 = (short int) _5;
> +  _4 = _3 & 65535;
> +  _5 = _4 + 4294967291;
> +  _11 = (int) _5;
> +  _6 = (_11) sext from bit (16);

Ok, so in GIMPLE we still have sign-changing conversions.  Another
thing we might want to lower at some stage ... ;)

>    goto <bb 5>;
>
>    <bb 4>:
> -  _7 = (short int) c_3;
> +  _12 = (int) _3;
> +  _7 = (_12) sext from bit (16);
>
>    <bb 5>:
>    # _1 = PHI <_6(3), _7(4)>
> -  return _1;
> +  _13 = (short int) _1;
> +  return _13;
>
>  }

Overall this looks like what I'd have expected - also pointing out the
missing argument/return value promotion.

>
> --- c5.org.s    2015-08-05 08:51:44.619133892 +1000
> +++ c5.new.s    2015-08-05 08:51:29.643134124 +1000
> @@ -16,16 +16,14 @@
>         .syntax divided
>         .arm
>         .type   unPack, %function
>  unPack:
>         @ args = 0, pretend = 0, frame = 0
>         @ frame_needed = 0, uses_anonymous_args = 0
>         @ link register save eliminated.
>         and     r0, r0, #15
>         cmp     r0, #7
>         subhi   r0, r0, #5
> -       uxth    r0, r0
> -       sxth    r0, r0

Nice.

>         bx      lr
>         .size   unPack, .-unPack
>         .ident  "GCC: (GNU) 6.0.0 20150724 (experimental)"
>         .section        .note.GNU-stack,"",%progbits
> --- crc.c.142t.veclower21       2015-08-05 08:52:43.811132974 +1000
> +++ crc.c.143t.promotion        2015-08-05 08:52:43.811132974 +1000
> @@ -1,52 +1,78 @@
>
>  ;; Function crc2 (crc2, funcdef_no=0, decl_uid=4146, cgraph_uid=0,
> symbol_order=0)
>
>  crc2 (short unsigned int crc, unsigned char data)
>  {
>    unsigned char carry;
>    unsigned char x16;
>    unsigned char i;
> -  unsigned char ivtmp_5;
> -  unsigned char _9;
> -  unsigned char _10;
> -  unsigned char ivtmp_18;
> +  unsigned int _2;
> +  unsigned int _3;
> +  unsigned int _5;
> +  unsigned int _7;
> +  unsigned int _8;
> +  unsigned int _9;
> +  unsigned int _10;
> +  unsigned int _11;
> +  unsigned int _12;
> +  unsigned int _13;
> +  unsigned int _15;
> +  unsigned int _16;
> +  unsigned int _18;
> +  unsigned int _19;
> +  unsigned int _21;
> +  unsigned int _22;
> +  unsigned int _24;
> +  short unsigned int _25;
> +  unsigned int _26;
> +  unsigned int _27;
> +  unsigned int _28;
> +  unsigned int _29;
>
>    <bb 2>:
> +  _8 = (unsigned int) data_4(D);
> +  _7 = (unsigned int) crc_30(D);
>
>    <bb 3>:
> -  # crc_28 = PHI <crc_2(5), crc_7(D)(2)>
> -  # data_29 = PHI <data_12(5), data_8(D)(2)>
> -  # ivtmp_18 = PHI <ivtmp_5(5), 8(2)>
> -  _9 = (unsigned char) crc_28;
> -  _10 = _9 ^ data_29;
> -  x16_11 = _10 & 1;
> -  data_12 = data_29 >> 1;
> -  if (x16_11 == 1)
> +  # _28 = PHI <_2(5), _7(2)>
> +  # _29 = PHI <_12(5), _8(2)>
> +  # _18 = PHI <_5(5), 8(2)>
> +  _9 = _28 & 255;
> +  _10 = _9 ^ _29;
> +  _11 = _10 & 1;
> +  _3 = _29 & 255;
> +  _12 = _3 >> 1;
> +  _27 = _11 & 255;
> +  if (_27 == 1)
>      goto <bb 4>;
>    else
>      goto <bb 7>;
>
>    <bb 4>:
> -  crc_13 = crc_28 ^ 16386;
> -  crc_24 = crc_13 >> 1;
> -  crc_15 = crc_24 | 32768;
> +  _13 = _28 ^ 16386;
> +  _26 = _13 & 65535;
> +  _24 = _26 >> 1;
> +  _15 = _24 | 4294934528;
>
>    <bb 5>:
> -  # crc_2 = PHI <crc_15(4), crc_21(7)>
> -  ivtmp_5 = ivtmp_18 - 1;
> -  if (ivtmp_5 != 0)
> +  # _2 = PHI <_15(4), _21(7)>
> +  _5 = _18 - 1;
> +  _22 = _5 & 255;
> +  if (_22 != 0)
>      goto <bb 3>;
>    else
>      goto <bb 6>;
>
>    <bb 6>:
> -  # crc_19 = PHI <crc_2(5)>
> -  return crc_19;
> +  # _19 = PHI <_2(5)>
> +  _25 = (short unsigned int) _19;
> +  return _25;
>
>    <bb 7>:
> -  crc_21 = crc_28 >> 1;
> +  _16 = _28 & 65535;
> +  _21 = _16 >> 1;
>    goto <bb 5>;
>
>  }
>
>
> --- crc.org.s   2015-08-05 08:54:17.491131520 +1000
> +++ crc.new.s   2015-08-05 08:53:12.183132534 +1000
> @@ -15,27 +15,28 @@
>         .global crc2
>         .syntax divided
>         .arm
>         .type   crc2, %function
>  crc2:
>         @ args = 0, pretend = 0, frame = 0
>         @ frame_needed = 0, uses_anonymous_args = 0
>         mov     ip, #32768
>         movt    ip, 65535
>         str     lr, [sp, #-4]!
> -       mov     r3, #8
> +       mov     r2, #8
>         movw    lr, #16386
>  .L3:
> -       eor     r2, r1, r0
> -       sub     r3, r3, #1
> -       tst     r2, #1
> +       uxtb    r3, r0
> +       eor     r3, r3, r1
>         mov     r1, r1, lsr #1
> +       tst     r3, #1
>         eorne   r0, r0, lr
> -       moveq   r0, r0, lsr #1
> -       orrne   r0, ip, r0, lsr #1
> -       uxthne  r0, r0
> -       ands    r3, r3, #255
> +       ubfxeq  r0, r0, #1, #15
> +       ubfxne  r0, r0, #1, #15
> +       orrne   r0, r0, ip
> +       subs    r2, r2, #1
>         bne     .L3
> +       uxth    r0, r0
>         ldr     pc, [sp], #4
>         .size   crc2, .-crc2
>         .ident  "GCC: (GNU) 6.0.0 20150724 (experimental)"
>         .section        .note.GNU-stack,"",%progbits

Can't really dechipher this changes...

>
>
> Testsuite regression for x86_64-unknown-linux-gnu:
> Tests that now fail, but worked before:
> gfortran.dg/graphite/pr42393-1.f90   -O  (test for excess errors)

I see this on pristine trunk as well.

>
> Testsuite regression for  arm-linux-gnu:
> Tests that now fail, but worked before:
> arm-sim: gcc.dg/fixed-point/convert-sat.c execution test
> arm-sim: gcc.dg/tree-ssa/20030729-1.c scan-tree-dump-times dom2 "\\(unsigned
> int\\)" 0
> arm-sim: gcc.dg/tree-ssa/pr54245.c scan-tree-dump-times slsr "Inserting
> initializer" 0
> arm-sim: gcc.dg/tree-ssa/shorten-1.c scan-tree-dump-not optimized
> "\\(int\\)"
> arm-sim: gcc.dg/tree-ssa/shorten-1.c scan-tree-dump-times optimized
> "\\(unsigned char\\)" 8
> arm-sim: gcc.target/arm/mla-2.c scan-assembler smlalbb
> arm-sim: gcc.target/arm/unsigned-extend-2.c scan-assembler ands
> arm-sim: gcc.target/arm/wmul-1.c scan-assembler-times smlabb 2
> arm-sim: gcc.target/arm/wmul-2.c scan-assembler-times smulbb 1
> arm-sim: gcc.target/arm/wmul-3.c scan-assembler-times smulbb 2
> arm-sim: gcc.target/arm/wmul-9.c scan-assembler smlalbb
> arm-sim: gfortran.dg/graphite/pr42393-1.f90   -O  (test for excess errors)
>
> Tests that now work, but didn't before:
> arm-sim: gcc.dg/tree-prof/time-profiler-2.c scan-ipa-dump-times profile
> "Read tp_first_run: 0" 2
> arm-sim: gcc.dg/tree-prof/time-profiler-2.c scan-ipa-dump-times profile
> "Read tp_first_run: 2" 1
> arm-sim: gcc.dg/tree-prof/time-profiler-2.c scan-ipa-dump-times profile
> "Read tp_first_run: 3" 1
> arm-sim: gcc.target/arm/builtin-bswap-1.c scan-assembler-times rev16ne\\t 1
> arm-sim: gcc.target/arm/builtin-bswap-1.c scan-assembler-times revshne\\t 1
> arm-sim: gcc.target/arm/smlaltb-1.c scan-assembler smlaltb\\t
> arm-sim: gcc.target/arm/smlaltt-1.c scan-assembler smlaltt\\t
>
>
> Testsuite regression for  aarch64-linux-gnu:
> Tests that now fail, but worked before:
> c-c++-common/torture/vector-compare-1.c   -O3 -g  (test for excess errors)
> c-c++-common/torture/vector-compare-1.c   -O3 -g  (test for excess errors)
> gcc.dg/tree-ssa/20030729-1.c scan-tree-dump-times dom2 "\\(unsigned int\\)"
> 0
> gcc.dg/tree-ssa/pr54245.c scan-tree-dump-times slsr "Inserting initializer"
> 0
> gcc.dg/tree-ssa/shorten-1.c scan-tree-dump-not optimized "\\(int\\)"
> gcc.dg/tree-ssa/shorten-1.c scan-tree-dump-times optimized "\\(unsigned
> char\\)" 8

tree-dump scan differences are expected, of course.  Others need to be
investigated.

Thanks for continuing to work on this!  I hope to have a closer look
at the updated patch later.

Thanks,
Richard.


> Thanks,
> Kugan

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2015-08-05  9:10 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-27 10:01 [PATCH 2/2] Enable elimination of zext/sext Uros Bizjak
2014-08-27 10:07 ` Richard Biener
2014-08-27 10:32   ` Uros Bizjak
2014-08-27 10:32     ` Richard Biener
2014-09-01  8:48     ` Jakub Jelinek
2014-09-01  8:54       ` Uros Bizjak
2014-08-28  7:50   ` Kugan
2014-08-28  8:57     ` Richard Biener
2014-09-04  3:41       ` Kugan
2014-09-04 13:00         ` Richard Biener
2014-09-05  1:33           ` Kugan
2014-09-05  9:51             ` Richard Biener
2014-09-07  9:51               ` Kugan
2014-09-08  9:48                 ` Richard Biener
2014-09-09 10:06                   ` Kugan
2014-09-09 10:28                     ` Richard Biener
2014-11-09 23:30               ` [RFC] Elimination of zext/sext - type promotion pass Kugan
2014-11-10 12:56                 ` Richard Biener
2015-05-01  4:41                   ` Kugan
2015-05-08 12:48                     ` Richard Biener
2015-06-01 23:20                       ` Kugan
2015-06-19  2:55                         ` Kugan
2015-07-28 11:05                         ` Richard Biener
2015-08-05  0:12                           ` kugan
2015-08-05  9:10                             ` Richard Biener
2014-08-27 13:02 ` [PATCH 2/2] Enable elimination of zext/sext Kugan
2014-08-28  3:46   ` Kugan
2014-08-28  6:44     ` Marc Glisse
2014-08-28  7:29       ` Kugan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).