public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: [PATCH 2/2] Enable elimination of zext/sext
@ 2014-08-27 10:01 Uros Bizjak
  2014-08-27 10:07 ` Richard Biener
  2014-08-27 13:02 ` [PATCH 2/2] Enable elimination of zext/sext Kugan
  0 siblings, 2 replies; 51+ messages in thread
From: Uros Bizjak @ 2014-08-27 10:01 UTC (permalink / raw)
  To: gcc-patches; +Cc: Kugan, Jakub Jelinek, Richard Biener

[-- Attachment #1: Type: text/plain, Size: 5703 bytes --]

Hello!

> 2014-08-07  Kugan Vivekanandarajah  <kuganv@linaro.org>
>
> * calls.c (precompute_arguments): Check
> promoted_for_signed_and_unsigned_p and set the promoted mode.
> (promoted_for_signed_and_unsigned_p): New function.
> (expand_expr_real_1): Check promoted_for_signed_and_unsigned_p
> and set the promoted mode.
> * expr.h (promoted_for_signed_and_unsigned_p): New function definition.
> * cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if
> SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED.

This patch regresses:

Running target unix
FAIL: libgomp.fortran/simd7.f90   -O2  execution test
FAIL: libgomp.fortran/simd7.f90   -Os  execution test

on alphaev6-linux-gnu.

The problem can be illustrated with attached testcase with a
crosscompiler to alphaev68-linux-gnu (-O2 -fopenmp). The problem is in
missing SImode extension after DImode shift of SImode subregs for this
part:

--cut here--
  # test.23_12 = PHI <0(37), 1(36)>
  _242 = ivtmp.181_73 + 2147483645;
  _240 = _242 * 2;
  _63 = (integer(kind=4)) _240;
  if (ubound.6_99 <= 2)
    goto <bb 39>;
  else
    goto <bb 40>;
;;    succ:       39
;;                40

;;   basic block 39, loop depth 1
;;    pred:       38
  pretmp_337 = test.23_12 | l_76;
  goto <bb 45>;
;;    succ:       45

;;   basic block 40, loop depth 1
;;    pred:       38
  _11 = *c_208[0];
  if (_11 != _63)
    goto <bb 45>;
  else
    goto <bb 42>;
--cut here--

this expands to:

(code_label 592 591 593 35 "" [0 uses])

(note 593 592 0 NOTE_INSN_BASIC_BLOCK)

;; _63 = (integer(kind=4)) _240;

(insn 594 593 595 (set (reg:SI 538)
        (const_int 1073741824 [0x40000000])) -1
     (nil))

(insn 595 594 596 (set (reg:SI 539)
        (plus:SI (reg:SI 538)
            (const_int 1073741824 [0x40000000]))) -1
     (nil))

(insn 596 595 597 (set (reg:SI 537)
        (plus:SI (reg:SI 539)
            (const_int -3 [0xfffffffffffffffd]))) -1
     (expr_list:REG_EQUAL (const_int 2147483645 [0x7ffffffd])
        (nil)))

(insn 597 596 598 (set (reg:SI 536 [ D.1700 ])
        (plus:SI (subreg/s/v/u:SI (reg:DI 144 [ ivtmp.181 ]) 0)
            (reg:SI 537))) -1
     (nil))

(insn 598 597 599 (set (reg:DI 540)
        (ashift:DI (subreg:DI (reg:SI 536 [ D.1700 ]) 0)
            (const_int 1 [0x1]))) -1
     (nil))

(insn 599 598 0 (set (reg:DI 145 [ D.1694 ])
        (reg:DI 540)) -1
     (nil))

...

(note 610 609 0 NOTE_INSN_BASIC_BLOCK)

;; _11 = *c_208[0];

(insn 611 610 0 (set (reg:DI 120 [ D.1694 ])
        (sign_extend:DI (mem:SI (reg/v/f:DI 227 [ c ]) [7 *c_208+0 S4
A128]))) simd7.f90:12 -1
     (nil))

;; if (_11 != _63)

(insn 612 611 613 40 (set (reg:DI 545)
        (eq:DI (reg:DI 120 [ D.1694 ])
            (reg:DI 145 [ D.1694 ]))) simd7.f90:12 -1
     (nil))

(jump_insn 613 612 616 40 (set (pc)
        (if_then_else (eq (reg:DI 545)
                (const_int 0 [0]))
            (label_ref 0)
            (pc))) simd7.f90:12 -1
     (int_list:REG_BR_PROB 450 (nil)))

which results in following asm:

$L35:
    addl $25,$7,$2     # 597    addsi3/1    [length = 4]
    addq $2,$2,$2     # 598    ashldi3/1    [length = 4]     <------ here
    bne $24,$L145     # 601    *bcc_normal    [length = 4]
    lda $4,4($20)     # 627    *adddi_internal/2    [length = 4]
    ldl $8,0($20)     # 611    *extendsidi2_1/2    [length = 4]
    lda $3,3($31)     # 74    *movdi/2    [length = 4]
    cmpeq $8,$2,$2     # 612    *setcc_internal    [length = 4]  <-- compare
    bne $2,$L40     # 613    *bcc_normal    [length = 4]
    br $31,$L88     # 2403    jump    [length = 4]
    .align 4
...

Tracking the values with the debugger shows wrong calculation:

   0x000000012000108c <+1788>:  addl    t10,t12,t1
   0x0000000120001090 <+1792>:  addq    t1,t1,t1
   ...
   0x00000001200010a4 <+1812>:  cmpeq   t6,t1,t1
   0x00000001200010a8 <+1816>:  bne     t1,0x1200010c0 <foo_+1840>

(gdb) si
0x000000012000108c      17          l = l .or. any (b /= 7 + i)
(gdb) i r t10 t12
t10            0x7      7
t12            0x7ffffffd       2147483645

(gdb) si
0x0000000120001090      17          l = l .or. any (b /= 7 + i)
(gdb) i r t1
t1             0xffffffff80000004       -2147483644

(gdb) si
18          l = l .or. any (c /= 8 + 2 * i)
(gdb) i r t1
t1             0xffffffff00000008       -4294967288

At this point, the calculation should zero-extend SImode value to full
DImode, since compare operates on DImode values. The problematic insn
is (insn 599), which is now a DImode assignment instead of
zero-extend, due to:

--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -3309,7 +3309,13 @@ expand_gimple_stmt_1 (gimple stmt)
   GET_MODE (target), temp, unsignedp);
   }

- convert_move (SUBREG_REG (target), temp, unsignedp);
+ if ((SUBREG_PROMOTED_GET (target) == SRP_SIGNED_AND_UNSIGNED)
+    && (GET_CODE (temp) == SUBREG)
+    && (GET_MODE (target) == GET_MODE (temp))
+    && (GET_MODE (SUBREG_REG (target)) == GET_MODE (SUBREG_REG (temp))))
+  emit_move_insn (SUBREG_REG (target), SUBREG_REG (temp));
+ else
+  convert_move (SUBREG_REG (target), temp, unsignedp);
       }
     else if (nontemporal && emit_storent_insn (target, temp))
       ;

When compiling this code, we have:

lhs = _63
target = (subreg/s/v/u:SI (reg:DI 145 [ D.1694 ]) 0)
temp = (subreg:SI (reg:DI 540) 0)

So, the code assumes that it is possible to copy (reg:DI 540) directly
to (reg:DI 154). However, this is not the case, since we still have
garbage in the top 32bits.

Reverting the part above fixes the runtime failure, since (insn 599) is now:

(insn 599 598 0 (set (reg:DI 145 [ D.1694 ])
        (zero_extend:DI (subreg:SI (reg:DI 540) 0))) -1
     (nil))

It looks to me that we have also to check the temp with SUBREG_PROMOTED_*.

Uros.

[-- Attachment #2: simd7.f90 --]
[-- Type: text/x-csrc, Size: 1050 bytes --]

subroutine foo (d, e, f, g, m, n)
  integer :: i, j, b(2:9), c(3:n), d(:), e(2:n), f(2:,3:), n
  integer, allocatable :: g(:), h(:), k, m
  logical :: l
  l = .false.
  allocate (h(2:7))
  i = 4; j = 4; b = 7; c = 8; d = 9; e = 10; f = 11; g = 12; h = 13; k = 14; m = 15
!$omp simd linear(b)linear(c:2)linear(d:3)linear(e:4)linear(f:5)linear(g:6) &
!$omp & linear(h:7)linear(k:8)linear(m:9) reduction(.or.:l)
  do i = 0, 63 
    l = l .or. any (b /= 7 + i)
    l = l .or. any (c /= 8 + 2 * i)
    b = b + 1; c = c + 2
    d = d + 3; e = e + 4; f = f + 5; g = g + 6
    h = h + 7; k = k + 8; m = m + 9
  end do
  if (l .or. i /= 64) call abort
  if (any (b /= 7 + 64) .or. any (c /= 8 + 2 * 64)) call abort
end subroutine

  interface
    subroutine foo (d, e, f, g, m, n)
      integer :: d(:), e(2:n), f(2:,3:), n
      integer, allocatable :: g(:), m
    end subroutine
  end interface
  integer, parameter :: n = 8
  integer :: d(2:18), e(3:n+1), f(5:6,7:9)
  integer, allocatable :: g(:), m
  allocate (g(7:10))
  call foo (d, e, f, g, m, n)
end

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-08-27 10:01 [PATCH 2/2] Enable elimination of zext/sext Uros Bizjak
@ 2014-08-27 10:07 ` Richard Biener
  2014-08-27 10:32   ` Uros Bizjak
  2014-08-28  7:50   ` Kugan
  2014-08-27 13:02 ` [PATCH 2/2] Enable elimination of zext/sext Kugan
  1 sibling, 2 replies; 51+ messages in thread
From: Richard Biener @ 2014-08-27 10:07 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: gcc-patches, Kugan, Jakub Jelinek

On Wed, Aug 27, 2014 at 12:01 PM, Uros Bizjak <ubizjak@gmail.com> wrote:
> Hello!
>
>> 2014-08-07  Kugan Vivekanandarajah  <kuganv@linaro.org>
>>
>> * calls.c (precompute_arguments): Check
>> promoted_for_signed_and_unsigned_p and set the promoted mode.
>> (promoted_for_signed_and_unsigned_p): New function.
>> (expand_expr_real_1): Check promoted_for_signed_and_unsigned_p
>> and set the promoted mode.
>> * expr.h (promoted_for_signed_and_unsigned_p): New function definition.
>> * cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if
>> SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED.
>
> This patch regresses:
>
> Running target unix
> FAIL: libgomp.fortran/simd7.f90   -O2  execution test
> FAIL: libgomp.fortran/simd7.f90   -Os  execution test
>
> on alphaev6-linux-gnu.
>
> The problem can be illustrated with attached testcase with a
> crosscompiler to alphaev68-linux-gnu (-O2 -fopenmp). The problem is in
> missing SImode extension after DImode shift of SImode subregs for this
> part:
>
> --cut here--
>   # test.23_12 = PHI <0(37), 1(36)>
>   _242 = ivtmp.181_73 + 2147483645;
>   _240 = _242 * 2;
>   _63 = (integer(kind=4)) _240;
>   if (ubound.6_99 <= 2)
>     goto <bb 39>;
>   else
>     goto <bb 40>;
> ;;    succ:       39
> ;;                40
>
> ;;   basic block 39, loop depth 1
> ;;    pred:       38
>   pretmp_337 = test.23_12 | l_76;
>   goto <bb 45>;
> ;;    succ:       45
>
> ;;   basic block 40, loop depth 1
> ;;    pred:       38
>   _11 = *c_208[0];
>   if (_11 != _63)
>     goto <bb 45>;
>   else
>     goto <bb 42>;
> --cut here--
>
> this expands to:
>
> (code_label 592 591 593 35 "" [0 uses])
>
> (note 593 592 0 NOTE_INSN_BASIC_BLOCK)
>
> ;; _63 = (integer(kind=4)) _240;
>
> (insn 594 593 595 (set (reg:SI 538)
>         (const_int 1073741824 [0x40000000])) -1
>      (nil))
>
> (insn 595 594 596 (set (reg:SI 539)
>         (plus:SI (reg:SI 538)
>             (const_int 1073741824 [0x40000000]))) -1
>      (nil))
>
> (insn 596 595 597 (set (reg:SI 537)
>         (plus:SI (reg:SI 539)
>             (const_int -3 [0xfffffffffffffffd]))) -1
>      (expr_list:REG_EQUAL (const_int 2147483645 [0x7ffffffd])
>         (nil)))
>
> (insn 597 596 598 (set (reg:SI 536 [ D.1700 ])
>         (plus:SI (subreg/s/v/u:SI (reg:DI 144 [ ivtmp.181 ]) 0)
>             (reg:SI 537))) -1
>      (nil))
>
> (insn 598 597 599 (set (reg:DI 540)
>         (ashift:DI (subreg:DI (reg:SI 536 [ D.1700 ]) 0)
>             (const_int 1 [0x1]))) -1
>      (nil))
>
> (insn 599 598 0 (set (reg:DI 145 [ D.1694 ])
>         (reg:DI 540)) -1
>      (nil))
>
> ...
>
> (note 610 609 0 NOTE_INSN_BASIC_BLOCK)
>
> ;; _11 = *c_208[0];
>
> (insn 611 610 0 (set (reg:DI 120 [ D.1694 ])
>         (sign_extend:DI (mem:SI (reg/v/f:DI 227 [ c ]) [7 *c_208+0 S4
> A128]))) simd7.f90:12 -1
>      (nil))
>
> ;; if (_11 != _63)
>
> (insn 612 611 613 40 (set (reg:DI 545)
>         (eq:DI (reg:DI 120 [ D.1694 ])
>             (reg:DI 145 [ D.1694 ]))) simd7.f90:12 -1
>      (nil))
>
> (jump_insn 613 612 616 40 (set (pc)
>         (if_then_else (eq (reg:DI 545)
>                 (const_int 0 [0]))
>             (label_ref 0)
>             (pc))) simd7.f90:12 -1
>      (int_list:REG_BR_PROB 450 (nil)))
>
> which results in following asm:
>
> $L35:
>     addl $25,$7,$2     # 597    addsi3/1    [length = 4]
>     addq $2,$2,$2     # 598    ashldi3/1    [length = 4]     <------ here
>     bne $24,$L145     # 601    *bcc_normal    [length = 4]
>     lda $4,4($20)     # 627    *adddi_internal/2    [length = 4]
>     ldl $8,0($20)     # 611    *extendsidi2_1/2    [length = 4]
>     lda $3,3($31)     # 74    *movdi/2    [length = 4]
>     cmpeq $8,$2,$2     # 612    *setcc_internal    [length = 4]  <-- compare
>     bne $2,$L40     # 613    *bcc_normal    [length = 4]
>     br $31,$L88     # 2403    jump    [length = 4]
>     .align 4
> ...
>
> Tracking the values with the debugger shows wrong calculation:
>
>    0x000000012000108c <+1788>:  addl    t10,t12,t1
>    0x0000000120001090 <+1792>:  addq    t1,t1,t1
>    ...
>    0x00000001200010a4 <+1812>:  cmpeq   t6,t1,t1
>    0x00000001200010a8 <+1816>:  bne     t1,0x1200010c0 <foo_+1840>
>
> (gdb) si
> 0x000000012000108c      17          l = l .or. any (b /= 7 + i)
> (gdb) i r t10 t12
> t10            0x7      7
> t12            0x7ffffffd       2147483645
>
> (gdb) si
> 0x0000000120001090      17          l = l .or. any (b /= 7 + i)
> (gdb) i r t1
> t1             0xffffffff80000004       -2147483644
>
> (gdb) si
> 18          l = l .or. any (c /= 8 + 2 * i)
> (gdb) i r t1
> t1             0xffffffff00000008       -4294967288
>
> At this point, the calculation should zero-extend SImode value to full
> DImode, since compare operates on DImode values. The problematic insn
> is (insn 599), which is now a DImode assignment instead of
> zero-extend, due to:
>
> --- a/gcc/cfgexpand.c
> +++ b/gcc/cfgexpand.c
> @@ -3309,7 +3309,13 @@ expand_gimple_stmt_1 (gimple stmt)
>    GET_MODE (target), temp, unsignedp);
>    }
>
> - convert_move (SUBREG_REG (target), temp, unsignedp);
> + if ((SUBREG_PROMOTED_GET (target) == SRP_SIGNED_AND_UNSIGNED)
> +    && (GET_CODE (temp) == SUBREG)
> +    && (GET_MODE (target) == GET_MODE (temp))
> +    && (GET_MODE (SUBREG_REG (target)) == GET_MODE (SUBREG_REG (temp))))
> +  emit_move_insn (SUBREG_REG (target), SUBREG_REG (temp));
> + else
> +  convert_move (SUBREG_REG (target), temp, unsignedp);
>        }
>      else if (nontemporal && emit_storent_insn (target, temp))
>        ;
>
> When compiling this code, we have:
>
> lhs = _63
> target = (subreg/s/v/u:SI (reg:DI 145 [ D.1694 ]) 0)
> temp = (subreg:SI (reg:DI 540) 0)
>
> So, the code assumes that it is possible to copy (reg:DI 540) directly
> to (reg:DI 154). However, this is not the case, since we still have
> garbage in the top 32bits.
>
> Reverting the part above fixes the runtime failure, since (insn 599) is now:
>
> (insn 599 598 0 (set (reg:DI 145 [ D.1694 ])
>         (zero_extend:DI (subreg:SI (reg:DI 540) 0))) -1
>      (nil))
>
> It looks to me that we have also to check the temp with SUBREG_PROMOTED_*.

Yeah, that makes sense.

Richard.

> Uros.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-08-27 10:07 ` Richard Biener
@ 2014-08-27 10:32   ` Uros Bizjak
  2014-08-27 10:32     ` Richard Biener
  2014-09-01  8:48     ` Jakub Jelinek
  2014-08-28  7:50   ` Kugan
  1 sibling, 2 replies; 51+ messages in thread
From: Uros Bizjak @ 2014-08-27 10:32 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches, Kugan, Jakub Jelinek

On Wed, Aug 27, 2014 at 12:07 PM, Richard Biener
<richard.guenther@gmail.com> wrote:
>> 2014-08-07  Kugan Vivekanandarajah  <kuganv@linaro.org>
>>>
>>> * calls.c (precompute_arguments): Check
>>> promoted_for_signed_and_unsigned_p and set the promoted mode.
>>> (promoted_for_signed_and_unsigned_p): New function.
>>> (expand_expr_real_1): Check promoted_for_signed_and_unsigned_p
>>> and set the promoted mode.
>>> * expr.h (promoted_for_signed_and_unsigned_p): New function definition.
>>> * cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if
>>> SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED.
>>
>> This patch regresses:
>>
>> Running target unix
>> FAIL: libgomp.fortran/simd7.f90   -O2  execution test
>> FAIL: libgomp.fortran/simd7.f90   -Os  execution test
>>
>> on alphaev6-linux-gnu.
>>
>> So, the code assumes that it is possible to copy (reg:DI 540) directly
>> to (reg:DI 154). However, this is not the case, since we still have
>> garbage in the top 32bits.
>>
>> Reverting the part above fixes the runtime failure, since (insn 599) is now:
>>
>> (insn 599 598 0 (set (reg:DI 145 [ D.1694 ])
>>         (zero_extend:DI (subreg:SI (reg:DI 540) 0))) -1
>>      (nil))
>>
>> It looks to me that we have also to check the temp with SUBREG_PROMOTED_*.
>
> Yeah, that makes sense.

Something like following (untested) patch that also fixes the testcase perhaps?

-- cut here--
Index: cfgexpand.c
===================================================================
--- cfgexpand.c (revision 214445)
+++ cfgexpand.c (working copy)
@@ -3322,6 +3322,7 @@ expand_gimple_stmt_1 (gimple stmt)

                if ((SUBREG_PROMOTED_GET (target) == SRP_SIGNED_AND_UNSIGNED)
                    && (GET_CODE (temp) == SUBREG)
+                   && SUBREG_PROMOTED_VAR_P (temp)
                    && (GET_MODE (target) == GET_MODE (temp))
                    && (GET_MODE (SUBREG_REG (target)) == GET_MODE
(SUBREG_REG (temp))))
                  emit_move_insn (SUBREG_REG (target), SUBREG_REG (temp));
-- cut here

Uros.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-08-27 10:32   ` Uros Bizjak
@ 2014-08-27 10:32     ` Richard Biener
  2014-09-01  8:48     ` Jakub Jelinek
  1 sibling, 0 replies; 51+ messages in thread
From: Richard Biener @ 2014-08-27 10:32 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: gcc-patches, Kugan, Jakub Jelinek

On Wed, Aug 27, 2014 at 12:25 PM, Uros Bizjak <ubizjak@gmail.com> wrote:
> On Wed, Aug 27, 2014 at 12:07 PM, Richard Biener
> <richard.guenther@gmail.com> wrote:
>>> 2014-08-07  Kugan Vivekanandarajah  <kuganv@linaro.org>
>>>>
>>>> * calls.c (precompute_arguments): Check
>>>> promoted_for_signed_and_unsigned_p and set the promoted mode.
>>>> (promoted_for_signed_and_unsigned_p): New function.
>>>> (expand_expr_real_1): Check promoted_for_signed_and_unsigned_p
>>>> and set the promoted mode.
>>>> * expr.h (promoted_for_signed_and_unsigned_p): New function definition.
>>>> * cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if
>>>> SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED.
>>>
>>> This patch regresses:
>>>
>>> Running target unix
>>> FAIL: libgomp.fortran/simd7.f90   -O2  execution test
>>> FAIL: libgomp.fortran/simd7.f90   -Os  execution test
>>>
>>> on alphaev6-linux-gnu.
>>>
>>> So, the code assumes that it is possible to copy (reg:DI 540) directly
>>> to (reg:DI 154). However, this is not the case, since we still have
>>> garbage in the top 32bits.
>>>
>>> Reverting the part above fixes the runtime failure, since (insn 599) is now:
>>>
>>> (insn 599 598 0 (set (reg:DI 145 [ D.1694 ])
>>>         (zero_extend:DI (subreg:SI (reg:DI 540) 0))) -1
>>>      (nil))
>>>
>>> It looks to me that we have also to check the temp with SUBREG_PROMOTED_*.
>>
>> Yeah, that makes sense.
>
> Something like following (untested) patch that also fixes the testcase perhaps?

Yes (though I'm not really familiar with the RTL side here and the
comment before SUBREG_PROMOTED_VAR_P looks odd)

Richard.

> -- cut here--
> Index: cfgexpand.c
> ===================================================================
> --- cfgexpand.c (revision 214445)
> +++ cfgexpand.c (working copy)
> @@ -3322,6 +3322,7 @@ expand_gimple_stmt_1 (gimple stmt)
>
>                 if ((SUBREG_PROMOTED_GET (target) == SRP_SIGNED_AND_UNSIGNED)
>                     && (GET_CODE (temp) == SUBREG)
> +                   && SUBREG_PROMOTED_VAR_P (temp)
>                     && (GET_MODE (target) == GET_MODE (temp))
>                     && (GET_MODE (SUBREG_REG (target)) == GET_MODE
> (SUBREG_REG (temp))))
>                   emit_move_insn (SUBREG_REG (target), SUBREG_REG (temp));
> -- cut here
>
> Uros.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-08-27 10:01 [PATCH 2/2] Enable elimination of zext/sext Uros Bizjak
  2014-08-27 10:07 ` Richard Biener
@ 2014-08-27 13:02 ` Kugan
  2014-08-28  3:46   ` Kugan
  1 sibling, 1 reply; 51+ messages in thread
From: Kugan @ 2014-08-27 13:02 UTC (permalink / raw)
  To: Uros Bizjak, gcc-patches; +Cc: Jakub Jelinek, Richard Biener

On 27/08/14 20:01, Uros Bizjak wrote:
> Hello!
> 
>> 2014-08-07  Kugan Vivekanandarajah  <kuganv@linaro.org>
>>
>> * calls.c (precompute_arguments): Check
>> promoted_for_signed_and_unsigned_p and set the promoted mode.
>> (promoted_for_signed_and_unsigned_p): New function.
>> (expand_expr_real_1): Check promoted_for_signed_and_unsigned_p
>> and set the promoted mode.
>> * expr.h (promoted_for_signed_and_unsigned_p): New function definition.
>> * cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if
>> SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED.
> 
> This patch regresses:
> 
> Running target unix
> FAIL: libgomp.fortran/simd7.f90   -O2  execution test
> FAIL: libgomp.fortran/simd7.f90   -Os  execution test
> 

[snip]

> When compiling this code, we have:
> 
> lhs = _63
> target = (subreg/s/v/u:SI (reg:DI 145 [ D.1694 ]) 0)
> temp = (subreg:SI (reg:DI 540) 0)
> 
> So, the code assumes that it is possible to copy (reg:DI 540) directly
> to (reg:DI 154). However, this is not the case, since we still have
> garbage in the top 32bits.
> 
> Reverting the part above fixes the runtime failure, since (insn 599) is now:
> 
> (insn 599 598 0 (set (reg:DI 145 [ D.1694 ])
>         (zero_extend:DI (subreg:SI (reg:DI 540) 0))) -1
>      (nil))
> 
> It looks to me that we have also to check the temp with SUBREG_PROMOTED_*.

Sorry for the breakage. I am looking into this now and I can reproduce
it on qemu-alpha.

I have noticed the following VRP data which is used in deciding this
erroneous removal. It seems suspicious to me.

_343: [2147483652, 2147483715]
_344: [8, 134]
_345: [8, 134]

_343 = ivtmp.179_52 + 2147483645;
_344 = _343 * 2;
_345 = (integer(kind=4)) _344;

Error comes from the third statement.

Thanks,
Kugan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-08-27 13:02 ` [PATCH 2/2] Enable elimination of zext/sext Kugan
@ 2014-08-28  3:46   ` Kugan
  2014-08-28  6:44     ` Marc Glisse
  0 siblings, 1 reply; 51+ messages in thread
From: Kugan @ 2014-08-28  3:46 UTC (permalink / raw)
  To: Uros Bizjak, gcc-patches, Richard Biener; +Cc: Jakub Jelinek


On 27/08/14 23:02, Kugan wrote:
> On 27/08/14 20:01, Uros Bizjak wrote:
>> Hello!
>>
>>> 2014-08-07  Kugan Vivekanandarajah  <kuganv@linaro.org>
>>>
>>> * calls.c (precompute_arguments): Check
>>> promoted_for_signed_and_unsigned_p and set the promoted mode.
>>> (promoted_for_signed_and_unsigned_p): New function.
>>> (expand_expr_real_1): Check promoted_for_signed_and_unsigned_p
>>> and set the promoted mode.
>>> * expr.h (promoted_for_signed_and_unsigned_p): New function definition.
>>> * cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if
>>> SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED.
>>
>> This patch regresses:
>>
>> Running target unix
>> FAIL: libgomp.fortran/simd7.f90   -O2  execution test
>> FAIL: libgomp.fortran/simd7.f90   -Os  execution test
>>
> 
> [snip]
> 
>> When compiling this code, we have:
>>
>> lhs = _63
>> target = (subreg/s/v/u:SI (reg:DI 145 [ D.1694 ]) 0)
>> temp = (subreg:SI (reg:DI 540) 0)
>>
>> So, the code assumes that it is possible to copy (reg:DI 540) directly
>> to (reg:DI 154). However, this is not the case, since we still have
>> garbage in the top 32bits.
>>
>> Reverting the part above fixes the runtime failure, since (insn 599) is now:
>>
>> (insn 599 598 0 (set (reg:DI 145 [ D.1694 ])
>>         (zero_extend:DI (subreg:SI (reg:DI 540) 0))) -1
>>      (nil))
>>
>> It looks to me that we have also to check the temp with SUBREG_PROMOTED_*.
> 
> Sorry for the breakage. I am looking into this now and I can reproduce
> it on qemu-alpha.
> 
> I have noticed the following VRP data which is used in deciding this
> erroneous removal. It seems suspicious to me.
> 
> _343: [2147483652, 2147483715]
> _344: [8, 134]
> _345: [8, 134]
> 
> _343 = ivtmp.179_52 + 2147483645;
> _344 = _343 * 2;
> _345 = (integer(kind=4)) _344;
> 
> Error comes from the third statement.

In tree-vrp.c, in extract_range_from_binary_expr_1, there is a loss of
precision and the value_range is truncated. For the test-case provided
by Uros, it is

_344 = _343 * 2;
[...,0x100000008], precision = 384
[...,0x100000086], precision = 384

and it is converted to following when it goes from wide_int to tree.
[8, 134]

How about doing something like this to fix it.

diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index d16fd8a..c0fb902 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -2625,6 +2625,8 @@ extract_range_from_binary_expr_1 (value_range_t *vr,
              <wi::extended_tree <WIDE_INT_MAX_PRECISION * 2> > vrp_int_cst;
 	  vrp_int sizem1 = wi::mask <vrp_int> (prec, false);
 	  vrp_int size = sizem1 + 1;
+	  vrp_int type_min = vrp_int_cst (TYPE_MIN_VALUE (expr_type));
+	  vrp_int type_max = vrp_int_cst (TYPE_MAX_VALUE (expr_type));

 	  /* Extend the values using the sign of the result to PREC2.
 	     From here on out, everthing is just signed math no matter
@@ -2688,7 +2690,9 @@ extract_range_from_binary_expr_1 (value_range_t *vr,

 	  /* diff = max - min.  */
 	  prod2 = prod3 - prod0;
-	  if (wi::geu_p (prod2, sizem1))
+	  if (wi::geu_p (prod2, sizem1)
+	      || wi::lts_p (prod0, type_min)
+	      || wi::gts_p (prod3, type_max))
 	    {
 	      /* the range covers all values.  */
 	      set_value_range_to_varying (vr);


If this looks reasonable I will do proper testing and post the results
with the Changelog.

Thanks,
Kugan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-08-28  3:46   ` Kugan
@ 2014-08-28  6:44     ` Marc Glisse
  2014-08-28  7:29       ` Kugan
  0 siblings, 1 reply; 51+ messages in thread
From: Marc Glisse @ 2014-08-28  6:44 UTC (permalink / raw)
  To: Kugan; +Cc: Uros Bizjak, gcc-patches, Richard Biener, Jakub Jelinek

On Thu, 28 Aug 2014, Kugan wrote:

> On 27/08/14 23:02, Kugan wrote:
>> On 27/08/14 20:01, Uros Bizjak wrote:
>>> Hello!
>>>
>>>> 2014-08-07  Kugan Vivekanandarajah  <kuganv@linaro.org>
>>>>
>>>> * calls.c (precompute_arguments): Check
>>>> promoted_for_signed_and_unsigned_p and set the promoted mode.
>>>> (promoted_for_signed_and_unsigned_p): New function.
>>>> (expand_expr_real_1): Check promoted_for_signed_and_unsigned_p
>>>> and set the promoted mode.
>>>> * expr.h (promoted_for_signed_and_unsigned_p): New function definition.
>>>> * cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if
>>>> SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED.
>>>
>>> This patch regresses:
>>>
>>> Running target unix
>>> FAIL: libgomp.fortran/simd7.f90   -O2  execution test
>>> FAIL: libgomp.fortran/simd7.f90   -Os  execution test
>>>
>>
>> [snip]
>>
>>> When compiling this code, we have:
>>>
>>> lhs = _63
>>> target = (subreg/s/v/u:SI (reg:DI 145 [ D.1694 ]) 0)
>>> temp = (subreg:SI (reg:DI 540) 0)
>>>
>>> So, the code assumes that it is possible to copy (reg:DI 540) directly
>>> to (reg:DI 154). However, this is not the case, since we still have
>>> garbage in the top 32bits.
>>>
>>> Reverting the part above fixes the runtime failure, since (insn 599) is now:
>>>
>>> (insn 599 598 0 (set (reg:DI 145 [ D.1694 ])
>>>         (zero_extend:DI (subreg:SI (reg:DI 540) 0))) -1
>>>      (nil))
>>>
>>> It looks to me that we have also to check the temp with SUBREG_PROMOTED_*.
>>
>> Sorry for the breakage. I am looking into this now and I can reproduce
>> it on qemu-alpha.
>>
>> I have noticed the following VRP data which is used in deciding this
>> erroneous removal. It seems suspicious to me.
>>
>> _343: [2147483652, 2147483715]
>> _344: [8, 134]
>> _345: [8, 134]
>>
>> _343 = ivtmp.179_52 + 2147483645;
>> _344 = _343 * 2;
>> _345 = (integer(kind=4)) _344;
>>
>> Error comes from the third statement.
>
> In tree-vrp.c, in extract_range_from_binary_expr_1, there is a loss of
> precision and the value_range is truncated. For the test-case provided
> by Uros, it is
>
> _344 = _343 * 2;
> [...,0x100000008], precision = 384
> [...,0x100000086], precision = 384
>
> and it is converted to following when it goes from wide_int to tree.
> [8, 134]

Why do you believe that is wrong? Assuming _344 has a 32 bit type with 
wrapping overflow, this is just doing the wrapping modulo 2^32.

-- 
Marc Glisse

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-08-28  6:44     ` Marc Glisse
@ 2014-08-28  7:29       ` Kugan
  0 siblings, 0 replies; 51+ messages in thread
From: Kugan @ 2014-08-28  7:29 UTC (permalink / raw)
  To: gcc-patches; +Cc: Uros Bizjak, Richard Biener, Jakub Jelinek



On 28/08/14 16:44, Marc Glisse wrote:
> On Thu, 28 Aug 2014, Kugan wrote:
> 
>> On 27/08/14 23:02, Kugan wrote:
>>> On 27/08/14 20:01, Uros Bizjak wrote:
>>>> Hello!
>>>>
>>>>> 2014-08-07  Kugan Vivekanandarajah  <kuganv@linaro.org>
>>>>>
>>>>> * calls.c (precompute_arguments): Check
>>>>> promoted_for_signed_and_unsigned_p and set the promoted mode.
>>>>> (promoted_for_signed_and_unsigned_p): New function.
>>>>> (expand_expr_real_1): Check promoted_for_signed_and_unsigned_p
>>>>> and set the promoted mode.
>>>>> * expr.h (promoted_for_signed_and_unsigned_p): New function
>>>>> definition.
>>>>> * cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if
>>>>> SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED.
>>>>
>>>> This patch regresses:
>>>>
>>>> Running target unix
>>>> FAIL: libgomp.fortran/simd7.f90   -O2  execution test
>>>> FAIL: libgomp.fortran/simd7.f90   -Os  execution test
>>>>
>>>
>>> [snip]
>>>
>>>> When compiling this code, we have:
>>>>
>>>> lhs = _63
>>>> target = (subreg/s/v/u:SI (reg:DI 145 [ D.1694 ]) 0)
>>>> temp = (subreg:SI (reg:DI 540) 0)
>>>>
>>>> So, the code assumes that it is possible to copy (reg:DI 540) directly
>>>> to (reg:DI 154). However, this is not the case, since we still have
>>>> garbage in the top 32bits.
>>>>
>>>> Reverting the part above fixes the runtime failure, since (insn 599)
>>>> is now:
>>>>
>>>> (insn 599 598 0 (set (reg:DI 145 [ D.1694 ])
>>>>         (zero_extend:DI (subreg:SI (reg:DI 540) 0))) -1
>>>>      (nil))
>>>>
>>>> It looks to me that we have also to check the temp with
>>>> SUBREG_PROMOTED_*.
>>>
>>> Sorry for the breakage. I am looking into this now and I can reproduce
>>> it on qemu-alpha.
>>>
>>> I have noticed the following VRP data which is used in deciding this
>>> erroneous removal. It seems suspicious to me.
>>>
>>> _343: [2147483652, 2147483715]
>>> _344: [8, 134]
>>> _345: [8, 134]
>>>
>>> _343 = ivtmp.179_52 + 2147483645;
>>> _344 = _343 * 2;
>>> _345 = (integer(kind=4)) _344;
>>>
>>> Error comes from the third statement.
>>
>> In tree-vrp.c, in extract_range_from_binary_expr_1, there is a loss of
>> precision and the value_range is truncated. For the test-case provided
>> by Uros, it is
>>
>> _344 = _343 * 2;
>> [...,0x100000008], precision = 384
>> [...,0x100000086], precision = 384
>>
>> and it is converted to following when it goes from wide_int to tree.
>> [8, 134]
> 
> Why do you believe that is wrong? Assuming _344 has a 32 bit type with
> wrapping overflow, this is just doing the wrapping modulo 2^32.
> 

Indeed. I missed the TYPE_OVERFLOW_WRAPS check earlier. Thanks for
pointing me to that.

Kugan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-08-27 10:07 ` Richard Biener
  2014-08-27 10:32   ` Uros Bizjak
@ 2014-08-28  7:50   ` Kugan
  2014-08-28  8:57     ` Richard Biener
  1 sibling, 1 reply; 51+ messages in thread
From: Kugan @ 2014-08-28  7:50 UTC (permalink / raw)
  To: Richard Biener, Uros Bizjak; +Cc: gcc-patches, Jakub Jelinek



On 27/08/14 20:07, Richard Biener wrote:
> On Wed, Aug 27, 2014 at 12:01 PM, Uros Bizjak <ubizjak@gmail.com> wrote:
>> Hello!
>>
>>> 2014-08-07  Kugan Vivekanandarajah  <kuganv@linaro.org>
>>>
>>> * calls.c (precompute_arguments): Check
>>> promoted_for_signed_and_unsigned_p and set the promoted mode.
>>> (promoted_for_signed_and_unsigned_p): New function.
>>> (expand_expr_real_1): Check promoted_for_signed_and_unsigned_p
>>> and set the promoted mode.
>>> * expr.h (promoted_for_signed_and_unsigned_p): New function definition.
>>> * cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if
>>> SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED.
>>
>> This patch regresses:
>>
>> Running target unix
>> FAIL: libgomp.fortran/simd7.f90   -O2  execution test
>> FAIL: libgomp.fortran/simd7.f90   -Os  execution test
>>
>> on alphaev6-linux-gnu.
>>
>> The problem can be illustrated with attached testcase with a
>> crosscompiler to alphaev68-linux-gnu (-O2 -fopenmp). The problem is in
>> missing SImode extension after DImode shift of SImode subregs for this
>> part:
>>
>> --cut here--
>>   # test.23_12 = PHI <0(37), 1(36)>
>>   _242 = ivtmp.181_73 + 2147483645;
>>   _240 = _242 * 2;
>>   _63 = (integer(kind=4)) _240;
>>   if (ubound.6_99 <= 2)
>>     goto <bb 39>;
>>   else
>>     goto <bb 40>;
>> ;;    succ:       39
>> ;;                40
>>
>> ;;   basic block 39, loop depth 1
>> ;;    pred:       38
>>   pretmp_337 = test.23_12 | l_76;
>>   goto <bb 45>;
>> ;;    succ:       45
>>
>> ;;   basic block 40, loop depth 1
>> ;;    pred:       38
>>   _11 = *c_208[0];
>>   if (_11 != _63)
>>     goto <bb 45>;
>>   else
>>     goto <bb 42>;
>> --cut here--
>>
>> this expands to:
>>
>> (code_label 592 591 593 35 "" [0 uses])
>>
>> (note 593 592 0 NOTE_INSN_BASIC_BLOCK)
>>
>> ;; _63 = (integer(kind=4)) _240;
>>
>> (insn 594 593 595 (set (reg:SI 538)
>>         (const_int 1073741824 [0x40000000])) -1
>>      (nil))
>>
>> (insn 595 594 596 (set (reg:SI 539)
>>         (plus:SI (reg:SI 538)
>>             (const_int 1073741824 [0x40000000]))) -1
>>      (nil))
>>
>> (insn 596 595 597 (set (reg:SI 537)
>>         (plus:SI (reg:SI 539)
>>             (const_int -3 [0xfffffffffffffffd]))) -1
>>      (expr_list:REG_EQUAL (const_int 2147483645 [0x7ffffffd])
>>         (nil)))
>>
>> (insn 597 596 598 (set (reg:SI 536 [ D.1700 ])
>>         (plus:SI (subreg/s/v/u:SI (reg:DI 144 [ ivtmp.181 ]) 0)
>>             (reg:SI 537))) -1
>>      (nil))
>>
>> (insn 598 597 599 (set (reg:DI 540)
>>         (ashift:DI (subreg:DI (reg:SI 536 [ D.1700 ]) 0)
>>             (const_int 1 [0x1]))) -1
>>      (nil))
>>
>> (insn 599 598 0 (set (reg:DI 145 [ D.1694 ])
>>         (reg:DI 540)) -1
>>      (nil))
>>
>> ...
>>
>> (note 610 609 0 NOTE_INSN_BASIC_BLOCK)
>>
>> ;; _11 = *c_208[0];
>>
>> (insn 611 610 0 (set (reg:DI 120 [ D.1694 ])
>>         (sign_extend:DI (mem:SI (reg/v/f:DI 227 [ c ]) [7 *c_208+0 S4
>> A128]))) simd7.f90:12 -1
>>      (nil))
>>
>> ;; if (_11 != _63)
>>
>> (insn 612 611 613 40 (set (reg:DI 545)
>>         (eq:DI (reg:DI 120 [ D.1694 ])
>>             (reg:DI 145 [ D.1694 ]))) simd7.f90:12 -1
>>      (nil))
>>
>> (jump_insn 613 612 616 40 (set (pc)
>>         (if_then_else (eq (reg:DI 545)
>>                 (const_int 0 [0]))
>>             (label_ref 0)
>>             (pc))) simd7.f90:12 -1
>>      (int_list:REG_BR_PROB 450 (nil)))
>>
>> which results in following asm:
>>
>> $L35:
>>     addl $25,$7,$2     # 597    addsi3/1    [length = 4]
>>     addq $2,$2,$2     # 598    ashldi3/1    [length = 4]     <------ here
>>     bne $24,$L145     # 601    *bcc_normal    [length = 4]
>>     lda $4,4($20)     # 627    *adddi_internal/2    [length = 4]
>>     ldl $8,0($20)     # 611    *extendsidi2_1/2    [length = 4]
>>     lda $3,3($31)     # 74    *movdi/2    [length = 4]
>>     cmpeq $8,$2,$2     # 612    *setcc_internal    [length = 4]  <-- compare
>>     bne $2,$L40     # 613    *bcc_normal    [length = 4]
>>     br $31,$L88     # 2403    jump    [length = 4]
>>     .align 4
>> ...
>>
>> Tracking the values with the debugger shows wrong calculation:
>>
>>    0x000000012000108c <+1788>:  addl    t10,t12,t1
>>    0x0000000120001090 <+1792>:  addq    t1,t1,t1
>>    ...
>>    0x00000001200010a4 <+1812>:  cmpeq   t6,t1,t1
>>    0x00000001200010a8 <+1816>:  bne     t1,0x1200010c0 <foo_+1840>
>>
>> (gdb) si
>> 0x000000012000108c      17          l = l .or. any (b /= 7 + i)
>> (gdb) i r t10 t12
>> t10            0x7      7
>> t12            0x7ffffffd       2147483645
>>
>> (gdb) si
>> 0x0000000120001090      17          l = l .or. any (b /= 7 + i)
>> (gdb) i r t1
>> t1             0xffffffff80000004       -2147483644
>>
>> (gdb) si
>> 18          l = l .or. any (c /= 8 + 2 * i)
>> (gdb) i r t1
>> t1             0xffffffff00000008       -4294967288
>>
>> At this point, the calculation should zero-extend SImode value to full
>> DImode, since compare operates on DImode values. The problematic insn
>> is (insn 599), which is now a DImode assignment instead of
>> zero-extend, due to:
>>
>> --- a/gcc/cfgexpand.c
>> +++ b/gcc/cfgexpand.c
>> @@ -3309,7 +3309,13 @@ expand_gimple_stmt_1 (gimple stmt)
>>    GET_MODE (target), temp, unsignedp);
>>    }
>>
>> - convert_move (SUBREG_REG (target), temp, unsignedp);
>> + if ((SUBREG_PROMOTED_GET (target) == SRP_SIGNED_AND_UNSIGNED)
>> +    && (GET_CODE (temp) == SUBREG)
>> +    && (GET_MODE (target) == GET_MODE (temp))
>> +    && (GET_MODE (SUBREG_REG (target)) == GET_MODE (SUBREG_REG (temp))))
>> +  emit_move_insn (SUBREG_REG (target), SUBREG_REG (temp));
>> + else
>> +  convert_move (SUBREG_REG (target), temp, unsignedp);
>>        }
>>      else if (nontemporal && emit_storent_insn (target, temp))
>>        ;
>>
>> When compiling this code, we have:
>>
>> lhs = _63
>> target = (subreg/s/v/u:SI (reg:DI 145 [ D.1694 ]) 0)
>> temp = (subreg:SI (reg:DI 540) 0)
>>
>> So, the code assumes that it is possible to copy (reg:DI 540) directly
>> to (reg:DI 154). However, this is not the case, since we still have
>> garbage in the top 32bits.
>>
>> Reverting the part above fixes the runtime failure, since (insn 599) is now:
>>
>> (insn 599 598 0 (set (reg:DI 145 [ D.1694 ])
>>         (zero_extend:DI (subreg:SI (reg:DI 540) 0))) -1
>>      (nil))
>>
>> It looks to me that we have also to check the temp with SUBREG_PROMOTED_*.
> 
> Yeah, that makes sense.
> 

Thanks Richard for your comments.

I added this part of the code (in cfgexpand.c) to handle binary/unary/..
gimple operations and used the LHS value range to infer the assigned
value range. I will revert this part of the code as this is wrong.

I dont think checking promoted_mode for temp will be necessary here as
convert_move will handle it correctly if promoted_mode is set for temp.

Thus, I will reimplement setting promoted_mode to temp (in
expand_expr_real_2) based on the gimple statement content on RHS. i.e.
by looking at the RHS operands and its value ranges and by calculating
the resulting value range. Does this sound OK to you.

Thanks,
Kugan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-08-28  7:50   ` Kugan
@ 2014-08-28  8:57     ` Richard Biener
  2014-09-04  3:41       ` Kugan
  0 siblings, 1 reply; 51+ messages in thread
From: Richard Biener @ 2014-08-28  8:57 UTC (permalink / raw)
  To: Kugan; +Cc: Uros Bizjak, gcc-patches, Jakub Jelinek

On Thu, Aug 28, 2014 at 9:50 AM, Kugan
<kugan.vivekanandarajah@linaro.org> wrote:
>
>
> On 27/08/14 20:07, Richard Biener wrote:
>> On Wed, Aug 27, 2014 at 12:01 PM, Uros Bizjak <ubizjak@gmail.com> wrote:
>>> Hello!
>>>
>>>> 2014-08-07  Kugan Vivekanandarajah  <kuganv@linaro.org>
>>>>
>>>> * calls.c (precompute_arguments): Check
>>>> promoted_for_signed_and_unsigned_p and set the promoted mode.
>>>> (promoted_for_signed_and_unsigned_p): New function.
>>>> (expand_expr_real_1): Check promoted_for_signed_and_unsigned_p
>>>> and set the promoted mode.
>>>> * expr.h (promoted_for_signed_and_unsigned_p): New function definition.
>>>> * cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if
>>>> SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED.
>>>
>>> This patch regresses:
>>>
>>> Running target unix
>>> FAIL: libgomp.fortran/simd7.f90   -O2  execution test
>>> FAIL: libgomp.fortran/simd7.f90   -Os  execution test
>>>
>>> on alphaev6-linux-gnu.
>>>
>>> The problem can be illustrated with attached testcase with a
>>> crosscompiler to alphaev68-linux-gnu (-O2 -fopenmp). The problem is in
>>> missing SImode extension after DImode shift of SImode subregs for this
>>> part:
>>>
>>> --cut here--
>>>   # test.23_12 = PHI <0(37), 1(36)>
>>>   _242 = ivtmp.181_73 + 2147483645;
>>>   _240 = _242 * 2;
>>>   _63 = (integer(kind=4)) _240;
>>>   if (ubound.6_99 <= 2)
>>>     goto <bb 39>;
>>>   else
>>>     goto <bb 40>;
>>> ;;    succ:       39
>>> ;;                40
>>>
>>> ;;   basic block 39, loop depth 1
>>> ;;    pred:       38
>>>   pretmp_337 = test.23_12 | l_76;
>>>   goto <bb 45>;
>>> ;;    succ:       45
>>>
>>> ;;   basic block 40, loop depth 1
>>> ;;    pred:       38
>>>   _11 = *c_208[0];
>>>   if (_11 != _63)
>>>     goto <bb 45>;
>>>   else
>>>     goto <bb 42>;
>>> --cut here--
>>>
>>> this expands to:
>>>
>>> (code_label 592 591 593 35 "" [0 uses])
>>>
>>> (note 593 592 0 NOTE_INSN_BASIC_BLOCK)
>>>
>>> ;; _63 = (integer(kind=4)) _240;
>>>
>>> (insn 594 593 595 (set (reg:SI 538)
>>>         (const_int 1073741824 [0x40000000])) -1
>>>      (nil))
>>>
>>> (insn 595 594 596 (set (reg:SI 539)
>>>         (plus:SI (reg:SI 538)
>>>             (const_int 1073741824 [0x40000000]))) -1
>>>      (nil))
>>>
>>> (insn 596 595 597 (set (reg:SI 537)
>>>         (plus:SI (reg:SI 539)
>>>             (const_int -3 [0xfffffffffffffffd]))) -1
>>>      (expr_list:REG_EQUAL (const_int 2147483645 [0x7ffffffd])
>>>         (nil)))
>>>
>>> (insn 597 596 598 (set (reg:SI 536 [ D.1700 ])
>>>         (plus:SI (subreg/s/v/u:SI (reg:DI 144 [ ivtmp.181 ]) 0)
>>>             (reg:SI 537))) -1
>>>      (nil))
>>>
>>> (insn 598 597 599 (set (reg:DI 540)
>>>         (ashift:DI (subreg:DI (reg:SI 536 [ D.1700 ]) 0)
>>>             (const_int 1 [0x1]))) -1
>>>      (nil))
>>>
>>> (insn 599 598 0 (set (reg:DI 145 [ D.1694 ])
>>>         (reg:DI 540)) -1
>>>      (nil))
>>>
>>> ...
>>>
>>> (note 610 609 0 NOTE_INSN_BASIC_BLOCK)
>>>
>>> ;; _11 = *c_208[0];
>>>
>>> (insn 611 610 0 (set (reg:DI 120 [ D.1694 ])
>>>         (sign_extend:DI (mem:SI (reg/v/f:DI 227 [ c ]) [7 *c_208+0 S4
>>> A128]))) simd7.f90:12 -1
>>>      (nil))
>>>
>>> ;; if (_11 != _63)
>>>
>>> (insn 612 611 613 40 (set (reg:DI 545)
>>>         (eq:DI (reg:DI 120 [ D.1694 ])
>>>             (reg:DI 145 [ D.1694 ]))) simd7.f90:12 -1
>>>      (nil))
>>>
>>> (jump_insn 613 612 616 40 (set (pc)
>>>         (if_then_else (eq (reg:DI 545)
>>>                 (const_int 0 [0]))
>>>             (label_ref 0)
>>>             (pc))) simd7.f90:12 -1
>>>      (int_list:REG_BR_PROB 450 (nil)))
>>>
>>> which results in following asm:
>>>
>>> $L35:
>>>     addl $25,$7,$2     # 597    addsi3/1    [length = 4]
>>>     addq $2,$2,$2     # 598    ashldi3/1    [length = 4]     <------ here
>>>     bne $24,$L145     # 601    *bcc_normal    [length = 4]
>>>     lda $4,4($20)     # 627    *adddi_internal/2    [length = 4]
>>>     ldl $8,0($20)     # 611    *extendsidi2_1/2    [length = 4]
>>>     lda $3,3($31)     # 74    *movdi/2    [length = 4]
>>>     cmpeq $8,$2,$2     # 612    *setcc_internal    [length = 4]  <-- compare
>>>     bne $2,$L40     # 613    *bcc_normal    [length = 4]
>>>     br $31,$L88     # 2403    jump    [length = 4]
>>>     .align 4
>>> ...
>>>
>>> Tracking the values with the debugger shows wrong calculation:
>>>
>>>    0x000000012000108c <+1788>:  addl    t10,t12,t1
>>>    0x0000000120001090 <+1792>:  addq    t1,t1,t1
>>>    ...
>>>    0x00000001200010a4 <+1812>:  cmpeq   t6,t1,t1
>>>    0x00000001200010a8 <+1816>:  bne     t1,0x1200010c0 <foo_+1840>
>>>
>>> (gdb) si
>>> 0x000000012000108c      17          l = l .or. any (b /= 7 + i)
>>> (gdb) i r t10 t12
>>> t10            0x7      7
>>> t12            0x7ffffffd       2147483645
>>>
>>> (gdb) si
>>> 0x0000000120001090      17          l = l .or. any (b /= 7 + i)
>>> (gdb) i r t1
>>> t1             0xffffffff80000004       -2147483644
>>>
>>> (gdb) si
>>> 18          l = l .or. any (c /= 8 + 2 * i)
>>> (gdb) i r t1
>>> t1             0xffffffff00000008       -4294967288
>>>
>>> At this point, the calculation should zero-extend SImode value to full
>>> DImode, since compare operates on DImode values. The problematic insn
>>> is (insn 599), which is now a DImode assignment instead of
>>> zero-extend, due to:
>>>
>>> --- a/gcc/cfgexpand.c
>>> +++ b/gcc/cfgexpand.c
>>> @@ -3309,7 +3309,13 @@ expand_gimple_stmt_1 (gimple stmt)
>>>    GET_MODE (target), temp, unsignedp);
>>>    }
>>>
>>> - convert_move (SUBREG_REG (target), temp, unsignedp);
>>> + if ((SUBREG_PROMOTED_GET (target) == SRP_SIGNED_AND_UNSIGNED)
>>> +    && (GET_CODE (temp) == SUBREG)
>>> +    && (GET_MODE (target) == GET_MODE (temp))
>>> +    && (GET_MODE (SUBREG_REG (target)) == GET_MODE (SUBREG_REG (temp))))
>>> +  emit_move_insn (SUBREG_REG (target), SUBREG_REG (temp));
>>> + else
>>> +  convert_move (SUBREG_REG (target), temp, unsignedp);
>>>        }
>>>      else if (nontemporal && emit_storent_insn (target, temp))
>>>        ;
>>>
>>> When compiling this code, we have:
>>>
>>> lhs = _63
>>> target = (subreg/s/v/u:SI (reg:DI 145 [ D.1694 ]) 0)
>>> temp = (subreg:SI (reg:DI 540) 0)
>>>
>>> So, the code assumes that it is possible to copy (reg:DI 540) directly
>>> to (reg:DI 154). However, this is not the case, since we still have
>>> garbage in the top 32bits.
>>>
>>> Reverting the part above fixes the runtime failure, since (insn 599) is now:
>>>
>>> (insn 599 598 0 (set (reg:DI 145 [ D.1694 ])
>>>         (zero_extend:DI (subreg:SI (reg:DI 540) 0))) -1
>>>      (nil))
>>>
>>> It looks to me that we have also to check the temp with SUBREG_PROMOTED_*.
>>
>> Yeah, that makes sense.
>>
>
> Thanks Richard for your comments.
>
> I added this part of the code (in cfgexpand.c) to handle binary/unary/..
> gimple operations and used the LHS value range to infer the assigned
> value range. I will revert this part of the code as this is wrong.
>
> I dont think checking promoted_mode for temp will be necessary here as
> convert_move will handle it correctly if promoted_mode is set for temp.
>
> Thus, I will reimplement setting promoted_mode to temp (in
> expand_expr_real_2) based on the gimple statement content on RHS. i.e.
> by looking at the RHS operands and its value ranges and by calculating
> the resulting value range. Does this sound OK to you.

No, this sounds backward again and won't work because those operands
again could be just truncated - thus you can't rely on their value-range.

What you would need is VRP computing value-ranges in the promoted
mode from the start (and it doesn't do that).

Richard.

> Thanks,
> Kugan
>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-08-27 10:32   ` Uros Bizjak
  2014-08-27 10:32     ` Richard Biener
@ 2014-09-01  8:48     ` Jakub Jelinek
  2014-09-01  8:54       ` Uros Bizjak
  1 sibling, 1 reply; 51+ messages in thread
From: Jakub Jelinek @ 2014-09-01  8:48 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: Richard Biener, gcc-patches, Kugan

On Wed, Aug 27, 2014 at 12:25:14PM +0200, Uros Bizjak wrote:
> Something like following (untested) patch that also fixes the testcase perhaps?
> 
> -- cut here--
> Index: cfgexpand.c
> ===================================================================
> --- cfgexpand.c (revision 214445)
> +++ cfgexpand.c (working copy)
> @@ -3322,6 +3322,7 @@ expand_gimple_stmt_1 (gimple stmt)
> 
>                 if ((SUBREG_PROMOTED_GET (target) == SRP_SIGNED_AND_UNSIGNED)
>                     && (GET_CODE (temp) == SUBREG)
> +                   && SUBREG_PROMOTED_VAR_P (temp)
>                     && (GET_MODE (target) == GET_MODE (temp))
>                     && (GET_MODE (SUBREG_REG (target)) == GET_MODE
> (SUBREG_REG (temp))))

Looks like a wrong order of the predicates in any case, first you should
check if it is a SUBREG, then SUBREG_PROMOTED_VAR_P and only then
SUBREG_PROMOTED_GET.  Also, the extra ()s around single line conditions
are unnecessary.

>                   emit_move_insn (SUBREG_REG (target), SUBREG_REG (temp));
> -- cut here
> 
> Uros.

	Jakub

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-09-01  8:48     ` Jakub Jelinek
@ 2014-09-01  8:54       ` Uros Bizjak
  0 siblings, 0 replies; 51+ messages in thread
From: Uros Bizjak @ 2014-09-01  8:54 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Richard Biener, gcc-patches, Kugan

On Mon, Sep 1, 2014 at 10:47 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Wed, Aug 27, 2014 at 12:25:14PM +0200, Uros Bizjak wrote:
>> Something like following (untested) patch that also fixes the testcase perhaps?
>>
>> -- cut here--
>> Index: cfgexpand.c
>> ===================================================================
>> --- cfgexpand.c (revision 214445)
>> +++ cfgexpand.c (working copy)
>> @@ -3322,6 +3322,7 @@ expand_gimple_stmt_1 (gimple stmt)
>>
>>                 if ((SUBREG_PROMOTED_GET (target) == SRP_SIGNED_AND_UNSIGNED)
>>                     && (GET_CODE (temp) == SUBREG)
>> +                   && SUBREG_PROMOTED_VAR_P (temp)
>>                     && (GET_MODE (target) == GET_MODE (temp))
>>                     && (GET_MODE (SUBREG_REG (target)) == GET_MODE
>> (SUBREG_REG (temp))))
>
> Looks like a wrong order of the predicates in any case, first you should
> check if it is a SUBREG, then SUBREG_PROMOTED_VAR_P and only then
> SUBREG_PROMOTED_GET.  Also, the extra ()s around single line conditions
> are unnecessary.

This comment applies to the original code, not the patched line, I guess.

Uros.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-08-28  8:57     ` Richard Biener
@ 2014-09-04  3:41       ` Kugan
  2014-09-04 13:00         ` Richard Biener
  0 siblings, 1 reply; 51+ messages in thread
From: Kugan @ 2014-09-04  3:41 UTC (permalink / raw)
  To: Richard Biener; +Cc: Uros Bizjak, gcc-patches, Jakub Jelinek

[-- Attachment #1: Type: text/plain, Size: 2196 bytes --]

>> I added this part of the code (in cfgexpand.c) to handle binary/unary/..
>> gimple operations and used the LHS value range to infer the assigned
>> value range. I will revert this part of the code as this is wrong.
>>
>> I dont think checking promoted_mode for temp will be necessary here as
>> convert_move will handle it correctly if promoted_mode is set for temp.
>>
>> Thus, I will reimplement setting promoted_mode to temp (in
>> expand_expr_real_2) based on the gimple statement content on RHS. i.e.
>> by looking at the RHS operands and its value ranges and by calculating
>> the resulting value range. Does this sound OK to you.
> 
> No, this sounds backward again and won't work because those operands
> again could be just truncated - thus you can't rely on their value-range.
> 
> What you would need is VRP computing value-ranges in the promoted
> mode from the start (and it doesn't do that).


Hi Richard,

Here is an attempt to do the value range computation in promoted_mode's
type when it is overflowing. Bootstrapped on x86-84.

Based on your feedback, I will do more testing on this.

Thanks for your time,
Kugan

gcc/ChangeLog:

2014-09-04  Kugan Vivekanandarajah <kuganv@linaro.org>

	* tree-ssa-ccp.c (ccp_finalize): Adjust the nonzero_bits precision to
	the type.
	(evaluate_stmt): Likewise.
	* tree-ssanames.c (set_range_info): Adjust if the precision of stored
	value range is different.
	* tree-vrp.c (normalize_int_cst_precision): New function.
	(set_value_range): Add assert to check precision.
	(set_and_canonicalize_value_range): Call normalize_int_cst_precision
	on min and max.
	(promoted_type): New function.
	(promote_unary_vr): Likewise.
	(promote_binary_vr): Likewise.
	(extract_range_from_binary_expr_1): Adjust type to match value range.
	Store value ranges in promoted type if they overflow.
	(extract_range_from_unary_expr_1): Likewise.
	(adjust_range_with_scev): Call normalize_int_cst_precision
	on min and max.
	(vrp_visit_assignment_or_call): Likewise.
	(simplify_bit_ops_using_ranges): Adjust the value range precision.
	(test_for_singularity): Likewise.
	(simplify_stmt_for_jump_threading): Likewise.
	(extract_range_from_assert): Likewise.

[-- Attachment #2: p.txt --]
[-- Type: text/plain, Size: 13120 bytes --]

diff --git a/gcc/tree-ssa-ccp.c b/gcc/tree-ssa-ccp.c
index a90f708..1733073 100644
--- a/gcc/tree-ssa-ccp.c
+++ b/gcc/tree-ssa-ccp.c
@@ -916,7 +916,11 @@ ccp_finalize (void)
 	  unsigned int precision = TYPE_PRECISION (TREE_TYPE (val->value));
 	  wide_int nonzero_bits = wide_int::from (val->mask, precision,
 						  UNSIGNED) | val->value;
-	  nonzero_bits &= get_nonzero_bits (name);
+	  wide_int nonzero_bits_name = get_nonzero_bits (name);
+	  if (precision != nonzero_bits_name.get_precision ())
+	    nonzero_bits = wi::shwi (*nonzero_bits.get_val (),
+				     nonzero_bits_name.get_precision ());
+	  nonzero_bits &= nonzero_bits_name;
 	  set_nonzero_bits (name, nonzero_bits);
 	}
     }
@@ -1852,6 +1856,8 @@ evaluate_stmt (gimple stmt)
     {
       tree lhs = gimple_get_lhs (stmt);
       wide_int nonzero_bits = get_nonzero_bits (lhs);
+      if (TYPE_PRECISION (TREE_TYPE (lhs)) != nonzero_bits.get_precision ())
+	  nonzero_bits = wide_int_to_tree (TREE_TYPE (lhs), nonzero_bits);
       if (nonzero_bits != -1)
 	{
 	  if (!is_constant)
diff --git a/gcc/tree-ssanames.c b/gcc/tree-ssanames.c
index 3af80a0..459c669 100644
--- a/gcc/tree-ssanames.c
+++ b/gcc/tree-ssanames.c
@@ -192,7 +192,7 @@ set_range_info (tree name, enum value_range_type range_type,
   gcc_assert (!POINTER_TYPE_P (TREE_TYPE (name)));
   gcc_assert (range_type == VR_RANGE || range_type == VR_ANTI_RANGE);
   range_info_def *ri = SSA_NAME_RANGE_INFO (name);
-  unsigned int precision = TYPE_PRECISION (TREE_TYPE (name));
+  unsigned int precision = min.get_precision ();
 
   /* Allocate if not available.  */
   if (ri == NULL)
@@ -204,6 +204,15 @@ set_range_info (tree name, enum value_range_type range_type,
       SSA_NAME_RANGE_INFO (name) = ri;
       ri->set_nonzero_bits (wi::shwi (-1, precision));
     }
+  else if (ri->get_min ().get_precision () != precision)
+    {
+      size_t size = (sizeof (range_info_def)
+		     + trailing_wide_ints <3>::extra_size (precision));
+      ri = static_cast<range_info_def *> (ggc_realloc (ri, size));
+      ri->ints.set_precision (precision);
+      SSA_NAME_RANGE_INFO (name) = ri;
+      ri->set_nonzero_bits (wi::shwi (-1, precision));
+    }
 
   /* Record the range type.  */
   if (SSA_NAME_RANGE_TYPE (name) != range_type)
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index d16fd8a..772676a 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -61,6 +61,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "optabs.h"
 #include "tree-ssa-threadedge.h"
 #include "wide-int.h"
+#include "langhooks.h"
 
 
 
@@ -424,6 +425,23 @@ set_value_range_to_varying (value_range_t *vr)
     bitmap_clear (vr->equiv);
 }
 
+/* Normalize min and max to promoted_type if their precision differs.  */
+
+static void
+normalize_int_cst_precision (tree *min, tree *max)
+{
+  if (TREE_CODE (*min) != INTEGER_CST
+	      || TREE_CODE (*max) != INTEGER_CST)
+    return;
+  if (TYPE_PRECISION (TREE_TYPE (*min)) != TYPE_PRECISION (TREE_TYPE (*max)))
+    {
+      tree type = TREE_TYPE (*min);
+      if (TYPE_PRECISION (TREE_TYPE (*min)) < TYPE_PRECISION (TREE_TYPE (*max)))
+	type = TREE_TYPE (*max);
+      *min = wide_int_to_tree (type, *min);
+      *max = wide_int_to_tree (type, *max);
+    }
+}
 
 /* Set value range VR to {T, MIN, MAX, EQUIV}.  */
 
@@ -438,6 +456,8 @@ set_value_range (value_range_t *vr, enum value_range_type t, tree min,
       int cmp;
 
       gcc_assert (min && max);
+      gcc_assert (TYPE_PRECISION (TREE_TYPE (min))
+		  == TYPE_PRECISION (TREE_TYPE (max)));
 
       gcc_assert ((!TREE_OVERFLOW_P (min) || is_overflow_infinity (min))
 		  && (!TREE_OVERFLOW_P (max) || is_overflow_infinity (max)));
@@ -597,6 +617,8 @@ set_and_canonicalize_value_range (value_range_t *vr, enum value_range_type t,
       return;
     }
 
+  if (min != NULL_TREE && max != NULL_TREE)
+    normalize_int_cst_precision (&min, &max);
   set_value_range (vr, t, min, max, equiv);
 }
 
@@ -951,6 +973,66 @@ usable_range_p (value_range_t *vr, bool *strict_overflow_p)
   return true;
 }
 
+/* Return the promoted type as defined by PROMOTE_MODE of the target.  */
+
+static tree
+promoted_type (tree type)
+{
+#ifdef PROMOTE_MODE
+  tree new_type;
+  if (!POINTER_TYPE_P (type)
+      && (TREE_CODE (type) != ENUMERAL_TYPE)
+      && INTEGRAL_TYPE_P (type))
+    {
+      enum machine_mode mode = TYPE_MODE (type);
+      int uns = TYPE_SIGN (type);
+      PROMOTE_MODE (mode, uns, type);
+      uns = TYPE_SIGN (type);
+      new_type = lang_hooks.types.type_for_mode (mode, uns);
+      if (TYPE_PRECISION (new_type) > TYPE_PRECISION (type))
+	type = new_type;
+    }
+#endif
+  return type;
+}
+
+/* Promote VRO to promoted_type if their precision differ and
+   return the new type.  */
+
+static tree
+promote_unary_vr (tree type, value_range_t *vr0)
+{
+  tree expr_type = type;
+
+  if (!range_int_cst_p (vr0))
+    return expr_type;
+  if ((TYPE_PRECISION (type) != TYPE_PRECISION (TREE_TYPE (vr0->min)))
+      || (TYPE_PRECISION (type) != TYPE_PRECISION (TREE_TYPE (vr0->max))))
+    {
+      expr_type = promoted_type (type);
+      vr0->min = wide_int_to_tree (expr_type, vr0->min);
+      vr0->max = wide_int_to_tree (expr_type, vr0->max);
+    }
+  return expr_type;
+}
+
+/* Promote VRO and VR1 to promoted_type if their precision differ and
+   return the new type.  */
+
+static tree
+promote_binary_vr (tree type, value_range_t *vr0, value_range_t *vr1)
+{
+  tree expr_type0 = promote_unary_vr (type, vr0);
+  tree expr_type1 = promote_unary_vr (type, vr1);
+
+  if (TYPE_PRECISION (expr_type0) == TYPE_PRECISION (expr_type1))
+    return expr_type0;
+  if (TYPE_PRECISION (expr_type0) < TYPE_PRECISION (expr_type1))
+    return promote_unary_vr (expr_type1, vr0);
+  else
+    return promote_unary_vr (expr_type0, vr1);
+}
+
 
 /* Return true if the result of assignment STMT is know to be non-negative.
    If the return value is based on the assumption that signed overflow is
@@ -1741,6 +1823,7 @@ extract_range_from_assert (value_range_t *vr_p, tree expr)
 		TREE_NO_WARNING (max) = 1;
 	    }
 
+	  normalize_int_cst_precision (&min, &max);
 	  set_value_range (vr_p, VR_RANGE, min, max, vr_p->equiv);
 	}
     }
@@ -1781,6 +1864,7 @@ extract_range_from_assert (value_range_t *vr_p, tree expr)
 		TREE_NO_WARNING (min) = 1;
 	    }
 
+	  normalize_int_cst_precision (&min, &max);
 	  set_value_range (vr_p, VR_RANGE, min, max, vr_p->equiv);
 	}
     }
@@ -2376,6 +2460,9 @@ extract_range_from_binary_expr_1 (value_range_t *vr,
      range and see what we end up with.  */
   if (code == PLUS_EXPR || code == MINUS_EXPR)
     {
+      /* If any of the value range is in promoted type, promote them all
+	 including the type.  */
+      expr_type = promote_binary_vr (expr_type, &vr0, &vr1);
       /* If we have a PLUS_EXPR with two VR_RANGE integer constant
          ranges compute the precise range for such case if possible.  */
       if (range_int_cst_p (&vr0)
@@ -2562,6 +2649,9 @@ extract_range_from_binary_expr_1 (value_range_t *vr,
   else if (code == MIN_EXPR
 	   || code == MAX_EXPR)
     {
+      /* If any of the value range is in promoted type, promote them all
+	 including the type.  */
+      expr_type = promote_binary_vr (expr_type, &vr0, &vr1);
       if (vr0.type == VR_RANGE
 	  && !symbolic_range_p (&vr0))
 	{
@@ -2625,6 +2715,8 @@ extract_range_from_binary_expr_1 (value_range_t *vr,
              <wi::extended_tree <WIDE_INT_MAX_PRECISION * 2> > vrp_int_cst;
 	  vrp_int sizem1 = wi::mask <vrp_int> (prec, false);
 	  vrp_int size = sizem1 + 1;
+	  vrp_int type_min = vrp_int_cst (TYPE_MIN_VALUE (expr_type));
+	  vrp_int type_max = vrp_int_cst (TYPE_MAX_VALUE (expr_type));
 
 	  /* Extend the values using the sign of the result to PREC2.
 	     From here on out, everthing is just signed math no matter
@@ -2697,8 +2789,17 @@ extract_range_from_binary_expr_1 (value_range_t *vr,
 
 	  /* The following should handle the wrapping and selecting
 	     VR_ANTI_RANGE for us.  */
-	  min = wide_int_to_tree (expr_type, prod0);
-	  max = wide_int_to_tree (expr_type, prod3);
+	  if (wi::lts_p (prod0, type_min)
+	      || wi::gts_p (prod3, type_max))
+	    {
+	      min = wide_int_to_tree (promoted_type (expr_type), prod0);
+	      max = wide_int_to_tree (promoted_type (expr_type), prod3);
+	    }
+	  else
+	    {
+	      min = wide_int_to_tree (expr_type, prod0);
+	      max = wide_int_to_tree (expr_type, prod3);
+	    }
 	  set_and_canonicalize_value_range (vr, VR_RANGE, min, max, NULL);
 	  return;
 	}
@@ -2724,6 +2825,8 @@ extract_range_from_binary_expr_1 (value_range_t *vr,
   else if (code == RSHIFT_EXPR
 	   || code == LSHIFT_EXPR)
     {
+      /* If value range is in promoted type, promote the type as well.  */
+      expr_type = promote_unary_vr (expr_type, &vr0);
       /* If we have a RSHIFT_EXPR with any shift values outside [0..prec-1],
 	 then drop to VR_VARYING.  Outside of this range we get undefined
 	 behavior from the shift operation.  We cannot even trust
@@ -2946,6 +3049,9 @@ extract_range_from_binary_expr_1 (value_range_t *vr,
       wide_int may_be_nonzero0, may_be_nonzero1;
       wide_int must_be_nonzero0, must_be_nonzero1;
 
+      /* If any of the value range is in promoted type, promote them all
+	 including the type.  */
+      expr_type = promote_binary_vr (expr_type, &vr0, &vr1);
       int_cst_range0 = zero_nonzero_bits_from_vr (expr_type, &vr0,
 						  &may_be_nonzero0,
 						  &must_be_nonzero0);
@@ -3224,14 +3330,22 @@ extract_range_from_unary_expr_1 (value_range_t *vr,
 	  tree new_min, new_max;
 	  if (is_overflow_infinity (vr0.min))
 	    new_min = negative_overflow_infinity (outer_type);
-	  else
+	  else if (int_fits_type_p (vr0.min, outer_type))
 	    new_min = force_fit_type (outer_type, wi::to_widest (vr0.min),
 				      0, false);
+	  else
+	    new_min = force_fit_type (promoted_type (outer_type),
+				      wi::to_widest (vr0.min),
+				      0, false);
 	  if (is_overflow_infinity (vr0.max))
 	    new_max = positive_overflow_infinity (outer_type);
-	  else
+	  else if (int_fits_type_p (vr0.min, outer_type))
 	    new_max = force_fit_type (outer_type, wi::to_widest (vr0.max),
 				      0, false);
+	  else
+	    new_max = force_fit_type (promoted_type (outer_type),
+				      wi::to_widest (vr0.max),
+				      0, false);
 	  set_and_canonicalize_value_range (vr, vr0.type,
 					    new_min, new_max, NULL);
 	  return;
@@ -3940,6 +4054,8 @@ adjust_range_with_scev (value_range_t *vr, struct loop *loop,
 	  && is_positive_overflow_infinity (max)))
     return;
 
+  if (min != NULL_TREE && max != NULL_TREE)
+    normalize_int_cst_precision (&min, &max);
   set_value_range (vr, VR_RANGE, min, max, vr->equiv);
 }
 
@@ -6668,6 +6784,8 @@ vrp_visit_assignment_or_call (gimple stmt, tree *output_p)
       else
 	extract_range_from_assignment (&new_vr, stmt);
 
+      if (range_int_cst_p (&new_vr))
+	normalize_int_cst_precision (&new_vr.min, &new_vr.max);
       if (update_value_range (lhs, &new_vr))
 	{
 	  *output_p = lhs;
@@ -8399,6 +8517,8 @@ vrp_visit_phi_node (gimple phi)
   /* If the new range is different than the previous value, keep
      iterating.  */
 update_range:
+  if (range_int_cst_p (&vr_result))
+    normalize_int_cst_precision (&vr_result.min, &vr_result.max);
   if (update_value_range (lhs, &vr_result))
     {
       if (dump_file && (dump_flags & TDF_DETAILS))
@@ -8655,9 +8775,19 @@ simplify_bit_ops_using_ranges (gimple_stmt_iterator *gsi, gimple stmt)
   if (!zero_nonzero_bits_from_vr (TREE_TYPE (op0), &vr0, &may_be_nonzero0,
 				  &must_be_nonzero0))
     return false;
-  if (!zero_nonzero_bits_from_vr (TREE_TYPE (op1), &vr1, &may_be_nonzero1,
+  if (!zero_nonzero_bits_from_vr (TREE_TYPE (op0), &vr1, &may_be_nonzero1,
 				  &must_be_nonzero1))
     return false;
+  if (TYPE_PRECISION (TREE_TYPE (op0)) != may_be_nonzero0.get_precision ())
+    {
+      may_be_nonzero0 = wide_int_to_tree (TREE_TYPE (op0), may_be_nonzero0);
+      must_be_nonzero0 = wide_int_to_tree (TREE_TYPE (op0), must_be_nonzero0);
+    }
+  if (TYPE_PRECISION (TREE_TYPE (op0)) != may_be_nonzero1.get_precision ())
+    {
+      may_be_nonzero1 = wide_int_to_tree (TREE_TYPE (op1), may_be_nonzero0);
+      must_be_nonzero1 = wide_int_to_tree (TREE_TYPE (op1), must_be_nonzero0);
+    }
 
   switch (gimple_assign_rhs_code (stmt))
     {
@@ -8752,9 +8882,9 @@ test_for_singularity (enum tree_code cond_code, tree op0,
   if (min && max)
     {
       if (compare_values (vr->min, min) == 1)
-	min = vr->min;
+	min = wide_int_to_tree (TREE_TYPE (op0), vr->min);
       if (compare_values (vr->max, max) == -1)
-	max = vr->max;
+	max = wide_int_to_tree (TREE_TYPE (op0), vr->max);
 
       /* If the new min/max values have converged to a single value,
 	 then there is only one value which can satisfy the condition,
@@ -9474,7 +9604,7 @@ simplify_stmt_for_jump_threading (gimple stmt, gimple within_stmt)
 	{
 	  extract_range_from_assignment (&new_vr, stmt);
 	  if (range_int_cst_singleton_p (&new_vr))
-	    return new_vr.min;
+	    return wide_int_to_tree (TREE_TYPE (lhs), new_vr.min);
 	}
     }
 

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-09-04  3:41       ` Kugan
@ 2014-09-04 13:00         ` Richard Biener
  2014-09-05  1:33           ` Kugan
  0 siblings, 1 reply; 51+ messages in thread
From: Richard Biener @ 2014-09-04 13:00 UTC (permalink / raw)
  To: Kugan; +Cc: Uros Bizjak, gcc-patches, Jakub Jelinek

On Thu, Sep 4, 2014 at 5:41 AM, Kugan <kugan.vivekanandarajah@linaro.org> wrote:
>>> I added this part of the code (in cfgexpand.c) to handle binary/unary/..
>>> gimple operations and used the LHS value range to infer the assigned
>>> value range. I will revert this part of the code as this is wrong.
>>>
>>> I dont think checking promoted_mode for temp will be necessary here as
>>> convert_move will handle it correctly if promoted_mode is set for temp.
>>>
>>> Thus, I will reimplement setting promoted_mode to temp (in
>>> expand_expr_real_2) based on the gimple statement content on RHS. i.e.
>>> by looking at the RHS operands and its value ranges and by calculating
>>> the resulting value range. Does this sound OK to you.
>>
>> No, this sounds backward again and won't work because those operands
>> again could be just truncated - thus you can't rely on their value-range.
>>
>> What you would need is VRP computing value-ranges in the promoted
>> mode from the start (and it doesn't do that).
>
>
> Hi Richard,
>
> Here is an attempt to do the value range computation in promoted_mode's
> type when it is overflowing. Bootstrapped on x86-84.

Err - I think you misunderstood this as a suggestion to do this ;)
value-ranges should be computed according to the type not according
to the (promoted) mode.  Otherwise we will miss optimization
opportunities.

Richard.

> Based on your feedback, I will do more testing on this.
>
> Thanks for your time,
> Kugan
>
> gcc/ChangeLog:
>
> 2014-09-04  Kugan Vivekanandarajah <kuganv@linaro.org>
>
>         * tree-ssa-ccp.c (ccp_finalize): Adjust the nonzero_bits precision to
>         the type.
>         (evaluate_stmt): Likewise.
>         * tree-ssanames.c (set_range_info): Adjust if the precision of stored
>         value range is different.
>         * tree-vrp.c (normalize_int_cst_precision): New function.
>         (set_value_range): Add assert to check precision.
>         (set_and_canonicalize_value_range): Call normalize_int_cst_precision
>         on min and max.
>         (promoted_type): New function.
>         (promote_unary_vr): Likewise.
>         (promote_binary_vr): Likewise.
>         (extract_range_from_binary_expr_1): Adjust type to match value range.
>         Store value ranges in promoted type if they overflow.
>         (extract_range_from_unary_expr_1): Likewise.
>         (adjust_range_with_scev): Call normalize_int_cst_precision
>         on min and max.
>         (vrp_visit_assignment_or_call): Likewise.
>         (simplify_bit_ops_using_ranges): Adjust the value range precision.
>         (test_for_singularity): Likewise.
>         (simplify_stmt_for_jump_threading): Likewise.
>         (extract_range_from_assert): Likewise.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-09-04 13:00         ` Richard Biener
@ 2014-09-05  1:33           ` Kugan
  2014-09-05  9:51             ` Richard Biener
  0 siblings, 1 reply; 51+ messages in thread
From: Kugan @ 2014-09-05  1:33 UTC (permalink / raw)
  To: Richard Biener; +Cc: Uros Bizjak, gcc-patches, Jakub Jelinek

>> Here is an attempt to do the value range computation in promoted_mode's
>> type when it is overflowing. Bootstrapped on x86-84.
> 
> Err - I think you misunderstood this as a suggestion to do this ;)
> value-ranges should be computed according to the type not according
> to the (promoted) mode.  Otherwise we will miss optimization
> opportunities.

Oops, sorry, I had my doubts about making trees aware of back-end stuff.

Coming back to the original problem, what would be the best approach to
handle this. Looking at the VRP pass, it seems to me that only MULT_EXPR
and LSHIFT_EXPR are truncating values this way. All other operation are
setting it to type_min, type_max. Can we rely on this ?

Is this error not showing up in PROMOTED_MODE <= word_mode (and
the mode precision of register from which we SUBREG is <= word_mode
precision) is just a coincidence. Can we rely on this?

Is there anyway we can fix this?

Thanks again,
Kugan



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-09-05  1:33           ` Kugan
@ 2014-09-05  9:51             ` Richard Biener
  2014-09-07  9:51               ` Kugan
  2014-11-09 23:30               ` [RFC] Elimination of zext/sext - type promotion pass Kugan
  0 siblings, 2 replies; 51+ messages in thread
From: Richard Biener @ 2014-09-05  9:51 UTC (permalink / raw)
  To: Kugan; +Cc: Uros Bizjak, gcc-patches, Jakub Jelinek

On Fri, Sep 5, 2014 at 3:33 AM, Kugan <kugan.vivekanandarajah@linaro.org> wrote:
>>> Here is an attempt to do the value range computation in promoted_mode's
>>> type when it is overflowing. Bootstrapped on x86-84.
>>
>> Err - I think you misunderstood this as a suggestion to do this ;)
>> value-ranges should be computed according to the type not according
>> to the (promoted) mode.  Otherwise we will miss optimization
>> opportunities.
>
> Oops, sorry, I had my doubts about making trees aware of back-end stuff.
>
> Coming back to the original problem, what would be the best approach to
> handle this. Looking at the VRP pass, it seems to me that only MULT_EXPR
> and LSHIFT_EXPR are truncating values this way. All other operation are
> setting it to type_min, type_max. Can we rely on this ?

No, that doesn't sound like a good thing to do.

> Is this error not showing up in PROMOTED_MODE <= word_mode (and
> the mode precision of register from which we SUBREG is <= word_mode
> precision) is just a coincidence. Can we rely on this?

Sounds like a coincidence to me.

> Is there anyway we can fix this?

Well - the best way would be to expose the target specifics to GIMPLE
at some point in the optimization pipeline.  My guess would be that it's
appropriate after loop optimizations (but maybe before induction variable
optimization).

That is, have a pass that applies register promotion to all SSA names
in the function, inserting appropriate truncations and extensions.  That
way you'd never see (set (subreg...) on RTL.  The VRP and DOM
passes running after that pass would then be able to aggressively
optimize redundant truncations and extensions.

Effects on debug information are to be considered.  You can change
the type of SSA names in-place but you don't want to do that for
user DECLs (and we can't have the SSA name type and its DECL
type differ - and not sure if we might want to lift that restriction).

Richard.

> Thanks again,
> Kugan
>
>
>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-09-05  9:51             ` Richard Biener
@ 2014-09-07  9:51               ` Kugan
  2014-09-08  9:48                 ` Richard Biener
  2014-11-09 23:30               ` [RFC] Elimination of zext/sext - type promotion pass Kugan
  1 sibling, 1 reply; 51+ messages in thread
From: Kugan @ 2014-09-07  9:51 UTC (permalink / raw)
  To: Richard Biener; +Cc: Uros Bizjak, gcc-patches, Jakub Jelinek

On 05/09/14 19:50, Richard Biener wrote:

> Well - the best way would be to expose the target specifics to GIMPLE
> at some point in the optimization pipeline.  My guess would be that it's
> appropriate after loop optimizations (but maybe before induction variable
> optimization).
> 
> That is, have a pass that applies register promotion to all SSA names
> in the function, inserting appropriate truncations and extensions.  That
> way you'd never see (set (subreg...) on RTL.  The VRP and DOM
> passes running after that pass would then be able to aggressively
> optimize redundant truncations and extensions.
> 
> Effects on debug information are to be considered.  You can change
> the type of SSA names in-place but you don't want to do that for
> user DECLs (and we can't have the SSA name type and its DECL
> type differ - and not sure if we might want to lift that restriction).

Thanks. I will try to implement this.

I still would like to keep the VRP based approach as there are some
cases that I think can only be done with range info. For example:

short foo(unsigned char c)
{
  c = c & (unsigned char)0x0F;
  if( c > 7 )
    return((short)(c - 5));
  else
    return(( short )c);
}


So, how about adding and setting the overflow/wrap around flag to
range_info. We now set static_flag for VR_RANG/VR_ANTI_RANGE. If we go
back to the max + 1, min - 1 for VR_ANTI_RANGE, we can use this
static_flag to encode overflow/wrap around. Will that be something
acceptable?

Thanks again,
Kugan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-09-07  9:51               ` Kugan
@ 2014-09-08  9:48                 ` Richard Biener
  2014-09-09 10:06                   ` Kugan
  0 siblings, 1 reply; 51+ messages in thread
From: Richard Biener @ 2014-09-08  9:48 UTC (permalink / raw)
  To: Kugan; +Cc: Uros Bizjak, gcc-patches, Jakub Jelinek

On Sun, Sep 7, 2014 at 11:50 AM, Kugan
<kugan.vivekanandarajah@linaro.org> wrote:
> On 05/09/14 19:50, Richard Biener wrote:
>
>> Well - the best way would be to expose the target specifics to GIMPLE
>> at some point in the optimization pipeline.  My guess would be that it's
>> appropriate after loop optimizations (but maybe before induction variable
>> optimization).
>>
>> That is, have a pass that applies register promotion to all SSA names
>> in the function, inserting appropriate truncations and extensions.  That
>> way you'd never see (set (subreg...) on RTL.  The VRP and DOM
>> passes running after that pass would then be able to aggressively
>> optimize redundant truncations and extensions.
>>
>> Effects on debug information are to be considered.  You can change
>> the type of SSA names in-place but you don't want to do that for
>> user DECLs (and we can't have the SSA name type and its DECL
>> type differ - and not sure if we might want to lift that restriction).
>
> Thanks. I will try to implement this.
>
> I still would like to keep the VRP based approach as there are some
> cases that I think can only be done with range info. For example:
>
> short foo(unsigned char c)
> {
>   c = c & (unsigned char)0x0F;
>   if( c > 7 )
>     return((short)(c - 5));
>   else
>     return(( short )c);
> }
>
>
> So, how about adding and setting the overflow/wrap around flag to
> range_info. We now set static_flag for VR_RANG/VR_ANTI_RANGE. If we go
> back to the max + 1, min - 1 for VR_ANTI_RANGE, we can use this
> static_flag to encode overflow/wrap around. Will that be something
> acceptable?

You mean tracking in the VRP lattice whether a value wrapped around
(or was assumed not to due to undefined behavior)?  I'm not sure this
is easy to do correctly (VRP is large).

Note that I don't think we'll lose the testcase you quoted if the promotion
pass runs before VRP2.   We'd have as input to VRP2 sth like (assuming
promote mode would promote to SImode)

  SImode tem_2 = (unsigned int)c_1(D);
  tem_3 = tem_3 & 0xF;
  if (tem_3 > 7)
    {
      tem_4 = tem_3 - 5;
      short _5 = (short)_4;
      tem_5 = (unsigned int)_5;
     return tem_5;
   }
else
   {
     short _6 = (short)_3;
     return _6;
   }

VRP should be able to remove the (unsigned int)(short) sign-extension
of tem_4.

note that both incoming registers and return registers are "interesting".
For simplicity I suggest to not promote them on GIMPLE.

What you'd lose in VRP2 is the smaller value-ranges you'd get from
(undefined) wrapping.  You could recover the undefinedness by
looking at SSA names recorded value-range and transfering that
in the promotion pass (but I'm not sure if you want to open the
can of latent signed overflow bugs in programs even more for
PROMOTE_MODE targets...)

Richard.

>
> Thanks again,
> Kugan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-09-08  9:48                 ` Richard Biener
@ 2014-09-09 10:06                   ` Kugan
  2014-09-09 10:28                     ` Richard Biener
  0 siblings, 1 reply; 51+ messages in thread
From: Kugan @ 2014-09-09 10:06 UTC (permalink / raw)
  To: Richard Biener; +Cc: Uros Bizjak, gcc-patches, Jakub Jelinek

[-- Attachment #1: Type: text/plain, Size: 3667 bytes --]



On 08/09/14 19:48, Richard Biener wrote:
> On Sun, Sep 7, 2014 at 11:50 AM, Kugan
> <kugan.vivekanandarajah@linaro.org> wrote:
>> On 05/09/14 19:50, Richard Biener wrote:
>>
>>> Well - the best way would be to expose the target specifics to GIMPLE
>>> at some point in the optimization pipeline.  My guess would be that it's
>>> appropriate after loop optimizations (but maybe before induction variable
>>> optimization).
>>>
>>> That is, have a pass that applies register promotion to all SSA names
>>> in the function, inserting appropriate truncations and extensions.  That
>>> way you'd never see (set (subreg...) on RTL.  The VRP and DOM
>>> passes running after that pass would then be able to aggressively
>>> optimize redundant truncations and extensions.
>>>
>>> Effects on debug information are to be considered.  You can change
>>> the type of SSA names in-place but you don't want to do that for
>>> user DECLs (and we can't have the SSA name type and its DECL
>>> type differ - and not sure if we might want to lift that restriction).
>>
>> Thanks. I will try to implement this.
>>
>> I still would like to keep the VRP based approach as there are some
>> cases that I think can only be done with range info. For example:
>>
>> short foo(unsigned char c)
>> {
>>   c = c & (unsigned char)0x0F;
>>   if( c > 7 )
>>     return((short)(c - 5));
>>   else
>>     return(( short )c);
>> }
>>
>>
>> So, how about adding and setting the overflow/wrap around flag to
>> range_info. We now set static_flag for VR_RANG/VR_ANTI_RANGE. If we go
>> back to the max + 1, min - 1 for VR_ANTI_RANGE, we can use this
>> static_flag to encode overflow/wrap around. Will that be something
>> acceptable?
> 
> You mean tracking in the VRP lattice whether a value wrapped around
> (or was assumed not to due to undefined behavior)?  I'm not sure this
> is easy to do correctly (VRP is large).
> 
> Note that I don't think we'll lose the testcase you quoted if the promotion
> pass runs before VRP2.   We'd have as input to VRP2 sth like (assuming
> promote mode would promote to SImode)
> 
>   SImode tem_2 = (unsigned int)c_1(D);
>   tem_3 = tem_3 & 0xF;
>   if (tem_3 > 7)
>     {
>       tem_4 = tem_3 - 5;
>       short _5 = (short)_4;
>       tem_5 = (unsigned int)_5;
>      return tem_5;
>    }
> else
>    {
>      short _6 = (short)_3;
>      return _6;
>    }
> 
> VRP should be able to remove the (unsigned int)(short) sign-extension
> of tem_4.
> 
> note that both incoming registers and return registers are "interesting".
> For simplicity I suggest to not promote them on GIMPLE.
> 
> What you'd lose in VRP2 is the smaller value-ranges you'd get from
> (undefined) wrapping.  You could recover the undefinedness by
> looking at SSA names recorded value-range and transfering that
> in the promotion pass (but I'm not sure if you want to open the
> can of latent signed overflow bugs in programs even more for
> PROMOTE_MODE targets...)
> 

Thanks. In the meantime I would like to revert the patch which is
enabling zero/sign extension. I have bootstrapped it in x86_64 and
regression testing is ongoing. Is this OK ?

Thanks,
Kugan

gcc/ChangeLog:

2014-09-09  Kugan Vivekanandarajah  <kuganv@linaro.org>

	Revert r213751:
	* calls.c (precompute_arguments): Check
	 promoted_for_signed_and_unsigned_p and set the promoted mode.
	(promoted_for_signed_and_unsigned_p): New function.
	(expand_expr_real_1): Check promoted_for_signed_and_unsigned_p
	and set the promoted mode.
	* expr.h (promoted_for_signed_and_unsigned_p): New function definition.
	* cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if
	SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED.

[-- Attachment #2: p.txt --]
[-- Type: text/plain, Size: 3886 bytes --]

diff --git a/gcc/calls.c b/gcc/calls.c
index 03ed9c8..345331f 100644
--- a/gcc/calls.c
+++ b/gcc/calls.c
@@ -1486,10 +1486,7 @@ precompute_arguments (int num_actuals, struct arg_data *args)
 	      args[i].initial_value
 		= gen_lowpart_SUBREG (mode, args[i].value);
 	      SUBREG_PROMOTED_VAR_P (args[i].initial_value) = 1;
-	      if (promoted_for_signed_and_unsigned_p (args[i].tree_value, mode))
-		SUBREG_PROMOTED_SET (args[i].initial_value, SRP_SIGNED_AND_UNSIGNED);
-	      else
-		SUBREG_PROMOTED_SET (args[i].initial_value, args[i].unsignedp);
+	      SUBREG_PROMOTED_SET (args[i].initial_value, args[i].unsignedp);
 	    }
 	}
     }
diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index db76897..8916305 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -3325,13 +3325,7 @@ expand_gimple_stmt_1 (gimple stmt)
 					  GET_MODE (target), temp, unsignedp);
 		  }
 
-		if ((SUBREG_PROMOTED_GET (target) == SRP_SIGNED_AND_UNSIGNED)
-		    && (GET_CODE (temp) == SUBREG)
-		    && (GET_MODE (target) == GET_MODE (temp))
-		    && (GET_MODE (SUBREG_REG (target)) == GET_MODE (SUBREG_REG (temp))))
-		  emit_move_insn (SUBREG_REG (target), SUBREG_REG (temp));
-		else
-		  convert_move (SUBREG_REG (target), temp, unsignedp);
+		convert_move (SUBREG_REG (target), temp, unsignedp);
 	      }
 	    else if (nontemporal && emit_storent_insn (target, temp))
 	      ;
diff --git a/gcc/expr.c b/gcc/expr.c
index 7e69955..5039734 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -68,7 +68,6 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-ssa-address.h"
 #include "cfgexpand.h"
 #include "builtins.h"
-#include "tree-ssa.h"
 
 #ifndef STACK_PUSH_CODE
 #ifdef STACK_GROWS_DOWNWARD
@@ -9250,35 +9249,6 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode,
 }
 #undef REDUCE_BIT_FIELD
 
-/* Return TRUE if value in SSA is zero and sign extended for wider mode MODE
-   using value range information stored.  Return FALSE otherwise.
-
-   This is used to check if SUBREG is zero and sign extended and to set
-   promoted mode SRP_SIGNED_AND_UNSIGNED to SUBREG.  */
-
-bool
-promoted_for_signed_and_unsigned_p (tree ssa, enum machine_mode mode)
-{
-  wide_int min, max;
-
-  if (ssa == NULL_TREE
-      || TREE_CODE (ssa) != SSA_NAME
-      || !INTEGRAL_TYPE_P (TREE_TYPE (ssa))
-      || (TYPE_PRECISION (TREE_TYPE (ssa)) != GET_MODE_PRECISION (mode)))
-    return false;
-
-  /* Return FALSE if value_range is not recorded for SSA.  */
-  if (get_range_info (ssa, &min, &max) != VR_RANGE)
-    return false;
-
-  /* Return true (to set SRP_SIGNED_AND_UNSIGNED to SUBREG) if MSB of the
-     smaller mode is not set (i.e.  MSB of ssa is not set).  */
-  if (!wi::neg_p (min, SIGNED) && !wi::neg_p(max, SIGNED))
-    return true;
-  else
-    return false;
-
-}
 
 /* Return TRUE if expression STMT is suitable for replacement.  
    Never consider memory loads as replaceable, because those don't ever lead 
@@ -9582,10 +9552,7 @@ expand_expr_real_1 (tree exp, rtx target, enum machine_mode tmode,
 
 	  temp = gen_lowpart_SUBREG (mode, decl_rtl);
 	  SUBREG_PROMOTED_VAR_P (temp) = 1;
-	  if (promoted_for_signed_and_unsigned_p (ssa_name, mode))
-	    SUBREG_PROMOTED_SET (temp, SRP_SIGNED_AND_UNSIGNED);
-	  else
-	    SUBREG_PROMOTED_SET (temp, unsignedp);
+	  SUBREG_PROMOTED_SET (temp, unsignedp);
 	  return temp;
 	}
 
diff --git a/gcc/expr.h b/gcc/expr.h
index 756c894..2e66329 100644
--- a/gcc/expr.h
+++ b/gcc/expr.h
@@ -440,7 +440,6 @@ extern rtx expand_expr_real_1 (tree, rtx, enum machine_mode,
 			       enum expand_modifier, rtx *, bool);
 extern rtx expand_expr_real_2 (sepops, rtx, enum machine_mode,
 			       enum expand_modifier);
-extern bool promoted_for_signed_and_unsigned_p (tree, enum machine_mode);
 
 /* Generate code for computing expression EXP.
    An rtx for the computed value is returned.  The value is never null.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-09-09 10:06                   ` Kugan
@ 2014-09-09 10:28                     ` Richard Biener
  0 siblings, 0 replies; 51+ messages in thread
From: Richard Biener @ 2014-09-09 10:28 UTC (permalink / raw)
  To: Kugan; +Cc: Uros Bizjak, gcc-patches, Jakub Jelinek

On Tue, Sep 9, 2014 at 12:06 PM, Kugan
<kugan.vivekanandarajah@linaro.org> wrote:
>
>
> On 08/09/14 19:48, Richard Biener wrote:
>> On Sun, Sep 7, 2014 at 11:50 AM, Kugan
>> <kugan.vivekanandarajah@linaro.org> wrote:
>>> On 05/09/14 19:50, Richard Biener wrote:
>>>
>>>> Well - the best way would be to expose the target specifics to GIMPLE
>>>> at some point in the optimization pipeline.  My guess would be that it's
>>>> appropriate after loop optimizations (but maybe before induction variable
>>>> optimization).
>>>>
>>>> That is, have a pass that applies register promotion to all SSA names
>>>> in the function, inserting appropriate truncations and extensions.  That
>>>> way you'd never see (set (subreg...) on RTL.  The VRP and DOM
>>>> passes running after that pass would then be able to aggressively
>>>> optimize redundant truncations and extensions.
>>>>
>>>> Effects on debug information are to be considered.  You can change
>>>> the type of SSA names in-place but you don't want to do that for
>>>> user DECLs (and we can't have the SSA name type and its DECL
>>>> type differ - and not sure if we might want to lift that restriction).
>>>
>>> Thanks. I will try to implement this.
>>>
>>> I still would like to keep the VRP based approach as there are some
>>> cases that I think can only be done with range info. For example:
>>>
>>> short foo(unsigned char c)
>>> {
>>>   c = c & (unsigned char)0x0F;
>>>   if( c > 7 )
>>>     return((short)(c - 5));
>>>   else
>>>     return(( short )c);
>>> }
>>>
>>>
>>> So, how about adding and setting the overflow/wrap around flag to
>>> range_info. We now set static_flag for VR_RANG/VR_ANTI_RANGE. If we go
>>> back to the max + 1, min - 1 for VR_ANTI_RANGE, we can use this
>>> static_flag to encode overflow/wrap around. Will that be something
>>> acceptable?
>>
>> You mean tracking in the VRP lattice whether a value wrapped around
>> (or was assumed not to due to undefined behavior)?  I'm not sure this
>> is easy to do correctly (VRP is large).
>>
>> Note that I don't think we'll lose the testcase you quoted if the promotion
>> pass runs before VRP2.   We'd have as input to VRP2 sth like (assuming
>> promote mode would promote to SImode)
>>
>>   SImode tem_2 = (unsigned int)c_1(D);
>>   tem_3 = tem_3 & 0xF;
>>   if (tem_3 > 7)
>>     {
>>       tem_4 = tem_3 - 5;
>>       short _5 = (short)_4;
>>       tem_5 = (unsigned int)_5;
>>      return tem_5;
>>    }
>> else
>>    {
>>      short _6 = (short)_3;
>>      return _6;
>>    }
>>
>> VRP should be able to remove the (unsigned int)(short) sign-extension
>> of tem_4.
>>
>> note that both incoming registers and return registers are "interesting".
>> For simplicity I suggest to not promote them on GIMPLE.
>>
>> What you'd lose in VRP2 is the smaller value-ranges you'd get from
>> (undefined) wrapping.  You could recover the undefinedness by
>> looking at SSA names recorded value-range and transfering that
>> in the promotion pass (but I'm not sure if you want to open the
>> can of latent signed overflow bugs in programs even more for
>> PROMOTE_MODE targets...)
>>
>
> Thanks. In the meantime I would like to revert the patch which is
> enabling zero/sign extension. I have bootstrapped it in x86_64 and
> regression testing is ongoing. Is this OK ?

Ok.

Thanks,
Richard.

> Thanks,
> Kugan
>
> gcc/ChangeLog:
>
> 2014-09-09  Kugan Vivekanandarajah  <kuganv@linaro.org>
>
>         Revert r213751:
>         * calls.c (precompute_arguments): Check
>          promoted_for_signed_and_unsigned_p and set the promoted mode.
>         (promoted_for_signed_and_unsigned_p): New function.
>         (expand_expr_real_1): Check promoted_for_signed_and_unsigned_p
>         and set the promoted mode.
>         * expr.h (promoted_for_signed_and_unsigned_p): New function definition.
>         * cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if
>         SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [RFC] Elimination of zext/sext - type promotion pass
  2014-09-05  9:51             ` Richard Biener
  2014-09-07  9:51               ` Kugan
@ 2014-11-09 23:30               ` Kugan
  2014-11-10 12:56                 ` Richard Biener
  1 sibling, 1 reply; 51+ messages in thread
From: Kugan @ 2014-11-09 23:30 UTC (permalink / raw)
  To: Richard Biener; +Cc: Uros Bizjak, gcc-patches, Jakub Jelinek

[-- Attachment #1: Type: text/plain, Size: 3535 bytes --]


> Well - the best way would be to expose the target specifics to GIMPLE
> at some point in the optimization pipeline.  My guess would be that it's
> appropriate after loop optimizations (but maybe before induction variable
> optimization).
> 
> That is, have a pass that applies register promotion to all SSA names
> in the function, inserting appropriate truncations and extensions.  That
> way you'd never see (set (subreg...) on RTL.  The VRP and DOM
> passes running after that pass would then be able to aggressively
> optimize redundant truncations and extensions.
> 
> Effects on debug information are to be considered.  You can change
> the type of SSA names in-place but you don't want to do that for
> user DECLs (and we can't have the SSA name type and its DECL
> type differ - and not sure if we might want to lift that restriction).


Thanks for the comments. Here is a prototype patch that implements a
type promotion pass. This pass records SSA variables that will have
values in higher bits (than the original type precision) if promoted and
uses this information in inserting appropriate truncations and
extensions. This pass also classifies some of the stmts that sets ssa's
to be unsafe to promote. Here is a gimple difference for the type
promotion as compared to previous dump for a testcase.

 crc2 (short unsigned int crc, unsigned char data)
 {
   unsigned char carry;
   unsigned char x16;
   unsigned char i;
-  unsigned char ivtmp_5;
+  unsigned int _2;
+  unsigned char _3;
+  unsigned int _4;
+  unsigned int _5;
   unsigned char _9;
-  unsigned char _10;
-  unsigned char ivtmp_18;
+  unsigned int _10;
+  unsigned int _11;
+  unsigned int _12;
+  unsigned int _13;
+  unsigned int _15;
+  unsigned int _16;
+  unsigned int _18;
+  unsigned int _21;
+  unsigned int _22;
+  unsigned int _24;
+  short unsigned int _26;
+  unsigned char _27;
+  unsigned int _28;
+  unsigned int _29;
+  unsigned int _30;

   <bb 2>:
+  _12 = (unsigned int) data_8(D);
+  _2 = (unsigned int) crc_7(D);

   <bb 3>:
-  # crc_28 = PHI <crc_2(5), crc_7(D)(2)>
-  # data_29 = PHI <data_12(5), data_8(D)(2)>
-  # ivtmp_18 = PHI <ivtmp_5(5), 8(2)>
-  _9 = (unsigned char) crc_28;
-  _10 = _9 ^ data_29;
-  x16_11 = _10 & 1;
-  data_12 = data_29 >> 1;
-  if (x16_11 == 1)
+  # _30 = PHI <_28(5), _2(2)>
+  # _16 = PHI <_29(5), _12(2)>
+  # _4 = PHI <_18(5), 8(2)>
+  _9 = (unsigned char) _30;
+  _5 = (unsigned int) _9;
+  _22 = _5 ^ _16;
+  _10 = _22 & 1;
+  _29 = _16 >> 1;
+  _27 = (unsigned char) _10;
+  if (_27 == 1)
     goto <bb 4>;
   else
     goto <bb 7>;

   <bb 4>:
-  crc_13 = crc_28 ^ 16386;
-  crc_24 = crc_13 >> 1;
-  crc_15 = crc_24 | 32768;
+  _11 = _30 ^ 16386;
+  _13 = _11 >> 1;
+  _24 = _13 | 32768;

   <bb 5>:
-  # crc_2 = PHI <crc_15(4), crc_21(7)>
-  ivtmp_5 = ivtmp_18 - 1;
-  if (ivtmp_5 != 0)
+  # _28 = PHI <_24(4), _15(7)>
+  _18 = _4 - 1;
+  _3 = (unsigned char) _18;
+  if (_3 != 0)
     goto <bb 3>;
   else
     goto <bb 6>;

   <bb 6>:
-  # crc_19 = PHI <crc_2(5)>
-  return crc_19;
+  # _21 = PHI <_28(5)>
+  _26 = (short unsigned int) _21;
+  return _26;

   <bb 7>:
-  crc_21 = crc_28 >> 1;
+  _15 = _30 >> 1;
   goto <bb 5>;

 }


I experimented with few simple test-cases and results so far are mixed.
It also seems that subsequent passes are not always optimising as
expected. I haven't looked in detail but will look into it based on the
feedback.

Please also note that this pass still doest handle debug instructions
and there are couple regression failures for ARM.

Thanks,
Kugan



[-- Attachment #2: p.txt --]
[-- Type: text/plain, Size: 33524 bytes --]

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 4a51393..1e26172 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,13 @@
+2014-11-08  Kugan Vivekanandarajah  <kuganv@linaro.org>
+
+	* Makefile.in (OBJS): Add tree-type-prmtn.o.
+	* common.opt (ftree-type-prmt): New flag.
+	* opts.c (OPT_ftree_type_prmt): New option added.
+	* passes.def: New pass included.
+	* tree-pass.h: LikeWise.
+	* timevar.def (TV_TREE_TYPE_PRMT): New timevar.
+	* tree-type-prmtn.c: New file.
+
 2014-11-08  Richard Sandiford  <richard.sandiford@arm.com>
 
 	* config/aarch64/aarch64.c: Include rtl-iter.h.
diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 2c3f02e..2c257b9 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1463,6 +1463,7 @@ OBJS = \
 	tree-vect-slp.o \
 	tree-vectorizer.o \
 	tree-vrp.o \
+	tree-type-prmtn.o \
 	tree.o \
 	valtrack.o \
 	value-prof.o \
diff --git a/gcc/common.opt b/gcc/common.opt
index b400636..4556e0a 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2304,6 +2304,10 @@ ftree-vrp
 Common Report Var(flag_tree_vrp) Init(0) Optimization
 Perform Value Range Propagation on trees
 
+ftree-type-prmt
+Common Report Var(flag_tree_type_prmt) Init(0) Optimization
+Perform Type Promotion on trees
+
 funit-at-a-time
 Common Report Var(flag_unit_at_a_time) Init(1) Optimization
 Compile whole compilation unit at a time
diff --git a/gcc/opts.c b/gcc/opts.c
index 752cc84..784c74f 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -500,6 +500,7 @@ static const struct default_options default_options_table[] =
     { OPT_LEVELS_2_PLUS, OPT_fipa_icf, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fisolate_erroneous_paths_dereference, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fuse_caller_save, NULL, 1 },
+    { OPT_LEVELS_2_PLUS, OPT_ftree_type_prmt, NULL, 1 },
 
     /* -O3 optimizations.  */
     { OPT_LEVELS_3_PLUS, OPT_ftree_loop_distribute_patterns, NULL, 1 },
diff --git a/gcc/passes.def b/gcc/passes.def
index 2305d67..50c5f96 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -264,6 +264,7 @@ along with GCC; see the file COPYING3.  If not see
       PUSH_INSERT_PASSES_WITHIN (pass_tree_no_loop)
 	  NEXT_PASS (pass_slp_vectorize);
       POP_INSERT_PASSES ()
+      NEXT_PASS (pass_type_promotion);
       NEXT_PASS (pass_lower_vector_ssa);
       NEXT_PASS (pass_cse_reciprocals);
       NEXT_PASS (pass_reassoc);
diff --git a/gcc/timevar.def b/gcc/timevar.def
index b74cbb0..0410f9b 100644
--- a/gcc/timevar.def
+++ b/gcc/timevar.def
@@ -266,6 +266,7 @@ DEFTIMEVAR (TV_PLUGIN_RUN            , "plugin execution")
 DEFTIMEVAR (TV_GIMPLE_SLSR           , "straight-line strength reduction")
 DEFTIMEVAR (TV_VTABLE_VERIFICATION   , "vtable verification")
 DEFTIMEVAR (TV_TREE_UBSAN            , "tree ubsan")
+DEFTIMEVAR (TV_TREE_TYPE_PRMT        , "tree type promotion")
 
 /* Everything else in rest_of_compilation not included above.  */
 DEFTIMEVAR (TV_EARLY_LOCAL	     , "early local passes")
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index a3efdd8..0acc15f 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -424,6 +424,7 @@ extern gimple_opt_pass *make_pass_fre (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_check_data_deps (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_copy_prop (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_isolate_erroneous_paths (gcc::context *ctxt);
+extern gimple_opt_pass *make_pass_type_promotion (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_vrp (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_uncprop (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_return_slot (gcc::context *ctxt);
diff --git a/gcc/tree-type-prmtn.c b/gcc/tree-type-prmtn.c
new file mode 100644
index 0000000..1e79c8d
--- /dev/null
+++ b/gcc/tree-type-prmtn.c
@@ -0,0 +1,1103 @@
+/* Type promotion of SSA names to minimise redundant zero/sign extension.
+   Copyright (C) 2014 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "tree.h"
+#include "stor-layout.h"
+#include "calls.h"
+#include "predict.h"
+#include "machmode.h"
+#include "hard-reg-set.h"
+#include "function.h"
+#include "dominance.h"
+#include "cfg.h"
+#include "basic-block.h"
+#include "tree-ssa-alias.h"
+#include "gimple-fold.h"
+#include "tree-eh.h"
+#include "gimple-expr.h"
+#include "is-a.h"
+#include "gimple.h"
+#include "gimple-iterator.h"
+#include "gimple-ssa.h"
+#include "tree-cfg.h"
+#include "tree-phinodes.h"
+#include "ssa-iterators.h"
+#include "stringpool.h"
+#include "tree-ssanames.h"
+#include "tree-ssa.h"
+#include "tree-pass.h"
+#include "gimple-pretty-print.h"
+#include "langhooks.h"
+#include "sbitmap.h"
+#include "domwalk.h"
+
+/* This pass applies type promotion to SSA names in the function and
+   inserts appropriate truncations.  Idea of this pass is to promote operations
+   such a way that we can minimise generation of subreg in RTL,
+   that intern results in removal of redundant zero/sign extensions.  This pass
+   will run prior to The VRP and DOM such that they will be able to optimise
+   redundant truncations and extensions.  This is based on the discussion from
+   https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00472.html.  */
+
+static unsigned n_ssa_val;
+static sbitmap ssa_not_safe_bitmap;
+static sbitmap ssa_to_be_promoted_bitmap;
+static sbitmap ssa_sets_higher_bits_bitmap;
+
+/* Return the promoted type for TYPE as defined by PROMOTE_MODE of the
+   target.  */
+static tree
+get_promoted_type (tree type)
+{
+#ifdef PROMOTE_MODE
+  tree promoted_type;
+  enum machine_mode mode = TYPE_MODE (type);
+  int uns = TYPE_SIGN (type);
+
+  if (POINTER_TYPE_P (type)
+      || TYPE_PRECISION (type) == 1
+      || !INTEGRAL_TYPE_P (type))
+    return type;
+
+  PROMOTE_MODE (mode, uns, type);
+  uns = TYPE_SIGN (type);
+  promoted_type = lang_hooks.types.type_for_mode (mode, uns);
+
+  if (TYPE_PRECISION (promoted_type) > TYPE_PRECISION (type))
+    {
+      promoted_type = build_type_attribute_qual_variant (promoted_type,
+						    TYPE_ATTRIBUTES (type),
+						    TYPE_QUALS (type));
+      type = promoted_type;
+    }
+#endif
+  return type;
+}
+
+/* Predicate that tells if promoting computation with ssa NAME is safe.  */
+static bool
+promotion_safe_p (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      gimple stmt = SSA_NAME_DEF_STMT (name);
+      unsigned int index = SSA_NAME_VERSION (name);
+
+      if (gimple_vdef (stmt) != NULL_TREE
+	  || gimple_vuse (stmt) != NULL_TREE)
+	return false;
+      if (index < n_ssa_val)
+	return !bitmap_bit_p (ssa_not_safe_bitmap, index);
+    }
+  return false;
+}
+
+/* Return true if ssa NAME is already considered for promotion.  */
+static bool
+ssa_promoted_p (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	return !bitmap_bit_p (ssa_to_be_promoted_bitmap, index);
+    }
+  return true;
+}
+
+/* Return true if ssa NAME will be considered for promotion.  */
+static bool
+ssa_tobe_promoted_p (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	return bitmap_bit_p (ssa_to_be_promoted_bitmap, index);
+    }
+  return false;
+}
+
+/* Set ssa NAME to be already considered for promotion.  */
+static void
+set_ssa_promoted (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	bitmap_clear_bit (ssa_to_be_promoted_bitmap, index);
+    }
+}
+
+/* Set ssa NAME will have higher bits if promoted.  */
+static void
+set_ssa_overflows (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	bitmap_set_bit (ssa_sets_higher_bits_bitmap, index);
+    }
+}
+
+/* Return true if ssa NAME will have higher bits if promoted.  */
+static bool
+ssa_overflows_p (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	return bitmap_bit_p (ssa_sets_higher_bits_bitmap, index);
+    }
+  return false;
+}
+
+/* Create an ssa with TYPE to copy ssa VAR.  */
+static tree
+make_promoted_copy (tree var, gimple def_stmt, tree type)
+{
+  tree new_lhs = make_ssa_name (type, def_stmt);
+  if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (var))
+    SSA_NAME_OCCURS_IN_ABNORMAL_PHI (new_lhs) = 1;
+  return new_lhs;
+}
+
+/* Return single successor (excluding EH edge) basic block.  If there are more
+   than one successors, return NULL.  */
+static basic_block
+get_next_bb (basic_block bb)
+{
+  edge e, res = NULL;
+  edge_iterator ei;
+
+  FOR_EACH_EDGE (e, ei, bb->succs)
+    if (!(e->flags & EDGE_EH))
+      {
+	if (res)
+	  return NULL;
+	res = e;
+      }
+  return res->dest;
+}
+
+/* Insert COPY_STMT after STMT when STMT can throw.  Create a new basic block
+   between basic block containing STMT and its successor.  */
+static void
+insert_next_bb (gimple stmt, gimple copy_stmt)
+{
+  edge_iterator ei;
+  edge e, edge = NULL;
+  gimple_stmt_iterator gsi;
+  basic_block bb = gimple_bb (stmt);
+
+  FOR_EACH_EDGE (e, ei, bb->succs)
+    if (!(e->flags & EDGE_EH))
+      {
+	gcc_assert (edge == NULL);
+	edge = e;
+      }
+
+  gcc_assert (edge);
+  basic_block new_bb = split_edge (edge);
+  gsi = gsi_after_labels (new_bb);
+  gsi_insert_after (&gsi, copy_stmt, GSI_NEW_STMT);
+}
+
+
+/* Return false if rhs type cannot be promoted in the stmt.  Return true
+   otherwise.  */
+static bool
+assign_rhs_promotable_p (gimple stmt, tree promoted_type)
+{
+  gcc_assert (gimple_code (stmt) == GIMPLE_ASSIGN);
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  tree lhs = gimple_assign_lhs (stmt);
+  tree rhs1 = gimple_assign_rhs1 (stmt);
+  tree rhs2 = gimple_assign_rhs2 (stmt);
+
+  /* If the OP is part of condition in COND_EXPR, it can be promoted only if
+     higher_bits for both the operands are not set.  */
+  if (TREE_CODE_CLASS (code) == tcc_comparison)
+    {
+      /* LHS and RHS can be promoted without changing the results of
+	 comparison.  */
+      if (((ssa_tobe_promoted_p (rhs1)
+	    && promotion_safe_p (rhs1)
+	    && !ssa_overflows_p (rhs1))
+	   || (TREE_CODE (rhs1) == INTEGER_CST))
+	  && ((ssa_tobe_promoted_p (rhs2)
+	       && promotion_safe_p (rhs2)
+	       && !ssa_overflows_p (rhs2))
+	      || (TREE_CODE (rhs2) == INTEGER_CST)))
+	return true;
+      /* LHS or RHS of the comparison is already promoted.  */
+      else if ((TYPE_PRECISION (TREE_TYPE (rhs1))
+		== TYPE_PRECISION (promoted_type))
+	       || (TYPE_PRECISION (TREE_TYPE (rhs2))
+		   == TYPE_PRECISION (promoted_type)))
+	return true;
+      else
+	return false;
+    }
+
+  if (TREE_CODE_CLASS (code) == tcc_reference
+      || code == VIEW_CONVERT_EXPR
+      || code == COMPLEX_EXPR
+      || code == ASM_EXPR
+      || code == OBJ_TYPE_REF
+      || gimple_vdef (stmt)
+      || VECTOR_TYPE_P (TREE_TYPE (lhs)))
+    return false;
+  return true;
+}
+
+/* Promote constants in STMT to TYPE.  */
+static void
+promote_cst_in_stmt (gimple stmt, tree type)
+{
+  tree op;
+  ssa_op_iter iter;
+  use_operand_p oprnd;
+  int index;
+
+  switch (gimple_code (stmt))
+    {
+    case GIMPLE_ASSIGN:
+      op = gimple_assign_rhs1 (stmt);
+      if (op && TREE_CODE (op) == INTEGER_CST)
+	gimple_assign_set_rhs1 (stmt, fold_convert (type, op));
+      op = gimple_assign_rhs2 (stmt);
+      if (op && TREE_CODE (op) == INTEGER_CST)
+	gimple_assign_set_rhs2 (stmt, fold_convert (type, op));
+      break;
+
+    case GIMPLE_PHI:
+      FOR_EACH_PHI_ARG (oprnd, stmt, iter, SSA_OP_USE)
+	{
+	  op = USE_FROM_PTR (oprnd);
+	  index = PHI_ARG_INDEX_FROM_USE (oprnd);
+	  if (TREE_CODE (op) == INTEGER_CST)
+	    SET_PHI_ARG_DEF (stmt, index, fold_convert (type, op));
+	}
+      break;
+
+    case GIMPLE_COND:
+      op = gimple_cond_lhs (stmt);
+      if (op && TREE_CODE (op) == INTEGER_CST)
+	gimple_cond_set_lhs (stmt, fold_convert (type, op));
+      op = gimple_cond_rhs (stmt);
+      if (op && TREE_CODE (op) == INTEGER_CST)
+	gimple_cond_set_rhs (stmt, fold_convert (type, op));
+
+    default:
+      break;
+    }
+}
+
+/* Promote use in an assignment.  Depending on the gimple_assign_rhs_code,
+   values in NEW_USE might have to be truncated to the type of USE.  */
+static void
+promote_assign_stmt_use (gimple stmt,
+			 tree use,
+			 imm_use_iterator *ui,
+			 tree new_use,
+			 tree copy_of_use,
+			 tree promoted_type)
+{
+  tree lhs = gimple_assign_lhs (stmt);
+  tree rhs1 = gimple_assign_rhs1 (stmt);
+  tree rhs2 = gimple_assign_rhs2 (stmt);
+  gimple_stmt_iterator gsi;
+  use_operand_p op;
+  tree type;
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  /* If promoted and fix up is tobe performed, fix is true.  */
+  bool fix = false;
+  /* If stmt code specifc and fix upis  performed, done is true.  */
+  bool done = false;
+
+  switch (code)
+    {
+    CASE_CONVERT:
+      /* If this is where precision is lost, just replace the use with
+	 new_use. */
+      if (TYPE_PRECISION (TREE_TYPE (lhs)) < TYPE_PRECISION (TREE_TYPE (rhs1)))
+	{
+	  done = true;
+	  FOR_EACH_IMM_USE_ON_STMT (op, *ui)
+	    SET_USE (op, new_use);
+	  update_stmt (stmt);
+	}
+      break;
+
+    case RSHIFT_EXPR:
+    case LSHIFT_EXPR:
+    case WIDEN_LSHIFT_EXPR:
+      if (use == rhs2
+	  && ssa_overflows_p (use))
+	fix = true;
+      break;
+
+    case TRUNC_MOD_EXPR:
+    case CEIL_MOD_EXPR:
+    case FLOOR_MOD_EXPR:
+    case ROUND_MOD_EXPR:
+    case TRUNC_DIV_EXPR:
+    case CEIL_DIV_EXPR:
+    case FLOOR_DIV_EXPR:
+    case RDIV_EXPR:
+    case ROUND_DIV_EXPR:
+    case EXACT_DIV_EXPR:
+      if (ssa_overflows_p (use))
+	fix = true;
+      break;
+
+    default:
+      break;
+    }
+
+  if (fix && !done)
+    {
+      if (promotion_safe_p (lhs))
+	{
+	  tree temp = make_promoted_copy (use, NULL, promoted_type);
+	  gimple copy_stmt
+	    = gimple_build_assign_with_ops (CONVERT_EXPR, temp,
+					    new_use, NULL_TREE);
+	  gsi = gsi_for_stmt (stmt);
+	  gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+	  FOR_EACH_IMM_USE_ON_STMT (op, *ui)
+	    SET_USE (op, temp);
+	  update_stmt (stmt);
+	}
+      else
+	{
+	  tree temp;
+	  if (copy_of_use)
+	    temp = copy_of_use;
+	  else
+	    {
+	      temp = make_promoted_copy (use, NULL, TREE_TYPE (use));
+	      gimple copy_stmt
+		= gimple_build_assign_with_ops (CONVERT_EXPR, temp,
+						new_use, NULL_TREE);
+	      gsi = gsi_for_stmt (stmt);
+	      gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+	    }
+
+	  FOR_EACH_IMM_USE_ON_STMT (op, *ui)
+	    SET_USE (op, temp);
+	  update_stmt (stmt);
+	}
+    }
+  else if (!done)
+    {
+      if (assign_rhs_promotable_p (stmt, promoted_type)
+	  && (promotion_safe_p (lhs)
+	      || (TREE_CODE_CLASS (code) == tcc_comparison)))
+	{
+	  type = promoted_type;
+	  if (TYPE_PRECISION (TREE_TYPE (use))
+	      < TYPE_PRECISION (promoted_type))
+	    promote_cst_in_stmt (stmt, promoted_type);
+	}
+      else
+	type = TREE_TYPE (use);
+
+      if ((type != TREE_TYPE (new_use)
+	   && type != TREE_TYPE (use))
+	  || (type == TREE_TYPE (use)
+	      && !copy_of_use))
+	{
+	  tree temp = make_promoted_copy (use, NULL, type);
+	  gimple copy_stmt
+	    = gimple_build_assign_with_ops (CONVERT_EXPR, temp,
+					    new_use, NULL_TREE);
+	  gsi = gsi_for_stmt (stmt);
+	  gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+	  FOR_EACH_IMM_USE_ON_STMT (op, *ui)
+	    SET_USE (op, temp);
+	  update_stmt (stmt);
+	}
+      else if (use != new_use)
+	{
+	  FOR_EACH_IMM_USE_ON_STMT (op, *ui)
+	    if (type == TREE_TYPE (new_use))
+	      SET_USE (op, new_use);
+	    else
+	      SET_USE (op, copy_of_use);
+	  update_stmt (stmt);
+	}
+    }
+}
+
+/* Promote ssa USE in phi STMT to PROMOTED_TYPE.  */
+static void
+promote_phi_stmt_use (gimple stmt,
+		       tree use,
+		       imm_use_iterator *ui,
+		       tree new_use,
+		       tree copy_of_use,
+		       tree promoted_type)
+{
+  tree lhs = PHI_RESULT (stmt);
+  tree type;
+  tree temp;
+  gimple_stmt_iterator gsi;
+  use_operand_p op;
+
+  if (ssa_tobe_promoted_p (lhs)
+      && promotion_safe_p (lhs))
+    type = promoted_type;
+  else
+    type = TREE_TYPE (lhs);
+
+  /* Check if we need a convert stmt to get the required type.  */
+  if ((type != TREE_TYPE (new_use) && type != TREE_TYPE (use))
+      || (type == TREE_TYPE (use) && !copy_of_use))
+    {
+      temp = make_promoted_copy (use, NULL, type);
+      gimple copy_stmt
+	= gimple_build_assign_with_ops (CONVERT_EXPR, temp,
+					new_use, NULL_TREE);
+      if (gimple_code (SSA_NAME_DEF_STMT (new_use)) == GIMPLE_NOP)
+	{
+	  basic_block bb = ENTRY_BLOCK_PTR_FOR_FN (cfun);
+	  bb = get_next_bb (bb);
+	  gcc_assert (bb);
+	  gsi = gsi_after_labels (bb);
+	  gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+	}
+      else if (gimple_code (SSA_NAME_DEF_STMT (new_use))
+	       != GIMPLE_PHI)
+	{
+	  gsi = gsi_for_stmt (SSA_NAME_DEF_STMT (new_use));
+	  if (lookup_stmt_eh_lp (SSA_NAME_DEF_STMT (new_use)) > 0)
+	    insert_next_bb (SSA_NAME_DEF_STMT (new_use), copy_stmt);
+	  else
+	    gsi_insert_after (&gsi, copy_stmt, GSI_NEW_STMT);
+	}
+      else
+	{
+	  gsi = gsi_after_labels
+	    (gimple_bb (SSA_NAME_DEF_STMT (new_use)));
+	  gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+	}
+    }
+  else if (type == TREE_TYPE (new_use))
+    temp = new_use;
+  else
+    temp = copy_of_use;
+
+  FOR_EACH_IMM_USE_ON_STMT (op, *ui)
+    SET_USE (op, temp);
+  update_stmt (stmt);
+}
+
+/* Promote ssa USE in STMT to PROMOTED_TYPE.  */
+static void
+promote_cond_stmt_use (gimple stmt,
+		       tree use,
+		       imm_use_iterator *ui,
+		       tree new_use,
+		       tree copy_of_use,
+		       tree promoted_type)
+{
+  bool promote;
+  tree lhs = gimple_cond_lhs (stmt);
+  tree rhs = gimple_cond_rhs (stmt);
+  gimple_stmt_iterator gsi;
+  use_operand_p op;
+
+  /* check if LHS and RHS can be promoted without changing the results
+     of comparison.  */
+  if (((ssa_tobe_promoted_p (lhs)
+	&& promotion_safe_p (lhs)
+	&& !ssa_overflows_p (lhs))
+       || (TREE_CODE (lhs) == INTEGER_CST))
+      && (( ssa_tobe_promoted_p (rhs)
+	   && promotion_safe_p (rhs)
+	   && !ssa_overflows_p (rhs))
+	  || (TREE_CODE (rhs) == INTEGER_CST)))
+    promote = true;
+  /* LHS or RHS of the comparsion is already promoted.  */
+  else if ((TYPE_PRECISION (TREE_TYPE (lhs))
+	    == TYPE_PRECISION (promoted_type))
+	   || (TYPE_PRECISION (TREE_TYPE (rhs))
+	       == TYPE_PRECISION (promoted_type)))
+    promote = true;
+  else
+    promote = false;
+
+  if (promote)
+    {
+      /* Copmparison will happen in promoted type.  */
+      tree temp;
+      if (TREE_TYPE (use) != TREE_TYPE (new_use))
+	temp = new_use;
+      else
+	{
+	  temp = make_promoted_copy (use, NULL, promoted_type);
+	  gimple copy_stmt
+	    = gimple_build_assign_with_ops (CONVERT_EXPR, temp,
+					    new_use, NULL_TREE);
+	  gsi = gsi_for_stmt (stmt);
+	  gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+	}
+      FOR_EACH_IMM_USE_ON_STMT (op, *ui)
+	SET_USE (op, temp);
+      update_stmt (stmt);
+      promote_cst_in_stmt (stmt, promoted_type);
+    }
+  else if (TREE_TYPE (use) != TREE_TYPE (new_use))
+    {
+      /* Copmparison will happen in original type.  */
+      tree temp;
+      if (copy_of_use)
+	temp = copy_of_use;
+      else
+	{
+	  temp = make_promoted_copy (use, NULL, TREE_TYPE (use));
+	  gimple copy_stmt
+	    = gimple_build_assign_with_ops (CONVERT_EXPR, temp,
+					    new_use, NULL_TREE);
+	  gsi = gsi_for_stmt (stmt);
+	  gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+	}
+
+      FOR_EACH_IMM_USE_ON_STMT (op, *ui)
+	SET_USE (op, temp);
+      update_stmt (stmt);
+    }
+}
+
+/* Promote definition DEF to NEW_TYPE.  If the DEF is replaced and has to
+   be released, set RELEASE_DEF.  Also return COPY_OF_DEF with the original
+   type for any use statement that needs truncation.  */
+static tree
+promote_definition (tree def,
+		    tree promoted_type,
+		    tree *copy_of_def,
+		    bool *release_def)
+{
+  gimple def_stmt = SSA_NAME_DEF_STMT (def);
+  gimple copy_stmt = NULL;
+  tree new_def;
+  basic_block bb;
+  gimple_stmt_iterator gsi;
+  gcc_assert (release_def);
+  *release_def = false;
+
+  switch (gimple_code (def_stmt))
+    {
+    case GIMPLE_PHI:
+      new_def = make_promoted_copy (def, def_stmt, promoted_type);
+      *copy_of_def = NULL;
+      gimple_phi_set_result (def_stmt, new_def);
+      SET_PHI_RESULT (def_stmt, new_def);
+      *release_def = true;
+      update_stmt (def_stmt);
+      promote_cst_in_stmt (def_stmt, promoted_type);
+      break;
+
+    case GIMPLE_NOP:
+      /* Create a promoted type copy of parameters.  */
+      bb = ENTRY_BLOCK_PTR_FOR_FN (cfun);
+      bb = get_next_bb (bb);
+      gcc_assert (bb);
+      gsi = gsi_after_labels (bb);
+      new_def = make_promoted_copy (def, NULL, promoted_type);
+      copy_stmt = gimple_build_assign_with_ops (CONVERT_EXPR, new_def,
+						def, NULL_TREE);
+      gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+      *copy_of_def = def;
+      break;
+
+    case GIMPLE_ASSIGN:
+      new_def = make_promoted_copy (def, def_stmt, promoted_type);
+      gimple_assign_set_lhs (def_stmt, new_def);
+      update_stmt (def_stmt);
+      if (TREE_CODE_CLASS (gimple_assign_rhs_code (def_stmt))
+	  != tcc_comparison)
+	promote_cst_in_stmt (def_stmt, promoted_type);
+      *release_def = true;
+      *copy_of_def = NULL;
+      break;
+
+    default:
+      new_def = make_promoted_copy (def, NULL, promoted_type);
+      copy_stmt = gimple_build_assign_with_ops (CONVERT_EXPR, new_def,
+						def, NULL_TREE);
+      gsi = gsi_for_stmt (def_stmt);
+      if (lookup_stmt_eh_lp (def_stmt) > 0)
+	insert_next_bb (def_stmt, copy_stmt);
+      else
+	gsi_insert_after (&gsi, copy_stmt, GSI_NEW_STMT);
+      update_stmt (copy_stmt);
+      *copy_of_def = def;
+      break;
+    }
+  return new_def;
+}
+
+
+/* Promote all the USE with NEW_USE.  */
+static unsigned int
+promote_all_uses (tree use, tree new_use, tree copy_of_use, tree promoted_type)
+{
+  gimple stmt;
+  imm_use_iterator ui;
+  gimple_stmt_iterator gsi;
+  use_operand_p op;
+
+  /* Replace all the use with the promoted variable.  */
+  FOR_EACH_IMM_USE_STMT (stmt, ui, use)
+    {
+      if (stmt == SSA_NAME_DEF_STMT (new_use))
+	continue;
+
+      switch (gimple_code (stmt))
+	{
+	case GIMPLE_ASSIGN:
+	  promote_assign_stmt_use (stmt, use, &ui, new_use,
+				   copy_of_use, promoted_type);
+	  break;
+	case GIMPLE_PHI:
+	  promote_phi_stmt_use (stmt, use, &ui, new_use,
+				copy_of_use, promoted_type);
+	  break;
+	case GIMPLE_COND:
+	  promote_cond_stmt_use (stmt, use, &ui, new_use,
+				 copy_of_use, promoted_type);
+	  break;
+	case GIMPLE_DEBUG:
+	  if (TREE_TYPE (use) != TREE_TYPE (new_use))
+	    {
+	      gsi = gsi_for_stmt (stmt);
+	      gsi_remove (&gsi, true);
+	    }
+	  break;
+	case GIMPLE_RETURN:
+	default:
+	  if (TREE_TYPE (use) != TREE_TYPE (new_use))
+	    {
+	      tree temp;
+	      if (copy_of_use)
+		temp = copy_of_use;
+	      else
+		{
+		  temp = make_promoted_copy (use, NULL, TREE_TYPE (use));
+		  gimple copy_stmt
+		    = gimple_build_assign_with_ops (CONVERT_EXPR, temp,
+						    new_use, NULL_TREE);
+		  gsi = gsi_for_stmt (stmt);
+		  gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+		}
+
+	      FOR_EACH_IMM_USE_ON_STMT (op, ui)
+		SET_USE (op, temp);
+	      update_stmt (stmt);
+	    }
+	  break;
+	}
+    }
+  return 0;
+}
+
+/* Promote definition of NAME and uses.  */
+static unsigned int
+promote_def_and_uses (tree name)
+{
+  tree type, new_name, copy_of_name;
+  bool release_def = false;
+
+  if (TREE_CODE (name) != SSA_NAME
+      || POINTER_TYPE_P (TREE_TYPE (name))
+      || !INTEGRAL_TYPE_P (TREE_TYPE (name))
+      || VECTOR_TYPE_P (TREE_TYPE (name))
+      || ssa_promoted_p (name)
+      || (type = get_promoted_type (TREE_TYPE (name))) == TREE_TYPE (name))
+    return 0;
+
+  if (promotion_safe_p (name))
+    {
+      new_name = promote_definition (name, type, &copy_of_name,
+				     &release_def);
+      promote_all_uses (name, new_name, copy_of_name, type);
+    }
+  else
+    promote_all_uses (name, name, name, type);
+  set_ssa_promoted (name);
+
+  if (release_def)
+    release_ssa_name (name);
+  return 0;
+}
+
+/* Mark the candidates.  */
+static void
+set_ssa_to_be_promoted_flag (gimple stmt)
+{
+  ssa_op_iter i;
+  tree def;
+  use_operand_p op;
+
+  switch (gimple_code (stmt))
+    {
+    case GIMPLE_PHI:
+      def = PHI_RESULT (stmt);
+      bitmap_set_bit (ssa_to_be_promoted_bitmap, SSA_NAME_VERSION (def));
+      FOR_EACH_PHI_ARG (op, stmt, i, SSA_OP_USE)
+	{
+	  def = USE_FROM_PTR (op);
+	  if (TREE_CODE (def) == SSA_NAME)
+	    bitmap_set_bit (ssa_to_be_promoted_bitmap, SSA_NAME_VERSION (def));
+	}
+      break;
+
+    default:
+      FOR_EACH_SSA_TREE_OPERAND (def, stmt, i, SSA_OP_USE | SSA_OP_DEF)
+	{
+	  if (TREE_CODE (def) == SSA_NAME)
+	    bitmap_set_bit (ssa_to_be_promoted_bitmap, SSA_NAME_VERSION (def));
+	}
+      break;
+    }
+}
+
+/* Visit PHI stmt and record if variables might have higher bits set if
+   promoted.  */
+static bool
+record_visit_phi_node (gimple phi)
+{
+  tree def;
+  ssa_op_iter i;
+  use_operand_p op;
+  bool high_bits_set = false;
+  tree lhs = PHI_RESULT (phi);
+
+  if (TREE_CODE (lhs) != SSA_NAME
+      || POINTER_TYPE_P (TREE_TYPE (lhs))
+      || !INTEGRAL_TYPE_P (TREE_TYPE (lhs))
+      || ssa_overflows_p (lhs))
+    return false;
+
+  FOR_EACH_PHI_ARG (op, phi, i, SSA_OP_USE)
+    {
+      def = USE_FROM_PTR (op);
+      if (ssa_overflows_p (def))
+	high_bits_set = true;
+    }
+
+  if (high_bits_set)
+    {
+      set_ssa_overflows (lhs);
+      return true;
+    }
+  else
+    return false;
+}
+
+/* Visit STMT and record if variables might have higher bits set if
+   promoted.  */
+static bool
+record_visit_stmt (gimple stmt)
+{
+  tree def;
+  ssa_op_iter i;
+  bool changed = false;
+  gcc_assert (gimple_code (stmt) == GIMPLE_ASSIGN);
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  tree lhs = gimple_assign_lhs (stmt);
+  tree rhs1 = gimple_assign_rhs1 (stmt);
+
+  if (TREE_CODE (lhs) != SSA_NAME
+      || POINTER_TYPE_P (TREE_TYPE (lhs))
+      || !INTEGRAL_TYPE_P (TREE_TYPE (lhs)))
+    return false;
+
+  switch (code)
+    {
+      /* Conversion expressions that may need to be preserved.  */
+    CASE_CONVERT:
+      bitmap_set_bit (ssa_not_safe_bitmap, SSA_NAME_VERSION (lhs));
+      break;
+
+    case SSA_NAME:
+      if (!ssa_overflows_p (lhs)
+	  && ssa_overflows_p (rhs1))
+	{
+	  set_ssa_overflows (lhs);
+	  changed = true;
+	}
+      break;
+
+    case NE_EXPR:
+    case LT_EXPR:
+    case LE_EXPR:
+    case GT_EXPR:
+    case GE_EXPR:
+    case EQ_EXPR:
+    case UNLT_EXPR:
+    case UNLE_EXPR:
+    case UNGT_EXPR:
+    case UNGE_EXPR:
+    case UNEQ_EXPR:
+    case LTGT_EXPR:
+    case RSHIFT_EXPR:
+    case LSHIFT_EXPR:
+    case WIDEN_LSHIFT_EXPR:
+      break;
+
+    case TRUNC_DIV_EXPR:
+    case CEIL_DIV_EXPR:
+    case FLOOR_DIV_EXPR:
+    case RDIV_EXPR:
+    case ROUND_DIV_EXPR:
+    case EXACT_DIV_EXPR:
+      if (!ssa_overflows_p (lhs))
+	{
+	  set_ssa_overflows (lhs);
+	  changed = true;
+	}
+      break;
+
+      /* Expressions which may produce results that will have higher bits if
+	 computed in promoted type.  (i.e.  results may overflow) */
+    case MULT_HIGHPART_EXPR:
+    case PLUS_EXPR:
+    case MINUS_EXPR:
+    case MULT_EXPR:
+    case BIT_XOR_EXPR:
+    case BIT_NOT_EXPR:
+    case WIDEN_MULT_EXPR:
+    case WIDEN_MULT_PLUS_EXPR:
+    case WIDEN_MULT_MINUS_EXPR:
+    case WIDEN_SUM_EXPR:
+    case BIT_IOR_EXPR:
+    case BIT_AND_EXPR:
+      if (!ssa_overflows_p (lhs))
+	{
+	  set_ssa_overflows (lhs);
+	  changed = true;
+	}
+      break;
+
+      /* Expressions for which operation has to be performed in original
+	 types if promoted operands may have higher bits.  */
+    case MIN_EXPR:
+    case MAX_EXPR:
+    case RANGE_EXPR:
+    case ABS_EXPR:
+    case NEGATE_EXPR:
+    case TRUNC_MOD_EXPR:
+    case CEIL_MOD_EXPR:
+    case FLOOR_MOD_EXPR:
+    case ROUND_MOD_EXPR:
+      FOR_EACH_SSA_TREE_OPERAND (def, stmt, i, SSA_OP_USE)
+	{
+	  if (ssa_overflows_p (def))
+	    bitmap_set_bit (ssa_not_safe_bitmap, SSA_NAME_VERSION (lhs));
+	}
+      break;
+
+      /* Expressions that has to be done in original types.  */
+    case LROTATE_EXPR:
+    case RROTATE_EXPR:
+      bitmap_set_bit (ssa_not_safe_bitmap, SSA_NAME_VERSION (lhs));
+      break;
+
+      /* To be safe, all other have to be done in original types.  */
+    default:
+      bitmap_set_bit (ssa_not_safe_bitmap, SSA_NAME_VERSION (lhs));
+      break;
+    }
+  return changed;
+}
+
+
+/* Promote all the stmts in the basic block.  */
+static void
+promote_all_stmts (basic_block bb)
+{
+  gimple_stmt_iterator gsi;
+  ssa_op_iter iter;
+  tree def;
+  for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+    {
+      gimple stmt = gsi_stmt (gsi);
+      use_operand_p op;
+
+      def = PHI_RESULT (stmt);
+      promote_def_and_uses (def);
+      FOR_EACH_PHI_ARG (op, stmt, iter, SSA_OP_USE)
+	{
+	  def = USE_FROM_PTR (op);
+	  promote_def_and_uses (def);
+	}
+    }
+  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+    {
+      gimple stmt = gsi_stmt (gsi);
+
+      FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_USE | SSA_OP_DEF)
+	promote_def_and_uses (def);
+    }
+}
+
+static void
+process_all_stmts_for_unsafe_promotion ()
+{
+  basic_block bb;
+  gimple_stmt_iterator gsi;
+  auto_vec<gimple> work_list;
+
+  FOR_EACH_BB_FN (bb, cfun)
+    {
+      for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+	{
+	  gimple phi = gsi_stmt (gsi);
+	  set_ssa_to_be_promoted_flag (phi);
+	  work_list.safe_push (phi);
+	}
+
+      for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+	{
+	  gimple stmt = gsi_stmt (gsi);
+	  set_ssa_to_be_promoted_flag (stmt);
+	  if (gimple_code (stmt) == GIMPLE_ASSIGN)
+	    work_list.safe_push (stmt);
+	}
+    }
+
+  while (work_list.length () > 0)
+    {
+      bool changed;
+      gimple stmt = work_list.pop ();
+      tree lhs;
+
+      switch (gimple_code (stmt))
+	{
+	case GIMPLE_ASSIGN:
+	  changed = record_visit_stmt (stmt);
+	  lhs = gimple_assign_lhs (stmt);
+	  break;
+	case GIMPLE_PHI:
+	  changed = record_visit_phi_node (stmt);
+	  lhs = PHI_RESULT (stmt);
+	  break;
+	default:
+	  gcc_assert (false);
+	  break;
+	}
+
+      if (changed)
+	{
+	  gimple use_stmt;
+	  imm_use_iterator ui;
+	  FOR_EACH_IMM_USE_STMT (use_stmt, ui, lhs)
+	    {
+	      if (gimple_code (use_stmt) == GIMPLE_ASSIGN
+		  || gimple_code (use_stmt) == GIMPLE_PHI)
+		work_list.safe_push (use_stmt);
+	    }
+	}
+    }
+}
+
+class type_promotion_dom_walker : public dom_walker
+{
+public:
+  type_promotion_dom_walker (cdi_direction direction)
+    : dom_walker (direction) {}
+  virtual void before_dom_children (basic_block bb)
+    {
+      promote_all_stmts (bb);
+    }
+};
+
+/* Main entry point to the pass.  */
+static unsigned int
+execute_type_promotion (void)
+{
+  n_ssa_val = num_ssa_names;
+  ssa_not_safe_bitmap = sbitmap_alloc (n_ssa_val);
+  bitmap_clear (ssa_not_safe_bitmap);
+  ssa_to_be_promoted_bitmap = sbitmap_alloc (n_ssa_val);
+  bitmap_clear (ssa_to_be_promoted_bitmap);
+  ssa_sets_higher_bits_bitmap = sbitmap_alloc (n_ssa_val);
+  bitmap_clear (ssa_sets_higher_bits_bitmap);
+
+  process_all_stmts_for_unsafe_promotion ();
+  /* Walk the CFG in dominator order.  */
+  type_promotion_dom_walker (CDI_DOMINATORS)
+    .walk (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+
+  sbitmap_free (ssa_not_safe_bitmap);
+  sbitmap_free (ssa_to_be_promoted_bitmap);
+  sbitmap_free (ssa_sets_higher_bits_bitmap);
+  return 0;
+}
+
+namespace {
+const pass_data pass_data_type_promotion =
+{
+  GIMPLE_PASS, /* type */
+  "promotion", /* name */
+  OPTGROUP_NONE, /* optinfo_flags */
+  TV_TREE_TYPE_PRMT, /* tv_id */
+  PROP_ssa, /* properties_required */
+  0, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  (TODO_cleanup_cfg | TODO_update_ssa | TODO_verify_all),
+};
+
+class pass_type_promotion : public gimple_opt_pass
+{
+public:
+  pass_type_promotion (gcc::context *ctxt)
+    : gimple_opt_pass (pass_data_type_promotion, ctxt)
+  {}
+
+  /* opt_pass methods: */
+  opt_pass * clone () { return new pass_type_promotion (m_ctxt); }
+  virtual bool gate (function *) { return flag_tree_type_prmt != 0; }
+  virtual unsigned int execute (function *)
+    {
+      return execute_type_promotion ();
+    }
+
+}; // class pass_type_promotion
+
+} // anon namespace
+
+gimple_opt_pass *
+make_pass_type_promotion (gcc::context *ctxt)
+{
+  return new pass_type_promotion (ctxt);
+}
+

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC] Elimination of zext/sext - type promotion pass
  2014-11-09 23:30               ` [RFC] Elimination of zext/sext - type promotion pass Kugan
@ 2014-11-10 12:56                 ` Richard Biener
  2015-05-01  4:41                   ` Kugan
  0 siblings, 1 reply; 51+ messages in thread
From: Richard Biener @ 2014-11-10 12:56 UTC (permalink / raw)
  To: Kugan; +Cc: Uros Bizjak, gcc-patches, Jakub Jelinek

On Mon, Nov 10, 2014 at 12:29 AM, Kugan
<kugan.vivekanandarajah@linaro.org> wrote:
>
>> Well - the best way would be to expose the target specifics to GIMPLE
>> at some point in the optimization pipeline.  My guess would be that it's
>> appropriate after loop optimizations (but maybe before induction variable
>> optimization).
>>
>> That is, have a pass that applies register promotion to all SSA names
>> in the function, inserting appropriate truncations and extensions.  That
>> way you'd never see (set (subreg...) on RTL.  The VRP and DOM
>> passes running after that pass would then be able to aggressively
>> optimize redundant truncations and extensions.
>>
>> Effects on debug information are to be considered.  You can change
>> the type of SSA names in-place but you don't want to do that for
>> user DECLs (and we can't have the SSA name type and its DECL
>> type differ - and not sure if we might want to lift that restriction).
>
>
> Thanks for the comments. Here is a prototype patch that implements a
> type promotion pass. This pass records SSA variables that will have
> values in higher bits (than the original type precision) if promoted and
> uses this information in inserting appropriate truncations and
> extensions. This pass also classifies some of the stmts that sets ssa's
> to be unsafe to promote. Here is a gimple difference for the type
> promotion as compared to previous dump for a testcase.

Note that while GIMPLE has a way to zero-extend (using BIT_AND_EXPR)
it has no convenient way to sign-extend other than truncating to a signed
(non-promoted) type and then extending to the promoted type.  Thus
I think such pass should be accompanied with a new tree code,
SEXT_EXPR.  Otherwise we end up with "spurious" un-promoted
signed types which later optimizations may be confused about.

Not sure if that is the actual issue though.

Instead op "prmt" and "prmtn" I'd spell out promote and tree-type-prmtn
should be gimple-ssa-type-promote.c.  In the end all targets with
non-trivial PROMOTE_MODE should run the pass as a lowering step
so it should be enabled even at -O0 (and not disablable).

I'd definitely run the pass _after_ pass_lower_vector_ssa (and in the
end I'd like to run it before IVOPTs ... which means moving IVOPTs
later, after VRP which should be the pass optimizing away some of
the extensions).

In get_promoted_type I don't understand why you preserve qualifiers.
Also even for targets without PROMOTE_MODE it may be
beneficial to expose truncations required by expanding bit-precision
arithmetic earlier (that is, if !PROMOTE_MODE at least promote
to GET_MODE_PRECISION (TYPE_MODE (type))).  A testcase
for that is for example

struct { long i : 33; long j : 33; } a;
return a.i + a.j;

where bitfields of type > int do not promote so you get a
33 bit add which we expand to a 64bit add plus a sign-extension
(and nothing optimizes that later usually).

insert_next_bb sounds like you want to use insert_on_edge
somewhere.

in assign_rhs_promotable_p you handle comparisons special
but the ternary COND_EXPR and VEC_COND_EXPR can have
comparisons embedded in their first operand.  The comment
confuses me though - with proper sign- or zero-extensions inserted
you should be able to promote them anyway?

You seem to miss that a GIMPLE_ASSIGN can have 3 operands
in promote_cst_in_stmt as well.

In promote_assign_stmt_use I consider a default: case that ends
up doing nothing dangerous ;)  Please either use gcc_unreachable ()
or do the safe thing (fix = true;?).  You seem to be working with
a lattice of some kind - fixing up stmt uses the way you do - walking
over immediate uses - is not very cache friendly.  Why not use
a lattice for this - record promoted vars to be used for old SSA names
and walk over all stmts instead, replacing SSA uses on them?
Btw, you don't need to call update_stmt if you SET_USE and not
replace an SSA name with a constant.

You seem to "fix" with a single stmt but I don't see where you insert
zero- or sign-extensions for ssa_overflows_p cases?

Note that at least for SSA names with !SSA_NAME_VAR (thus
anonymous vars) you want to do a cheaper promotion by not
allocating a new SSA name but simply "fixing" its type by
assigning to its TREE_TYPE.   For SSA names with SSA_NAME_VAR
there is of course debug-info to consider and thus doing what you
do is better (but probably still will wreck debuginfo?).

GIMPLE_NOPs are not only used for parameters but also uninitialized
uses - for non-parameters you should simply adjust their type.  No
need to fixup their value.

The pass needs more comments.

It looks like you are not promoting all variables but only those
where compensation code (zero-/sign-extensions) is not necessary?

Thanks for trying to work on this.
Richard.

>  crc2 (short unsigned int crc, unsigned char data)
>  {
>    unsigned char carry;
>    unsigned char x16;
>    unsigned char i;
> -  unsigned char ivtmp_5;
> +  unsigned int _2;
> +  unsigned char _3;
> +  unsigned int _4;
> +  unsigned int _5;
>    unsigned char _9;
> -  unsigned char _10;
> -  unsigned char ivtmp_18;
> +  unsigned int _10;
> +  unsigned int _11;
> +  unsigned int _12;
> +  unsigned int _13;
> +  unsigned int _15;
> +  unsigned int _16;
> +  unsigned int _18;
> +  unsigned int _21;
> +  unsigned int _22;
> +  unsigned int _24;
> +  short unsigned int _26;
> +  unsigned char _27;
> +  unsigned int _28;
> +  unsigned int _29;
> +  unsigned int _30;
>
>    <bb 2>:
> +  _12 = (unsigned int) data_8(D);
> +  _2 = (unsigned int) crc_7(D);
>
>    <bb 3>:
> -  # crc_28 = PHI <crc_2(5), crc_7(D)(2)>
> -  # data_29 = PHI <data_12(5), data_8(D)(2)>
> -  # ivtmp_18 = PHI <ivtmp_5(5), 8(2)>
> -  _9 = (unsigned char) crc_28;
> -  _10 = _9 ^ data_29;
> -  x16_11 = _10 & 1;
> -  data_12 = data_29 >> 1;
> -  if (x16_11 == 1)
> +  # _30 = PHI <_28(5), _2(2)>
> +  # _16 = PHI <_29(5), _12(2)>
> +  # _4 = PHI <_18(5), 8(2)>
> +  _9 = (unsigned char) _30;
> +  _5 = (unsigned int) _9;
> +  _22 = _5 ^ _16;
> +  _10 = _22 & 1;
> +  _29 = _16 >> 1;
> +  _27 = (unsigned char) _10;
> +  if (_27 == 1)
>      goto <bb 4>;
>    else
>      goto <bb 7>;
>
>    <bb 4>:
> -  crc_13 = crc_28 ^ 16386;
> -  crc_24 = crc_13 >> 1;
> -  crc_15 = crc_24 | 32768;
> +  _11 = _30 ^ 16386;
> +  _13 = _11 >> 1;
> +  _24 = _13 | 32768;
>
>    <bb 5>:
> -  # crc_2 = PHI <crc_15(4), crc_21(7)>
> -  ivtmp_5 = ivtmp_18 - 1;
> -  if (ivtmp_5 != 0)
> +  # _28 = PHI <_24(4), _15(7)>
> +  _18 = _4 - 1;
> +  _3 = (unsigned char) _18;
> +  if (_3 != 0)
>      goto <bb 3>;
>    else
>      goto <bb 6>;
>
>    <bb 6>:
> -  # crc_19 = PHI <crc_2(5)>
> -  return crc_19;
> +  # _21 = PHI <_28(5)>
> +  _26 = (short unsigned int) _21;
> +  return _26;
>
>    <bb 7>:
> -  crc_21 = crc_28 >> 1;
> +  _15 = _30 >> 1;
>    goto <bb 5>;
>
>  }
>
>
> I experimented with few simple test-cases and results so far are mixed.
> It also seems that subsequent passes are not always optimising as
> expected. I haven't looked in detail but will look into it based on the
> feedback.
>
> Please also note that this pass still doest handle debug instructions
> and there are couple regression failures for ARM.
>
> Thanks,
> Kugan
>
>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC] Elimination of zext/sext - type promotion pass
  2014-11-10 12:56                 ` Richard Biener
@ 2015-05-01  4:41                   ` Kugan
  2015-05-08 12:48                     ` Richard Biener
  0 siblings, 1 reply; 51+ messages in thread
From: Kugan @ 2015-05-01  4:41 UTC (permalink / raw)
  To: Richard Biener; +Cc: Uros Bizjak, gcc-patches, Jakub Jelinek

[-- Attachment #1: Type: text/plain, Size: 5261 bytes --]


>> Thanks for the comments. Here is a prototype patch that implements a
>> type promotion pass. This pass records SSA variables that will have
>> values in higher bits (than the original type precision) if promoted and
>> uses this information in inserting appropriate truncations and
>> extensions. This pass also classifies some of the stmts that sets ssa's
>> to be unsafe to promote. Here is a gimple difference for the type
>> promotion as compared to previous dump for a testcase.
> 
> Note that while GIMPLE has a way to zero-extend (using BIT_AND_EXPR)
> it has no convenient way to sign-extend other than truncating to a signed
> (non-promoted) type and then extending to the promoted type.  Thus
> I think such pass should be accompanied with a new tree code,
> SEXT_EXPR.  Otherwise we end up with "spurious" un-promoted
> signed types which later optimizations may be confused about.
> 
> Not sure if that is the actual issue though.
> 
> Instead op "prmt" and "prmtn" I'd spell out promote and tree-type-prmtn
> should be gimple-ssa-type-promote.c.  In the end all targets with
> non-trivial PROMOTE_MODE should run the pass as a lowering step
> so it should be enabled even at -O0 (and not disablable).
> 
> I'd definitely run the pass _after_ pass_lower_vector_ssa (and in the
> end I'd like to run it before IVOPTs ... which means moving IVOPTs
> later, after VRP which should be the pass optimizing away some of
> the extensions).
> 
> In get_promoted_type I don't understand why you preserve qualifiers.
> Also even for targets without PROMOTE_MODE it may be
> beneficial to expose truncations required by expanding bit-precision
> arithmetic earlier (that is, if !PROMOTE_MODE at least promote
> to GET_MODE_PRECISION (TYPE_MODE (type))).  A testcase
> for that is for example
> 
> struct { long i : 33; long j : 33; } a;
> return a.i + a.j;
> 
> where bitfields of type > int do not promote so you get a
> 33 bit add which we expand to a 64bit add plus a sign-extension
> (and nothing optimizes that later usually).
> 
> insert_next_bb sounds like you want to use insert_on_edge
> somewhere.
> 
> in assign_rhs_promotable_p you handle comparisons special
> but the ternary COND_EXPR and VEC_COND_EXPR can have
> comparisons embedded in their first operand.  The comment
> confuses me though - with proper sign- or zero-extensions inserted
> you should be able to promote them anyway?
> 
> You seem to miss that a GIMPLE_ASSIGN can have 3 operands
> in promote_cst_in_stmt as well.
> 
> In promote_assign_stmt_use I consider a default: case that ends
> up doing nothing dangerous ;)  Please either use gcc_unreachable ()
> or do the safe thing (fix = true;?).  You seem to be working with
> a lattice of some kind - fixing up stmt uses the way you do - walking
> over immediate uses - is not very cache friendly.  Why not use
> a lattice for this - record promoted vars to be used for old SSA names
> and walk over all stmts instead, replacing SSA uses on them?
> Btw, you don't need to call update_stmt if you SET_USE and not
> replace an SSA name with a constant.
> 
> You seem to "fix" with a single stmt but I don't see where you insert
> zero- or sign-extensions for ssa_overflows_p cases?
> 
> Note that at least for SSA names with !SSA_NAME_VAR (thus
> anonymous vars) you want to do a cheaper promotion by not
> allocating a new SSA name but simply "fixing" its type by
> assigning to its TREE_TYPE.   For SSA names with SSA_NAME_VAR
> there is of course debug-info to consider and thus doing what you
> do is better (but probably still will wreck debuginfo?).
> 
> GIMPLE_NOPs are not only used for parameters but also uninitialized
> uses - for non-parameters you should simply adjust their type.  No
> need to fixup their value.
> 
> The pass needs more comments.
> 
> It looks like you are not promoting all variables but only those
> where compensation code (zero-/sign-extensions) is not necessary?
> 

Thanks for the comments. Please find an updated version of this which
addresses your review comments above. I am still to do full benchmarking
on this, but tried with few small benchmarks. I will do proper
benchmarking after getting feedback on the implementation. I have
however bootstrapped on x86-64-none-linux and regression tested on
x86-64, ARM and AArch64.

I am also not clear with how I should handle the gimple debug statements
when the intermediate temporary variable that maps to the original
variable is promoted.

Thanks,
Kugan


gcc/ChangeLog:

2015-05-01  Kugan Vivekanandarajah  <kuganv@linaro.org>

	* Makefile.in: Add gimple-ssa-type-promote.o.
	* cfgexpand.c (expand_debug_expr): Handle SEXT_EXPR.
	* common.opt: New option -ftree-type-promote.
	* expr.c (expand_expr_real_2): Handle SEXT_EXPR.
	* fold-const.c (int_const_binop_1):
	* gimple-ssa-type-promote.c: New file.
	* passes.def: Define new pass_type_promote.
	* timevar.def: Define new TV_TREE_TYPE_PROMOTE.
	* tree-cfg.c (verify_gimple_assign_binary): Handle SEXT_EXPR.
	* tree-inline.c (estimate_operator_cost):
	* tree-pass.h (make_pass_type_promote): New.
	* tree-pretty-print.c (dump_generic_node): Handle SEXT_EXPR.
	(op_symbol_code): Likewise.
	* tree-vrp.c (extract_range_from_binary_expr_1): Likewise.
	* tree.def: Define new SEXT_EXPR.

[-- Attachment #2: p.txt --]
[-- Type: text/plain, Size: 44012 bytes --]

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 80c91f0..0318631 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1478,6 +1478,7 @@ OBJS = \
 	tree-vect-slp.o \
 	tree-vectorizer.o \
 	tree-vrp.o \
+	gimple-ssa-type-promote.o \
 	tree.o \
 	valtrack.o \
 	value-prof.o \
diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index ca491a0..99a1d4c 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -4881,6 +4881,10 @@ expand_debug_expr (tree exp)
     case FMA_EXPR:
       return simplify_gen_ternary (FMA, mode, inner_mode, op0, op1, op2);
 
+    case SEXT_EXPR:
+      return op0;
+
+
     default:
     flag_unsupported:
 #ifdef ENABLE_CHECKING
diff --git a/gcc/common.opt b/gcc/common.opt
index 380848c..1dc9b1b 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2356,6 +2356,10 @@ ftree-vrp
 Common Report Var(flag_tree_vrp) Init(0) Optimization
 Perform Value Range Propagation on trees
 
+ftree-type-promote
+Common Report Var(flag_tree_type_promote) Init(1) Optimization
+Perform Type Promotion on trees
+
 funit-at-a-time
 Common Report Var(flag_unit_at_a_time) Init(1)
 Compile whole compilation unit at a time
diff --git a/gcc/expr.c b/gcc/expr.c
index 530a944..f672a99 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -9336,6 +9336,21 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
       target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, target);
       return target;
 
+    case SEXT_EXPR:
+	{
+	  rtx op0 = expand_normal (treeop0);
+	  rtx temp;
+	  if (!target)
+	    target = gen_reg_rtx (TYPE_MODE (TREE_TYPE (treeop0)));
+
+	  machine_mode inner_mode = smallest_mode_for_size (tree_to_shwi (treeop1),
+							    MODE_INT);
+	  temp = convert_modes (inner_mode,
+				TYPE_MODE (TREE_TYPE (treeop0)), op0, 0);
+	  convert_move (target, temp, 0);
+	  return target;
+	}
+
     default:
       gcc_unreachable ();
     }
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 3654fd6..f5f00af 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -1007,6 +1007,10 @@ int_const_binop_1 (enum tree_code code, const_tree arg1, const_tree parg2,
       res = wi::bit_and (arg1, arg2);
       break;
 
+    case SEXT_EXPR:
+      res = wi::sext (arg1, arg2.to_uhwi ());
+      break;
+
     case RSHIFT_EXPR:
     case LSHIFT_EXPR:
       if (wi::neg_p (arg2))
diff --git a/gcc/gimple-ssa-type-promote.c b/gcc/gimple-ssa-type-promote.c
index e69de29..a226e50c 100644
--- a/gcc/gimple-ssa-type-promote.c
+++ b/gcc/gimple-ssa-type-promote.c
@@ -0,0 +1,1311 @@
+/* Type promotion of SSA names to minimise redundant zero/sign extension.
+   Copyright (C) 2015 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "flags.h"
+#include "hash-set.h"
+#include "machmode.h"
+#include "vec.h"
+#include "double-int.h"
+#include "input.h"
+#include "alias.h"
+#include "symtab.h"
+#include "wide-int.h"
+#include "inchash.h"
+#include "tree.h"
+#include "fold-const.h"
+#include "stor-layout.h"
+#include "calls.h"
+#include "predict.h"
+#include "hard-reg-set.h"
+#include "function.h"
+#include "dominance.h"
+#include "cfg.h"
+#include "basic-block.h"
+#include "tree-ssa-alias.h"
+#include "gimple-fold.h"
+#include "tree-eh.h"
+#include "gimple-expr.h"
+#include "is-a.h"
+#include "gimple.h"
+#include "gimple-iterator.h"
+#include "gimple-ssa.h"
+#include "tree-phinodes.h"
+#include "ssa-iterators.h"
+#include "stringpool.h"
+#include "tree-ssanames.h"
+#include "tree-pass.h"
+#include "gimple-pretty-print.h"
+#include "langhooks.h"
+#include "sbitmap.h"
+#include "domwalk.h"
+
+/* This pass applies type promotion to SSA names in the function and
+   inserts appropriate truncations.  Idea of this pass is to promote operations
+   such a way that we can minimise generation of subreg in RTL,
+   that intern results in removal of redundant zero/sign extensions.  This pass
+   will run prior to The VRP and DOM such that they will be able to optimise
+   redundant truncations and extensions.  This is based on the discussion from
+   https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00472.html.
+
+   This pass execute as follows:
+
+   1.  This pass records gimple statements that may produce results that can
+   overflow (beyond the original type) and operations that has to be always
+   performed in the original type.  This is done in
+   process_all_stmts_for_unsafe_promotion.  Here, gimple which sets SSA_NAMES
+   are processed in a work_list to set ssa_sets_higher_bits_bitmap
+   (set_ssa_overflows) and ssa_not_safe_bitmap.
+
+   2.  promote_all_stmts  traverses the basic blocks in dominator order and
+   promotes all the SSA_NAMES that were  selected as safe in the  step 1 above.
+   It uses promote_all_stmts to do the register promotion stmt by stmt.
+   The definition of the SSA_NAME is promoted first and then all the uses are
+   promoted according to the gimple stmt type.  If the SSA_NAME can overflow
+   when promoted necessary fix-ups are also performed to preserve the semantics
+   of the program.
+*/
+
+static unsigned n_ssa_val;
+static sbitmap ssa_not_safe_bitmap;
+static sbitmap ssa_to_be_promoted_bitmap;
+static sbitmap ssa_sets_higher_bits_bitmap;
+
+/* Return the promoted type for TYPE.  */
+static tree
+get_promoted_type (tree type)
+{
+  tree promoted_type;
+  enum machine_mode mode;
+  int uns;
+  if (POINTER_TYPE_P (type)
+      || TYPE_PRECISION (type) == 1
+      || !INTEGRAL_TYPE_P (type))
+    return type;
+#ifdef PROMOTE_MODE
+  mode = TYPE_MODE (type);
+  uns = TYPE_SIGN (type);
+  PROMOTE_MODE (mode, uns, type);
+#else
+  mode = smallest_mode_for_size (GET_MODE_PRECISION (TYPE_MODE (type)),
+				 MODE_INT);
+#endif
+  uns = TYPE_SIGN (type);
+  promoted_type = lang_hooks.types.type_for_mode (mode, uns);
+  if (promoted_type
+      && (TYPE_PRECISION (promoted_type) > TYPE_PRECISION (type)))
+    type = promoted_type;
+  return type;
+}
+
+/* Predicate that tells if promoting computation with ssa NAME is safe.  */
+static bool
+promotion_safe_p (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      gimple stmt = SSA_NAME_DEF_STMT (name);
+      unsigned int index = SSA_NAME_VERSION (name);
+
+      if (gimple_vdef (stmt) != NULL_TREE
+	  || gimple_vuse (stmt) != NULL_TREE)
+	return false;
+      if (index < n_ssa_val)
+	return !bitmap_bit_p (ssa_not_safe_bitmap, index);
+    }
+  return false;
+}
+
+/* Return true if ssa NAME is already considered for promotion.  */
+static bool
+ssa_promoted_p (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	return !bitmap_bit_p (ssa_to_be_promoted_bitmap, index);
+    }
+  return true;
+}
+
+/* Return true if ssa NAME will be considered for promotion.  */
+static bool
+ssa_tobe_promoted_p (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	return bitmap_bit_p (ssa_to_be_promoted_bitmap, index);
+    }
+  return false;
+}
+
+/* Set ssa NAME to be already considered for promotion.  */
+static void
+set_ssa_promoted (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	bitmap_clear_bit (ssa_to_be_promoted_bitmap, index);
+    }
+}
+
+/* Set ssa NAME will have higher bits if promoted.  */
+static void
+set_ssa_overflows (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	bitmap_set_bit (ssa_sets_higher_bits_bitmap, index);
+    }
+}
+
+/* Return true if ssa NAME will have higher bits if promoted.  */
+static bool
+ssa_overflows_p (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	return bitmap_bit_p (ssa_sets_higher_bits_bitmap, index);
+    }
+  return false;
+}
+
+/* Create an ssa with TYPE to copy ssa VAR.  */
+static tree
+make_promoted_copy (tree var, gimple def_stmt, tree type)
+{
+  tree new_lhs = make_ssa_name (type, def_stmt);
+  if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (var))
+    SSA_NAME_OCCURS_IN_ABNORMAL_PHI (new_lhs) = 1;
+  return new_lhs;
+}
+
+/* Return single successor (excluding EH edge) for basic block BB.  If there
+   are more than one successors, return NULL.  */
+static basic_block
+get_single_successor_bb (basic_block bb)
+{
+  edge e, res = NULL;
+  edge_iterator ei;
+
+  FOR_EACH_EDGE (e, ei, bb->succs)
+    if (!(e->flags & EDGE_EH))
+      {
+	if (res)
+	  return NULL;
+	res = e;
+      }
+  return res->dest;
+}
+
+/* Insert COPY_STMT along the edge from STMT to its successor.  */
+static void
+insert_stmt_on_edge (gimple stmt, gimple copy_stmt)
+{
+  edge_iterator ei;
+  edge e, edge = NULL;
+  basic_block bb = gimple_bb (stmt);
+
+  FOR_EACH_EDGE (e, ei, bb->succs)
+    if (!(e->flags & EDGE_EH))
+      {
+	gcc_assert (edge == NULL);
+	edge = e;
+      }
+
+  gcc_assert (edge);
+  gsi_insert_on_edge_immediate (edge, copy_stmt);
+}
+
+/* Convert constant CST to TYPE.  */
+static tree
+convert_int_cst (tree type, tree cst, signop sign = SIGNED)
+{
+  wide_int wi_cons = fold_convert (type, cst);
+  wi_cons = wi::ext (wi_cons, TYPE_PRECISION (TREE_TYPE (cst)), sign);
+  return wide_int_to_tree (type, wi_cons);
+}
+
+
+/* Promote constants in STMT to TYPE.  If PROMOTE_COND_EXPR is true,
+ promote only the constants in conditions part of the COND_EXPR.  */
+static void
+promote_cst_in_stmt (gimple stmt, tree type,
+		     bool promote_cond_expr = false, signop sign = SIGNED)
+{
+  tree op;
+  ssa_op_iter iter;
+  use_operand_p oprnd;
+  int index;
+  tree op0, op1;
+
+  if (promote_cond_expr)
+    {
+      /* Promote constant in COND_EXPR.  */
+      gcc_assert (gimple_assign_rhs_code (stmt) == COND_EXPR);
+      op = gimple_assign_rhs1 (stmt);
+      op0 = TREE_OPERAND (op, 0);
+      op1 = TREE_OPERAND (op, 1);
+
+      if (TREE_CODE (op0) == INTEGER_CST)
+	op0 = convert_int_cst (type, op0, sign);
+      if (TREE_CODE (op1) == INTEGER_CST)
+	op1 = convert_int_cst (type, op1, sign);
+
+      tree new_op = build2 (TREE_CODE (op), type, op0, op1);
+      gimple_assign_set_rhs1 (stmt, new_op);
+      return;
+    }
+
+  switch (gimple_code (stmt))
+    {
+    case GIMPLE_ASSIGN:
+      op = gimple_assign_rhs1 (stmt);
+
+      if (op && TREE_CODE (op) == INTEGER_CST)
+	gimple_assign_set_rhs1 (stmt, convert_int_cst (type, op, sign));
+      op = gimple_assign_rhs2 (stmt);
+
+      if (op && TREE_CODE (op) == INTEGER_CST)
+	gimple_assign_set_rhs2 (stmt, convert_int_cst (type, op, sign));
+      op = gimple_assign_rhs3 (stmt);
+
+      if (op && TREE_CODE (op) == INTEGER_CST)
+	gimple_assign_set_rhs3 (stmt, convert_int_cst (type, op, sign));
+      break;
+
+    case GIMPLE_PHI:
+	{
+	  gphi *phi = as_a <gphi *> (stmt);
+	  FOR_EACH_PHI_ARG (oprnd, phi, iter, SSA_OP_USE)
+	    {
+	      op = USE_FROM_PTR (oprnd);
+	      index = PHI_ARG_INDEX_FROM_USE (oprnd);
+	      if (TREE_CODE (op) == INTEGER_CST)
+		SET_PHI_ARG_DEF (phi, index, convert_int_cst (type, op, sign));
+	    }
+	}
+      break;
+
+    case GIMPLE_COND:
+	{
+	  gcond *cond = as_a <gcond *> (stmt);
+	  op = gimple_cond_lhs (cond);
+
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_cond_set_lhs (cond, convert_int_cst (type, op, sign));
+	  op = gimple_cond_rhs (cond);
+
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_cond_set_rhs (cond, convert_int_cst (type, op, sign));
+	}
+
+    default:
+      break;
+    }
+}
+
+/* Zero/sign extend (depending on type) VAR and truncate to WIDTH bits.
+   Assign the zero/sign extended value in NEW_VAR.  gimple statement
+   that performs the zero/sign extension is returned.  */
+static gimple
+zero_sign_extend_stmt (tree new_var, tree var, int width)
+{
+  gcc_assert (TYPE_PRECISION (TREE_TYPE (var))
+	      == TYPE_PRECISION (TREE_TYPE (new_var)));
+  gcc_assert (TYPE_PRECISION (TREE_TYPE (var)) > width);
+  gimple stmt;
+
+  if (TYPE_UNSIGNED (TREE_TYPE (new_var)))
+    /* Zero extend.  */
+    stmt = gimple_build_assign (new_var,
+				BIT_AND_EXPR,
+				var, build_int_cst (TREE_TYPE (var),
+						    ((1ULL << width) - 1)));
+  else
+    /* Sign extend.  */
+    stmt = gimple_build_assign (new_var,
+				SEXT_EXPR,
+				var, build_int_cst (TREE_TYPE (var), width));
+  return stmt;
+}
+
+/* Promote use in an assignment.  Depending on the gimple_assign_rhs_code,
+   values in NEW_USE might have to be truncated to the type of USE.  */
+static void
+promote_assign_stmt_use (gimple stmt,
+			 tree use,
+			 imm_use_iterator *ui,
+			 tree new_use,
+			 tree copy_of_use,
+			 tree promoted_type)
+{
+  tree lhs = gimple_assign_lhs (stmt);
+  tree rhs1 = gimple_assign_rhs1 (stmt);
+  tree rhs2 = gimple_assign_rhs2 (stmt);
+  tree rhs3 = gimple_assign_rhs3 (stmt);
+  gimple_stmt_iterator gsi;
+  use_operand_p op;
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  /* If promoted and fix up is to be performed, fix is true.  */
+  bool fix = false;
+
+  switch (code)
+    {
+    CASE_CONVERT:
+      if (ssa_tobe_promoted_p (lhs)
+	  && promotion_safe_p (lhs)
+	  && TREE_TYPE (new_use) == promoted_type)
+	{
+	  if (TYPE_PRECISION (TREE_TYPE (lhs)) > TYPE_PRECISION (TREE_TYPE (rhs1)))
+	    {
+	      tree temp = make_promoted_copy (lhs, NULL, promoted_type);
+	      gimple copy_stmt =
+		zero_sign_extend_stmt (temp, new_use,
+				       TYPE_PRECISION (TREE_TYPE (use)));
+	      gsi = gsi_for_stmt (stmt);
+	      gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+	      FOR_EACH_IMM_USE_ON_STMT (op, *ui)
+		SET_USE (op, temp);
+	      update_stmt (stmt);
+	    }
+	  else
+	    {
+	      FOR_EACH_IMM_USE_ON_STMT (op, *ui)
+		SET_USE (op, new_use);
+	      update_stmt (stmt);
+	    }
+	}
+      else
+	{
+	  if (TYPE_PRECISION (TREE_TYPE (lhs)) < TYPE_PRECISION (TREE_TYPE (rhs1)))
+	    {
+	      FOR_EACH_IMM_USE_ON_STMT (op, *ui)
+		SET_USE (op, new_use);
+	      update_stmt (stmt);
+	    }
+	  else if (!copy_of_use)
+	    {
+	      tree temp = make_promoted_copy (use, NULL, TREE_TYPE (use));
+	      gimple copy_stmt = gimple_build_assign (temp, CONVERT_EXPR,
+						      new_use, NULL_TREE);
+	      gsi = gsi_for_stmt (stmt);
+	      gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+	      FOR_EACH_IMM_USE_ON_STMT (op, *ui)
+		SET_USE (op, temp);
+	      update_stmt (stmt);
+	    }
+	  else
+	    {
+	      FOR_EACH_IMM_USE_ON_STMT (op, *ui)
+		SET_USE (op, copy_of_use);
+	      update_stmt (stmt);
+	    }
+	}
+      return;
+
+    case COND_EXPR:
+      /* Promote COND_EXPR coparison operands.  */
+      if (use != rhs2
+	  && use != rhs3)
+	{
+	  tree temp;
+	  tree op0 = TREE_OPERAND (rhs1, 0);
+	  tree op1 = TREE_OPERAND (rhs1, 1);
+	  bool is_cst = false;
+
+	  if (TREE_CODE (op0) == INTEGER_CST
+	      || TREE_CODE (op1) == INTEGER_CST)
+	    is_cst = true;
+
+	  /* If this SSA is not promoted.  */
+	  if (use == new_use)
+	    {
+	      if (is_cst)
+		temp = new_use;
+	      else
+		{
+		  temp = make_promoted_copy (use, NULL, promoted_type);
+		  gimple copy_stmt = gimple_build_assign (temp, CONVERT_EXPR,
+							  new_use, NULL_TREE);
+		  gsi = gsi_for_stmt (stmt);
+		  gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+		  promote_cst_in_stmt (stmt, promoted_type, true,
+				       TYPE_SIGN (TREE_TYPE (use)));
+		}
+	    }
+	  /* If this SSA is promoted.  */
+	  else
+	    {
+	      temp = make_promoted_copy (use, NULL, promoted_type);
+	      gimple copy_stmt
+		= zero_sign_extend_stmt (temp, new_use,
+					 TYPE_PRECISION (TREE_TYPE (use)));
+	      gsi = gsi_for_stmt (stmt);
+	      gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+	      promote_cst_in_stmt (stmt, promoted_type, true,
+				   TYPE_SIGN (TREE_TYPE (use)));
+	    }
+
+	  if (op0 == use)
+	    op0 = temp;
+	  else
+	    op1 = temp;
+
+	  tree new_op = build2 (TREE_CODE (rhs1), promoted_type, op0, op1);
+	  gimple_assign_set_rhs1 (stmt, new_op);
+	  update_stmt (stmt);
+	  return;
+	}
+      else
+	{
+	  promote_cst_in_stmt (stmt, promoted_type);
+	}
+      break;
+
+    case RSHIFT_EXPR:
+    case LSHIFT_EXPR:
+    case WIDEN_LSHIFT_EXPR:
+    case TRUNC_MOD_EXPR:
+    case CEIL_MOD_EXPR:
+    case FLOOR_MOD_EXPR:
+    case ROUND_MOD_EXPR:
+    case TRUNC_DIV_EXPR:
+    case CEIL_DIV_EXPR:
+    case FLOOR_DIV_EXPR:
+    case RDIV_EXPR:
+    case ROUND_DIV_EXPR:
+    case EXACT_DIV_EXPR:
+    case MIN_EXPR:
+    case MAX_EXPR:
+    case RANGE_EXPR:
+      if (ssa_overflows_p (use))
+	fix = true;
+      break;
+
+    default:
+      break;
+    }
+
+  if (fix && promotion_safe_p (lhs)
+      && TREE_TYPE (new_use) == promoted_type)
+    {
+      /* Promoted with values truncated.  */
+      tree temp = make_promoted_copy (use, NULL, promoted_type);
+      gimple copy_stmt = zero_sign_extend_stmt (temp, new_use,
+						TYPE_PRECISION (TREE_TYPE (use)));
+      gsi = gsi_for_stmt (stmt);
+      gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+      FOR_EACH_IMM_USE_ON_STMT (op, *ui)
+	SET_USE (op, temp);
+      update_stmt (stmt);
+      return;
+    }
+  else if (!(TREE_CODE_CLASS (code) == tcc_comparison
+	     || TREE_CODE_CLASS (code) == tcc_reference
+	     || code == VIEW_CONVERT_EXPR
+	     || code == COMPLEX_EXPR
+	     || code == ASM_EXPR
+	     || code == OBJ_TYPE_REF
+	     || gimple_vdef (stmt)
+	     || VECTOR_TYPE_P (TREE_TYPE (lhs)))
+	   && (promotion_safe_p (lhs)
+	       || (TREE_CODE_CLASS (code) == tcc_comparison)))
+    {
+      /* Statement promoted.  */
+      if ((TYPE_PRECISION (TREE_TYPE (use))
+	   < TYPE_PRECISION (promoted_type))
+	  && (code != COND_EXPR))
+	promote_cst_in_stmt (stmt, promoted_type);
+
+      if (promoted_type == TREE_TYPE (new_use))
+	{
+	  /* Operand also promoted.  */
+	  FOR_EACH_IMM_USE_ON_STMT (op, *ui)
+	    SET_USE (op, new_use);
+	  update_stmt (stmt);
+	}
+      else
+	{
+	  /* Operand not promoted.  */
+	  tree temp = make_promoted_copy (use, NULL, promoted_type);
+	  gimple copy_stmt = gimple_build_assign (temp, CONVERT_EXPR,
+						  new_use, NULL_TREE);
+	  gsi = gsi_for_stmt (stmt);
+	  gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+	  FOR_EACH_IMM_USE_ON_STMT (op, *ui)
+	    SET_USE (op, temp);
+	  update_stmt (stmt);
+	}
+    }
+  else
+    {
+      /* Statement not promoted.  */
+      if (copy_of_use)
+	{
+	  /* Operand also not promoted.  */
+	  FOR_EACH_IMM_USE_ON_STMT (op, *ui)
+	    SET_USE (op, copy_of_use);
+	  update_stmt (stmt);
+	}
+      else
+	{
+	  /* Operand promoted.  */
+	  tree temp = make_promoted_copy (use, NULL, TREE_TYPE (use));
+	  gimple copy_stmt = gimple_build_assign (temp, CONVERT_EXPR,
+				   new_use, NULL_TREE);
+
+	  gsi = gsi_for_stmt (stmt);
+	  gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+
+	  FOR_EACH_IMM_USE_ON_STMT (op, *ui)
+	    SET_USE (op, temp);
+	  update_stmt (stmt);
+	}
+    }
+}
+
+/* Promote ssa USE in phi STMT to PROMOTED_TYPE.  */
+static void
+promote_phi_stmt_use (gimple stmt,
+		       tree use,
+		       imm_use_iterator *ui,
+		       tree new_use,
+		       tree copy_of_use,
+		       tree promoted_type)
+{
+  tree lhs = PHI_RESULT (stmt);
+  tree type;
+  tree temp;
+  gimple_stmt_iterator gsi;
+  use_operand_p op;
+
+  if (ssa_tobe_promoted_p (lhs)
+      && promotion_safe_p (lhs))
+    type = promoted_type;
+  else
+    type = TREE_TYPE (lhs);
+
+  /* Check if we need a convert stmt to get the required type.  */
+  if (type == TREE_TYPE (new_use))
+    temp = new_use;
+  else if (copy_of_use && (type == TREE_TYPE (copy_of_use)))
+    temp = copy_of_use;
+  else
+    {
+      temp = make_promoted_copy (use, NULL, type);
+      gimple copy_stmt
+	= gimple_build_assign (temp, CONVERT_EXPR,
+			       new_use, NULL_TREE);
+
+      if (gimple_code (SSA_NAME_DEF_STMT (new_use)) == GIMPLE_NOP)
+	{
+	  basic_block bb = ENTRY_BLOCK_PTR_FOR_FN (cfun);
+	  bb = get_single_successor_bb (bb);
+	  gcc_assert (bb);
+	  gsi = gsi_after_labels (bb);
+	  gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+	}
+      else if (gimple_code (SSA_NAME_DEF_STMT (new_use))
+	       != GIMPLE_PHI)
+	{
+	  gsi = gsi_for_stmt (SSA_NAME_DEF_STMT (new_use));
+	  if (lookup_stmt_eh_lp (SSA_NAME_DEF_STMT (new_use)) > 0)
+	    insert_stmt_on_edge (SSA_NAME_DEF_STMT (new_use), copy_stmt);
+	  else
+	    gsi_insert_after (&gsi, copy_stmt, GSI_NEW_STMT);
+	}
+      else
+	{
+	  gsi = gsi_after_labels
+	    (gimple_bb (SSA_NAME_DEF_STMT (new_use)));
+	  gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+	}
+    }
+
+  FOR_EACH_IMM_USE_ON_STMT (op, *ui)
+    SET_USE (op, temp);
+  update_stmt (stmt);
+}
+
+/* Promote ssa USE in STMT to PROMOTED_TYPE.  */
+static void
+promote_cond_stmt_use (gimple stmt,
+		       tree use,
+		       imm_use_iterator *ui,
+		       tree new_use,
+		       tree promoted_type)
+{
+  gimple_stmt_iterator gsi;
+  use_operand_p op;
+  bool is_cst = false;
+  tree lhs = gimple_cond_lhs (stmt);
+  tree rhs = gimple_cond_rhs (stmt);
+
+  if (TREE_CODE (lhs) == INTEGER_CST
+      || TREE_CODE (rhs) == INTEGER_CST)
+    is_cst = true;
+
+  if (TREE_TYPE (new_use) == promoted_type)
+    {
+      tree temp = make_promoted_copy (use, NULL, promoted_type);
+      gimple copy_stmt = zero_sign_extend_stmt (temp, new_use,
+						TYPE_PRECISION (TREE_TYPE (use)));
+
+      gsi = gsi_for_stmt (stmt);
+      gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+      FOR_EACH_IMM_USE_ON_STMT (op, *ui)
+	SET_USE (op, temp);
+      update_stmt (stmt);
+      promote_cst_in_stmt (stmt, promoted_type, false,
+			   TYPE_SIGN (TREE_TYPE (use)));
+    }
+  else
+    {
+      /* Copmparison will happen in promoted type.  */
+      tree temp;
+      if (TREE_TYPE (new_use) == promoted_type)
+	{
+	  temp = new_use;
+	  promote_cst_in_stmt (stmt, promoted_type, false,
+			       TYPE_SIGN (TREE_TYPE (use)));
+	}
+      else if (is_cst)
+	{
+	  temp = new_use;
+	}
+      else
+	{
+	  temp = make_promoted_copy (use, NULL, promoted_type);
+	  gimple copy_stmt = gimple_build_assign (temp, CONVERT_EXPR,
+						  new_use, NULL_TREE);
+	  gsi = gsi_for_stmt (stmt);
+	  gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+	}
+
+      FOR_EACH_IMM_USE_ON_STMT (op, *ui)
+	SET_USE (op, temp);
+      update_stmt (stmt);
+    }
+}
+
+/* Promote definition DEF to NEW_TYPE.  If the DEF is replaced and has to
+   be released, set RELEASE_DEF.  Also return COPY_OF_DEF with the original
+   type for any use statement that needs truncation.  */
+static tree
+promote_definition (tree def,
+		    tree promoted_type,
+		    tree *copy_of_def,
+		    bool *release_def)
+{
+  gimple def_stmt = SSA_NAME_DEF_STMT (def);
+  gimple copy_stmt = NULL;
+  tree new_def;
+  basic_block bb;
+  gimple_stmt_iterator gsi;
+  gphi *phi;
+
+  gcc_assert (release_def);
+  *release_def = false;
+  if (SSA_NAME_VAR (def) == NULL
+      && gimple_code (def_stmt) == GIMPLE_NOP)
+    {
+      TREE_TYPE (def) = promoted_type;
+      promote_cst_in_stmt (def_stmt, promoted_type);
+      new_def = def;
+      *copy_of_def = NULL;
+      return new_def;
+    }
+
+  switch (gimple_code (def_stmt))
+    {
+
+    case GIMPLE_PHI:
+      phi = as_a <gphi *> (def_stmt);
+      new_def = make_promoted_copy (def, phi, promoted_type);
+      *copy_of_def = NULL;
+      gimple_phi_set_result (phi, new_def);
+      SET_PHI_RESULT (phi, new_def);
+      *release_def = true;
+      update_stmt (def_stmt);
+      promote_cst_in_stmt (def_stmt, promoted_type);
+      break;
+
+    case GIMPLE_NOP:
+      /* Create a promoted type copy of parameters.  */
+      bb = ENTRY_BLOCK_PTR_FOR_FN (cfun);
+      bb = get_single_successor_bb (bb);
+      gcc_assert (bb);
+      gsi = gsi_after_labels (bb);
+      new_def = make_promoted_copy (def, NULL, promoted_type);
+      copy_stmt = gimple_build_assign (new_def, CONVERT_EXPR,
+				       def, NULL_TREE);
+      gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+      *copy_of_def = def;
+      break;
+
+    case GIMPLE_ASSIGN:
+	{
+	  enum tree_code code = gimple_assign_rhs_code (def_stmt);
+	  tree rhs1 = gimple_assign_rhs1 (def_stmt);
+	  if (CONVERT_EXPR_CODE_P (code)
+	      && TREE_TYPE (rhs1) == promoted_type)
+	    {
+	      new_def = make_promoted_copy (def, NULL, promoted_type);
+	      gimple copy_stmt =
+		zero_sign_extend_stmt (new_def, rhs1,
+				       TYPE_PRECISION (TREE_TYPE (def)));
+	      gsi = gsi_for_stmt (def_stmt);
+	      gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+	      gsi = gsi_for_stmt (def_stmt);
+	      gsi_remove (&gsi, true);
+	    }
+	  else
+	    {
+	      new_def = make_promoted_copy (def, def_stmt, promoted_type);
+	      gimple_assign_set_lhs (def_stmt, new_def);
+	      update_stmt (def_stmt);
+	      if (TREE_CODE_CLASS (gimple_assign_rhs_code (def_stmt))
+		  != tcc_comparison)
+		promote_cst_in_stmt (def_stmt, promoted_type);
+	    }
+	  *release_def = true;
+	  *copy_of_def = NULL;
+	  break;
+	}
+
+    default:
+      new_def = make_promoted_copy (def, NULL, promoted_type);
+      copy_stmt = gimple_build_assign (new_def, CONVERT_EXPR,
+						def, NULL_TREE);
+      gsi = gsi_for_stmt (def_stmt);
+      if (lookup_stmt_eh_lp (def_stmt) > 0)
+	insert_stmt_on_edge (def_stmt, copy_stmt);
+      else
+	gsi_insert_after (&gsi, copy_stmt, GSI_NEW_STMT);
+      update_stmt (copy_stmt);
+      *copy_of_def = def;
+      break;
+    }
+
+  return new_def;
+}
+
+
+/* Promote all the USE with NEW_USE.  */
+static unsigned int
+promote_all_uses (tree use, tree new_use, tree copy_of_use,
+		  tree promoted_type)
+{
+  gimple stmt;
+  imm_use_iterator ui;
+  gimple_stmt_iterator gsi;
+  use_operand_p op;
+
+  /* Replace all the use with the promoted variable.  */
+  FOR_EACH_IMM_USE_STMT (stmt, ui, use)
+    {
+      if (stmt == SSA_NAME_DEF_STMT (new_use))
+	continue;
+
+      switch (gimple_code (stmt))
+	{
+
+	case GIMPLE_ASSIGN:
+	  promote_assign_stmt_use (stmt, use, &ui, new_use,
+				   copy_of_use, promoted_type);
+	  break;
+
+	case GIMPLE_PHI:
+	  promote_phi_stmt_use (stmt, use, &ui, new_use,
+				copy_of_use, promoted_type);
+	  break;
+
+	case GIMPLE_COND:
+	  promote_cond_stmt_use (stmt, use, &ui, new_use,
+				 promoted_type);
+	  break;
+
+	case GIMPLE_DEBUG:
+	  if (TREE_TYPE (use) != TREE_TYPE (new_use)
+	      && gimple_debug_bind_p (stmt))
+	    {
+	      gsi = gsi_for_stmt (stmt);
+	      gsi_remove (&gsi, true);
+	    }
+	  break;
+
+	default:
+	  if (TREE_TYPE (use) != TREE_TYPE (new_use))
+	    {
+	      tree temp;
+	      if (copy_of_use)
+		temp = copy_of_use;
+	      else
+		{
+		  temp = make_promoted_copy (use, NULL, TREE_TYPE (use));
+		  gimple copy_stmt = gimple_build_assign (temp, CONVERT_EXPR,
+							  new_use, NULL_TREE);
+		  gsi = gsi_for_stmt (stmt);
+		  gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+		}
+
+	      FOR_EACH_IMM_USE_ON_STMT (op, ui)
+		SET_USE (op, temp);
+	      update_stmt (stmt);
+	    }
+	  break;
+	}
+    }
+
+  return 0;
+}
+
+/* Promote definition of NAME and all its uses.  */
+static unsigned int
+promote_def_and_uses (tree name)
+{
+  tree type, new_name, copy_of_name;
+  bool release_def = false;
+
+  if (TREE_CODE (name) != SSA_NAME
+      || POINTER_TYPE_P (TREE_TYPE (name))
+      || !INTEGRAL_TYPE_P (TREE_TYPE (name))
+      || VECTOR_TYPE_P (TREE_TYPE (name))
+      || ssa_promoted_p (name)
+      || (type = get_promoted_type (TREE_TYPE (name))) == TREE_TYPE (name))
+    return 0;
+
+  if (promotion_safe_p (name))
+    {
+      new_name = promote_definition (name, type, &copy_of_name,
+				     &release_def);
+      promote_all_uses (name, new_name, copy_of_name, type);
+    }
+  else
+    promote_all_uses (name, name, name, type);
+  set_ssa_promoted (name);
+
+  if (release_def)
+    release_ssa_name (name);
+  return 0;
+}
+
+/* Mark the candidates for promotion.  */
+static void
+set_ssa_to_be_promoted_flag (gimple stmt)
+{
+  ssa_op_iter i;
+  tree def;
+  use_operand_p op;
+
+  switch (gimple_code (stmt))
+    {
+
+    case GIMPLE_PHI:
+	{
+	  gphi *phi = as_a <gphi *> (stmt);
+	  def = PHI_RESULT (phi);
+	  bitmap_set_bit (ssa_to_be_promoted_bitmap, SSA_NAME_VERSION (def));
+	  FOR_EACH_PHI_ARG (op, phi, i, SSA_OP_USE)
+	    {
+	      def = USE_FROM_PTR (op);
+	      if (TREE_CODE (def) == SSA_NAME)
+		bitmap_set_bit (ssa_to_be_promoted_bitmap, SSA_NAME_VERSION (def));
+	    }
+	  break;
+	}
+
+    default:
+      FOR_EACH_SSA_TREE_OPERAND (def, stmt, i, SSA_OP_USE | SSA_OP_DEF)
+	{
+	  if (TREE_CODE (def) == SSA_NAME)
+	    bitmap_set_bit (ssa_to_be_promoted_bitmap, SSA_NAME_VERSION (def));
+	}
+      break;
+    }
+}
+
+/* Visit PHI stmt and record if variables might have higher bits set if
+   promoted.  */
+static bool
+record_visit_phi_node (gimple stmt)
+{
+  tree def;
+  ssa_op_iter i;
+  use_operand_p op;
+  bool high_bits_set = false;
+  gphi *phi = as_a <gphi *> (stmt);
+  tree lhs = PHI_RESULT (phi);
+
+  if (TREE_CODE (lhs) != SSA_NAME
+      || POINTER_TYPE_P (TREE_TYPE (lhs))
+      || !INTEGRAL_TYPE_P (TREE_TYPE (lhs))
+      || ssa_overflows_p (lhs))
+    return false;
+
+  FOR_EACH_PHI_ARG (op, phi, i, SSA_OP_USE)
+    {
+      def = USE_FROM_PTR (op);
+      if (ssa_overflows_p (def))
+	high_bits_set = true;
+    }
+
+  if (high_bits_set)
+    {
+      set_ssa_overflows (lhs);
+      return true;
+    }
+  else
+    return false;
+}
+
+/* Visit STMT and record if variables might have higher bits set if
+   promoted.  */
+static bool
+record_visit_stmt (gimple stmt)
+{
+  tree def;
+  ssa_op_iter i;
+  bool changed = false;
+  gcc_assert (gimple_code (stmt) == GIMPLE_ASSIGN);
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  tree lhs = gimple_assign_lhs (stmt);
+  tree rhs1 = gimple_assign_rhs1 (stmt);
+
+  if (TREE_CODE (lhs) != SSA_NAME
+      || POINTER_TYPE_P (TREE_TYPE (lhs))
+      || !INTEGRAL_TYPE_P (TREE_TYPE (lhs)))
+    return false;
+
+  switch (code)
+    {
+      /* Conversion expressions that may need to be preserved.  */
+    CASE_CONVERT:
+      /* if the precision of LHS is greater than RHS, it is not safe to
+	 convert this with  ZEXT/SEXT stmt when there is also type change.  */
+      if ((TYPE_PRECISION (TREE_TYPE (lhs))
+	   > TYPE_PRECISION (TREE_TYPE (rhs1)))
+	  && (TYPE_UNSIGNED (TREE_TYPE (lhs))
+	      != TYPE_PRECISION (TREE_TYPE (rhs1))))
+	bitmap_set_bit (ssa_not_safe_bitmap, SSA_NAME_VERSION (lhs));
+      else if ((TYPE_PRECISION (TREE_TYPE (lhs))
+		<= TYPE_PRECISION (TREE_TYPE (rhs1)))
+	       && !ssa_overflows_p (lhs))
+	{
+	  set_ssa_overflows (lhs);
+	  changed = true;
+	}
+      break;
+
+    case SSA_NAME:
+      if (!ssa_overflows_p (lhs)
+	  && ssa_overflows_p (rhs1))
+	{
+	  set_ssa_overflows (lhs);
+	  changed = true;
+	}
+      break;
+
+    case NE_EXPR:
+    case LT_EXPR:
+    case LE_EXPR:
+    case GT_EXPR:
+    case GE_EXPR:
+    case EQ_EXPR:
+    case UNLT_EXPR:
+    case UNLE_EXPR:
+    case UNGT_EXPR:
+    case UNGE_EXPR:
+    case UNEQ_EXPR:
+    case LTGT_EXPR:
+    case RSHIFT_EXPR:
+    case LSHIFT_EXPR:
+    case WIDEN_LSHIFT_EXPR:
+    case MIN_EXPR:
+    case MAX_EXPR:
+    case RANGE_EXPR:
+      break;
+
+    case TRUNC_DIV_EXPR:
+    case CEIL_DIV_EXPR:
+    case FLOOR_DIV_EXPR:
+    case RDIV_EXPR:
+    case ROUND_DIV_EXPR:
+    case EXACT_DIV_EXPR:
+      if (!ssa_overflows_p (lhs))
+	{
+	  set_ssa_overflows (lhs);
+	  changed = true;
+	}
+      break;
+
+      /* Expressions which may produce results that will have higher bits if
+	 computed in promoted type.  (i.e.  results may overflow) */
+    case MULT_HIGHPART_EXPR:
+    case PLUS_EXPR:
+    case MINUS_EXPR:
+    case MULT_EXPR:
+    case BIT_XOR_EXPR:
+    case BIT_NOT_EXPR:
+    case WIDEN_MULT_EXPR:
+    case WIDEN_MULT_PLUS_EXPR:
+    case WIDEN_MULT_MINUS_EXPR:
+    case WIDEN_SUM_EXPR:
+    case BIT_IOR_EXPR:
+    case BIT_AND_EXPR:
+      if (!ssa_overflows_p (lhs))
+	{
+	  set_ssa_overflows (lhs);
+	  changed = true;
+	}
+      break;
+
+      /* Expressions for which operation has to be performed in original
+	 types if promoted operands may have higher bits.  */
+    case ABS_EXPR:
+    case NEGATE_EXPR:
+    case TRUNC_MOD_EXPR:
+    case CEIL_MOD_EXPR:
+    case FLOOR_MOD_EXPR:
+    case ROUND_MOD_EXPR:
+      FOR_EACH_SSA_TREE_OPERAND (def, stmt, i, SSA_OP_USE)
+	{
+	  if (ssa_overflows_p (def))
+	    bitmap_set_bit (ssa_not_safe_bitmap, SSA_NAME_VERSION (lhs));
+	}
+      break;
+
+    case COND_EXPR:
+	{
+	  tree rhs2 = gimple_assign_rhs2 (stmt);
+	  tree rhs3 = gimple_assign_rhs3 (stmt);
+
+	  if (ssa_overflows_p (rhs2))
+	    {
+	      set_ssa_overflows (lhs);
+	      changed = true;
+	    }
+	  else if (ssa_overflows_p (rhs3))
+	    {
+	      set_ssa_overflows (lhs);
+	      changed = true;
+	    }
+	}
+      break;
+
+      /* Expressions that has to be done in original types.  */
+    case LROTATE_EXPR:
+    case RROTATE_EXPR:
+      bitmap_set_bit (ssa_not_safe_bitmap, SSA_NAME_VERSION (lhs));
+      break;
+
+      /* To be safe, all other have to be done in original types.  */
+    default:
+      bitmap_set_bit (ssa_not_safe_bitmap, SSA_NAME_VERSION (lhs));
+      break;
+    }
+  return changed;
+}
+
+
+/* Promote all the stmts in the basic block.  */
+static void
+promote_all_stmts (basic_block bb)
+{
+  gimple_stmt_iterator gsi;
+  ssa_op_iter iter;
+  tree def;
+
+  for (gphi_iterator gpi = gsi_start_phis (bb);
+       !gsi_end_p (gpi); gsi_next (&gpi))
+    {
+      gphi *phi = gpi.phi ();
+      use_operand_p op;
+
+      def = PHI_RESULT (phi);
+      promote_def_and_uses (def);
+      FOR_EACH_PHI_ARG (op, phi, iter, SSA_OP_USE)
+	{
+	  def = USE_FROM_PTR (op);
+	  promote_def_and_uses (def);
+	}
+    }
+  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+    {
+      gimple stmt = gsi_stmt (gsi);
+
+      FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_USE | SSA_OP_DEF)
+	promote_def_and_uses (def);
+    }
+}
+
+static void
+process_all_stmts_for_unsafe_promotion ()
+{
+  basic_block bb;
+  gimple_stmt_iterator gsi;
+  auto_vec<gimple> work_list;
+
+  FOR_EACH_BB_FN (bb, cfun)
+    {
+      for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+	{
+	  gimple phi = gsi_stmt (gsi);
+
+	  set_ssa_to_be_promoted_flag (phi);
+	  work_list.safe_push (phi);
+	}
+
+      for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+	{
+	  gimple stmt = gsi_stmt (gsi);
+
+	  set_ssa_to_be_promoted_flag (stmt);
+	  if (gimple_code (stmt) == GIMPLE_ASSIGN)
+	    work_list.safe_push (stmt);
+	}
+    }
+
+  while (work_list.length () > 0)
+    {
+      bool changed;
+      gimple stmt = work_list.pop ();
+      tree lhs;
+
+      switch (gimple_code (stmt))
+	{
+
+	case GIMPLE_ASSIGN:
+	  changed = record_visit_stmt (stmt);
+	  lhs = gimple_assign_lhs (stmt);
+	  break;
+
+	case GIMPLE_PHI:
+	  changed = record_visit_phi_node (stmt);
+	  lhs = PHI_RESULT (stmt);
+	  break;
+
+	default:
+	  gcc_assert (false);
+	  break;
+	}
+
+      if (changed)
+	{
+	  gimple use_stmt;
+	  imm_use_iterator ui;
+
+	  FOR_EACH_IMM_USE_STMT (use_stmt, ui, lhs)
+	    {
+	      if (gimple_code (use_stmt) == GIMPLE_ASSIGN
+		  || gimple_code (use_stmt) == GIMPLE_PHI)
+		work_list.safe_push (use_stmt);
+	    }
+	}
+    }
+}
+
+class type_promotion_dom_walker : public dom_walker
+{
+public:
+  type_promotion_dom_walker (cdi_direction direction)
+    : dom_walker (direction) {}
+  virtual void before_dom_children (basic_block bb)
+    {
+      promote_all_stmts (bb);
+    }
+};
+
+/* Main entry point to the pass.  */
+static unsigned int
+execute_type_promotion (void)
+{
+  n_ssa_val = num_ssa_names;
+  ssa_not_safe_bitmap = sbitmap_alloc (n_ssa_val);
+  bitmap_clear (ssa_not_safe_bitmap);
+  ssa_to_be_promoted_bitmap = sbitmap_alloc (n_ssa_val);
+  bitmap_clear (ssa_to_be_promoted_bitmap);
+  ssa_sets_higher_bits_bitmap = sbitmap_alloc (n_ssa_val);
+  bitmap_clear (ssa_sets_higher_bits_bitmap);
+
+  calculate_dominance_info (CDI_DOMINATORS);
+  process_all_stmts_for_unsafe_promotion ();
+  /* Walk the CFG in dominator order.  */
+  type_promotion_dom_walker (CDI_DOMINATORS)
+    .walk (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+
+  sbitmap_free (ssa_not_safe_bitmap);
+  sbitmap_free (ssa_to_be_promoted_bitmap);
+  sbitmap_free (ssa_sets_higher_bits_bitmap);
+    free_dominance_info (CDI_DOMINATORS);
+  return 0;
+}
+
+namespace {
+const pass_data pass_data_type_promotion =
+{
+  GIMPLE_PASS, /* type */
+  "promotion", /* name */
+  OPTGROUP_NONE, /* optinfo_flags */
+  TV_TREE_TYPE_PROMOTE, /* tv_id */
+  PROP_ssa, /* properties_required */
+  0, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  (TODO_cleanup_cfg | TODO_update_ssa | TODO_verify_all),
+};
+
+class pass_type_promotion : public gimple_opt_pass
+{
+public:
+  pass_type_promotion (gcc::context *ctxt)
+    : gimple_opt_pass (pass_data_type_promotion, ctxt)
+  {}
+
+  /* opt_pass methods: */
+  opt_pass * clone () { return new pass_type_promotion (m_ctxt); }
+  virtual bool gate (function *) { return flag_tree_type_promote != 0; }
+  virtual unsigned int execute (function *)
+    {
+      return execute_type_promotion ();
+    }
+
+}; // class pass_type_promotion
+
+} // anon namespace
+
+gimple_opt_pass *
+make_pass_type_promote (gcc::context *ctxt)
+{
+  return new pass_type_promotion (ctxt);
+}
+
diff --git a/gcc/passes.def b/gcc/passes.def
index ffa63b5..846ec1b 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -271,6 +271,7 @@ along with GCC; see the file COPYING3.  If not see
 	  NEXT_PASS (pass_slp_vectorize);
       POP_INSERT_PASSES ()
       NEXT_PASS (pass_lower_vector_ssa);
+      NEXT_PASS (pass_type_promote);
       NEXT_PASS (pass_cse_reciprocals);
       NEXT_PASS (pass_reassoc);
       NEXT_PASS (pass_strength_reduction);
diff --git a/gcc/timevar.def b/gcc/timevar.def
index 711bbed..f15e931 100644
--- a/gcc/timevar.def
+++ b/gcc/timevar.def
@@ -268,6 +268,7 @@ DEFTIMEVAR (TV_PLUGIN_RUN            , "plugin execution")
 DEFTIMEVAR (TV_GIMPLE_SLSR           , "straight-line strength reduction")
 DEFTIMEVAR (TV_VTABLE_VERIFICATION   , "vtable verification")
 DEFTIMEVAR (TV_TREE_UBSAN            , "tree ubsan")
+DEFTIMEVAR (TV_TREE_TYPE_PROMOTE     , "tree type promote")
 
 /* Everything else in rest_of_compilation not included above.  */
 DEFTIMEVAR (TV_EARLY_LOCAL	     , "early local passes")
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index 4929629..a766ac6 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -3805,6 +3805,18 @@ verify_gimple_assign_binary (gassign *stmt)
         return false;
       }
 
+    case SEXT_EXPR:
+      {
+	if (!INTEGRAL_TYPE_P (lhs_type)
+	    || !INTEGRAL_TYPE_P (rhs1_type)
+	    || TREE_CODE (rhs2) != INTEGER_CST)
+	  {
+	    error ("invalid operands in sext expr");
+	    return true;
+	  }
+	return false;
+      }
+
     case VEC_WIDEN_LSHIFT_HI_EXPR:
     case VEC_WIDEN_LSHIFT_LO_EXPR:
       {
diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index 42ddb9f..8c20089 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -3913,6 +3913,7 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights,
     case BIT_XOR_EXPR:
     case BIT_AND_EXPR:
     case BIT_NOT_EXPR:
+    case SEXT_EXPR:
 
     case TRUTH_ANDIF_EXPR:
     case TRUTH_ORIF_EXPR:
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index 172bd82..533e4a6 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -428,6 +428,7 @@ extern gimple_opt_pass *make_pass_fre (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_check_data_deps (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_copy_prop (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_isolate_erroneous_paths (gcc::context *ctxt);
+extern gimple_opt_pass *make_pass_type_promote (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_vrp (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_uncprop (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_return_slot (gcc::context *ctxt);
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index d7c049f..0045962 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -1812,6 +1812,14 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, int flags,
       }
       break;
 
+    case SEXT_EXPR:
+      pp_string (pp, "SEXT_EXPR <");
+      dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);
+      pp_string (pp, ", ");
+      dump_generic_node (pp, TREE_OPERAND (node, 1), spc, flags, false);
+      pp_greater (pp);
+      break;
+
     case MODIFY_EXPR:
     case INIT_EXPR:
       dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags,
@@ -3432,6 +3440,9 @@ op_symbol_code (enum tree_code code)
     case MIN_EXPR:
       return "min";
 
+    case SEXT_EXPR:
+      return "sext from bit";
+
     default:
       return "<<< ??? >>>";
     }
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index e7ab23c..581b1fe 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -2408,6 +2408,7 @@ extract_range_from_binary_expr_1 (value_range_t *vr,
       && code != LSHIFT_EXPR
       && code != MIN_EXPR
       && code != MAX_EXPR
+      && code != SEXT_EXPR
       && code != BIT_AND_EXPR
       && code != BIT_IOR_EXPR
       && code != BIT_XOR_EXPR)
@@ -2984,6 +2985,49 @@ extract_range_from_binary_expr_1 (value_range_t *vr,
       extract_range_from_multiplicative_op_1 (vr, code, &vr0, &vr1);
       return;
     }
+  else if (code == SEXT_EXPR)
+    {
+      gcc_assert (range_int_cst_p (&vr1));
+      unsigned int prec = tree_to_uhwi (vr1.min);
+      type = vr0.type;
+      wide_int tmin, tmax;
+      wide_int type_min, type_max;
+      wide_int may_be_nonzero, must_be_nonzero;
+
+      gcc_assert (!TYPE_UNSIGNED (expr_type));
+      type_min = wi::shwi (1 << (prec - 1),
+			   TYPE_PRECISION (TREE_TYPE (vr0.min)));
+      type_max = wi::shwi (((1 << (prec - 1)) - 1),
+			   TYPE_PRECISION (TREE_TYPE (vr0.max)));
+      if (zero_nonzero_bits_from_vr (expr_type, &vr0,
+				     &may_be_nonzero,
+				     &must_be_nonzero))
+	{
+	  HOST_WIDE_INT _may_be_nonzero = may_be_nonzero.to_uhwi ();
+
+	  if (_may_be_nonzero & (1 << (prec - 1)))
+	    {
+	      /* If to-be-extended sign bit can be one.  */
+	      tmin = type_min;
+	      tmax = may_be_nonzero & type_max;
+	    }
+	  else
+	    {
+	      /* If to-be-extended sign bit is zero.  */
+	      tmin = must_be_nonzero;
+	      tmax = may_be_nonzero;
+	    }
+	}
+      else
+	{
+	  tmin = type_min;
+	  tmax = type_max;
+	}
+      tmin = wi::sext (tmin, prec);
+      tmax = wi::sext (tmax, prec);
+      min = wide_int_to_tree (expr_type, tmin);
+      max = wide_int_to_tree (expr_type, tmax);
+    }
   else if (code == RSHIFT_EXPR
 	   || code == LSHIFT_EXPR)
     {
diff --git a/gcc/tree.def b/gcc/tree.def
index b4b4164..f58b073 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -747,6 +747,9 @@ DEFTREECODE (BIT_XOR_EXPR, "bit_xor_expr", tcc_binary, 2)
 DEFTREECODE (BIT_AND_EXPR, "bit_and_expr", tcc_binary, 2)
 DEFTREECODE (BIT_NOT_EXPR, "bit_not_expr", tcc_unary, 1)
 
+/*  Sign-extend operation.  */
+DEFTREECODE (SEXT_EXPR, "sext_expr", tcc_binary, 2)
+
 /* ANDIF and ORIF allow the second operand not to be computed if the
    value of the expression is determined from the first operand.  AND,
    OR, and XOR always compute the second operand whether its value is

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC] Elimination of zext/sext - type promotion pass
  2015-05-01  4:41                   ` Kugan
@ 2015-05-08 12:48                     ` Richard Biener
  2015-06-01 23:20                       ` Kugan
  0 siblings, 1 reply; 51+ messages in thread
From: Richard Biener @ 2015-05-08 12:48 UTC (permalink / raw)
  To: Kugan; +Cc: Uros Bizjak, gcc-patches, Jakub Jelinek

On Fri, May 1, 2015 at 6:41 AM, Kugan <kugan.vivekanandarajah@linaro.org> wrote:
>
>>> Thanks for the comments. Here is a prototype patch that implements a
>>> type promotion pass. This pass records SSA variables that will have
>>> values in higher bits (than the original type precision) if promoted and
>>> uses this information in inserting appropriate truncations and
>>> extensions. This pass also classifies some of the stmts that sets ssa's
>>> to be unsafe to promote. Here is a gimple difference for the type
>>> promotion as compared to previous dump for a testcase.
>>
>> Note that while GIMPLE has a way to zero-extend (using BIT_AND_EXPR)
>> it has no convenient way to sign-extend other than truncating to a signed
>> (non-promoted) type and then extending to the promoted type.  Thus
>> I think such pass should be accompanied with a new tree code,
>> SEXT_EXPR.  Otherwise we end up with "spurious" un-promoted
>> signed types which later optimizations may be confused about.
>>
>> Not sure if that is the actual issue though.
>>
>> Instead op "prmt" and "prmtn" I'd spell out promote and tree-type-prmtn
>> should be gimple-ssa-type-promote.c.  In the end all targets with
>> non-trivial PROMOTE_MODE should run the pass as a lowering step
>> so it should be enabled even at -O0 (and not disablable).
>>
>> I'd definitely run the pass _after_ pass_lower_vector_ssa (and in the
>> end I'd like to run it before IVOPTs ... which means moving IVOPTs
>> later, after VRP which should be the pass optimizing away some of
>> the extensions).
>>
>> In get_promoted_type I don't understand why you preserve qualifiers.
>> Also even for targets without PROMOTE_MODE it may be
>> beneficial to expose truncations required by expanding bit-precision
>> arithmetic earlier (that is, if !PROMOTE_MODE at least promote
>> to GET_MODE_PRECISION (TYPE_MODE (type))).  A testcase
>> for that is for example
>>
>> struct { long i : 33; long j : 33; } a;
>> return a.i + a.j;
>>
>> where bitfields of type > int do not promote so you get a
>> 33 bit add which we expand to a 64bit add plus a sign-extension
>> (and nothing optimizes that later usually).
>>
>> insert_next_bb sounds like you want to use insert_on_edge
>> somewhere.
>>
>> in assign_rhs_promotable_p you handle comparisons special
>> but the ternary COND_EXPR and VEC_COND_EXPR can have
>> comparisons embedded in their first operand.  The comment
>> confuses me though - with proper sign- or zero-extensions inserted
>> you should be able to promote them anyway?
>>
>> You seem to miss that a GIMPLE_ASSIGN can have 3 operands
>> in promote_cst_in_stmt as well.
>>
>> In promote_assign_stmt_use I consider a default: case that ends
>> up doing nothing dangerous ;)  Please either use gcc_unreachable ()
>> or do the safe thing (fix = true;?).  You seem to be working with
>> a lattice of some kind - fixing up stmt uses the way you do - walking
>> over immediate uses - is not very cache friendly.  Why not use
>> a lattice for this - record promoted vars to be used for old SSA names
>> and walk over all stmts instead, replacing SSA uses on them?
>> Btw, you don't need to call update_stmt if you SET_USE and not
>> replace an SSA name with a constant.
>>
>> You seem to "fix" with a single stmt but I don't see where you insert
>> zero- or sign-extensions for ssa_overflows_p cases?
>>
>> Note that at least for SSA names with !SSA_NAME_VAR (thus
>> anonymous vars) you want to do a cheaper promotion by not
>> allocating a new SSA name but simply "fixing" its type by
>> assigning to its TREE_TYPE.   For SSA names with SSA_NAME_VAR
>> there is of course debug-info to consider and thus doing what you
>> do is better (but probably still will wreck debuginfo?).
>>
>> GIMPLE_NOPs are not only used for parameters but also uninitialized
>> uses - for non-parameters you should simply adjust their type.  No
>> need to fixup their value.
>>
>> The pass needs more comments.
>>
>> It looks like you are not promoting all variables but only those
>> where compensation code (zero-/sign-extensions) is not necessary?
>>
>
> Thanks for the comments. Please find an updated version of this which
> addresses your review comments above. I am still to do full benchmarking
> on this, but tried with few small benchmarks. I will do proper
> benchmarking after getting feedback on the implementation. I have
> however bootstrapped on x86-64-none-linux and regression tested on
> x86-64, ARM and AArch64.
>
> I am also not clear with how I should handle the gimple debug statements
> when the intermediate temporary variable that maps to the original
> variable is promoted.

A few notes.

+/*  Sign-extend operation.  */
+DEFTREECODE (SEXT_EXPR, "sext_expr", tcc_binary, 2)

this needs an extended comment documenting the operands.

+    case SEXT_EXPR:
+       {
+         rtx op0 = expand_normal (treeop0);
+         rtx temp;
+         if (!target)
+           target = gen_reg_rtx (TYPE_MODE (TREE_TYPE (treeop0)));
+
+         machine_mode inner_mode = smallest_mode_for_size
(tree_to_shwi (treeop1),
+                                                           MODE_INT);
+         temp = convert_modes (inner_mode,
+                               TYPE_MODE (TREE_TYPE (treeop0)), op0, 0);
+         convert_move (target, temp, 0);
+         return target;
+       }

I think that if you allow arbitrary treeop1 you have to properly implement
fallbacks for the case where direct expansion to
(sign_extend:<target-mode> (subreg:<inner-mode> reg)) does not work
which is the intended operation modeled by SEXT_EXPR.

Direct expansion to that RTL would also be best I suppose.

+/* Return the promoted type for TYPE.  */
+static tree
+get_promoted_type (tree type)
+{
+  tree promoted_type;
+  enum machine_mode mode;
+  int uns;
+  if (POINTER_TYPE_P (type)
+      || TYPE_PRECISION (type) == 1
+      || !INTEGRAL_TYPE_P (type))
+    return type;

you should check for INTEGRAL_TYPE_P before looking at TYPE_PRECISION.

+#ifdef PROMOTE_MODE
+  mode = TYPE_MODE (type);
+  uns = TYPE_SIGN (type);
+  PROMOTE_MODE (mode, uns, type);
+#else
+  mode = smallest_mode_for_size (GET_MODE_PRECISION (TYPE_MODE (type)),
+                                MODE_INT);
+#endif

That smallest_mode_for_size should be a no-op.  Just hoist out
mode = TYPE_MODE (type).

Now before I get into too much details at this point.

You compute which promotions are unsafe, like sources/sinks of memory
(I think you miss call arguments/return values and also asm operands here).
But instead of simply marking those SSA names as not to be promoted
I'd instead split their life-ranges, thus replace

  _1 = mem;

with

  _2 = mem;
  _1 = [zs]ext (_2, ...);

and promote _1 anyway.  So in the first phase I'd do that (and obviously
note that _2 isn't to be promoted in the specific example).

For promotions that apply I wouldn't bother allocating new SSA names
but just "fix" their types (assign to their TREE_TYPE).  This also means
they have to become anonymous and if they didn't have a !DECL_IGNORED_P
decl before then a debug stmt should be inserted at the point of the
promotions.  So

  bar_3 = _1 + _2;

when promoted would become

 _4 = _1 + _2;
 _3 = sext <_4, ...>;
 # DEBUG bar = (orig-type) _4;  // or _3?

so you'd basically always promote defs (you have a lot of stmt/operand
walking code I didn't look too closely at - but it looks like too much) and
the uses get promoted automagically (because you promote the original
SSA name). Promotion of constants has to remain, of course.

I wouldn't promote pointers at all (are targets doing that?)

There are existing various helpers for stuff you re-invent.  I've just
spotted get_single_successor_bb for which there is single_succ_p ()
plus single_succ ().

Generally most of your stmt walking code could either use walk_stmt ()
or the low-level gimple_op (...) interface and a loop over all gimple_num_ops ()
operands.

You seem to mix in optimization and lowering - you are extending
at uses, not at defs for example.  IMHO that complicates the code
or do you think that a later optimization pass cannot recover from some
obviously bad decisions here?  If so then I suggest to implement sth
less ad-hoc by using a SSA lattice to track this and propagate the
info properly, still emitting the truncations at the defs where necessary.

Thanks,
Richard.

> Thanks,
> Kugan
>
>
> gcc/ChangeLog:
>
> 2015-05-01  Kugan Vivekanandarajah  <kuganv@linaro.org>
>
>         * Makefile.in: Add gimple-ssa-type-promote.o.
>         * cfgexpand.c (expand_debug_expr): Handle SEXT_EXPR.
>         * common.opt: New option -ftree-type-promote.
>         * expr.c (expand_expr_real_2): Handle SEXT_EXPR.
>         * fold-const.c (int_const_binop_1):
>         * gimple-ssa-type-promote.c: New file.
>         * passes.def: Define new pass_type_promote.
>         * timevar.def: Define new TV_TREE_TYPE_PROMOTE.
>         * tree-cfg.c (verify_gimple_assign_binary): Handle SEXT_EXPR.
>         * tree-inline.c (estimate_operator_cost):
>         * tree-pass.h (make_pass_type_promote): New.
>         * tree-pretty-print.c (dump_generic_node): Handle SEXT_EXPR.
>         (op_symbol_code): Likewise.
>         * tree-vrp.c (extract_range_from_binary_expr_1): Likewise.
>         * tree.def: Define new SEXT_EXPR.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC] Elimination of zext/sext - type promotion pass
  2015-05-08 12:48                     ` Richard Biener
@ 2015-06-01 23:20                       ` Kugan
  2015-06-19  2:55                         ` Kugan
  2015-07-28 11:05                         ` Richard Biener
  0 siblings, 2 replies; 51+ messages in thread
From: Kugan @ 2015-06-01 23:20 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jeff Law, gcc-patches, Jakub Jelinek

[-- Attachment #1: Type: text/plain, Size: 2712 bytes --]



On 08/05/15 22:48, Richard Biener wrote:
> You compute which promotions are unsafe, like sources/sinks of memory
> (I think you miss call arguments/return values and also asm operands here).
> But instead of simply marking those SSA names as not to be promoted
> I'd instead split their life-ranges, thus replace
> 
>   _1 = mem;
> 
> with
> 
>   _2 = mem;
>   _1 = [zs]ext (_2, ...);
> 
> and promote _1 anyway.  So in the first phase I'd do that (and obviously
> note that _2 isn't to be promoted in the specific example).
> 
> For promotions that apply I wouldn't bother allocating new SSA names
> but just "fix" their types (assign to their TREE_TYPE).  This also means
> they have to become anonymous and if they didn't have a !DECL_IGNORED_P
> decl before then a debug stmt should be inserted at the point of the
> promotions.  So
> 
>   bar_3 = _1 + _2;
> 
> when promoted would become
> 
>  _4 = _1 + _2;
>  _3 = sext <_4, ...>;
>  # DEBUG bar = (orig-type) _4;  // or _3?
> 
> so you'd basically always promote defs (you have a lot of stmt/operand
> walking code I didn't look too closely at - but it looks like too much) and
> the uses get promoted automagically (because you promote the original
> SSA name). Promotion of constants has to remain, of course.


Thanks Richard. I experimented on this idea to understand it better.
Please see the attached prototype (I am still working on your other
comments which is not addressed here). Please have a look and let me
know if this is along what you would expect. I have few questions though.

1. In the following example above :
  char _1;
  _1 = mem;

when changing with

  char _2;
  int _1;
  _2 = mem;
  _1 = [zs]ext (_2, ...);

for the [zs]ext operation we now use BIT_AND_EXPR and ZEXT_EXPR which
(as of now) requires that the LHS and RHS are of the same type. Are you
suggesting that we should have a true ZEXT_EXPR and SEXT_EXPR which can
do the above in the gimple? I am now using CONVER_EXPR and which is the
source of many optimization issue.

2. for inline asm (a reduced test case that might not make much as a
stand alone test-case, but I ran into similar cases with valid programmes)

;; Function fn1 (fn1, funcdef_no=0, decl_uid=4220, cgraph_uid=0,
symbol_order=0)

fn1 (short int p1)
{
  <bb 2>:
  __asm__("" : "=r" p1_2 : "0" p1_1(D));
  return;

}


I am generating something like the following which ICEs. What is the
expected out?

;; Function fn1 (fn1, funcdef_no=0, decl_uid=4220, cgraph_uid=0,
symbol_order=0)

fn1 (short int p1)
{
  int _1;
  int _2;
  short int _5;

  <bb 2>:
  _1 = (int) p1_4(D);
  _5 = (short int) _1;
  __asm__("" : "=r" p1_6 : "0" _5);
  _2 = (int) p1_6;
  return;

}

Thanks a lot for your time,
Kugan

[-- Attachment #2: p.txt --]
[-- Type: text/plain, Size: 29184 bytes --]

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index b59b5d9..16b2df1 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1491,6 +1491,7 @@ OBJS = \
 	tree-vect-slp.o \
 	tree-vectorizer.o \
 	tree-vrp.o \
+	gimple-ssa-type-promote.o \
 	tree.o \
 	valtrack.o \
 	value-prof.o \
diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index b190f91..bbf70ce 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -4809,6 +4809,10 @@ expand_debug_expr (tree exp)
     case FMA_EXPR:
       return simplify_gen_ternary (FMA, mode, inner_mode, op0, op1, op2);
 
+    case SEXT_EXPR:
+      return op0;
+
+
     default:
     flag_unsupported:
 #ifdef ENABLE_CHECKING
diff --git a/gcc/common.opt b/gcc/common.opt
index 6e15af2..56231b5 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2368,6 +2368,10 @@ ftree-vrp
 Common Report Var(flag_tree_vrp) Init(0) Optimization
 Perform Value Range Propagation on trees
 
+ftree-type-promote
+Common Report Var(flag_tree_type_promote) Init(1) Optimization
+Perform Type Promotion on trees
+
 funit-at-a-time
 Common Report Var(flag_unit_at_a_time) Init(1)
 Compile whole compilation unit at a time
diff --git a/gcc/expr.c b/gcc/expr.c
index 5a931dc..0badfa5 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -9249,6 +9249,21 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
       target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, target);
       return target;
 
+    case SEXT_EXPR:
+	{
+	  rtx op0 = expand_normal (treeop0);
+	  rtx temp;
+	  if (!target)
+	    target = gen_reg_rtx (TYPE_MODE (TREE_TYPE (treeop0)));
+
+	  machine_mode inner_mode = smallest_mode_for_size (tree_to_shwi (treeop1),
+							    MODE_INT);
+	  temp = convert_modes (inner_mode,
+				TYPE_MODE (TREE_TYPE (treeop0)), op0, 0);
+	  convert_move (target, temp, 0);
+	  return target;
+	}
+
     default:
       gcc_unreachable ();
     }
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 55196b5..3c42449 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -1007,6 +1007,10 @@ int_const_binop_1 (enum tree_code code, const_tree arg1, const_tree parg2,
       res = wi::bit_and (arg1, arg2);
       break;
 
+    case SEXT_EXPR:
+      res = wi::sext (arg1, arg2.to_uhwi ());
+      break;
+
     case RSHIFT_EXPR:
     case LSHIFT_EXPR:
       if (wi::neg_p (arg2))
diff --git a/gcc/gimple-ssa-type-promote.c b/gcc/gimple-ssa-type-promote.c
index e69de29..6ccda0c 100644
--- a/gcc/gimple-ssa-type-promote.c
+++ b/gcc/gimple-ssa-type-promote.c
@@ -0,0 +1,719 @@
+/* Type promotion of SSA names to minimise redundant zero/sign extension.
+   Copyright (C) 2015 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "flags.h"
+#include "hash-set.h"
+#include "machmode.h"
+#include "vec.h"
+#include "double-int.h"
+#include "input.h"
+#include "alias.h"
+#include "symtab.h"
+#include "wide-int.h"
+#include "inchash.h"
+#include "tree.h"
+#include "fold-const.h"
+#include "stor-layout.h"
+#include "calls.h"
+#include "predict.h"
+#include "hard-reg-set.h"
+#include "function.h"
+#include "dominance.h"
+#include "cfg.h"
+#include "basic-block.h"
+#include "tree-ssa-alias.h"
+#include "gimple-fold.h"
+#include "tree-eh.h"
+#include "gimple-expr.h"
+#include "is-a.h"
+#include "gimple.h"
+#include "gimple-iterator.h"
+#include "gimple-ssa.h"
+#include "tree-phinodes.h"
+#include "ssa-iterators.h"
+#include "stringpool.h"
+#include "tree-ssanames.h"
+#include "tree-pass.h"
+#include "gimple-pretty-print.h"
+#include "langhooks.h"
+#include "sbitmap.h"
+#include "domwalk.h"
+
+#include "tree-dfa.h"
+
+/* This pass applies type promotion to SSA names in the function and
+   inserts appropriate truncations.  Idea of this pass is to promote operations
+   such a way that we can minimise generation of subreg in RTL,
+   that intern results in removal of redundant zero/sign extensions.  This pass
+   will run prior to The VRP and DOM such that they will be able to optimise
+   redundant truncations and extensions.  This is based on the discussion from
+   https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00472.html.
+
+*/
+
+static unsigned n_ssa_val;
+static sbitmap ssa_to_be_promoted_bitmap;
+static sbitmap ssa_sets_higher_bits_bitmap;
+
+/* Return the promoted type for TYPE.  */
+static tree
+get_promoted_type (tree type)
+{
+  tree promoted_type;
+  enum machine_mode mode;
+  int uns;
+  if (POINTER_TYPE_P (type)
+      || !INTEGRAL_TYPE_P (type)
+      || TYPE_PRECISION (type) == 1)
+    return type;
+  mode = TYPE_MODE (type);
+#ifdef PROMOTE_MODE
+  uns = TYPE_SIGN (type);
+  PROMOTE_MODE (mode, uns, type);
+#endif
+  uns = TYPE_SIGN (type);
+  promoted_type = lang_hooks.types.type_for_mode (mode, uns);
+  if (promoted_type
+      && (TYPE_PRECISION (promoted_type) > TYPE_PRECISION (type)))
+    type = promoted_type;
+  return type;
+}
+
+/* Return true if ssa NAME is already considered for promotion.  */
+static bool
+ssa_promoted_p (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	return bitmap_bit_p (ssa_to_be_promoted_bitmap, index);
+    }
+  return true;
+}
+
+
+/* Set ssa NAME to be already considered for promotion.  */
+static void
+set_ssa_promoted (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	bitmap_set_bit (ssa_to_be_promoted_bitmap, index);
+    }
+}
+
+/* Return single successor (excluding EH edge) for basic block BB.  If there
+   are more than one successors, return NULL.  */
+static basic_block
+get_single_successor_bb (basic_block bb)
+{
+  edge e, res = NULL;
+  edge_iterator ei;
+
+  FOR_EACH_EDGE (e, ei, bb->succs)
+    if (!(e->flags & EDGE_EH))
+      {
+	if (res)
+	  return NULL;
+	res = e;
+      }
+  return res->dest;
+}
+
+/* Insert COPY_STMT along the edge from STMT to its successor.  */
+static void
+insert_stmt_on_edge (gimple stmt, gimple copy_stmt)
+{
+  edge_iterator ei;
+  edge e, edge = NULL;
+  basic_block bb = gimple_bb (stmt);
+
+  FOR_EACH_EDGE (e, ei, bb->succs)
+    if (!(e->flags & EDGE_EH))
+      {
+	gcc_assert (edge == NULL);
+	edge = e;
+      }
+
+  gcc_assert (edge);
+  gsi_insert_on_edge_immediate (edge, copy_stmt);
+}
+
+/* Convert constant CST to TYPE.  */
+static tree
+convert_int_cst (tree type, tree cst, signop sign = SIGNED)
+{
+  wide_int wi_cons = fold_convert (type, cst);
+  wi_cons = wi::ext (wi_cons, TYPE_PRECISION (TREE_TYPE (cst)), sign);
+  return wide_int_to_tree (type, wi_cons);
+}
+
+
+/* Promote constants in STMT to TYPE.  If PROMOTE_COND_EXPR is true,
+ promote only the constants in conditions part of the COND_EXPR.  */
+static void
+promote_cst_in_stmt (gimple stmt, tree type,
+		     bool promote_cond_expr = false, signop sign = SIGNED)
+{
+  tree op;
+  ssa_op_iter iter;
+  use_operand_p oprnd;
+  int index;
+  tree op0, op1;
+
+  if (promote_cond_expr)
+    {
+      /* Promote constant in COND_EXPR.  */
+      gcc_assert (gimple_assign_rhs_code (stmt) == COND_EXPR);
+      op = gimple_assign_rhs1 (stmt);
+      op0 = TREE_OPERAND (op, 0);
+      op1 = TREE_OPERAND (op, 1);
+
+      if (TREE_CODE (op0) == INTEGER_CST)
+	op0 = convert_int_cst (type, op0, sign);
+      if (TREE_CODE (op1) == INTEGER_CST)
+	op1 = convert_int_cst (type, op1, sign);
+
+      tree new_op = build2 (TREE_CODE (op), type, op0, op1);
+      gimple_assign_set_rhs1 (stmt, new_op);
+      return;
+    }
+
+  switch (gimple_code (stmt))
+    {
+    case GIMPLE_ASSIGN:
+      op = gimple_assign_rhs1 (stmt);
+
+      if (op && TREE_CODE (op) == INTEGER_CST)
+	gimple_assign_set_rhs1 (stmt, convert_int_cst (type, op, sign));
+      op = gimple_assign_rhs2 (stmt);
+
+      if (op && TREE_CODE (op) == INTEGER_CST)
+	gimple_assign_set_rhs2 (stmt, convert_int_cst (type, op, sign));
+      op = gimple_assign_rhs3 (stmt);
+
+      if (op && TREE_CODE (op) == INTEGER_CST)
+	gimple_assign_set_rhs3 (stmt, convert_int_cst (type, op, sign));
+      break;
+
+    case GIMPLE_PHI:
+	{
+	  gphi *phi = as_a <gphi *> (stmt);
+	  FOR_EACH_PHI_ARG (oprnd, phi, iter, SSA_OP_USE)
+	    {
+	      op = USE_FROM_PTR (oprnd);
+	      index = PHI_ARG_INDEX_FROM_USE (oprnd);
+	      if (TREE_CODE (op) == INTEGER_CST)
+		SET_PHI_ARG_DEF (phi, index, convert_int_cst (type, op, sign));
+	    }
+	}
+      break;
+
+    case GIMPLE_COND:
+	{
+	  gcond *cond = as_a <gcond *> (stmt);
+	  op = gimple_cond_lhs (cond);
+
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_cond_set_lhs (cond, convert_int_cst (type, op, sign));
+	  op = gimple_cond_rhs (cond);
+
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_cond_set_rhs (cond, convert_int_cst (type, op, sign));
+	}
+
+    default:
+      break;
+    }
+}
+
+/* Create an ssa with TYPE to copy ssa VAR.  */
+static tree
+make_promoted_copy (tree var, gimple def_stmt, tree type)
+{
+  tree new_lhs = make_ssa_name (type, def_stmt);
+  if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (var))
+    SSA_NAME_OCCURS_IN_ABNORMAL_PHI (new_lhs) = 1;
+  return new_lhs;
+}
+
+/* Zero/sign extend (depending on type) VAR and truncate to WIDTH bits.
+   Assign the zero/sign extended value in NEW_VAR.  gimple statement
+   that performs the zero/sign extension is returned.  */
+static gimple
+zero_sign_extend_stmt (tree new_var, tree var, int width)
+{
+  gcc_assert (TYPE_PRECISION (TREE_TYPE (var))
+	      == TYPE_PRECISION (TREE_TYPE (new_var)));
+  gcc_assert (TYPE_PRECISION (TREE_TYPE (var)) > width);
+  gimple stmt;
+
+  if (TYPE_UNSIGNED (TREE_TYPE (new_var)))
+    /* Zero extend.  */
+    stmt = gimple_build_assign (new_var,
+				BIT_AND_EXPR,
+				var, build_int_cst (TREE_TYPE (var),
+						    ((1ULL << width) - 1)));
+  else
+    /* Sign extend.  */
+    stmt = gimple_build_assign (new_var,
+				SEXT_EXPR,
+				var, build_int_cst (TREE_TYPE (var), width));
+  return stmt;
+}
+
+
+void duplicate_default_ssa (tree to, tree from)
+{
+  SET_SSA_NAME_VAR_OR_IDENTIFIER (to, SSA_NAME_VAR (from));
+  SSA_NAME_IS_DEFAULT_DEF (to) = SSA_NAME_IS_DEFAULT_DEF (from);
+  SSA_NAME_DEF_STMT (to) = SSA_NAME_DEF_STMT (from);
+  SET_SSA_NAME_VAR_OR_IDENTIFIER (from, NULL_TREE);
+  SSA_NAME_IS_DEFAULT_DEF (from)= 0;
+  SSA_NAME_IS_DEFAULT_DEF (to) = 1;
+  SSA_NAME_IS_DEFAULT_DEF (from) = 0;
+
+}
+
+
+/* Promote definition DEF to NEW_TYPE.  If the DEF is replaced and has to
+   be released, set RELEASE_DEF.  Also return COPY_OF_DEF with the original
+   type for any use statement that needs truncation.  */
+static void
+promote_definition (tree def,
+		    tree promoted_type)
+{
+  gimple def_stmt = SSA_NAME_DEF_STMT (def);
+  gimple copy_stmt = NULL;
+  basic_block bb;
+  gimple_stmt_iterator gsi;
+  tree original_type = TREE_TYPE (def);
+  gphi *phi;
+  tree new_def;
+  bool do_not_promote = false;
+
+  if (gimple_vuse (def_stmt) != NULL_TREE
+      || gimple_vdef (def_stmt) != NULL_TREE)
+    {
+      do_not_promote = true;
+    }
+  else
+    {
+      switch (gimple_code (def_stmt))
+	{
+	case GIMPLE_PHI:
+	    {
+	      phi = as_a <gphi *> (def_stmt);
+	      TREE_TYPE (def) = promoted_type;
+	      gimple_phi_set_result (phi, def);
+	      SET_PHI_RESULT (phi, def);
+	      SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+	      promote_cst_in_stmt (def_stmt, promoted_type);
+	      break;
+	    }
+
+	case GIMPLE_ASM:
+	    {
+	      gasm *asm_stmt = as_a <gasm *> (def_stmt);
+	      for (unsigned int i = 0; i < gimple_asm_noutputs (asm_stmt); ++i)
+		{
+		  tree link = gimple_asm_output_op (asm_stmt, i);
+		  tree op = TREE_VALUE (link);
+		  if (op == def)
+		    {
+		      new_def = copy_ssa_name (def);
+		      duplicate_default_ssa (new_def, def);
+		      TREE_VALUE (link) = new_def;
+		      gimple_asm_set_output_op (asm_stmt, i, link);
+
+		      TREE_TYPE (def) = promoted_type;
+		      copy_stmt = gimple_build_assign (def, CONVERT_EXPR,
+						       new_def, NULL_TREE);
+		      SSA_NAME_DEF_STMT (def) = copy_stmt;
+		      gsi = gsi_for_stmt (def_stmt);
+		      gsi_insert_after (&gsi, copy_stmt, GSI_NEW_STMT);
+		      break;
+		    }
+		}
+	      break;
+	    }
+
+	case GIMPLE_NOP:
+	    {
+	      if (SSA_NAME_VAR (def) == NULL)
+		{
+		  TREE_TYPE (def) = promoted_type;
+		}
+	      else
+		{
+		  /* Create a promoted type copy of parameters.  */
+		  bb = ENTRY_BLOCK_PTR_FOR_FN (cfun);
+		  bb = get_single_successor_bb (bb);
+		  gcc_assert (bb);
+		  gsi = gsi_after_labels (bb);
+		  new_def = copy_ssa_name (def);
+		  set_ssa_default_def (cfun, SSA_NAME_VAR (def), new_def);
+		  duplicate_default_ssa (new_def, def);
+		  TREE_TYPE (def) = promoted_type;
+		  copy_stmt = gimple_build_assign (def, CONVERT_EXPR,
+						   new_def, NULL_TREE);
+		  SSA_NAME_DEF_STMT (def) = copy_stmt;
+		  gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+		}
+	      break;
+	    }
+
+	case GIMPLE_ASSIGN:
+	    {
+	      enum tree_code code = gimple_assign_rhs_code (def_stmt);
+	      if (code == ARRAY_REF
+		  || code == LROTATE_EXPR
+		  || code == RROTATE_EXPR
+		  || code == VIEW_CONVERT_EXPR
+		  || code == BIT_FIELD_REF
+		  || code == REALPART_EXPR
+		  || code == IMAGPART_EXPR
+		  || code == REDUC_MAX_EXPR
+		  || code == REDUC_PLUS_EXPR
+		  || code == REDUC_MIN_EXPR)
+		{
+		  do_not_promote = true;
+		  break;
+		}
+
+	      if (CONVERT_EXPR_CODE_P (code))
+		{
+		  tree rhs = gimple_assign_rhs1 (def_stmt);
+		  if ((TYPE_PRECISION (TREE_TYPE (rhs)) == TYPE_PRECISION (promoted_type))
+		      && (TYPE_UNSIGNED (TREE_TYPE (rhs)) == TYPE_UNSIGNED (promoted_type)))
+		    {
+		      TREE_TYPE (def) = promoted_type;
+		      gimple copy_stmt =
+			zero_sign_extend_stmt (def, rhs,
+					       TYPE_PRECISION (original_type));
+		      SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		      gsi = gsi_for_stmt (def_stmt);
+		      gsi_replace (&gsi, copy_stmt, false);
+		    }
+		  else
+		    {
+		      do_not_promote = true;
+		    }
+		}
+	      else
+		{
+		  SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		  promote_cst_in_stmt (def_stmt, promoted_type);
+		  TREE_TYPE (def) = promoted_type;
+		  new_def = copy_ssa_name (def);
+		  gimple_set_lhs (def_stmt, new_def);
+		  gimple copy_stmt =
+		    zero_sign_extend_stmt (def, new_def,
+					   TYPE_PRECISION (original_type));
+		  gsi = gsi_for_stmt (def_stmt);
+		  gsi_insert_after (&gsi, copy_stmt, GSI_NEW_STMT);
+		}
+	      break;
+	    }
+
+	default:
+	  do_not_promote = true;
+	  break;
+	}
+    }
+
+  if (do_not_promote)
+    {
+      new_def = copy_ssa_name (def);
+      SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+      TREE_TYPE (def) = promoted_type;
+      gimple_set_lhs (def_stmt, new_def);
+      copy_stmt = gimple_build_assign (def, CONVERT_EXPR,
+				       new_def, NULL_TREE);
+      gsi = gsi_for_stmt (def_stmt);
+      if (lookup_stmt_eh_lp (def_stmt) > 0)
+	insert_stmt_on_edge (def_stmt, copy_stmt);
+      else
+	gsi_insert_after (&gsi, copy_stmt, GSI_NEW_STMT);
+    }
+}
+
+
+/* Promote all the USE with NEW_USE.  */
+static unsigned int
+promote_all_uses (tree use, tree promoted_type, tree old_type)
+{
+  gimple stmt;
+  imm_use_iterator ui;
+  gimple_stmt_iterator gsi;
+  use_operand_p op;
+
+  /* Replace all the use with the promoted variable.  */
+  FOR_EACH_IMM_USE_STMT (stmt, ui, use)
+    {
+      bool do_not_promote = false;
+      if (gimple_vuse (stmt) != NULL_TREE
+	  || gimple_vdef (stmt) != NULL_TREE)
+	do_not_promote = true;
+      else
+	{
+	  switch (gimple_code (stmt))
+	    {
+
+	    case GIMPLE_DEBUG:
+		{
+		  gsi = gsi_for_stmt (stmt);
+		  gsi_remove (&gsi, true);
+		}
+	      break;
+	    case GIMPLE_ASM:
+		{
+		  gasm *asm_stmt = as_a <gasm *> (stmt);
+		  for (unsigned int i = 0; i < gimple_asm_ninputs (asm_stmt); ++i)
+		    {
+		      tree link = gimple_asm_input_op (asm_stmt, i);
+		      tree op = TREE_VALUE (link);
+		      if (op == use)
+			{
+			  tree temp = make_promoted_copy (use, NULL, old_type);
+			  gsi = gsi_for_stmt (stmt);
+			  gimple copy_stmt = gimple_build_assign (temp, CONVERT_EXPR,
+								  use, NULL_TREE);
+			  gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+			  TREE_VALUE (link) = temp;
+			  gimple_asm_set_input_op (asm_stmt, i, link);
+			  break;
+			}
+		    }
+		}
+	      break;
+
+	    case GIMPLE_ASSIGN:
+		{
+		  enum tree_code code = gimple_assign_rhs_code (stmt);
+		  tree lhs = gimple_assign_lhs (stmt);
+		  if (code == VIEW_CONVERT_EXPR
+		      || code == LROTATE_EXPR
+		      || code == RROTATE_EXPR
+		      || code == CONSTRUCTOR
+		      || code == BIT_FIELD_REF
+		      || code == COMPLEX_EXPR
+		      || code == ASM_EXPR
+		      || VECTOR_TYPE_P (TREE_TYPE (lhs)))
+		    {
+		      do_not_promote = true;
+		    }
+
+		  if (TREE_CODE_CLASS (code)
+		      == tcc_comparison)
+		    {
+		      if (TREE_TYPE (use) == promoted_type)
+			promote_cst_in_stmt (stmt, promoted_type);
+		    }
+
+		  if (CONVERT_EXPR_CODE_P (code))
+		    {
+		      tree lhs = gimple_assign_lhs (stmt);
+		      if ((TYPE_PRECISION (TREE_TYPE (lhs)) == TYPE_PRECISION (promoted_type))
+			  && (TYPE_UNSIGNED (TREE_TYPE (lhs)) == TYPE_UNSIGNED (promoted_type)))
+			{
+			  gimple copy_stmt =
+			    zero_sign_extend_stmt (lhs, use,
+						   TYPE_PRECISION (old_type));
+			  gsi = gsi_for_stmt (stmt);
+			  gsi_replace (&gsi, copy_stmt, false);
+			}
+		      else if (TYPE_PRECISION (TREE_TYPE (lhs)) < TYPE_PRECISION (old_type))
+			{
+			  /* do nothing */
+			}
+		      else
+			do_not_promote = true;
+		    }
+		}
+	      break;
+
+	    case GIMPLE_COND:
+		{
+		  tree temp = make_promoted_copy (use, NULL, TREE_TYPE (use));
+		  gimple copy_stmt =
+		    zero_sign_extend_stmt (temp, use,
+					   TYPE_PRECISION (old_type));
+		  gsi = gsi_for_stmt (stmt);
+		  gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+
+		  FOR_EACH_IMM_USE_ON_STMT (op, ui)
+		    SET_USE (op, temp);
+		  promote_cst_in_stmt (stmt, promoted_type, false,
+				       TYPE_SIGN (TREE_TYPE (use)));
+		  update_stmt (stmt);
+		}
+	      break;
+	    default:
+	      break;
+	    }
+	}
+
+      if (do_not_promote)
+	{
+	  tree temp;
+	  temp = copy_ssa_name (use);
+	  TREE_TYPE (temp) = old_type;
+	  gimple copy_stmt = gimple_build_assign (temp, CONVERT_EXPR,
+						  use, NULL_TREE);
+	  gsi = gsi_for_stmt (stmt);
+	  gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+	  FOR_EACH_IMM_USE_ON_STMT (op, ui)
+	    SET_USE (op, temp);
+	  update_stmt (stmt);
+	}
+    }
+
+  return 0;
+}
+
+/* Promote definition of NAME and all its uses.  */
+static unsigned int
+promote_def_and_uses (tree name)
+{
+  tree type;
+  if (TREE_CODE (name) != SSA_NAME
+      || POINTER_TYPE_P (TREE_TYPE (name))
+      || !INTEGRAL_TYPE_P (TREE_TYPE (name))
+      || VECTOR_TYPE_P (TREE_TYPE (name))
+      || ssa_promoted_p (name)
+      || (type = get_promoted_type (TREE_TYPE (name))) == TREE_TYPE (name))
+    return 0;
+  tree old_type = TREE_TYPE (name);
+  promote_definition (name, type);
+  promote_all_uses (name, type, old_type);
+  set_ssa_promoted (name);
+  return 0;
+}
+
+/* Promote all the stmts in the basic block.  */
+static void
+promote_all_stmts (basic_block bb)
+{
+  gimple_stmt_iterator gsi;
+  ssa_op_iter iter;
+  tree def;
+
+  for (gphi_iterator gpi = gsi_start_phis (bb);
+       !gsi_end_p (gpi); gsi_next (&gpi))
+    {
+      gphi *phi = gpi.phi ();
+      use_operand_p op;
+
+      FOR_EACH_PHI_ARG (op, phi, iter, SSA_OP_USE)
+	{
+	  def = USE_FROM_PTR (op);
+	  promote_def_and_uses (def);
+	}
+      def = PHI_RESULT (phi);
+      promote_def_and_uses (def);
+    }
+  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+    {
+      gimple stmt = gsi_stmt (gsi);
+
+      FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_USE | SSA_OP_DEF)
+	promote_def_and_uses (def);
+    }
+}
+
+
+class type_promotion_dom_walker : public dom_walker
+{
+public:
+  type_promotion_dom_walker (cdi_direction direction)
+    : dom_walker (direction) {}
+  virtual void before_dom_children (basic_block bb)
+    {
+      promote_all_stmts (bb);
+    }
+};
+
+/* Main entry point to the pass.  */
+static unsigned int
+execute_type_promotion (void)
+{
+  n_ssa_val = num_ssa_names;
+  ssa_to_be_promoted_bitmap = sbitmap_alloc (n_ssa_val);
+  bitmap_clear (ssa_to_be_promoted_bitmap);
+  ssa_sets_higher_bits_bitmap = sbitmap_alloc (n_ssa_val);
+  bitmap_clear (ssa_sets_higher_bits_bitmap);
+
+  calculate_dominance_info (CDI_DOMINATORS);
+  /* Walk the CFG in dominator order.  */
+  type_promotion_dom_walker (CDI_DOMINATORS)
+    .walk (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+
+  sbitmap_free (ssa_to_be_promoted_bitmap);
+  sbitmap_free (ssa_sets_higher_bits_bitmap);
+    free_dominance_info (CDI_DOMINATORS);
+  return 0;
+}
+
+namespace {
+const pass_data pass_data_type_promotion =
+{
+  GIMPLE_PASS, /* type */
+  "promotion", /* name */
+  OPTGROUP_NONE, /* optinfo_flags */
+  TV_TREE_TYPE_PROMOTE, /* tv_id */
+  PROP_ssa, /* properties_required */
+  0, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  (TODO_cleanup_cfg | TODO_update_ssa | TODO_verify_all),
+};
+
+class pass_type_promotion : public gimple_opt_pass
+{
+public:
+  pass_type_promotion (gcc::context *ctxt)
+    : gimple_opt_pass (pass_data_type_promotion, ctxt)
+  {}
+
+  /* opt_pass methods: */
+  opt_pass * clone () { return new pass_type_promotion (m_ctxt); }
+  virtual bool gate (function *) { return flag_tree_type_promote != 0; }
+  virtual unsigned int execute (function *)
+    {
+      return execute_type_promotion ();
+    }
+
+}; // class pass_type_promotion
+
+} // anon namespace
+
+gimple_opt_pass *
+make_pass_type_promote (gcc::context *ctxt)
+{
+  return new pass_type_promotion (ctxt);
+}
+
diff --git a/gcc/passes.def b/gcc/passes.def
index 4690e23..dfa8a5b 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -271,6 +271,7 @@ along with GCC; see the file COPYING3.  If not see
 	  NEXT_PASS (pass_slp_vectorize);
       POP_INSERT_PASSES ()
       NEXT_PASS (pass_lower_vector_ssa);
+      NEXT_PASS (pass_type_promote);
       NEXT_PASS (pass_cse_reciprocals);
       NEXT_PASS (pass_reassoc);
       NEXT_PASS (pass_strength_reduction);
diff --git a/gcc/timevar.def b/gcc/timevar.def
index cf8f37d..57afa8d 100644
--- a/gcc/timevar.def
+++ b/gcc/timevar.def
@@ -269,6 +269,7 @@ DEFTIMEVAR (TV_GIMPLE_SLSR           , "straight-line strength reduction")
 DEFTIMEVAR (TV_VTABLE_VERIFICATION   , "vtable verification")
 DEFTIMEVAR (TV_TREE_UBSAN            , "tree ubsan")
 DEFTIMEVAR (TV_INITIALIZE_RTL        , "initialize rtl")
+DEFTIMEVAR (TV_TREE_TYPE_PROMOTE     , "tree type promote")
 
 /* Everything else in rest_of_compilation not included above.  */
 DEFTIMEVAR (TV_EARLY_LOCAL	     , "early local passes")
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index 99b27c7..3332626 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -3805,6 +3805,18 @@ verify_gimple_assign_binary (gassign *stmt)
         return false;
       }
 
+    case SEXT_EXPR:
+      {
+	if (!INTEGRAL_TYPE_P (lhs_type)
+	    || !INTEGRAL_TYPE_P (rhs1_type)
+	    || TREE_CODE (rhs2) != INTEGER_CST)
+	  {
+	    error ("invalid operands in sext expr");
+	    return true;
+	  }
+	return false;
+      }
+
     case VEC_WIDEN_LSHIFT_HI_EXPR:
     case VEC_WIDEN_LSHIFT_LO_EXPR:
       {
@@ -5235,6 +5247,7 @@ gimple_verify_flow_info (void)
 
 	  if (found_ctrl_stmt)
 	    {
+	      dump_bb (stderr, gimple_bb (stmt), 0, 0);
 	      error ("control flow in the middle of basic block %d",
 		     bb->index);
 	      err = 1;
diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index 71d75d9..e19ac3d 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -3912,6 +3912,7 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights,
     case BIT_XOR_EXPR:
     case BIT_AND_EXPR:
     case BIT_NOT_EXPR:
+    case SEXT_EXPR:
 
     case TRUTH_ANDIF_EXPR:
     case TRUTH_ORIF_EXPR:
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index 172bd82..533e4a6 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -428,6 +428,7 @@ extern gimple_opt_pass *make_pass_fre (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_check_data_deps (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_copy_prop (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_isolate_erroneous_paths (gcc::context *ctxt);
+extern gimple_opt_pass *make_pass_type_promote (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_vrp (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_uncprop (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_return_slot (gcc::context *ctxt);
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index cf875c8..02bc101 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -1812,6 +1812,14 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, int flags,
       }
       break;
 
+    case SEXT_EXPR:
+      pp_string (pp, "SEXT_EXPR <");
+      dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);
+      pp_string (pp, ", ");
+      dump_generic_node (pp, TREE_OPERAND (node, 1), spc, flags, false);
+      pp_greater (pp);
+      break;
+
     case MODIFY_EXPR:
     case INIT_EXPR:
       dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags,
@@ -3432,6 +3440,9 @@ op_symbol_code (enum tree_code code)
     case MIN_EXPR:
       return "min";
 
+    case SEXT_EXPR:
+      return "sext from bit";
+
     default:
       return "<<< ??? >>>";
     }
diff --git a/gcc/tree-ssanames.c b/gcc/tree-ssanames.c
index 9c39f65..05eef17 100644
--- a/gcc/tree-ssanames.c
+++ b/gcc/tree-ssanames.c
@@ -209,7 +209,8 @@ set_range_info (tree name, enum value_range_type range_type,
   unsigned int precision = TYPE_PRECISION (TREE_TYPE (name));
 
   /* Allocate if not available.  */
-  if (ri == NULL)
+  if (ri == NULL
+      || (precision != ri->get_min().get_precision()))
     {
       size_t size = (sizeof (range_info_def)
 		     + trailing_wide_ints <3>::extra_size (precision));
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index 22587d0..9ceae8d 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -2413,6 +2413,7 @@ extract_range_from_binary_expr_1 (value_range_t *vr,
       && code != LSHIFT_EXPR
       && code != MIN_EXPR
       && code != MAX_EXPR
+      && code != SEXT_EXPR
       && code != BIT_AND_EXPR
       && code != BIT_IOR_EXPR
       && code != BIT_XOR_EXPR)
@@ -2989,6 +2990,49 @@ extract_range_from_binary_expr_1 (value_range_t *vr,
       extract_range_from_multiplicative_op_1 (vr, code, &vr0, &vr1);
       return;
     }
+  else if (code == SEXT_EXPR)
+    {
+      gcc_assert (range_int_cst_p (&vr1));
+      unsigned int prec = tree_to_uhwi (vr1.min);
+      type = vr0.type;
+      wide_int tmin, tmax;
+      wide_int type_min, type_max;
+      wide_int may_be_nonzero, must_be_nonzero;
+
+      gcc_assert (!TYPE_UNSIGNED (expr_type));
+      type_min = wi::shwi (1 << (prec - 1),
+			   TYPE_PRECISION (TREE_TYPE (vr0.min)));
+      type_max = wi::shwi (((1 << (prec - 1)) - 1),
+			   TYPE_PRECISION (TREE_TYPE (vr0.max)));
+      if (zero_nonzero_bits_from_vr (expr_type, &vr0,
+				     &may_be_nonzero,
+				     &must_be_nonzero))
+	{
+	  HOST_WIDE_INT _may_be_nonzero = may_be_nonzero.to_uhwi ();
+
+	  if (_may_be_nonzero & (1 << (prec - 1)))
+	    {
+	      /* If to-be-extended sign bit can be one.  */
+	      tmin = type_min;
+	      tmax = may_be_nonzero & type_max;
+	    }
+	  else
+	    {
+	      /* If to-be-extended sign bit is zero.  */
+	      tmin = must_be_nonzero;
+	      tmax = may_be_nonzero;
+	    }
+	}
+      else
+	{
+	  tmin = type_min;
+	  tmax = type_max;
+	}
+      tmin = wi::sext (tmin, prec);
+      tmax = wi::sext (tmax, prec);
+      min = wide_int_to_tree (expr_type, tmin);
+      max = wide_int_to_tree (expr_type, tmax);
+    }
   else if (code == RSHIFT_EXPR
 	   || code == LSHIFT_EXPR)
     {
diff --git a/gcc/tree.def b/gcc/tree.def
index 56580af..57a1981 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -752,6 +752,9 @@ DEFTREECODE (BIT_XOR_EXPR, "bit_xor_expr", tcc_binary, 2)
 DEFTREECODE (BIT_AND_EXPR, "bit_and_expr", tcc_binary, 2)
 DEFTREECODE (BIT_NOT_EXPR, "bit_not_expr", tcc_unary, 1)
 
+/*  Sign-extend operation.  */
+DEFTREECODE (SEXT_EXPR, "sext_expr", tcc_binary, 2)
+
 /* ANDIF and ORIF allow the second operand not to be computed if the
    value of the expression is determined from the first operand.  AND,
    OR, and XOR always compute the second operand whether its value is

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC] Elimination of zext/sext - type promotion pass
  2015-06-01 23:20                       ` Kugan
@ 2015-06-19  2:55                         ` Kugan
  2015-07-28 11:05                         ` Richard Biener
  1 sibling, 0 replies; 51+ messages in thread
From: Kugan @ 2015-06-19  2:55 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jeff Law, gcc-patches, Jakub Jelinek

ping?

Thanks,
Kugan

On 02/06/15 09:14, Kugan wrote:
> 
> 
> On 08/05/15 22:48, Richard Biener wrote:
>> You compute which promotions are unsafe, like sources/sinks of memory
>> (I think you miss call arguments/return values and also asm operands here).
>> But instead of simply marking those SSA names as not to be promoted
>> I'd instead split their life-ranges, thus replace
>>
>>   _1 = mem;
>>
>> with
>>
>>   _2 = mem;
>>   _1 = [zs]ext (_2, ...);
>>
>> and promote _1 anyway.  So in the first phase I'd do that (and obviously
>> note that _2 isn't to be promoted in the specific example).
>>
>> For promotions that apply I wouldn't bother allocating new SSA names
>> but just "fix" their types (assign to their TREE_TYPE).  This also means
>> they have to become anonymous and if they didn't have a !DECL_IGNORED_P
>> decl before then a debug stmt should be inserted at the point of the
>> promotions.  So
>>
>>   bar_3 = _1 + _2;
>>
>> when promoted would become
>>
>>  _4 = _1 + _2;
>>  _3 = sext <_4, ...>;
>>  # DEBUG bar = (orig-type) _4;  // or _3?
>>
>> so you'd basically always promote defs (you have a lot of stmt/operand
>> walking code I didn't look too closely at - but it looks like too much) and
>> the uses get promoted automagically (because you promote the original
>> SSA name). Promotion of constants has to remain, of course.
> 
> 
> Thanks Richard. I experimented on this idea to understand it better.
> Please see the attached prototype (I am still working on your other
> comments which is not addressed here). Please have a look and let me
> know if this is along what you would expect. I have few questions though.
> 
> 1. In the following example above :
>   char _1;
>   _1 = mem;
> 
> when changing with
> 
>   char _2;
>   int _1;
>   _2 = mem;
>   _1 = [zs]ext (_2, ...);
> 
> for the [zs]ext operation we now use BIT_AND_EXPR and ZEXT_EXPR which
> (as of now) requires that the LHS and RHS are of the same type. Are you
> suggesting that we should have a true ZEXT_EXPR and SEXT_EXPR which can
> do the above in the gimple? I am now using CONVER_EXPR and which is the
> source of many optimization issue.
> 
> 2. for inline asm (a reduced test case that might not make much as a
> stand alone test-case, but I ran into similar cases with valid programmes)
> 
> ;; Function fn1 (fn1, funcdef_no=0, decl_uid=4220, cgraph_uid=0,
> symbol_order=0)
> 
> fn1 (short int p1)
> {
>   <bb 2>:
>   __asm__("" : "=r" p1_2 : "0" p1_1(D));
>   return;
> 
> }
> 
> 
> I am generating something like the following which ICEs. What is the
> expected out?
> 
> ;; Function fn1 (fn1, funcdef_no=0, decl_uid=4220, cgraph_uid=0,
> symbol_order=0)
> 
> fn1 (short int p1)
> {
>   int _1;
>   int _2;
>   short int _5;
> 
>   <bb 2>:
>   _1 = (int) p1_4(D);
>   _5 = (short int) _1;
>   __asm__("" : "=r" p1_6 : "0" _5);
>   _2 = (int) p1_6;
>   return;
> 
> }
> 
> Thanks a lot for your time,
> Kugan
> 

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC] Elimination of zext/sext - type promotion pass
  2015-06-01 23:20                       ` Kugan
  2015-06-19  2:55                         ` Kugan
@ 2015-07-28 11:05                         ` Richard Biener
  2015-08-05  0:12                           ` kugan
  1 sibling, 1 reply; 51+ messages in thread
From: Richard Biener @ 2015-07-28 11:05 UTC (permalink / raw)
  To: Kugan; +Cc: Jeff Law, gcc-patches, Jakub Jelinek

On Tue, Jun 2, 2015 at 1:14 AM, Kugan <kugan.vivekanandarajah@linaro.org> wrote:
>

Sorry for replying so late...

> On 08/05/15 22:48, Richard Biener wrote:
>> You compute which promotions are unsafe, like sources/sinks of memory
>> (I think you miss call arguments/return values and also asm operands here).
>> But instead of simply marking those SSA names as not to be promoted
>> I'd instead split their life-ranges, thus replace
>>
>>   _1 = mem;
>>
>> with
>>
>>   _2 = mem;
>>   _1 = [zs]ext (_2, ...);
>>
>> and promote _1 anyway.  So in the first phase I'd do that (and obviously
>> note that _2 isn't to be promoted in the specific example).
>>
>> For promotions that apply I wouldn't bother allocating new SSA names
>> but just "fix" their types (assign to their TREE_TYPE).  This also means
>> they have to become anonymous and if they didn't have a !DECL_IGNORED_P
>> decl before then a debug stmt should be inserted at the point of the
>> promotions.  So
>>
>>   bar_3 = _1 + _2;
>>
>> when promoted would become
>>
>>  _4 = _1 + _2;
>>  _3 = sext <_4, ...>;
>>  # DEBUG bar = (orig-type) _4;  // or _3?
>>
>> so you'd basically always promote defs (you have a lot of stmt/operand
>> walking code I didn't look too closely at - but it looks like too much) and
>> the uses get promoted automagically (because you promote the original
>> SSA name). Promotion of constants has to remain, of course.
>
>
> Thanks Richard. I experimented on this idea to understand it better.
> Please see the attached prototype (I am still working on your other
> comments which is not addressed here). Please have a look and let me
> know if this is along what you would expect. I have few questions though.
>
> 1. In the following example above :
>   char _1;
>   _1 = mem;
>
> when changing with
>
>   char _2;
>   int _1;
>   _2 = mem;
>   _1 = [zs]ext (_2, ...);
>
> for the [zs]ext operation we now use BIT_AND_EXPR and ZEXT_EXPR which
> (as of now) requires that the LHS and RHS are of the same type. Are you
> suggesting that we should have a true ZEXT_EXPR and SEXT_EXPR which can
> do the above in the gimple? I am now using CONVER_EXPR and which is the
> source of many optimization issue.

You indeed need to use CONVERT_EXPR here, maybe you can elaborate
on the optimization issues.

> 2. for inline asm (a reduced test case that might not make much as a
> stand alone test-case, but I ran into similar cases with valid programmes)
>
> ;; Function fn1 (fn1, funcdef_no=0, decl_uid=4220, cgraph_uid=0,
> symbol_order=0)
>
> fn1 (short int p1)
> {
>   <bb 2>:
>   __asm__("" : "=r" p1_2 : "0" p1_1(D));
>   return;
>
> }
>
>
> I am generating something like the following which ICEs. What is the
> expected out?
>
> ;; Function fn1 (fn1, funcdef_no=0, decl_uid=4220, cgraph_uid=0,
> symbol_order=0)
>
> fn1 (short int p1)
> {
>   int _1;
>   int _2;
>   short int _5;
>
>   <bb 2>:
>   _1 = (int) p1_4(D);
>   _5 = (short int) _1;
>   __asm__("" : "=r" p1_6 : "0" _5);
>   _2 = (int) p1_6;
>   return;
>
> }

Parameters are indeed "interesting" to handle ;)  As we now see on ARM
the incoming parameter (the default def) and later assignments to it
can require different promotions (well, different extensions for ARM).

The only sensible way to deal with promoting parameters is to
promote them by changing the function signature.  Thus reflect the
targets ABI for parameters in the GIMPLE representation (which
includes TYPE_ARG_TYPES and DECL_ARGUMENTS).
IMHO we should do this during gimplification of parameters / call
arguments already.

So for your example you'd end up with

fn1 (int p1)
{
  __asm__("" : "=r" p1_6 : "0" p1_4(D));
  return;
}

that is, promotions also apply to asm inputs/outputs (no?)

Richard.

> Thanks a lot for your time,
> Kugan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC] Elimination of zext/sext - type promotion pass
  2015-07-28 11:05                         ` Richard Biener
@ 2015-08-05  0:12                           ` kugan
  2015-08-05  9:10                             ` Richard Biener
  0 siblings, 1 reply; 51+ messages in thread
From: kugan @ 2015-08-05  0:12 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jeff Law, gcc-patches, Jakub Jelinek

[-- Attachment #1: Type: text/plain, Size: 9583 bytes --]


> You indeed need to use CONVERT_EXPR here, maybe you can elaborate
> on the optimization issues.
>
>> 2. for inline asm (a reduced test case that might not make much as a
>> stand alone test-case, but I ran into similar cases with valid programmes)
>>
>> ;; Function fn1 (fn1, funcdef_no=0, decl_uid=4220, cgraph_uid=0,
>> symbol_order=0)
>>
>> fn1 (short int p1)
>> {
>>    <bb 2>:
>>    __asm__("" : "=r" p1_2 : "0" p1_1(D));
>>    return;
>>
>> }
>>
>>
>> I am generating something like the following which ICEs. What is the
>> expected out?
>>
>> ;; Function fn1 (fn1, funcdef_no=0, decl_uid=4220, cgraph_uid=0,
>> symbol_order=0)
>>
>> fn1 (short int p1)
>> {
>>    int _1;
>>    int _2;
>>    short int _5;
>>
>>    <bb 2>:
>>    _1 = (int) p1_4(D);
>>    _5 = (short int) _1;
>>    __asm__("" : "=r" p1_6 : "0" _5);
>>    _2 = (int) p1_6;
>>    return;
>>
>> }
>
> Parameters are indeed "interesting" to handle ;)  As we now see on ARM
> the incoming parameter (the default def) and later assignments to it
> can require different promotions (well, different extensions for ARM).
>
> The only sensible way to deal with promoting parameters is to
> promote them by changing the function signature.  Thus reflect the
> targets ABI for parameters in the GIMPLE representation (which
> includes TYPE_ARG_TYPES and DECL_ARGUMENTS).
> IMHO we should do this during gimplification of parameters / call
> arguments already.
>
> So for your example you'd end up with
>
> fn1 (int p1)
> {
>    __asm__("" : "=r" p1_6 : "0" p1_4(D));
>    return;
> }
>
> that is, promotions also apply to asm inputs/outputs (no?)


Thanks for the review and answers. For the time being, I am handling 
gimple_asm as one that has to be handled in original type. I Will look 
into improving it after getting the basic framework right.

As it is, attached patch bootstraps on x86_64-linux-gnu, arm-linux-gnu 
and aarch64-linux-gnu. There are few regressions to look into (Please 
see below).

There are cases it is working well. There are cases where it can be 
improved. I am attaching couple test cases (and their results). I am 
seeing some BIT_AND_EXPR which are inserted by promotion are not being 
optimized when they are redundant. This is especially the case when I 
invalidate the VRP range into from VRP1 during the type promotion. I am 
looking into it.

Please note that attached patch still needs to address:
* Adding gimple_debug stmts.
* Address review comment for expr.c handling SEXT_EXPR.
* Address regression failures

Based on the feedback, I will address the above and split the patch into 
logical patch set for easy detailed review.

Here are the outputs for the testcases.

--- c5.c.142t.veclower21	2015-08-05 08:50:11.367135339 +1000
+++ c5.c.143t.promotion	2015-08-05 08:50:11.367135339 +1000
@@ -1,34 +1,45 @@

  ;; Function unPack (unPack, funcdef_no=0, decl_uid=4145, cgraph_uid=0, 
symbol_order=0)

  unPack (unsigned char c)
  {
-  short int _1;
-  unsigned short _4;
-  unsigned short _5;
-  short int _6;
-  short int _7;
+  int _1;
+  unsigned int _2;
+  unsigned int _3;
+  unsigned int _4;
+  unsigned int _5;
+  int _6;
+  int _7;
+  unsigned int _9;
+  int _11;
+  int _12;
+  short int _13;

    <bb 2>:
-  c_3 = c_2(D) & 15;
-  if (c_3 > 7)
+  _2 = (unsigned int) c_10(D);
+  _3 = _2 & 15;
+  _9 = _3 & 255;
+  if (_9 > 7)
      goto <bb 3>;
    else
      goto <bb 4>;

    <bb 3>:
-  _4 = (unsigned short) c_3;
-  _5 = _4 + 65531;
-  _6 = (short int) _5;
+  _4 = _3 & 65535;
+  _5 = _4 + 4294967291;
+  _11 = (int) _5;
+  _6 = (_11) sext from bit (16);
    goto <bb 5>;

    <bb 4>:
-  _7 = (short int) c_3;
+  _12 = (int) _3;
+  _7 = (_12) sext from bit (16);

    <bb 5>:
    # _1 = PHI <_6(3), _7(4)>
-  return _1;
+  _13 = (short int) _1;
+  return _13;

  }


--- c5.org.s	2015-08-05 08:51:44.619133892 +1000
+++ c5.new.s	2015-08-05 08:51:29.643134124 +1000
@@ -16,16 +16,14 @@
  	.syntax divided
  	.arm
  	.type	unPack, %function
  unPack:
  	@ args = 0, pretend = 0, frame = 0
  	@ frame_needed = 0, uses_anonymous_args = 0
  	@ link register save eliminated.
  	and	r0, r0, #15
  	cmp	r0, #7
  	subhi	r0, r0, #5
-	uxth	r0, r0
-	sxth	r0, r0
  	bx	lr
  	.size	unPack, .-unPack
  	.ident	"GCC: (GNU) 6.0.0 20150724 (experimental)"
  	.section	.note.GNU-stack,"",%progbits
--- crc.c.142t.veclower21	2015-08-05 08:52:43.811132974 +1000
+++ crc.c.143t.promotion	2015-08-05 08:52:43.811132974 +1000
@@ -1,52 +1,78 @@

  ;; Function crc2 (crc2, funcdef_no=0, decl_uid=4146, cgraph_uid=0, 
symbol_order=0)

  crc2 (short unsigned int crc, unsigned char data)
  {
    unsigned char carry;
    unsigned char x16;
    unsigned char i;
-  unsigned char ivtmp_5;
-  unsigned char _9;
-  unsigned char _10;
-  unsigned char ivtmp_18;
+  unsigned int _2;
+  unsigned int _3;
+  unsigned int _5;
+  unsigned int _7;
+  unsigned int _8;
+  unsigned int _9;
+  unsigned int _10;
+  unsigned int _11;
+  unsigned int _12;
+  unsigned int _13;
+  unsigned int _15;
+  unsigned int _16;
+  unsigned int _18;
+  unsigned int _19;
+  unsigned int _21;
+  unsigned int _22;
+  unsigned int _24;
+  short unsigned int _25;
+  unsigned int _26;
+  unsigned int _27;
+  unsigned int _28;
+  unsigned int _29;

    <bb 2>:
+  _8 = (unsigned int) data_4(D);
+  _7 = (unsigned int) crc_30(D);

    <bb 3>:
-  # crc_28 = PHI <crc_2(5), crc_7(D)(2)>
-  # data_29 = PHI <data_12(5), data_8(D)(2)>
-  # ivtmp_18 = PHI <ivtmp_5(5), 8(2)>
-  _9 = (unsigned char) crc_28;
-  _10 = _9 ^ data_29;
-  x16_11 = _10 & 1;
-  data_12 = data_29 >> 1;
-  if (x16_11 == 1)
+  # _28 = PHI <_2(5), _7(2)>
+  # _29 = PHI <_12(5), _8(2)>
+  # _18 = PHI <_5(5), 8(2)>
+  _9 = _28 & 255;
+  _10 = _9 ^ _29;
+  _11 = _10 & 1;
+  _3 = _29 & 255;
+  _12 = _3 >> 1;
+  _27 = _11 & 255;
+  if (_27 == 1)
      goto <bb 4>;
    else
      goto <bb 7>;

    <bb 4>:
-  crc_13 = crc_28 ^ 16386;
-  crc_24 = crc_13 >> 1;
-  crc_15 = crc_24 | 32768;
+  _13 = _28 ^ 16386;
+  _26 = _13 & 65535;
+  _24 = _26 >> 1;
+  _15 = _24 | 4294934528;

    <bb 5>:
-  # crc_2 = PHI <crc_15(4), crc_21(7)>
-  ivtmp_5 = ivtmp_18 - 1;
-  if (ivtmp_5 != 0)
+  # _2 = PHI <_15(4), _21(7)>
+  _5 = _18 - 1;
+  _22 = _5 & 255;
+  if (_22 != 0)
      goto <bb 3>;
    else
      goto <bb 6>;

    <bb 6>:
-  # crc_19 = PHI <crc_2(5)>
-  return crc_19;
+  # _19 = PHI <_2(5)>
+  _25 = (short unsigned int) _19;
+  return _25;

    <bb 7>:
-  crc_21 = crc_28 >> 1;
+  _16 = _28 & 65535;
+  _21 = _16 >> 1;
    goto <bb 5>;

  }


--- crc.org.s	2015-08-05 08:54:17.491131520 +1000
+++ crc.new.s	2015-08-05 08:53:12.183132534 +1000
@@ -15,27 +15,28 @@
  	.global	crc2
  	.syntax divided
  	.arm
  	.type	crc2, %function
  crc2:
  	@ args = 0, pretend = 0, frame = 0
  	@ frame_needed = 0, uses_anonymous_args = 0
  	mov	ip, #32768
  	movt	ip, 65535
  	str	lr, [sp, #-4]!
-	mov	r3, #8
+	mov	r2, #8
  	movw	lr, #16386
  .L3:
-	eor	r2, r1, r0
-	sub	r3, r3, #1
-	tst	r2, #1
+	uxtb	r3, r0
+	eor	r3, r3, r1
  	mov	r1, r1, lsr #1
+	tst	r3, #1
  	eorne	r0, r0, lr
-	moveq	r0, r0, lsr #1
-	orrne	r0, ip, r0, lsr #1
-	uxthne	r0, r0
-	ands	r3, r3, #255
+	ubfxeq	r0, r0, #1, #15
+	ubfxne	r0, r0, #1, #15
+	orrne	r0, r0, ip
+	subs	r2, r2, #1
  	bne	.L3
+	uxth	r0, r0
  	ldr	pc, [sp], #4
  	.size	crc2, .-crc2
  	.ident	"GCC: (GNU) 6.0.0 20150724 (experimental)"
  	.section	.note.GNU-stack,"",%progbits



Testsuite regression for x86_64-unknown-linux-gnu:
Tests that now fail, but worked before:
gfortran.dg/graphite/pr42393-1.f90   -O  (test for excess errors)


Testsuite regression for  arm-linux-gnu:
Tests that now fail, but worked before:
arm-sim: gcc.dg/fixed-point/convert-sat.c execution test
arm-sim: gcc.dg/tree-ssa/20030729-1.c scan-tree-dump-times dom2 "\\(unsigned
int\\)" 0
arm-sim: gcc.dg/tree-ssa/pr54245.c scan-tree-dump-times slsr "Inserting
initializer" 0
arm-sim: gcc.dg/tree-ssa/shorten-1.c scan-tree-dump-not optimized
"\\(int\\)"
arm-sim: gcc.dg/tree-ssa/shorten-1.c scan-tree-dump-times optimized
"\\(unsigned char\\)" 8
arm-sim: gcc.target/arm/mla-2.c scan-assembler smlalbb
arm-sim: gcc.target/arm/unsigned-extend-2.c scan-assembler ands
arm-sim: gcc.target/arm/wmul-1.c scan-assembler-times smlabb 2
arm-sim: gcc.target/arm/wmul-2.c scan-assembler-times smulbb 1
arm-sim: gcc.target/arm/wmul-3.c scan-assembler-times smulbb 2
arm-sim: gcc.target/arm/wmul-9.c scan-assembler smlalbb
arm-sim: gfortran.dg/graphite/pr42393-1.f90   -O  (test for excess errors)

Tests that now work, but didn't before:
arm-sim: gcc.dg/tree-prof/time-profiler-2.c scan-ipa-dump-times profile
"Read tp_first_run: 0" 2
arm-sim: gcc.dg/tree-prof/time-profiler-2.c scan-ipa-dump-times profile
"Read tp_first_run: 2" 1
arm-sim: gcc.dg/tree-prof/time-profiler-2.c scan-ipa-dump-times profile
"Read tp_first_run: 3" 1
arm-sim: gcc.target/arm/builtin-bswap-1.c scan-assembler-times rev16ne\\t 1
arm-sim: gcc.target/arm/builtin-bswap-1.c scan-assembler-times revshne\\t 1
arm-sim: gcc.target/arm/smlaltb-1.c scan-assembler smlaltb\\t
arm-sim: gcc.target/arm/smlaltt-1.c scan-assembler smlaltt\\t


Testsuite regression for  aarch64-linux-gnu:
Tests that now fail, but worked before:
c-c++-common/torture/vector-compare-1.c   -O3 -g  (test for excess errors)
c-c++-common/torture/vector-compare-1.c   -O3 -g  (test for excess errors)
gcc.dg/tree-ssa/20030729-1.c scan-tree-dump-times dom2 "\\(unsigned int\\)"
0
gcc.dg/tree-ssa/pr54245.c scan-tree-dump-times slsr "Inserting initializer"
0
gcc.dg/tree-ssa/shorten-1.c scan-tree-dump-not optimized "\\(int\\)"
gcc.dg/tree-ssa/shorten-1.c scan-tree-dump-times optimized "\\(unsigned
char\\)" 8

Thanks,
Kugan

[-- Attachment #2: p.txt --]
[-- Type: text/plain, Size: 34824 bytes --]

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 16d5582..63c9dd2 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1493,6 +1493,7 @@ OBJS = \
 	tree-vect-slp.o \
 	tree-vectorizer.o \
 	tree-vrp.o \
+	gimple-ssa-type-promote.o \
 	tree.o \
 	valtrack.o \
 	value-prof.o \
diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index 0b19953..6642c01 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -4972,6 +4972,10 @@ expand_debug_expr (tree exp)
     case FMA_EXPR:
       return simplify_gen_ternary (FMA, mode, inner_mode, op0, op1, op2);
 
+    case SEXT_EXPR:
+      return op0;
+
+
     default:
     flag_unsupported:
 #ifdef ENABLE_CHECKING
diff --git a/gcc/common.opt b/gcc/common.opt
index 6d47e94..5afda05 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2374,6 +2374,10 @@ ftree-vrp
 Common Report Var(flag_tree_vrp) Init(0) Optimization
 Perform Value Range Propagation on trees
 
+ftree-type-promote
+Common Report Var(flag_tree_type_promote) Init(1) Optimization
+Perform Type Promotion on trees
+
 funit-at-a-time
 Common Report Var(flag_unit_at_a_time) Init(1)
 Compile whole compilation unit at a time
diff --git a/gcc/expr.c b/gcc/expr.c
index d601129..7483950 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -9194,6 +9194,22 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
       target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, target);
       return target;
 
+    case SEXT_EXPR:
+	{
+	  rtx op0 = expand_normal (treeop0);
+	  rtx temp;
+	  if (!target)
+	    target = gen_reg_rtx (TYPE_MODE (TREE_TYPE (treeop0)));
+
+	  machine_mode inner_mode
+	    = smallest_mode_for_size (tree_to_shwi (treeop1),
+				      MODE_INT);
+	  temp = convert_modes (inner_mode,
+				TYPE_MODE (TREE_TYPE (treeop0)), op0, 0);
+	  convert_move (target, temp, 0);
+	  return target;
+	}
+
     default:
       gcc_unreachable ();
     }
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 886922f..bac899c 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -984,6 +984,10 @@ int_const_binop_1 (enum tree_code code, const_tree arg1, const_tree parg2,
       res = wi::bit_and (arg1, arg2);
       break;
 
+    case SEXT_EXPR:
+      res = wi::sext (arg1, arg2.to_uhwi ());
+      break;
+
     case RSHIFT_EXPR:
     case LSHIFT_EXPR:
       if (wi::neg_p (arg2))
diff --git a/gcc/gimple-ssa-type-promote.c b/gcc/gimple-ssa-type-promote.c
index e69de29..b5b69cc 100644
--- a/gcc/gimple-ssa-type-promote.c
+++ b/gcc/gimple-ssa-type-promote.c
@@ -0,0 +1,815 @@
+/* Type promotion of SSA names to minimise redundant zero/sign extension.
+   Copyright (C) 2015 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "flags.h"
+#include "hash-set.h"
+#include "machmode.h"
+#include "vec.h"
+#include "double-int.h"
+#include "input.h"
+#include "alias.h"
+#include "symtab.h"
+#include "wide-int.h"
+#include "inchash.h"
+#include "tree.h"
+#include "fold-const.h"
+#include "stor-layout.h"
+#include "calls.h"
+#include "predict.h"
+#include "hard-reg-set.h"
+#include "function.h"
+#include "dominance.h"
+#include "cfg.h"
+#include "basic-block.h"
+#include "tree-ssa-alias.h"
+#include "gimple-fold.h"
+#include "tree-eh.h"
+#include "gimple-expr.h"
+#include "is-a.h"
+#include "gimple.h"
+#include "gimple-iterator.h"
+#include "gimple-ssa.h"
+#include "tree-phinodes.h"
+#include "ssa-iterators.h"
+#include "stringpool.h"
+#include "tree-ssanames.h"
+#include "tree-pass.h"
+#include "gimple-pretty-print.h"
+#include "langhooks.h"
+#include "sbitmap.h"
+#include "domwalk.h"
+
+#include "tree-dfa.h"
+
+/* This pass applies type promotion to SSA names in the function and
+   inserts appropriate truncations.  Idea of this pass is to promote operations
+   such a way that we can minimise generation of subreg in RTL,
+   that intern results in removal of redundant zero/sign extensions.  This pass
+   will run prior to The VRP and DOM such that they will be able to optimise
+   redundant truncations and extensions.  This is based on the discussion from
+   https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00472.html.
+
+*/
+
+static unsigned n_ssa_val;
+static sbitmap ssa_to_be_promoted_bitmap;
+static sbitmap ssa_sets_higher_bits_bitmap;
+static hash_map <tree, tree>  *original_type_map;
+
+/* Return the promoted type for TYPE.  */
+static tree
+get_promoted_type (tree type)
+{
+  tree promoted_type;
+  enum machine_mode mode;
+  int uns;
+  if (POINTER_TYPE_P (type)
+      || !INTEGRAL_TYPE_P (type)
+      || TYPE_PRECISION (type) % 8 != 0)
+    return type;
+  mode = TYPE_MODE (type);
+#ifdef PROMOTE_MODE
+  uns = TYPE_SIGN (type);
+  PROMOTE_MODE (mode, uns, type);
+#endif
+  uns = TYPE_SIGN (type);
+  promoted_type = lang_hooks.types.type_for_mode (mode, uns);
+  if (promoted_type
+      && (TYPE_PRECISION (promoted_type) > TYPE_PRECISION (type)))
+    type = promoted_type;
+  return type;
+}
+
+/* Return true if ssa NAME is already considered for promotion.  */
+static bool
+ssa_promoted_p (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	return bitmap_bit_p (ssa_to_be_promoted_bitmap, index);
+    }
+  return true;
+}
+
+
+/* Set ssa NAME to be already considered for promotion.  */
+static void
+set_ssa_promoted (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	bitmap_set_bit (ssa_to_be_promoted_bitmap, index);
+    }
+}
+
+/* Insert COPY_STMT along the edge from STMT to its successor.  */
+static void
+insert_stmt_on_edge (gimple stmt, gimple copy_stmt)
+{
+  edge_iterator ei;
+  edge e, edge = NULL;
+  basic_block bb = gimple_bb (stmt);
+
+  FOR_EACH_EDGE (e, ei, bb->succs)
+    if (!(e->flags & EDGE_EH))
+      {
+	gcc_assert (edge == NULL);
+	edge = e;
+      }
+
+  gcc_assert (edge);
+  gsi_insert_on_edge_immediate (edge, copy_stmt);
+}
+
+/* Return true if it is safe to promote the defined SSA_NAME in the STMT
+   itself.  */
+static bool
+safe_to_promote_def_p (gimple stmt)
+{
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  if (gimple_vuse (stmt) != NULL_TREE
+      || gimple_vdef (stmt) != NULL_TREE
+      || code == ARRAY_REF
+      || code == LROTATE_EXPR
+      || code == RROTATE_EXPR
+      || code == VIEW_CONVERT_EXPR
+      || code == BIT_FIELD_REF
+      || code == REALPART_EXPR
+      || code == IMAGPART_EXPR
+      || code == REDUC_MAX_EXPR
+      || code == REDUC_PLUS_EXPR
+      || code == REDUC_MIN_EXPR)
+    return false;
+  return true;
+}
+
+/* Return true if it is safe to promote the use in the STMT.  */
+static bool
+safe_to_promote_use_p (gimple stmt)
+{
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  tree lhs = gimple_assign_lhs (stmt);
+
+  if (gimple_vuse (stmt) != NULL_TREE
+      || gimple_vdef (stmt) != NULL_TREE
+      ||code == VIEW_CONVERT_EXPR
+      || code == LROTATE_EXPR
+      || code == RROTATE_EXPR
+      || code == CONSTRUCTOR
+      || code == BIT_FIELD_REF
+      || code == COMPLEX_EXPR
+      || code == ASM_EXPR
+      || VECTOR_TYPE_P (TREE_TYPE (lhs)))
+    return false;
+  return true;
+}
+
+/* Return true if the SSA_NAME has to be truncated to preserve the
+   semantics.  */
+static bool
+truncate_use_p (gimple stmt)
+{
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  if (TREE_CODE_CLASS (code)
+      == tcc_comparison
+      || code == TRUNC_DIV_EXPR
+      || code == CEIL_DIV_EXPR
+      || code == FLOOR_DIV_EXPR
+      || code == ROUND_DIV_EXPR
+      || code == TRUNC_MOD_EXPR
+      || code == CEIL_MOD_EXPR
+      || code == FLOOR_MOD_EXPR
+      || code == ROUND_MOD_EXPR
+      || code == LSHIFT_EXPR
+      || code == RSHIFT_EXPR)
+    return true;
+  return false;
+}
+
+/* Return true if LHS will be promoted later.  */
+static bool
+tobe_promoted_p (tree lhs)
+{
+  if (TREE_CODE (lhs) == SSA_NAME
+      && !POINTER_TYPE_P (TREE_TYPE (lhs))
+      && INTEGRAL_TYPE_P (TREE_TYPE (lhs))
+      && !VECTOR_TYPE_P (TREE_TYPE (lhs))
+      && !ssa_promoted_p (lhs)
+      && (get_promoted_type (TREE_TYPE (lhs))
+	  != TREE_TYPE (lhs)))
+    return true;
+  else
+    return false;
+}
+
+/* Convert constant CST to TYPE.  */
+static tree
+convert_int_cst (tree type, tree cst, signop sign = SIGNED)
+{
+  wide_int wi_cons = fold_convert (type, cst);
+  wi_cons = wi::ext (wi_cons, TYPE_PRECISION (TREE_TYPE (cst)), sign);
+  return wide_int_to_tree (type, wi_cons);
+}
+
+/* Promote constants in STMT to TYPE.  If PROMOTE_COND_EXPR is true,
+   promote only the constants in conditions part of the COND_EXPR.  */
+static void
+promote_cst_in_stmt (gimple stmt, tree type, bool promote_cond = false)
+{
+  tree op;
+  ssa_op_iter iter;
+  use_operand_p oprnd;
+  int index;
+  tree op0, op1;
+  signop sign = SIGNED;
+
+  switch (gimple_code (stmt))
+    {
+    case GIMPLE_ASSIGN:
+      if (promote_cond
+	  && gimple_assign_rhs_code (stmt) == COND_EXPR)
+	{
+	  /* Promote INTEGER_CST that are tcc_compare arguments.  */
+	  sign = TYPE_SIGN (type);
+	  op = gimple_assign_rhs1 (stmt);
+	  op0 = TREE_OPERAND (op, 0);
+	  op1 = TREE_OPERAND (op, 1);
+	  if (TREE_CODE (op0) == INTEGER_CST)
+	    op0 = convert_int_cst (type, op0, sign);
+	  if (TREE_CODE (op1) == INTEGER_CST)
+	    op1 = convert_int_cst (type, op1, sign);
+	  tree new_op = build2 (TREE_CODE (op), type, op0, op1);
+	  gimple_assign_set_rhs1 (stmt, new_op);
+	}
+      else
+	{
+	  /* Promote INTEGER_CST in GIMPLE_ASSIGN.  */
+	  op = gimple_assign_rhs3 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs3 (stmt, convert_int_cst (type, op, sign));
+	  if (TREE_CODE_CLASS (gimple_assign_rhs_code (stmt))
+	      == tcc_comparison)
+	    sign = TYPE_SIGN (type);
+	  op = gimple_assign_rhs1 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs1 (stmt, convert_int_cst (type, op, sign));
+	  op = gimple_assign_rhs2 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs2 (stmt, convert_int_cst (type, op, sign));
+	}
+      break;
+
+    case GIMPLE_PHI:
+	{
+	  /* Promote INTEGER_CST arguments to GIMPLE_PHI.  */
+	  gphi *phi = as_a <gphi *> (stmt);
+	  FOR_EACH_PHI_ARG (oprnd, phi, iter, SSA_OP_USE)
+	    {
+	      op = USE_FROM_PTR (oprnd);
+	      index = PHI_ARG_INDEX_FROM_USE (oprnd);
+	      if (TREE_CODE (op) == INTEGER_CST)
+		SET_PHI_ARG_DEF (phi, index, convert_int_cst (type, op, sign));
+	    }
+	}
+      break;
+
+    case GIMPLE_COND:
+	{
+	  /* Promote INTEGER_CST that are GIMPLE_COND arguments.  */
+	  gcond *cond = as_a <gcond *> (stmt);
+	  op = gimple_cond_lhs (cond);
+	  sign = TYPE_SIGN (type);
+
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_cond_set_lhs (cond, convert_int_cst (type, op, sign));
+	  op = gimple_cond_rhs (cond);
+
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_cond_set_rhs (cond, convert_int_cst (type, op, sign));
+	}
+      break;
+
+    default:
+      gcc_unreachable ();
+    }
+}
+
+/* Create an ssa with TYPE to copy ssa VAR.  */
+static tree
+make_promoted_copy (tree var, gimple def_stmt, tree type)
+{
+  tree new_lhs = make_ssa_name (type, def_stmt);
+  if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (var))
+    SSA_NAME_OCCURS_IN_ABNORMAL_PHI (new_lhs) = 1;
+  return new_lhs;
+}
+
+/* Zero/sign extend (depending on type) VAR and truncate to WIDTH bits.
+   Assign the zero/sign extended value in NEW_VAR.  gimple statement
+   that performs the zero/sign extension is returned.  */
+static gimple
+zero_sign_extend_stmt (tree new_var, tree var, int width)
+{
+  gcc_assert (TYPE_PRECISION (TREE_TYPE (var))
+	      == TYPE_PRECISION (TREE_TYPE (new_var)));
+  gcc_assert (TYPE_PRECISION (TREE_TYPE (var)) > width);
+  gimple stmt;
+
+  if (TYPE_UNSIGNED (TREE_TYPE (new_var)))
+    /* Zero extend.  */
+    stmt = gimple_build_assign (new_var,
+				BIT_AND_EXPR,
+				var, build_int_cst (TREE_TYPE (var),
+						    ((1ULL << width) - 1)));
+  else
+    /* Sign extend.  */
+    stmt = gimple_build_assign (new_var,
+				SEXT_EXPR,
+				var, build_int_cst (TREE_TYPE (var), width));
+  return stmt;
+}
+
+
+void duplicate_default_ssa (tree to, tree from)
+{
+  SET_SSA_NAME_VAR_OR_IDENTIFIER (to, SSA_NAME_VAR (from));
+  SSA_NAME_IS_DEFAULT_DEF (to) = SSA_NAME_IS_DEFAULT_DEF (from);
+  SSA_NAME_DEF_STMT (to) = SSA_NAME_DEF_STMT (from);
+  SET_SSA_NAME_VAR_OR_IDENTIFIER (from, NULL_TREE);
+  SSA_NAME_IS_DEFAULT_DEF (to) = 1;
+  SSA_NAME_IS_DEFAULT_DEF (from) = 0;
+}
+
+/* Promote definition DEF to PROMOTED_TYPE. If the stmt that defines def
+   is def_stmt, make the type of def promoted_type. If the stmt is such
+   that, result of the def_stmt cannot be of promoted_type, create a new_def
+   of the original_type and make the def_stmt assign its value to newdef.
+   Then, create a CONVERT_EXPR to convert new_def to def of promoted type.
+
+   For example, for stmt with original_type char and promoted_type int:
+		char _1 = mem;
+	becomes:
+		char _2 = mem;
+		int _1 = (int)_2;
+
+   If the def_stmt allows def to be promoted, promote def in-place
+   (and its arguments when needed).
+
+   For example:
+		char _3 = _1 + _2;
+	becomes:
+		int _3 = _1 + _2;
+   Here, _1 and _2 will also be promoted. */
+
+static void
+promote_definition (tree def,
+		    tree promoted_type)
+{
+  gimple def_stmt = SSA_NAME_DEF_STMT (def);
+  gimple copy_stmt = NULL;
+  basic_block bb;
+  gimple_stmt_iterator gsi;
+  tree original_type = TREE_TYPE (def);
+  tree new_def;
+  bool do_not_promote = false;
+
+  switch (gimple_code (def_stmt))
+    {
+    case GIMPLE_PHI:
+	{
+	  /* Promote def by fixing its type and make def anonymous.  */
+	  TREE_TYPE (def) = promoted_type;
+	  SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+	  promote_cst_in_stmt (def_stmt, promoted_type);
+	  /* TODO: If def doesnt have a !DECL_IGNORED_P, insert a debug stmt.  */
+	  break;
+	}
+
+    case GIMPLE_ASM:
+	{
+	  gasm *asm_stmt = as_a <gasm *> (def_stmt);
+	  for (unsigned int i = 0; i < gimple_asm_noutputs (asm_stmt); ++i)
+	    {
+	      /* Promote def and copy (i.e. convert) the value defined
+		 by asm to def.  */
+	      tree link = gimple_asm_output_op (asm_stmt, i);
+	      tree op = TREE_VALUE (link);
+	      if (op == def)
+		{
+		  new_def = copy_ssa_name (def);
+		  set_ssa_promoted (new_def);
+		  duplicate_default_ssa (new_def, def);
+		  TREE_VALUE (link) = new_def;
+		  gimple_asm_set_output_op (asm_stmt, i, link);
+
+		  TREE_TYPE (def) = promoted_type;
+		  copy_stmt = gimple_build_assign (def, CONVERT_EXPR,
+						   new_def, NULL_TREE);
+		  gsi = gsi_for_stmt (def_stmt);
+		  SSA_NAME_IS_DEFAULT_DEF (new_def) = 0;
+		  gsi_insert_after (&gsi, copy_stmt, GSI_NEW_STMT);
+		  break;
+		}
+	    }
+	  break;
+	}
+
+    case GIMPLE_NOP:
+	{
+	  if (SSA_NAME_VAR (def) == NULL)
+	    {
+	      /* Promote def by fixing its type for anonymous def.  */
+	      TREE_TYPE (def) = promoted_type;
+	    }
+	  else
+	    {
+	      /* Create a promoted copy of parameters.  */
+	      bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+	      gcc_assert (bb);
+	      gsi = gsi_after_labels (bb);
+	      new_def = copy_ssa_name (def);
+	      set_ssa_promoted (new_def);
+	      set_ssa_default_def (cfun, SSA_NAME_VAR (def), new_def);
+	      duplicate_default_ssa (new_def, def);
+	      TREE_TYPE (def) = promoted_type;
+	      copy_stmt = gimple_build_assign (def, CONVERT_EXPR,
+					       new_def, NULL_TREE);
+	      SSA_NAME_DEF_STMT (def) = copy_stmt;
+	      gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+	    }
+	  break;
+	}
+
+    case GIMPLE_ASSIGN:
+	{
+	  enum tree_code code = gimple_assign_rhs_code (def_stmt);
+	  if (!safe_to_promote_def_p (def_stmt))
+	    {
+	      do_not_promote = true;
+	    }
+	  else if (CONVERT_EXPR_CODE_P (code))
+	    {
+	      tree rhs = gimple_assign_rhs1 (def_stmt);
+	      if (types_compatible_p (TREE_TYPE (rhs), promoted_type))
+		{
+		  /* As we travel statements in dominated order, arguments
+		     of def_stmt will be visited before visiting def.  If RHS
+		     is already promoted and type is compatible, we can convert
+		     them into ZERO/SIGN EXTEND stmt.  */
+		  tree &type = original_type_map->get_or_insert (rhs);
+		  if (type == NULL_TREE)
+		    type = TREE_TYPE (rhs);
+		  if (TYPE_PRECISION (original_type) < TYPE_PRECISION (type))
+		    type = original_type;
+		  gcc_assert (type != NULL_TREE);
+		  TREE_TYPE (def) = promoted_type;
+		  gimple copy_stmt =
+		    zero_sign_extend_stmt (def, rhs,
+					   TYPE_PRECISION (type));
+		  SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		  gsi = gsi_for_stmt (def_stmt);
+		  gsi_replace (&gsi, copy_stmt, false);
+		}
+	      else
+		{
+		  /* If RHS is not promoted OR their types are not
+		     compatible, create CONVERT_EXPR that converts
+		     RHS to  promoted DEF type and perform a
+		     ZERO/SIGN EXTEND to get the required value
+		     from RHS.  */
+		  tree s = (TYPE_PRECISION (TREE_TYPE (def))
+			    < TYPE_PRECISION (TREE_TYPE (rhs)))
+		    ? TREE_TYPE (def) : TREE_TYPE (rhs);
+		  new_def = copy_ssa_name (def);
+		  set_ssa_promoted (new_def);
+		  TREE_TYPE (def) = promoted_type;
+		  TREE_TYPE (new_def) = promoted_type;
+		  SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		  SET_SSA_NAME_VAR_OR_IDENTIFIER (new_def, NULL_TREE);
+		  gimple_set_lhs (def_stmt, new_def);
+		  gimple copy_stmt =
+		    zero_sign_extend_stmt (def, new_def,
+					   TYPE_PRECISION (s));
+		  gsi = gsi_for_stmt (def_stmt);
+		  if (lookup_stmt_eh_lp (def_stmt) > 0)
+		    insert_stmt_on_edge (def_stmt, copy_stmt);
+		  else
+		    gsi_insert_after (&gsi, copy_stmt, GSI_NEW_STMT);
+		}
+	    }
+	  else
+	    {
+	      /* Promote def by fixing its type and make def anonymous.  */
+	      SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+	      promote_cst_in_stmt (def_stmt, promoted_type);
+	      TREE_TYPE (def) = promoted_type;
+	    }
+	  break;
+	}
+
+    default:
+      do_not_promote = true;
+      break;
+    }
+
+  if (do_not_promote)
+    {
+      /* Promote def and copy (i.e. convert) the value defined
+	 by the stmt that cannot be promoted.  */
+      new_def = copy_ssa_name (def);
+      set_ssa_promoted (new_def);
+      SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+      TREE_TYPE (def) = promoted_type;
+      gimple_set_lhs (def_stmt, new_def);
+      copy_stmt = gimple_build_assign (def, CONVERT_EXPR,
+				       new_def, NULL_TREE);
+      gsi = gsi_for_stmt (def_stmt);
+      if (lookup_stmt_eh_lp (def_stmt) > 0)
+	insert_stmt_on_edge (def_stmt, copy_stmt);
+      else
+	gsi_insert_after (&gsi, copy_stmt, GSI_NEW_STMT);
+    }
+  else
+    {
+      /* Type is now promoted.  Due to this, some of the value ranges computed by
+	 VRP1 will is invalid.  TODO: We can be intelligent in deciding
+	 which ranges to be invalidated instead of invalidating everything.  */
+      SSA_NAME_RANGE_INFO (def) = NULL;
+    }
+}
+
+/* Fix the (promoted) USE in stmts where USE cannot be be promoted.  */
+static unsigned int
+fixup_uses (tree use, tree promoted_type, tree old_type)
+{
+  gimple stmt;
+  imm_use_iterator ui;
+  gimple_stmt_iterator gsi;
+  use_operand_p op;
+
+  FOR_EACH_IMM_USE_STMT (stmt, ui, use)
+    {
+      bool do_not_promote = false;
+      switch (gimple_code (stmt))
+	{
+	case GIMPLE_DEBUG:
+	    {
+	      gsi = gsi_for_stmt (stmt);
+	      gsi_remove (&gsi, true);
+	      break;
+	    }
+
+	case GIMPLE_ASM:
+	case GIMPLE_CALL:
+	case GIMPLE_RETURN:
+	    {
+	      /* USE cannot be promoted here.  */
+	      do_not_promote = true;
+	      break;
+	    }
+
+	case GIMPLE_ASSIGN:
+	    {
+	      enum tree_code code = gimple_assign_rhs_code (stmt);
+	      tree lhs = gimple_assign_lhs (stmt);
+	      if (!safe_to_promote_use_p (stmt))
+		{
+		  do_not_promote = true;
+		}
+	      else if (truncate_use_p (stmt))
+		{
+		  /* In some stmts, value in USE has to be ZERO/SIGN
+		     Extended based on the original type for correct
+		     result.  */
+		  tree temp = make_promoted_copy (use, NULL, TREE_TYPE (use));
+		  gimple copy_stmt =
+		    zero_sign_extend_stmt (temp, use,
+					   TYPE_PRECISION (old_type));
+		  gsi = gsi_for_stmt (stmt);
+		  gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+
+		  FOR_EACH_IMM_USE_ON_STMT (op, ui)
+		    SET_USE (op, temp);
+		  if (TREE_CODE_CLASS (code)
+		      == tcc_comparison)
+		    promote_cst_in_stmt (stmt, promoted_type, true);
+		  update_stmt (stmt);
+		}
+	      else if (CONVERT_EXPR_CODE_P (code))
+		{
+		  if (types_compatible_p (TREE_TYPE (lhs), promoted_type))
+		    {
+		      /* Type of LHS and promoted RHS are compatible, we can
+			 convert this into ZERO/SIGN EXTEND stmt.  */
+		      gimple copy_stmt =
+			zero_sign_extend_stmt (lhs, use,
+					       TYPE_PRECISION (old_type));
+		      gsi = gsi_for_stmt (stmt);
+		      set_ssa_promoted (lhs);
+		      gsi_replace (&gsi, copy_stmt, false);
+		    }
+		  else if (tobe_promoted_p (lhs))
+		    {
+		      /* If LHS will be promoted later, store the original
+			 type of RHS so that we can convert it to ZERO/SIGN
+			 EXTEND when LHS is promoted.  */
+		      tree rhs = gimple_assign_rhs1 (stmt);
+		      tree &type = original_type_map->get_or_insert (rhs);
+		      type = TREE_TYPE (old_type);
+		    }
+		  else
+		    {
+		      do_not_promote = true;
+		    }
+		}
+	      break;
+	    }
+
+	case GIMPLE_COND:
+	    {
+	      /* In GIMPLE_COND, value in USE has to be ZERO/SIGN
+		 Extended based on the original type for correct
+		 result.  */
+	      tree temp = make_promoted_copy (use, NULL, TREE_TYPE (use));
+	      gimple copy_stmt =
+		zero_sign_extend_stmt (temp, use,
+				       TYPE_PRECISION (old_type));
+	      gsi = gsi_for_stmt (stmt);
+	      gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+
+	      FOR_EACH_IMM_USE_ON_STMT (op, ui)
+		SET_USE (op, temp);
+	      promote_cst_in_stmt (stmt, promoted_type, true);
+	      update_stmt (stmt);
+	      break;
+	    }
+
+	default:
+	  break;
+	}
+
+      if (do_not_promote)
+	{
+	  /* FOR stmts where USE canoot be promoted, create an
+	     original type copy.  */
+	  tree temp;
+	  temp = copy_ssa_name (use);
+	  set_ssa_promoted (temp);
+	  TREE_TYPE (temp) = old_type;
+	  gimple copy_stmt = gimple_build_assign (temp, CONVERT_EXPR,
+						  use, NULL_TREE);
+	  gsi = gsi_for_stmt (stmt);
+	  gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+	  FOR_EACH_IMM_USE_ON_STMT (op, ui)
+	    SET_USE (op, temp);
+	  update_stmt (stmt);
+	}
+    }
+  return 0;
+}
+
+/* Promote definition of NAME and adjust its uses if necessary.  */
+static unsigned int
+promote_def_and_uses (tree name)
+{
+  tree type;
+  if (tobe_promoted_p (name))
+    {
+      type = get_promoted_type (TREE_TYPE (name));
+      tree old_type = TREE_TYPE (name);
+      promote_definition (name, type);
+      fixup_uses (name, type, old_type);
+      set_ssa_promoted (name);
+    }
+  return 0;
+}
+
+/* Promote all the stmts in the basic block.  */
+static void
+promote_all_stmts (basic_block bb)
+{
+  gimple_stmt_iterator gsi;
+  ssa_op_iter iter;
+  tree def;
+
+  for (gphi_iterator gpi = gsi_start_phis (bb);
+       !gsi_end_p (gpi); gsi_next (&gpi))
+    {
+      gphi *phi = gpi.phi ();
+      use_operand_p op;
+
+      FOR_EACH_PHI_ARG (op, phi, iter, SSA_OP_USE)
+	{
+	  def = USE_FROM_PTR (op);
+	  promote_def_and_uses (def);
+	}
+      def = PHI_RESULT (phi);
+      promote_def_and_uses (def);
+    }
+  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+    {
+      gimple stmt = gsi_stmt (gsi);
+
+      FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_USE | SSA_OP_DEF)
+	promote_def_and_uses (def);
+    }
+}
+
+
+class type_promotion_dom_walker : public dom_walker
+{
+public:
+  type_promotion_dom_walker (cdi_direction direction)
+    : dom_walker (direction) {}
+  virtual void before_dom_children (basic_block bb)
+    {
+      promote_all_stmts (bb);
+    }
+};
+
+/* Main entry point to the pass.  */
+static unsigned int
+execute_type_promotion (void)
+{
+  n_ssa_val = num_ssa_names;
+  original_type_map = new hash_map<tree, tree>;
+  ssa_to_be_promoted_bitmap = sbitmap_alloc (n_ssa_val);
+  bitmap_clear (ssa_to_be_promoted_bitmap);
+  ssa_sets_higher_bits_bitmap = sbitmap_alloc (n_ssa_val);
+  bitmap_clear (ssa_sets_higher_bits_bitmap);
+
+  calculate_dominance_info (CDI_DOMINATORS);
+  /* Walk the CFG in dominator order.  */
+  type_promotion_dom_walker (CDI_DOMINATORS)
+    .walk (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+
+  sbitmap_free (ssa_to_be_promoted_bitmap);
+  sbitmap_free (ssa_sets_higher_bits_bitmap);
+  free_dominance_info (CDI_DOMINATORS);
+  delete original_type_map;
+  return 0;
+}
+
+namespace {
+const pass_data pass_data_type_promotion =
+{
+  GIMPLE_PASS, /* type */
+  "promotion", /* name */
+  OPTGROUP_NONE, /* optinfo_flags */
+  TV_TREE_TYPE_PROMOTE, /* tv_id */
+  PROP_ssa, /* properties_required */
+  0, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  (TODO_cleanup_cfg | TODO_update_ssa | TODO_verify_all),
+};
+
+class pass_type_promotion : public gimple_opt_pass
+{
+public:
+  pass_type_promotion (gcc::context *ctxt)
+    : gimple_opt_pass (pass_data_type_promotion, ctxt)
+  {}
+
+  /* opt_pass methods: */
+  opt_pass * clone () { return new pass_type_promotion (m_ctxt); }
+  virtual bool gate (function *) { return flag_tree_type_promote != 0; }
+  virtual unsigned int execute (function *)
+    {
+      return execute_type_promotion ();
+    }
+
+}; // class pass_type_promotion
+
+} // anon namespace
+
+gimple_opt_pass *
+make_pass_type_promote (gcc::context *ctxt)
+{
+  return new pass_type_promotion (ctxt);
+}
+
diff --git a/gcc/passes.def b/gcc/passes.def
index 64fc4d9..254496b 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -270,6 +270,7 @@ along with GCC; see the file COPYING3.  If not see
       POP_INSERT_PASSES ()
       NEXT_PASS (pass_simduid_cleanup);
       NEXT_PASS (pass_lower_vector_ssa);
+      NEXT_PASS (pass_type_promote);
       NEXT_PASS (pass_cse_reciprocals);
       NEXT_PASS (pass_reassoc);
       NEXT_PASS (pass_strength_reduction);
diff --git a/gcc/timevar.def b/gcc/timevar.def
index aee36e6..38b8d7d 100644
--- a/gcc/timevar.def
+++ b/gcc/timevar.def
@@ -276,6 +276,7 @@ DEFTIMEVAR (TV_VTABLE_VERIFICATION   , "vtable verification")
 DEFTIMEVAR (TV_TREE_UBSAN            , "tree ubsan")
 DEFTIMEVAR (TV_INITIALIZE_RTL        , "initialize rtl")
 DEFTIMEVAR (TV_GIMPLE_LADDRESS       , "address lowering")
+DEFTIMEVAR (TV_TREE_TYPE_PROMOTE     , "tree type promote")
 
 /* Everything else in rest_of_compilation not included above.  */
 DEFTIMEVAR (TV_EARLY_LOCAL	     , "early local passes")
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index 66f999e..1db888d 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -3748,6 +3748,18 @@ verify_gimple_assign_binary (gassign *stmt)
         return false;
       }
 
+    case SEXT_EXPR:
+      {
+	if (!INTEGRAL_TYPE_P (lhs_type)
+	    || !INTEGRAL_TYPE_P (rhs1_type)
+	    || TREE_CODE (rhs2) != INTEGER_CST)
+	  {
+	    error ("invalid operands in sext expr");
+	    return true;
+	  }
+	return false;
+      }
+
     case VEC_WIDEN_LSHIFT_HI_EXPR:
     case VEC_WIDEN_LSHIFT_LO_EXPR:
       {
@@ -5192,6 +5204,7 @@ gimple_verify_flow_info (void)
 
 	  if (found_ctrl_stmt)
 	    {
+	      dump_bb (stderr, gimple_bb (stmt), 0, 0);
 	      error ("control flow in the middle of basic block %d",
 		     bb->index);
 	      err = 1;
diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index e1ceea4..272c409 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -3884,6 +3884,7 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights,
     case BIT_XOR_EXPR:
     case BIT_AND_EXPR:
     case BIT_NOT_EXPR:
+    case SEXT_EXPR:
 
     case TRUTH_ANDIF_EXPR:
     case TRUTH_ORIF_EXPR:
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index 7b66a1c..7ddb55c 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -431,6 +431,7 @@ extern gimple_opt_pass *make_pass_fre (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_check_data_deps (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_copy_prop (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_isolate_erroneous_paths (gcc::context *ctxt);
+extern gimple_opt_pass *make_pass_type_promote (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_vrp (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_uncprop (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_return_slot (gcc::context *ctxt);
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index 7cd1fe7..04f6777 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -1794,6 +1794,14 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, int flags,
       }
       break;
 
+    case SEXT_EXPR:
+      pp_string (pp, "SEXT_EXPR <");
+      dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);
+      pp_string (pp, ", ");
+      dump_generic_node (pp, TREE_OPERAND (node, 1), spc, flags, false);
+      pp_greater (pp);
+      break;
+
     case MODIFY_EXPR:
     case INIT_EXPR:
       dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags,
@@ -3414,6 +3422,9 @@ op_symbol_code (enum tree_code code)
     case MIN_EXPR:
       return "min";
 
+    case SEXT_EXPR:
+      return "sext from bit";
+
     default:
       return "<<< ??? >>>";
     }
diff --git a/gcc/tree-ssanames.c b/gcc/tree-ssanames.c
index 910cb19..19aa918 100644
--- a/gcc/tree-ssanames.c
+++ b/gcc/tree-ssanames.c
@@ -190,7 +190,8 @@ set_range_info (tree name, enum value_range_type range_type,
   unsigned int precision = TYPE_PRECISION (TREE_TYPE (name));
 
   /* Allocate if not available.  */
-  if (ri == NULL)
+  if (ri == NULL
+      || (precision != ri->get_min ().get_precision ()))
     {
       size_t size = (sizeof (range_info_def)
 		     + trailing_wide_ints <3>::extra_size (precision));
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index d962683..dee8f6f 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -2389,6 +2389,7 @@ extract_range_from_binary_expr_1 (value_range_t *vr,
       && code != LSHIFT_EXPR
       && code != MIN_EXPR
       && code != MAX_EXPR
+      && code != SEXT_EXPR
       && code != BIT_AND_EXPR
       && code != BIT_IOR_EXPR
       && code != BIT_XOR_EXPR)
@@ -2949,6 +2950,57 @@ extract_range_from_binary_expr_1 (value_range_t *vr,
       extract_range_from_multiplicative_op_1 (vr, code, &vr0, &vr1);
       return;
     }
+  else if (code == SEXT_EXPR)
+    {
+      gcc_assert (range_int_cst_p (&vr1));
+      unsigned int prec = tree_to_uhwi (vr1.min);
+      type = vr0.type;
+      wide_int tmin, tmax;
+      wide_int type_min, type_max;
+      wide_int may_be_nonzero, must_be_nonzero;
+      wide_int mask = wi::shwi (((1 << (prec - 1)) - 1),
+				TYPE_PRECISION (TREE_TYPE (vr0.max)));
+
+      gcc_assert (!TYPE_UNSIGNED (expr_type));
+      type_min = wi::shwi (1 << (prec - 1),
+			   TYPE_PRECISION (TREE_TYPE (vr0.min)));
+      type_max = wi::shwi (((1 << (prec - 1)) - 1),
+			   TYPE_PRECISION (TREE_TYPE (vr0.max)));
+      if (zero_nonzero_bits_from_vr (expr_type, &vr0,
+				     &may_be_nonzero,
+				     &must_be_nonzero))
+	{
+	  HOST_WIDE_INT int_may_be_nonzero = may_be_nonzero.to_uhwi ();
+	  HOST_WIDE_INT int_must_be_nonzero = must_be_nonzero.to_uhwi ();
+
+	  if (int_must_be_nonzero & (1 << (prec - 1)))
+	    {
+	      /* If to-be-extended sign bit is one.  */
+	      tmin = type_min;
+	      tmax = may_be_nonzero & mask;
+	    }
+	  else if ((int_may_be_nonzero & (1 << (prec - 1))) == 0)
+	    {
+	      /* If to-be-extended sign bit is zero.  */
+	      tmin = must_be_nonzero & mask;
+	      tmax = may_be_nonzero & mask;
+	    }
+	  else
+	    {
+	      tmin = type_min;
+	      tmax = type_max;
+	    }
+	}
+      else
+	{
+	  tmin = type_min;
+	  tmax = type_max;
+	}
+      tmin = wi::sext (tmin, prec);
+      tmax = wi::sext (tmax, prec);
+      min = wide_int_to_tree (expr_type, tmin);
+      max = wide_int_to_tree (expr_type, tmax);
+    }
   else if (code == RSHIFT_EXPR
 	   || code == LSHIFT_EXPR)
     {
@@ -9279,6 +9331,30 @@ simplify_bit_ops_using_ranges (gimple_stmt_iterator *gsi, gimple stmt)
 	  break;
 	}
       break;
+    case SEXT_EXPR:
+	{
+	  gcc_assert (is_gimple_min_invariant (op1));
+	  unsigned int prec = tree_to_uhwi (op1);
+	  wide_int mask;
+	  HOST_WIDE_INT may_be_nonzero = may_be_nonzero0.to_uhwi ();
+	  HOST_WIDE_INT must_be_nonzero = must_be_nonzero0.to_uhwi ();
+	  mask = wi::shwi (((1 << (prec - 1)) - 1),
+			   TYPE_PRECISION (TREE_TYPE (vr0.max)));
+	  mask = wi::bit_not (mask);
+	  if (must_be_nonzero & (1 << (prec - 1)))
+	    {
+	      /* If to-be-extended sign bit is one.  */
+	      if (wi::bit_and (must_be_nonzero0, mask) == mask)
+		op = op0;
+	    }
+	  else if ((may_be_nonzero & (1 << (prec - 1))) == 0)
+	    {
+	      /* If to-be-extended sign bit is zero.  */
+	      if (wi::bit_and (may_be_nonzero0, mask) == 0)
+		op = op0;
+	    }
+	}
+      break;
     default:
       gcc_unreachable ();
     }
@@ -9980,6 +10056,7 @@ simplify_stmt_using_ranges (gimple_stmt_iterator *gsi)
 
 	case BIT_AND_EXPR:
 	case BIT_IOR_EXPR:
+	case SEXT_EXPR:
 	  /* Optimize away BIT_AND_EXPR and BIT_IOR_EXPR
 	     if all the bits being cleared are already cleared or
 	     all the bits being set are already set.  */
diff --git a/gcc/tree.def b/gcc/tree.def
index 56580af..d98c999 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -752,6 +752,10 @@ DEFTREECODE (BIT_XOR_EXPR, "bit_xor_expr", tcc_binary, 2)
 DEFTREECODE (BIT_AND_EXPR, "bit_and_expr", tcc_binary, 2)
 DEFTREECODE (BIT_NOT_EXPR, "bit_not_expr", tcc_unary, 1)
 
+/*  Sign-extend operation. It will sign extend first operand from
+ the sign bit specified by the second operand. */
+DEFTREECODE (SEXT_EXPR, "sext_expr", tcc_binary, 2)
+
 /* ANDIF and ORIF allow the second operand not to be computed if the
    value of the expression is determined from the first operand.  AND,
    OR, and XOR always compute the second operand whether its value is

[-- Attachment #3: log.txt --]
[-- Type: text/plain, Size: 825 bytes --]

gcc/ChangeLog:

2015-07-05  Kugan Vivekanandarajah  <kuganv@linaro.org>

	* Makefile.in: Add gimple-ssa-type-promote.o.
	* cfgexpand.c (expand_debug_expr): Handle SEXT_EXPR.
	* common.opt: New option -ftree-type-promote.
	* expr.c (expand_expr_real_2): Handle SEXT_EXPR.
	* fold-const.c (int_const_binop_1):
	* gimple-ssa-type-promote.c: New file.
	* passes.def: Define new pass_type_promote.
	* timevar.def: Define new TV_TREE_TYPE_PROMOTE.
	* tree-cfg.c (verify_gimple_assign_binary): Handle SEXT_EXPR.
	* tree-inline.c (estimate_operator_cost):
	* tree-pass.h (make_pass_type_promote): New.
	* tree-pretty-print.c (dump_generic_node): Handle SEXT_EXPR.
	(op_symbol_code): Likewise.
	* tree-vrp.c (extract_range_from_binary_expr_1): Likewise.
	(simplify_bit_ops_using_ranges): Likewise.
	* tree.def: Define new SEXT_EXPR.


[-- Attachment #4: c5.c --]
[-- Type: text/x-csrc, Size: 290 bytes --]

short unPack( unsigned char c )
{
    /* Only want lower four bit nibble */
    c = c & (unsigned char)0x0F ;

    if( c > 7 ) {
        /* Negative nibble */
        return( ( short )( c - 5 ) ) ;

    }
    else
    {
        /* positive nibble */
        return( ( short )c ) ;
    }
}


[-- Attachment #5: crc.c --]
[-- Type: text/x-csrc, Size: 431 bytes --]

unsigned short
crc2(unsigned short crc, unsigned char data)
{
   unsigned char i, x16, carry;
 
   for (i = 0; i < 8; i++)
     {
       x16 = (data ^ crc) & 1;
       data >>= 1;
 
       if (x16 == 1)
         {
           crc ^= 0x4002;
           carry = 1;
         }
       else
         carry = 0;
 
       crc >>= 1;
 
       if (carry)
         crc |= 0x8000;
       else
         crc &= 0x7fff;
     }
 
   return crc;
}

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC] Elimination of zext/sext - type promotion pass
  2015-08-05  0:12                           ` kugan
@ 2015-08-05  9:10                             ` Richard Biener
  0 siblings, 0 replies; 51+ messages in thread
From: Richard Biener @ 2015-08-05  9:10 UTC (permalink / raw)
  To: kugan; +Cc: Jeff Law, gcc-patches, Jakub Jelinek

On Wed, Aug 5, 2015 at 2:12 AM, kugan <kugan.vivekanandarajah@linaro.org> wrote:
>
>> You indeed need to use CONVERT_EXPR here, maybe you can elaborate
>> on the optimization issues.
>>
>>> 2. for inline asm (a reduced test case that might not make much as a
>>> stand alone test-case, but I ran into similar cases with valid
>>> programmes)
>>>
>>> ;; Function fn1 (fn1, funcdef_no=0, decl_uid=4220, cgraph_uid=0,
>>> symbol_order=0)
>>>
>>> fn1 (short int p1)
>>> {
>>>    <bb 2>:
>>>    __asm__("" : "=r" p1_2 : "0" p1_1(D));
>>>    return;
>>>
>>> }
>>>
>>>
>>> I am generating something like the following which ICEs. What is the
>>> expected out?
>>>
>>> ;; Function fn1 (fn1, funcdef_no=0, decl_uid=4220, cgraph_uid=0,
>>> symbol_order=0)
>>>
>>> fn1 (short int p1)
>>> {
>>>    int _1;
>>>    int _2;
>>>    short int _5;
>>>
>>>    <bb 2>:
>>>    _1 = (int) p1_4(D);
>>>    _5 = (short int) _1;
>>>    __asm__("" : "=r" p1_6 : "0" _5);
>>>    _2 = (int) p1_6;
>>>    return;
>>>
>>> }
>>
>>
>> Parameters are indeed "interesting" to handle ;)  As we now see on ARM
>> the incoming parameter (the default def) and later assignments to it
>> can require different promotions (well, different extensions for ARM).
>>
>> The only sensible way to deal with promoting parameters is to
>> promote them by changing the function signature.  Thus reflect the
>> targets ABI for parameters in the GIMPLE representation (which
>> includes TYPE_ARG_TYPES and DECL_ARGUMENTS).
>> IMHO we should do this during gimplification of parameters / call
>> arguments already.
>>
>> So for your example you'd end up with
>>
>> fn1 (int p1)
>> {
>>    __asm__("" : "=r" p1_6 : "0" p1_4(D));
>>    return;
>> }
>>
>> that is, promotions also apply to asm inputs/outputs (no?)
>
>
>
> Thanks for the review and answers. For the time being, I am handling
> gimple_asm as one that has to be handled in original type. I Will look into
> improving it after getting the basic framework right.

Yeah, that's always a possibility.  I also see from the dumps that we probably
want to promote function arguments and results on GIMPLE as well.  Possibly
very early during gimplification or as an early IPA pass (as it needs to adjust
the IL for calls as well, exposing ABI required promotions / extensions).

> As it is, attached patch bootstraps on x86_64-linux-gnu, arm-linux-gnu and
> aarch64-linux-gnu. There are few regressions to look into (Please see
> below).
>
> There are cases it is working well. There are cases where it can be
> improved. I am attaching couple test cases (and their results). I am seeing
> some BIT_AND_EXPR which are inserted by promotion are not being optimized
> when they are redundant. This is especially the case when I invalidate the
> VRP range into from VRP1 during the type promotion. I am looking into it.
>
> Please note that attached patch still needs to address:
> * Adding gimple_debug stmts.
> * Address review comment for expr.c handling SEXT_EXPR.
> * Address regression failures
>
> Based on the feedback, I will address the above and split the patch into
> logical patch set for easy detailed review.
>
> Here are the outputs for the testcases.
>
> --- c5.c.142t.veclower21        2015-08-05 08:50:11.367135339 +1000
> +++ c5.c.143t.promotion 2015-08-05 08:50:11.367135339 +1000
> @@ -1,34 +1,45 @@
>
>  ;; Function unPack (unPack, funcdef_no=0, decl_uid=4145, cgraph_uid=0,
> symbol_order=0)
>
>  unPack (unsigned char c)
>  {
> -  short int _1;
> -  unsigned short _4;
> -  unsigned short _5;
> -  short int _6;
> -  short int _7;
> +  int _1;
> +  unsigned int _2;
> +  unsigned int _3;
> +  unsigned int _4;
> +  unsigned int _5;
> +  int _6;
> +  int _7;
> +  unsigned int _9;
> +  int _11;
> +  int _12;
> +  short int _13;
>
>    <bb 2>:
> -  c_3 = c_2(D) & 15;
> -  if (c_3 > 7)
> +  _2 = (unsigned int) c_10(D);
> +  _3 = _2 & 15;
> +  _9 = _3 & 255;
> +  if (_9 > 7)
>      goto <bb 3>;
>    else
>      goto <bb 4>;
>
>    <bb 3>:
> -  _4 = (unsigned short) c_3;
> -  _5 = _4 + 65531;
> -  _6 = (short int) _5;
> +  _4 = _3 & 65535;
> +  _5 = _4 + 4294967291;
> +  _11 = (int) _5;
> +  _6 = (_11) sext from bit (16);

Ok, so in GIMPLE we still have sign-changing conversions.  Another
thing we might want to lower at some stage ... ;)

>    goto <bb 5>;
>
>    <bb 4>:
> -  _7 = (short int) c_3;
> +  _12 = (int) _3;
> +  _7 = (_12) sext from bit (16);
>
>    <bb 5>:
>    # _1 = PHI <_6(3), _7(4)>
> -  return _1;
> +  _13 = (short int) _1;
> +  return _13;
>
>  }

Overall this looks like what I'd have expected - also pointing out the
missing argument/return value promotion.

>
> --- c5.org.s    2015-08-05 08:51:44.619133892 +1000
> +++ c5.new.s    2015-08-05 08:51:29.643134124 +1000
> @@ -16,16 +16,14 @@
>         .syntax divided
>         .arm
>         .type   unPack, %function
>  unPack:
>         @ args = 0, pretend = 0, frame = 0
>         @ frame_needed = 0, uses_anonymous_args = 0
>         @ link register save eliminated.
>         and     r0, r0, #15
>         cmp     r0, #7
>         subhi   r0, r0, #5
> -       uxth    r0, r0
> -       sxth    r0, r0

Nice.

>         bx      lr
>         .size   unPack, .-unPack
>         .ident  "GCC: (GNU) 6.0.0 20150724 (experimental)"
>         .section        .note.GNU-stack,"",%progbits
> --- crc.c.142t.veclower21       2015-08-05 08:52:43.811132974 +1000
> +++ crc.c.143t.promotion        2015-08-05 08:52:43.811132974 +1000
> @@ -1,52 +1,78 @@
>
>  ;; Function crc2 (crc2, funcdef_no=0, decl_uid=4146, cgraph_uid=0,
> symbol_order=0)
>
>  crc2 (short unsigned int crc, unsigned char data)
>  {
>    unsigned char carry;
>    unsigned char x16;
>    unsigned char i;
> -  unsigned char ivtmp_5;
> -  unsigned char _9;
> -  unsigned char _10;
> -  unsigned char ivtmp_18;
> +  unsigned int _2;
> +  unsigned int _3;
> +  unsigned int _5;
> +  unsigned int _7;
> +  unsigned int _8;
> +  unsigned int _9;
> +  unsigned int _10;
> +  unsigned int _11;
> +  unsigned int _12;
> +  unsigned int _13;
> +  unsigned int _15;
> +  unsigned int _16;
> +  unsigned int _18;
> +  unsigned int _19;
> +  unsigned int _21;
> +  unsigned int _22;
> +  unsigned int _24;
> +  short unsigned int _25;
> +  unsigned int _26;
> +  unsigned int _27;
> +  unsigned int _28;
> +  unsigned int _29;
>
>    <bb 2>:
> +  _8 = (unsigned int) data_4(D);
> +  _7 = (unsigned int) crc_30(D);
>
>    <bb 3>:
> -  # crc_28 = PHI <crc_2(5), crc_7(D)(2)>
> -  # data_29 = PHI <data_12(5), data_8(D)(2)>
> -  # ivtmp_18 = PHI <ivtmp_5(5), 8(2)>
> -  _9 = (unsigned char) crc_28;
> -  _10 = _9 ^ data_29;
> -  x16_11 = _10 & 1;
> -  data_12 = data_29 >> 1;
> -  if (x16_11 == 1)
> +  # _28 = PHI <_2(5), _7(2)>
> +  # _29 = PHI <_12(5), _8(2)>
> +  # _18 = PHI <_5(5), 8(2)>
> +  _9 = _28 & 255;
> +  _10 = _9 ^ _29;
> +  _11 = _10 & 1;
> +  _3 = _29 & 255;
> +  _12 = _3 >> 1;
> +  _27 = _11 & 255;
> +  if (_27 == 1)
>      goto <bb 4>;
>    else
>      goto <bb 7>;
>
>    <bb 4>:
> -  crc_13 = crc_28 ^ 16386;
> -  crc_24 = crc_13 >> 1;
> -  crc_15 = crc_24 | 32768;
> +  _13 = _28 ^ 16386;
> +  _26 = _13 & 65535;
> +  _24 = _26 >> 1;
> +  _15 = _24 | 4294934528;
>
>    <bb 5>:
> -  # crc_2 = PHI <crc_15(4), crc_21(7)>
> -  ivtmp_5 = ivtmp_18 - 1;
> -  if (ivtmp_5 != 0)
> +  # _2 = PHI <_15(4), _21(7)>
> +  _5 = _18 - 1;
> +  _22 = _5 & 255;
> +  if (_22 != 0)
>      goto <bb 3>;
>    else
>      goto <bb 6>;
>
>    <bb 6>:
> -  # crc_19 = PHI <crc_2(5)>
> -  return crc_19;
> +  # _19 = PHI <_2(5)>
> +  _25 = (short unsigned int) _19;
> +  return _25;
>
>    <bb 7>:
> -  crc_21 = crc_28 >> 1;
> +  _16 = _28 & 65535;
> +  _21 = _16 >> 1;
>    goto <bb 5>;
>
>  }
>
>
> --- crc.org.s   2015-08-05 08:54:17.491131520 +1000
> +++ crc.new.s   2015-08-05 08:53:12.183132534 +1000
> @@ -15,27 +15,28 @@
>         .global crc2
>         .syntax divided
>         .arm
>         .type   crc2, %function
>  crc2:
>         @ args = 0, pretend = 0, frame = 0
>         @ frame_needed = 0, uses_anonymous_args = 0
>         mov     ip, #32768
>         movt    ip, 65535
>         str     lr, [sp, #-4]!
> -       mov     r3, #8
> +       mov     r2, #8
>         movw    lr, #16386
>  .L3:
> -       eor     r2, r1, r0
> -       sub     r3, r3, #1
> -       tst     r2, #1
> +       uxtb    r3, r0
> +       eor     r3, r3, r1
>         mov     r1, r1, lsr #1
> +       tst     r3, #1
>         eorne   r0, r0, lr
> -       moveq   r0, r0, lsr #1
> -       orrne   r0, ip, r0, lsr #1
> -       uxthne  r0, r0
> -       ands    r3, r3, #255
> +       ubfxeq  r0, r0, #1, #15
> +       ubfxne  r0, r0, #1, #15
> +       orrne   r0, r0, ip
> +       subs    r2, r2, #1
>         bne     .L3
> +       uxth    r0, r0
>         ldr     pc, [sp], #4
>         .size   crc2, .-crc2
>         .ident  "GCC: (GNU) 6.0.0 20150724 (experimental)"
>         .section        .note.GNU-stack,"",%progbits

Can't really dechipher this changes...

>
>
> Testsuite regression for x86_64-unknown-linux-gnu:
> Tests that now fail, but worked before:
> gfortran.dg/graphite/pr42393-1.f90   -O  (test for excess errors)

I see this on pristine trunk as well.

>
> Testsuite regression for  arm-linux-gnu:
> Tests that now fail, but worked before:
> arm-sim: gcc.dg/fixed-point/convert-sat.c execution test
> arm-sim: gcc.dg/tree-ssa/20030729-1.c scan-tree-dump-times dom2 "\\(unsigned
> int\\)" 0
> arm-sim: gcc.dg/tree-ssa/pr54245.c scan-tree-dump-times slsr "Inserting
> initializer" 0
> arm-sim: gcc.dg/tree-ssa/shorten-1.c scan-tree-dump-not optimized
> "\\(int\\)"
> arm-sim: gcc.dg/tree-ssa/shorten-1.c scan-tree-dump-times optimized
> "\\(unsigned char\\)" 8
> arm-sim: gcc.target/arm/mla-2.c scan-assembler smlalbb
> arm-sim: gcc.target/arm/unsigned-extend-2.c scan-assembler ands
> arm-sim: gcc.target/arm/wmul-1.c scan-assembler-times smlabb 2
> arm-sim: gcc.target/arm/wmul-2.c scan-assembler-times smulbb 1
> arm-sim: gcc.target/arm/wmul-3.c scan-assembler-times smulbb 2
> arm-sim: gcc.target/arm/wmul-9.c scan-assembler smlalbb
> arm-sim: gfortran.dg/graphite/pr42393-1.f90   -O  (test for excess errors)
>
> Tests that now work, but didn't before:
> arm-sim: gcc.dg/tree-prof/time-profiler-2.c scan-ipa-dump-times profile
> "Read tp_first_run: 0" 2
> arm-sim: gcc.dg/tree-prof/time-profiler-2.c scan-ipa-dump-times profile
> "Read tp_first_run: 2" 1
> arm-sim: gcc.dg/tree-prof/time-profiler-2.c scan-ipa-dump-times profile
> "Read tp_first_run: 3" 1
> arm-sim: gcc.target/arm/builtin-bswap-1.c scan-assembler-times rev16ne\\t 1
> arm-sim: gcc.target/arm/builtin-bswap-1.c scan-assembler-times revshne\\t 1
> arm-sim: gcc.target/arm/smlaltb-1.c scan-assembler smlaltb\\t
> arm-sim: gcc.target/arm/smlaltt-1.c scan-assembler smlaltt\\t
>
>
> Testsuite regression for  aarch64-linux-gnu:
> Tests that now fail, but worked before:
> c-c++-common/torture/vector-compare-1.c   -O3 -g  (test for excess errors)
> c-c++-common/torture/vector-compare-1.c   -O3 -g  (test for excess errors)
> gcc.dg/tree-ssa/20030729-1.c scan-tree-dump-times dom2 "\\(unsigned int\\)"
> 0
> gcc.dg/tree-ssa/pr54245.c scan-tree-dump-times slsr "Inserting initializer"
> 0
> gcc.dg/tree-ssa/shorten-1.c scan-tree-dump-not optimized "\\(int\\)"
> gcc.dg/tree-ssa/shorten-1.c scan-tree-dump-times optimized "\\(unsigned
> char\\)" 8

tree-dump scan differences are expected, of course.  Others need to be
investigated.

Thanks for continuing to work on this!  I hope to have a closer look
at the updated patch later.

Thanks,
Richard.


> Thanks,
> Kugan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-08-07  5:25                                     ` Kugan
@ 2014-08-07  8:09                                       ` Richard Biener
  0 siblings, 0 replies; 51+ messages in thread
From: Richard Biener @ 2014-08-07  8:09 UTC (permalink / raw)
  To: Kugan; +Cc: Jakub Jelinek, gcc-patches

On Thu, Aug 7, 2014 at 7:24 AM, Kugan <kugan.vivekanandarajah@linaro.org> wrote:
> On 06/08/14 23:29, Richard Biener wrote:
>> On Wed, Aug 6, 2014 at 3:21 PM, Kugan <kugan.vivekanandarajah@linaro.org> wrote:
>>> On 06/08/14 22:09, Richard Biener wrote:
>>>> On Tue, Aug 5, 2014 at 4:21 PM, Jakub Jelinek <jakub@redhat.com> wrote:
>>>>> On Tue, Aug 05, 2014 at 04:17:41PM +0200, Richard Biener wrote:
>>>>>> what's the semantic of setting SRP_SIGNED_AND_UNSIGNED
>>>>>> on the subreg?  That is, for the created (subreg:lhs_mode
>>>>>> (reg:<PROMOTE_MODE of ssa> N))?
>>>>>
>>>>> SRP_SIGNED_AND_UNSIGNED on a subreg should mean that
>>>>> the subreg is both zero and sign extended, which means
>>>>> that the topmost bit of the narrower mode is known to be zero,
>>>>> and all bits above it in the wider mode are known to be zero too.
>>>>> SRP_SIGNED means that the topmost bit of the narrower mode is
>>>>> either 0 or 1 and depending on that the above wider mode bits
>>>>> are either all 0 or all 1.
>>>>> SRP_UNSIGNED means that regardless of the topmost bit value,
>>>>> all above wider mode bits are 0.
>>>>
>>>> Ok, then from the context of the patch we already know that
>>>> either SRP_UNSIGNED or SRP_SIGNED is true which means
>>>> that the value is sign- or zero-extended.
>>>>
>>>> I suppose inside promoted_for_type_p
>>>> TYPE_MODE (TREE_TYPE (ssa)) == lhs_mode, I'm not sure
>>>> why you pass !unsignedp as lhs_uns.
>>>
>>> In expand_expr_real_1, it is already known that it is promoted for
>>> unsigned_p and we are setting SUBREG_PROMOTED_SET (temp, unsignedp).
>>>
>>> If we can prove that it is also promoted for !unsignedp, we can set
>>> SUBREG_PROMOTED_SET (temp, SRP_SIGNED_AND_UNSIGNED).
>>>
>>> promoted_for_type_p should prove this based on the value range info.
>>>
>>>>
>>>> Now, from 'ssa' alone we can't tell anything about a larger mode
>>>> registers value if that is either zero- or sign-extended.  But we
>>>> know that those bits are properly zero-extended if unsignedp
>>>> and properly sign-extended if !unsignedp?
>>>>
>>>> So what the predicate tries to prove is that sign- and zero-extending
>>>> results in the same larger-mode value.  This is true if the
>>>> MSB of the smaller mode is not set.
>>>>
>>>> Let's assume that smaller mode is that of 'ssa' then the test
>>>> is just
>>>>
>>>>   return (!tree_int_cst_sign_bit (min) && !tree_int_cst_sign_bit (max));
>>>>
>>>> no?
>>>
>>> hmm,  is this because we will never have a call to promoted_for_type_p
>>> with same sign (ignoring PROMOTE_MODE) for 'ssa' and the larger mode.
>>> The case with larger mode signed and 'ssa' unsigned will not work.
>>> Therefore larger mode unsigned and 'ssa' signed will be the only case
>>> that we should consider.
>>>
>>> However, with PROMOTE_MODE, isnt that we will miss some cases with this.
>>
>> No, PROMOTE_MODE will still either sign- or zero-extend.  If either
>> results in zeros in the upper bits then PROMOTE_MODE doesn't matter.
>>
>
> Thanks for the explanation. Please find the attached patch that
> implements this. I have updated the comments and predicate to match this.
>
> Bootstrap tested on x86_64-unknown-linux-gnu and regression tested on
> x86_64-unknown-linux-gnu and arm-none-linux-gnueabi with no new
> regressions. Is this OK?

Ok with changing

+      || (TYPE_PRECISION (TREE_TYPE (ssa)) > GET_MODE_PRECISION (mode)))
+    return false;

to check with != (does that even happen?)

Thanks,
Richard.

> Thanks,
> Kugan
>
> gcc/
> 2014-08-07  Kugan Vivekanandarajah  <kuganv@linaro.org>
>
>         * calls.c (precompute_arguments): Check
>          promoted_for_signed_and_unsigned_p and set the promoted mode.
>         (promoted_for_signed_and_unsigned_p): New function.
>         (expand_expr_real_1): Check promoted_for_signed_and_unsigned_p
>         and set the promoted mode.
>         * expr.h (promoted_for_signed_and_unsigned_p): New function definition.
>         * cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if
>         SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED.
>
>
> gcc/testsuite
> 2014-08-07  Kugan Vivekanandarajah  <kuganv@linaro.org>
>
>         * gcc.dg/zero_sign_ext_test.c: New test.
>
>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-08-06 13:29                                   ` Richard Biener
@ 2014-08-07  5:25                                     ` Kugan
  2014-08-07  8:09                                       ` Richard Biener
  0 siblings, 1 reply; 51+ messages in thread
From: Kugan @ 2014-08-07  5:25 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jakub Jelinek, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 3763 bytes --]

On 06/08/14 23:29, Richard Biener wrote:
> On Wed, Aug 6, 2014 at 3:21 PM, Kugan <kugan.vivekanandarajah@linaro.org> wrote:
>> On 06/08/14 22:09, Richard Biener wrote:
>>> On Tue, Aug 5, 2014 at 4:21 PM, Jakub Jelinek <jakub@redhat.com> wrote:
>>>> On Tue, Aug 05, 2014 at 04:17:41PM +0200, Richard Biener wrote:
>>>>> what's the semantic of setting SRP_SIGNED_AND_UNSIGNED
>>>>> on the subreg?  That is, for the created (subreg:lhs_mode
>>>>> (reg:<PROMOTE_MODE of ssa> N))?
>>>>
>>>> SRP_SIGNED_AND_UNSIGNED on a subreg should mean that
>>>> the subreg is both zero and sign extended, which means
>>>> that the topmost bit of the narrower mode is known to be zero,
>>>> and all bits above it in the wider mode are known to be zero too.
>>>> SRP_SIGNED means that the topmost bit of the narrower mode is
>>>> either 0 or 1 and depending on that the above wider mode bits
>>>> are either all 0 or all 1.
>>>> SRP_UNSIGNED means that regardless of the topmost bit value,
>>>> all above wider mode bits are 0.
>>>
>>> Ok, then from the context of the patch we already know that
>>> either SRP_UNSIGNED or SRP_SIGNED is true which means
>>> that the value is sign- or zero-extended.
>>>
>>> I suppose inside promoted_for_type_p
>>> TYPE_MODE (TREE_TYPE (ssa)) == lhs_mode, I'm not sure
>>> why you pass !unsignedp as lhs_uns.
>>
>> In expand_expr_real_1, it is already known that it is promoted for
>> unsigned_p and we are setting SUBREG_PROMOTED_SET (temp, unsignedp).
>>
>> If we can prove that it is also promoted for !unsignedp, we can set
>> SUBREG_PROMOTED_SET (temp, SRP_SIGNED_AND_UNSIGNED).
>>
>> promoted_for_type_p should prove this based on the value range info.
>>
>>>
>>> Now, from 'ssa' alone we can't tell anything about a larger mode
>>> registers value if that is either zero- or sign-extended.  But we
>>> know that those bits are properly zero-extended if unsignedp
>>> and properly sign-extended if !unsignedp?
>>>
>>> So what the predicate tries to prove is that sign- and zero-extending
>>> results in the same larger-mode value.  This is true if the
>>> MSB of the smaller mode is not set.
>>>
>>> Let's assume that smaller mode is that of 'ssa' then the test
>>> is just
>>>
>>>   return (!tree_int_cst_sign_bit (min) && !tree_int_cst_sign_bit (max));
>>>
>>> no?
>>
>> hmm,  is this because we will never have a call to promoted_for_type_p
>> with same sign (ignoring PROMOTE_MODE) for 'ssa' and the larger mode.
>> The case with larger mode signed and 'ssa' unsigned will not work.
>> Therefore larger mode unsigned and 'ssa' signed will be the only case
>> that we should consider.
>>
>> However, with PROMOTE_MODE, isnt that we will miss some cases with this.
> 
> No, PROMOTE_MODE will still either sign- or zero-extend.  If either
> results in zeros in the upper bits then PROMOTE_MODE doesn't matter.
> 

Thanks for the explanation. Please find the attached patch that
implements this. I have updated the comments and predicate to match this.

Bootstrap tested on x86_64-unknown-linux-gnu and regression tested on
x86_64-unknown-linux-gnu and arm-none-linux-gnueabi with no new
regressions. Is this OK?

Thanks,
Kugan

gcc/
2014-08-07  Kugan Vivekanandarajah  <kuganv@linaro.org>

	* calls.c (precompute_arguments): Check
	 promoted_for_signed_and_unsigned_p and set the promoted mode.
	(promoted_for_signed_and_unsigned_p): New function.
	(expand_expr_real_1): Check promoted_for_signed_and_unsigned_p
	and set the promoted mode.
	* expr.h (promoted_for_signed_and_unsigned_p): New function definition.
	* cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if
	SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED.


gcc/testsuite
2014-08-07  Kugan Vivekanandarajah  <kuganv@linaro.org>

	* gcc.dg/zero_sign_ext_test.c: New test.



[-- Attachment #2: p2.txt --]
[-- Type: text/plain, Size: 8498 bytes --]

diff --git a/gcc/calls.c b/gcc/calls.c
index 00c5028..4285ec1 100644
--- a/gcc/calls.c
+++ b/gcc/calls.c
@@ -1484,7 +1484,10 @@ precompute_arguments (int num_actuals, struct arg_data *args)
 	      args[i].initial_value
 		= gen_lowpart_SUBREG (mode, args[i].value);
 	      SUBREG_PROMOTED_VAR_P (args[i].initial_value) = 1;
-	      SUBREG_PROMOTED_SET (args[i].initial_value, args[i].unsignedp);
+	      if (promoted_for_signed_and_unsigned_p (args[i].tree_value, mode))
+		SUBREG_PROMOTED_SET (args[i].initial_value, SRP_SIGNED_AND_UNSIGNED);
+	      else
+		SUBREG_PROMOTED_SET (args[i].initial_value, args[i].unsignedp);
 	    }
 	}
     }
diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index f98c322..b14626c 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -3309,7 +3309,13 @@ expand_gimple_stmt_1 (gimple stmt)
 					  GET_MODE (target), temp, unsignedp);
 		  }
 
-		convert_move (SUBREG_REG (target), temp, unsignedp);
+		if ((SUBREG_PROMOTED_GET (target) == SRP_SIGNED_AND_UNSIGNED)
+		    && (GET_CODE (temp) == SUBREG)
+		    && (GET_MODE (target) == GET_MODE (temp))
+		    && (GET_MODE (SUBREG_REG (target)) == GET_MODE (SUBREG_REG (temp))))
+		  emit_move_insn (SUBREG_REG (target), SUBREG_REG (temp));
+		else
+		  convert_move (SUBREG_REG (target), temp, unsignedp);
 	      }
 	    else if (nontemporal && emit_storent_insn (target, temp))
 	      ;
diff --git a/gcc/expr.c b/gcc/expr.c
index 1242031..c217b9a 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -68,6 +68,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-ssa-address.h"
 #include "cfgexpand.h"
 #include "builtins.h"
+#include "tree-ssa.h"
 
 #ifndef STACK_PUSH_CODE
 #ifdef STACK_GROWS_DOWNWARD
@@ -9224,6 +9225,35 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode,
 }
 #undef REDUCE_BIT_FIELD
 
+/* Return TRUE if value in SSA is zero and sign extended for wider mode MODE
+   using value range information stored.  Return FALSE otherwise.
+
+   This is used to check if SUBREG is zero and sign extended and to set
+   promoted mode SRP_SIGNED_AND_UNSIGNED to SUBREG.  */
+
+bool
+promoted_for_signed_and_unsigned_p (tree ssa, enum machine_mode mode)
+{
+  wide_int min, max;
+
+  if (ssa == NULL_TREE
+      || TREE_CODE (ssa) != SSA_NAME
+      || !INTEGRAL_TYPE_P (TREE_TYPE (ssa))
+      || (TYPE_PRECISION (TREE_TYPE (ssa)) > GET_MODE_PRECISION (mode)))
+    return false;
+
+  /* Return FALSE if value_range is not recorded for SSA.  */
+  if (get_range_info (ssa, &min, &max) != VR_RANGE)
+    return false;
+
+  /* Return true (to set SRP_SIGNED_AND_UNSIGNED to SUBREG) if MSB of the smaller
+     mode is not set (i.e. MSB of ssa is not set).  */
+  if (!wi::neg_p (min, SIGNED) && !wi::neg_p(max, SIGNED))
+    return true;
+  else
+    return false;
+
+}
 
 /* Return TRUE if expression STMT is suitable for replacement.  
    Never consider memory loads as replaceable, because those don't ever lead 
@@ -9527,7 +9557,10 @@ expand_expr_real_1 (tree exp, rtx target, enum machine_mode tmode,
 
 	  temp = gen_lowpart_SUBREG (mode, decl_rtl);
 	  SUBREG_PROMOTED_VAR_P (temp) = 1;
-	  SUBREG_PROMOTED_SET (temp, unsignedp);
+	  if (promoted_for_signed_and_unsigned_p (ssa_name, mode))
+	    SUBREG_PROMOTED_SET (temp, SRP_SIGNED_AND_UNSIGNED);
+	  else
+	    SUBREG_PROMOTED_SET (temp, unsignedp);
 	  return temp;
 	}
 
diff --git a/gcc/expr.h b/gcc/expr.h
index 6a1d3ab..a429509 100644
--- a/gcc/expr.h
+++ b/gcc/expr.h
@@ -440,6 +440,7 @@ extern rtx expand_expr_real_1 (tree, rtx, enum machine_mode,
 			       enum expand_modifier, rtx *, bool);
 extern rtx expand_expr_real_2 (sepops, rtx, enum machine_mode,
 			       enum expand_modifier);
+extern bool promoted_for_signed_and_unsigned_p (tree, enum machine_mode);
 
 /* Generate code for computing expression EXP.
    An rtx for the computed value is returned.  The value is never null.
diff --git a/gcc/testsuite/gcc.dg/zero_sign_ext_test.c b/gcc/testsuite/gcc.dg/zero_sign_ext_test.c
index e69de29..6a52678 100644
--- a/gcc/testsuite/gcc.dg/zero_sign_ext_test.c
+++ b/gcc/testsuite/gcc.dg/zero_sign_ext_test.c
@@ -0,0 +1,136 @@
+extern void abort (void);
+
+/* { dg-options "-O2" } */
+/* { dg-do run } */
+
+#define TYPE_MAX(type, sign)	\
+  ((!sign) ? ((1 << (sizeof (type) * 8 - 1)) - 1) :	\
+   ((1 << (sizeof (type) * 8)) - 1))
+#define TYPE_MIN(type, sign)	\
+  ((!sign) ? -(1 << (sizeof (type) * 8 - 1)) : 0)
+
+#define TEST_FN(NAME, ARG_TYPE, RET_TYPE, CAST_TYPE, VAL, VR_MIN, VR_MAX)\
+  __attribute__((noinline, noclone)) RET_TYPE				\
+      NAME (ARG_TYPE arg){						\
+      RET_TYPE ret = VAL;						\
+      if (arg + 1 < VR_MIN || arg + 1 > VR_MAX) return ret;		\
+      /* Value Range of arg at this point will be  [VR_min, VR_max].  */\
+      arg = arg + VAL;							\
+      ret = (CAST_TYPE)arg;						\
+      return arg;							\
+  }
+
+/* Signed to signed conversion with value in-range.  */
+TEST_FN (foo1, short, short, char, 1, TYPE_MIN (char, 0), TYPE_MAX (char, 0));
+TEST_FN (foo2, short, short, char, 1, TYPE_MIN (char, 0) + 1,\
+	TYPE_MAX (char, 0) - 1);
+
+/* Signed to signed conversion with value not in-range.  */
+TEST_FN (foo3, short, short, char, -1, TYPE_MIN (short, 0) + 1,  100);
+TEST_FN (foo4, short, short, char, 1, 12, TYPE_MAX (short, 0) + 1);
+
+/* Unsigned to unsigned conversion with value in-range.  */
+TEST_FN (foo5, unsigned short, unsigned short, unsigned char, 1,\
+	TYPE_MIN (char, 1) + 1, TYPE_MAX (char, 1) - 1);
+TEST_FN (foo6, unsigned short, unsigned short, unsigned char, 1,\
+	TYPE_MIN (char, 1), TYPE_MAX (char, 1));
+
+/* Unsigned to unsigned conversion with value not in-range.  */
+TEST_FN (foo7, unsigned short, unsigned short, unsigned char, 1,\
+	TYPE_MIN (short, 1) + 1, TYPE_MAX (short, 1) - 1);
+TEST_FN (foo8, unsigned short, unsigned short, unsigned char, 1,\
+	TYPE_MIN (short, 1), TYPE_MAX (short, 1));
+
+/* Signed to unsigned conversion with value range positive.  */
+TEST_FN (foo9, short, short, unsigned char, -1, 1,\
+	TYPE_MAX (char, 1) - 1);
+TEST_FN (foo10, short, short, unsigned char, 1, 0,\
+	TYPE_MAX (char, 1));
+
+/* Signed to unsigned conversion with value range negative.  */
+TEST_FN (foo11, short, short, unsigned char, 1,\
+	TYPE_MIN (char, 0) + 1, TYPE_MAX (char, 0) - 1);
+TEST_FN (foo12, short, short, unsigned char, 1,\
+	TYPE_MIN (char, 0), TYPE_MAX (char, 0));
+
+/* Unsigned to Signed conversion with value range in signed equiv range.  */
+TEST_FN (foo13, unsigned short, unsigned short, char, 1,\
+	TYPE_MIN (char, 1) + 1, TYPE_MAX (char, 0) - 1);
+TEST_FN (foo14, unsigned short, unsigned short, char, 1,\
+	TYPE_MIN (char, 1), TYPE_MAX (char, 0));
+
+/* Unsigned to Signed conversion with value range not-in signed range.  */
+TEST_FN (foo15, unsigned short, unsigned short, char, 1,\
+	TYPE_MIN (char, 1) + 1, TYPE_MAX (char, 1) - 1);
+TEST_FN (foo16, unsigned short, unsigned short, char, 1,\
+	TYPE_MIN (char, 1), TYPE_MAX (char, 1));
+
+int main ()
+{
+  /* Signed to signed conversion with value in-range.  */
+  /* arg + 1.  */
+  if (foo1 (-32) != -31)
+    abort ();
+  /* arg + 1.  */
+  if (foo2 (32) != 33)
+    abort ();
+
+  /* Signed to signed conversion with value not in-range.  */
+  /* arg - 1.  */
+  if (foo3 (-512) != -513)
+    abort ();
+  /* arg + 1.  */
+  if (foo4 (512) != 513)
+    abort ();
+
+  /* Unsigned to unsigned conversion with value in-range.  */
+  /* arg + 1.  */
+  if (foo5 (64) != 65)
+    abort ();
+  /* arg + 1.  */
+  if (foo6 (64) != 65)
+    abort ();
+
+  /* Unsigned to unsigned conversion with value not in-range.  */
+  /* arg + 1.  */
+  if (foo7 (512) != 513)
+    abort ();
+  /* arg + 1.  */
+  if (foo8 (512) != 513)
+    abort ();
+
+  /* Signed to unsigned conversion with value range positive.  */
+  /* arg - 1.  */
+  if (foo9 (2) != 1)
+    abort ();
+  /* arg + 1.  */
+  if (foo10 (2) != 3)
+    abort ();
+
+  /* Signed to unsigned conversion with value range negative.  */
+  /* arg + 1.  */
+  if (foo11 (-125) != -124)
+    abort ();
+  /* arg + 1.  */
+  if (foo12 (-125) != -124)
+    abort ();
+
+  /* Unsigned to Signed conversion with value range in signed equiv range.  */
+  /* arg + 1.  */
+  if (foo13 (125) != 126)
+    abort ();
+  /* arg + 1.  */
+  if (foo14 (125) != 126)
+    abort ();
+
+  /* Unsigned to Signed conversion with value range not-in signed range.  */
+  /* arg + 1.  */
+  if (foo15 (250) != 251)
+    abort ();
+  /* arg + 1.  */
+  if (foo16 (250) != 251)
+    abort ();
+
+  return 0;
+}
+

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-08-06 13:22                                 ` Kugan
@ 2014-08-06 13:29                                   ` Richard Biener
  2014-08-07  5:25                                     ` Kugan
  0 siblings, 1 reply; 51+ messages in thread
From: Richard Biener @ 2014-08-06 13:29 UTC (permalink / raw)
  To: Kugan; +Cc: Jakub Jelinek, gcc-patches

On Wed, Aug 6, 2014 at 3:21 PM, Kugan <kugan.vivekanandarajah@linaro.org> wrote:
> On 06/08/14 22:09, Richard Biener wrote:
>> On Tue, Aug 5, 2014 at 4:21 PM, Jakub Jelinek <jakub@redhat.com> wrote:
>>> On Tue, Aug 05, 2014 at 04:17:41PM +0200, Richard Biener wrote:
>>>> what's the semantic of setting SRP_SIGNED_AND_UNSIGNED
>>>> on the subreg?  That is, for the created (subreg:lhs_mode
>>>> (reg:<PROMOTE_MODE of ssa> N))?
>>>
>>> SRP_SIGNED_AND_UNSIGNED on a subreg should mean that
>>> the subreg is both zero and sign extended, which means
>>> that the topmost bit of the narrower mode is known to be zero,
>>> and all bits above it in the wider mode are known to be zero too.
>>> SRP_SIGNED means that the topmost bit of the narrower mode is
>>> either 0 or 1 and depending on that the above wider mode bits
>>> are either all 0 or all 1.
>>> SRP_UNSIGNED means that regardless of the topmost bit value,
>>> all above wider mode bits are 0.
>>
>> Ok, then from the context of the patch we already know that
>> either SRP_UNSIGNED or SRP_SIGNED is true which means
>> that the value is sign- or zero-extended.
>>
>> I suppose inside promoted_for_type_p
>> TYPE_MODE (TREE_TYPE (ssa)) == lhs_mode, I'm not sure
>> why you pass !unsignedp as lhs_uns.
>
> In expand_expr_real_1, it is already known that it is promoted for
> unsigned_p and we are setting SUBREG_PROMOTED_SET (temp, unsignedp).
>
> If we can prove that it is also promoted for !unsignedp, we can set
> SUBREG_PROMOTED_SET (temp, SRP_SIGNED_AND_UNSIGNED).
>
> promoted_for_type_p should prove this based on the value range info.
>
>>
>> Now, from 'ssa' alone we can't tell anything about a larger mode
>> registers value if that is either zero- or sign-extended.  But we
>> know that those bits are properly zero-extended if unsignedp
>> and properly sign-extended if !unsignedp?
>>
>> So what the predicate tries to prove is that sign- and zero-extending
>> results in the same larger-mode value.  This is true if the
>> MSB of the smaller mode is not set.
>>
>> Let's assume that smaller mode is that of 'ssa' then the test
>> is just
>>
>>   return (!tree_int_cst_sign_bit (min) && !tree_int_cst_sign_bit (max));
>>
>> no?
>
> hmm,  is this because we will never have a call to promoted_for_type_p
> with same sign (ignoring PROMOTE_MODE) for 'ssa' and the larger mode.
> The case with larger mode signed and 'ssa' unsigned will not work.
> Therefore larger mode unsigned and 'ssa' signed will be the only case
> that we should consider.
>
> However, with PROMOTE_MODE, isnt that we will miss some cases with this.

No, PROMOTE_MODE will still either sign- or zero-extend.  If either
results in zeros in the upper bits then PROMOTE_MODE doesn't matter.

Richard.

> Thanks,
> Kugan
>
>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-08-06 12:09                               ` Richard Biener
@ 2014-08-06 13:22                                 ` Kugan
  2014-08-06 13:29                                   ` Richard Biener
  0 siblings, 1 reply; 51+ messages in thread
From: Kugan @ 2014-08-06 13:22 UTC (permalink / raw)
  To: Richard Biener, Jakub Jelinek; +Cc: gcc-patches

On 06/08/14 22:09, Richard Biener wrote:
> On Tue, Aug 5, 2014 at 4:21 PM, Jakub Jelinek <jakub@redhat.com> wrote:
>> On Tue, Aug 05, 2014 at 04:17:41PM +0200, Richard Biener wrote:
>>> what's the semantic of setting SRP_SIGNED_AND_UNSIGNED
>>> on the subreg?  That is, for the created (subreg:lhs_mode
>>> (reg:<PROMOTE_MODE of ssa> N))?
>>
>> SRP_SIGNED_AND_UNSIGNED on a subreg should mean that
>> the subreg is both zero and sign extended, which means
>> that the topmost bit of the narrower mode is known to be zero,
>> and all bits above it in the wider mode are known to be zero too.
>> SRP_SIGNED means that the topmost bit of the narrower mode is
>> either 0 or 1 and depending on that the above wider mode bits
>> are either all 0 or all 1.
>> SRP_UNSIGNED means that regardless of the topmost bit value,
>> all above wider mode bits are 0.
> 
> Ok, then from the context of the patch we already know that
> either SRP_UNSIGNED or SRP_SIGNED is true which means
> that the value is sign- or zero-extended.
> 
> I suppose inside promoted_for_type_p
> TYPE_MODE (TREE_TYPE (ssa)) == lhs_mode, I'm not sure
> why you pass !unsignedp as lhs_uns.

In expand_expr_real_1, it is already known that it is promoted for
unsigned_p and we are setting SUBREG_PROMOTED_SET (temp, unsignedp).

If we can prove that it is also promoted for !unsignedp, we can set
SUBREG_PROMOTED_SET (temp, SRP_SIGNED_AND_UNSIGNED).

promoted_for_type_p should prove this based on the value range info.

> 
> Now, from 'ssa' alone we can't tell anything about a larger mode
> registers value if that is either zero- or sign-extended.  But we
> know that those bits are properly zero-extended if unsignedp
> and properly sign-extended if !unsignedp?
> 
> So what the predicate tries to prove is that sign- and zero-extending
> results in the same larger-mode value.  This is true if the
> MSB of the smaller mode is not set.
> 
> Let's assume that smaller mode is that of 'ssa' then the test
> is just
> 
>   return (!tree_int_cst_sign_bit (min) && !tree_int_cst_sign_bit (max));
> 
> no?

hmm,  is this because we will never have a call to promoted_for_type_p
with same sign (ignoring PROMOTE_MODE) for 'ssa' and the larger mode.
The case with larger mode signed and 'ssa' unsigned will not work.
Therefore larger mode unsigned and 'ssa' signed will be the only case
that we should consider.

However, with PROMOTE_MODE, isnt that we will miss some cases with this.

Thanks,
Kugan


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-08-05 14:21                             ` Jakub Jelinek
@ 2014-08-06 12:09                               ` Richard Biener
  2014-08-06 13:22                                 ` Kugan
  0 siblings, 1 reply; 51+ messages in thread
From: Richard Biener @ 2014-08-06 12:09 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Kugan, gcc-patches

On Tue, Aug 5, 2014 at 4:21 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Tue, Aug 05, 2014 at 04:17:41PM +0200, Richard Biener wrote:
>> what's the semantic of setting SRP_SIGNED_AND_UNSIGNED
>> on the subreg?  That is, for the created (subreg:lhs_mode
>> (reg:<PROMOTE_MODE of ssa> N))?
>
> SRP_SIGNED_AND_UNSIGNED on a subreg should mean that
> the subreg is both zero and sign extended, which means
> that the topmost bit of the narrower mode is known to be zero,
> and all bits above it in the wider mode are known to be zero too.
> SRP_SIGNED means that the topmost bit of the narrower mode is
> either 0 or 1 and depending on that the above wider mode bits
> are either all 0 or all 1.
> SRP_UNSIGNED means that regardless of the topmost bit value,
> all above wider mode bits are 0.

Ok, then from the context of the patch we already know that
either SRP_UNSIGNED or SRP_SIGNED is true which means
that the value is sign- or zero-extended.

I suppose inside promoted_for_type_p
TYPE_MODE (TREE_TYPE (ssa)) == lhs_mode, I'm not sure
why you pass !unsignedp as lhs_uns.

Now, from 'ssa' alone we can't tell anything about a larger mode
registers value if that is either zero- or sign-extended.  But we
know that those bits are properly zero-extended if unsignedp
and properly sign-extended if !unsignedp?

So what the predicate tries to prove is that sign- and zero-extending
results in the same larger-mode value.  This is true if the
MSB of the smaller mode is not set.

Let's assume that smaller mode is that of 'ssa' then the test
is just

  return (!tree_int_cst_sign_bit (min) && !tree_int_cst_sign_bit (max));

no?

Thanks,
Richard.

>         Jakub

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-08-05 14:18                           ` Richard Biener
@ 2014-08-05 14:21                             ` Jakub Jelinek
  2014-08-06 12:09                               ` Richard Biener
  0 siblings, 1 reply; 51+ messages in thread
From: Jakub Jelinek @ 2014-08-05 14:21 UTC (permalink / raw)
  To: Richard Biener; +Cc: Kugan, gcc-patches

On Tue, Aug 05, 2014 at 04:17:41PM +0200, Richard Biener wrote:
> what's the semantic of setting SRP_SIGNED_AND_UNSIGNED
> on the subreg?  That is, for the created (subreg:lhs_mode
> (reg:<PROMOTE_MODE of ssa> N))?

SRP_SIGNED_AND_UNSIGNED on a subreg should mean that
the subreg is both zero and sign extended, which means
that the topmost bit of the narrower mode is known to be zero,
and all bits above it in the wider mode are known to be zero too.
SRP_SIGNED means that the topmost bit of the narrower mode is
either 0 or 1 and depending on that the above wider mode bits
are either all 0 or all 1.
SRP_UNSIGNED means that regardless of the topmost bit value,
all above wider mode bits are 0.

	Jakub

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-08-01 16:04                         ` Kugan
  2014-08-03 23:56                           ` Kugan
@ 2014-08-05 14:18                           ` Richard Biener
  2014-08-05 14:21                             ` Jakub Jelinek
  1 sibling, 1 reply; 51+ messages in thread
From: Richard Biener @ 2014-08-05 14:18 UTC (permalink / raw)
  To: Kugan; +Cc: Jakub Jelinek, gcc-patches

On Fri, Aug 1, 2014 at 6:03 PM, Kugan <kugan.vivekanandarajah@linaro.org> wrote:
>>>>  if (rhs_uns)
>>>>    return wi::ge_p (min, 0);  // if min >= 0 then range contains positive values
>>>>  else
>>>>    return wi::le_p (max, wi::max_value (TYPE_PRECISION (TREE_TYPE
>>>> (ssa)), SIGNED);  // if max <= signed-max-of-type then range doesn't
>>>> need sign-extension
>>>
>>> I think we will have to check that ssa has necessary sign/zero extension
>>> when assigned to lhs_type. If PROMOTE_MODE tells us that ssa's type will
>>> be interpreted differently, the value range of ssa also will have
>>> corresponding range.  In this cases, shouldn’t we have to check for
>>> upper and lower limit for both min and max?
>>
>> Hmm?  That's exactly what the check is testing...  we know that
>> min <= max thus if min >= 0 then max >= 0.
>>
>> zero_extension will never do anything on [0, INF]
>>
>> If max < MAX-SIGNED then sign-extension will not do anything.  Ok,
>> sign-extension will do sth for negative values still.  So rather
>>
>>   if (rhs_uns)
>>     return wi::geu_p (min, 0);
>>   else
>>     return wi::ges_p (min, 0) && wi::les_p (max, wi::max_value
>> (TYPE_PRECISION (TREE_TYPE (ssa)), SIGNED));
>>
>> ?
>
> Thanks for the explanation. I agree. Don’t we have to however check this
> on lhs_uns as this function is checking if ssa is promoted for lhs_sign
> and lhs_mode?
>
> Here is an attempt based on this. I ran regression testing with
> arm-none-linux-gnueabi on qemu-arm without any new regressions.
>
> Sine I am not comparing value ranges to see if it can be represented in
> lhs_sigh, I can now skip the PROMOTED_MODE check.

Now I'm lost.  You call this function from two contexts:

diff --git a/gcc/calls.c b/gcc/calls.c
index a3e6faa..eac512f 100644
--- a/gcc/calls.c
+++ b/gcc/calls.c
@@ -1484,7 +1484,10 @@ precompute_arguments (int num_actuals, struct
arg_data *args)
              args[i].initial_value
                = gen_lowpart_SUBREG (mode, args[i].value);
              SUBREG_PROMOTED_VAR_P (args[i].initial_value) = 1;
-             SUBREG_PROMOTED_SET (args[i].initial_value, args[i].unsignedp);
+             if (is_promoted_for_type (args[i].tree_value, mode,
!args[i].unsignedp))
+               SUBREG_PROMOTED_SET (args[i].initial_value,
SRP_SIGNED_AND_UNSIGNED);
+             else
+               SUBREG_PROMOTED_SET (args[i].initial_value, args[i].unsignedp);

and

@@ -9527,7 +9587,10 @@ expand_expr_real_1 (tree exp, rtx target, enum
machine_mode tmode,

          temp = gen_lowpart_SUBREG (mode, decl_rtl);
          SUBREG_PROMOTED_VAR_P (temp) = 1;
-         SUBREG_PROMOTED_SET (temp, unsignedp);
+         if (is_promoted_for_type (ssa_name, mode, !unsignedp))
+           SUBREG_PROMOTED_SET (temp, SRP_SIGNED_AND_UNSIGNED);
+         else
+           SUBREG_PROMOTED_SET (temp, unsignedp);
          return temp;
        }

what's the semantic of setting SRP_SIGNED_AND_UNSIGNED
on the subreg?  That is, for the created (subreg:lhs_mode
(reg:<PROMOTE_MODE of ssa> N))?

it seems that we need to verify that 'ssa', when promoted,
does not have bits set above the target modes MSB when
we know it is zero-extended (according to PROMOTE_MODE)?
Or has all bits set to one and is sign-extended (according to
PROMOTE_MODE)?

Now it seems that the promotion is according to
promote_{function,decl}_mode in expand_expr_real_1
and according to promote_mode in calls.c.

The function comment above promoted_for_type_p needs to be
more elaborate on what invariant it checks.  As you pass in
the subreg mode but you need to verify the larger mode is
properly extended.

> I am still using wide_int::from (instead of wi::max_value) to get the
> limit as I have to match the precision with min, max precision.
> otherwise wide_int comparisons will not work. Is there a better way for
> this?

I don't understand.  wi::max_value takes a precision argument.

>
> /* Return TRUE if value in SSA is already zero/sign extended for lhs type
>    (type here is the combination of LHS_MODE and LHS_UNS) using value range
>    information stored.  Return FALSE otherwise.  */
> bool
> promoted_for_type_p (tree ssa, enum machine_mode lhs_mode, bool lhs_uns)
> {
>   wide_int min, max, limit;
>   tree lhs_type;
>   bool rhs_uns;
>   signop rhs_signop;
>
>   if (ssa == NULL_TREE
>       || TREE_CODE (ssa) != SSA_NAME
>       || !INTEGRAL_TYPE_P (TREE_TYPE (ssa)))
>     return false;
>
>   /* Return FALSE if value_range is not recorded for SSA.  */
>   if (get_range_info (ssa, &min, &max) != VR_RANGE)
>     return false;
>
>   rhs_uns = TYPE_UNSIGNED (TREE_TYPE (ssa));
>   rhs_signop = rhs_uns ? UNSIGNED : SIGNED;
>   lhs_type = lang_hooks.types.type_for_mode (lhs_mode, lhs_uns);
>   limit = wide_int::from (TYPE_MAX_VALUE (lhs_type),
>                           TYPE_PRECISION (TREE_TYPE (ssa)), SIGNED);
>
>   if (lhs_uns)
>     /* If min >= 0 then range contains positive values and doesnt need
>        zero-extension.  */
>     return wi::ge_p (min, 0, rhs_signop);
>   else
>     /* If min >= 0 and max <= signed-max-of-type then range doesn't need
>        sign-extension.  */
>     return wi::ge_p (min, 0, rhs_signop) && wi::le_p (max, limit,
> rhs_signop);
> }
>
> Thanks,
> Kugan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-08-01 16:04                         ` Kugan
@ 2014-08-03 23:56                           ` Kugan
  2014-08-05 14:18                           ` Richard Biener
  1 sibling, 0 replies; 51+ messages in thread
From: Kugan @ 2014-08-03 23:56 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jakub Jelinek, gcc-patches


On 02/08/14 02:03, Kugan wrote:
>>>>  if (rhs_uns)
>>>>    return wi::ge_p (min, 0);  // if min >= 0 then range contains positive values
>>>>  else
>>>>    return wi::le_p (max, wi::max_value (TYPE_PRECISION (TREE_TYPE
>>>> (ssa)), SIGNED);  // if max <= signed-max-of-type then range doesn't
>>>> need sign-extension
>>>
>>> I think we will have to check that ssa has necessary sign/zero extension
>>> when assigned to lhs_type. If PROMOTE_MODE tells us that ssa's type will
>>> be interpreted differently, the value range of ssa also will have
>>> corresponding range.  In this cases, shouldn’t we have to check for
>>> upper and lower limit for both min and max?
>>
>> Hmm?  That's exactly what the check is testing...  we know that
>> min <= max thus if min >= 0 then max >= 0.
>>
>> zero_extension will never do anything on [0, INF]
>>
>> If max < MAX-SIGNED then sign-extension will not do anything.  Ok,
>> sign-extension will do sth for negative values still.  So rather
>>
>>   if (rhs_uns)
>>     return wi::geu_p (min, 0);
>>   else
>>     return wi::ges_p (min, 0) && wi::les_p (max, wi::max_value
>> (TYPE_PRECISION (TREE_TYPE (ssa)), SIGNED));
>>
>> ?

Looking at your comments again, I think we have to consider three things
here.

To be able assign to LHS (of lhs_uns and lhs_mode) without conversion of
RHS (tree SSA)

* If we ignore the mode changes (i.e. LHS_mode can be different in terms
of precision) and ignore PROMOTE_MODE and consider only the sign of LHS
and RHS
  if (lhs_uns)
   return wi::ge_p (min, 0, rhs_signop);  // if min >= 0 then range
contains positive values
 else
   if (rhs_uns)
     // if max <= signed-max-of-type then range doesn't need sign-extension
     return wi::le_p (max, wi::max_value (TYPE_PRECISION (TREE_TYPE
(ssa)), SIGNED);
   else
     return true;


* However, if we consider the PROMOTE_MODE might change the RHS sign
  if (lhs_uns)
    {
      return wi::ge_p (min, 0, rhs_signop);
    }
  else
    {
      signed_max = wide_int::from (TYPE_MAX_VALUE (lhs_type),
				   TYPE_PRECISION (TREE_TYPE (ssa)), rhs_signop);
      if (rhs_uns)
	/* If PROMOTE_MODE changed an RHS signed to unsigned and
	   SSA contains negative value range, we still have to do sign-extend.  */
	return wi::ge_p (min, 0, TYPE_SIGN (TREE_TYPE (ssa)))
	  && wi::le_p (max, signed_max, rhs_signop);
      else
	/* If PROMOTE_MODE changed an RHS unsigned to signed and SSA contains value
	   range more than signed-max-of-type, we still have to do sign-extend.  */
	return wi::le_p (max, signed_max, TYPE_SIGN (TREE_TYPE (ssa)));
    }

* If we also consider that LHS mode and RHS mode precision can be different
  if (lhs_uns)
    {
      unsigned_max = wide_int::from (TYPE_MAX_VALUE (lhs_type),
				     TYPE_PRECISION (TREE_TYPE (ssa)), rhs_signop);
      /* If min >= 0 then range contains positive values and doesnt need
	 zero-extension.  If max <= unsigned-max-of-type, then value fits type.  */
      return wi::ge_p (min, 0, rhs_signop)
	&& wi::le_p (max, unsigned_max, rhs_signop);
    }
  else
    {
      signed_max = wide_int::from (TYPE_MAX_VALUE (lhs_type),
				   TYPE_PRECISION (TREE_TYPE (ssa)), rhs_signop);
      signed_min = wide_int::from (TYPE_MIN_VALUE (lhs_type),
				   TYPE_PRECISION (TREE_TYPE (ssa)), rhs_signop);
      if (rhs_uns)
	/* If PROMOTE_MODE changed an RHS signed to unsigned and
	   SSA contains negative value range, we still have to do sign-extend.  */
	return wi::ge_p (min, 0, TYPE_SIGN (TREE_TYPE (ssa)))
	  && wi::le_p (max, signed_max, rhs_signop);
      else
	/* If PROMOTE_MODE changed an RHS unsigned to signed and SSA contains value
	   range more than signed-max-of-type, we still have to do sign-extend.  */
	return wi::le_p (max, signed_max, TYPE_SIGN (TREE_TYPE (ssa)))
	  && wi::ge_p (min, signed_min, rhs_signop);
    }
}


Since we can have PROMOTE_MODE changing the sign and LHS mode and RHS
mode precision can be different, the check should be the third one. Does
that make sense or am I still missing it?

Thanks again for your time,
Kugan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-08-01 11:16                       ` Richard Biener
@ 2014-08-01 16:04                         ` Kugan
  2014-08-03 23:56                           ` Kugan
  2014-08-05 14:18                           ` Richard Biener
  0 siblings, 2 replies; 51+ messages in thread
From: Kugan @ 2014-08-01 16:04 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jakub Jelinek, gcc-patches

>>>  if (rhs_uns)
>>>    return wi::ge_p (min, 0);  // if min >= 0 then range contains positive values
>>>  else
>>>    return wi::le_p (max, wi::max_value (TYPE_PRECISION (TREE_TYPE
>>> (ssa)), SIGNED);  // if max <= signed-max-of-type then range doesn't
>>> need sign-extension
>>
>> I think we will have to check that ssa has necessary sign/zero extension
>> when assigned to lhs_type. If PROMOTE_MODE tells us that ssa's type will
>> be interpreted differently, the value range of ssa also will have
>> corresponding range.  In this cases, shouldn’t we have to check for
>> upper and lower limit for both min and max?
> 
> Hmm?  That's exactly what the check is testing...  we know that
> min <= max thus if min >= 0 then max >= 0.
> 
> zero_extension will never do anything on [0, INF]
> 
> If max < MAX-SIGNED then sign-extension will not do anything.  Ok,
> sign-extension will do sth for negative values still.  So rather
> 
>   if (rhs_uns)
>     return wi::geu_p (min, 0);
>   else
>     return wi::ges_p (min, 0) && wi::les_p (max, wi::max_value
> (TYPE_PRECISION (TREE_TYPE (ssa)), SIGNED));
> 
> ?

Thanks for the explanation. I agree. Don’t we have to however check this
on lhs_uns as this function is checking if ssa is promoted for lhs_sign
and lhs_mode?

Here is an attempt based on this. I ran regression testing with
arm-none-linux-gnueabi on qemu-arm without any new regressions.

Sine I am not comparing value ranges to see if it can be represented in
lhs_sigh, I can now skip the PROMOTED_MODE check.

I am still using wide_int::from (instead of wi::max_value) to get the
limit as I have to match the precision with min, max precision.
otherwise wide_int comparisons will not work. Is there a better way for
this?

/* Return TRUE if value in SSA is already zero/sign extended for lhs type
   (type here is the combination of LHS_MODE and LHS_UNS) using value range
   information stored.  Return FALSE otherwise.  */
bool
promoted_for_type_p (tree ssa, enum machine_mode lhs_mode, bool lhs_uns)
{
  wide_int min, max, limit;
  tree lhs_type;
  bool rhs_uns;
  signop rhs_signop;

  if (ssa == NULL_TREE
      || TREE_CODE (ssa) != SSA_NAME
      || !INTEGRAL_TYPE_P (TREE_TYPE (ssa)))
    return false;

  /* Return FALSE if value_range is not recorded for SSA.  */
  if (get_range_info (ssa, &min, &max) != VR_RANGE)
    return false;

  rhs_uns = TYPE_UNSIGNED (TREE_TYPE (ssa));
  rhs_signop = rhs_uns ? UNSIGNED : SIGNED;
  lhs_type = lang_hooks.types.type_for_mode (lhs_mode, lhs_uns);
  limit = wide_int::from (TYPE_MAX_VALUE (lhs_type),
			  TYPE_PRECISION (TREE_TYPE (ssa)), SIGNED);

  if (lhs_uns)
    /* If min >= 0 then range contains positive values and doesnt need
       zero-extension.  */
    return wi::ge_p (min, 0, rhs_signop);
  else
    /* If min >= 0 and max <= signed-max-of-type then range doesn't need
       sign-extension.  */
    return wi::ge_p (min, 0, rhs_signop) && wi::le_p (max, limit,
rhs_signop);
}

Thanks,
Kugan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-08-01  4:51                     ` Kugan
@ 2014-08-01 11:16                       ` Richard Biener
  2014-08-01 16:04                         ` Kugan
  0 siblings, 1 reply; 51+ messages in thread
From: Richard Biener @ 2014-08-01 11:16 UTC (permalink / raw)
  To: Kugan; +Cc: Jakub Jelinek, gcc-patches

On Fri, Aug 1, 2014 at 6:51 AM, Kugan <kugan.vivekanandarajah@linaro.org> wrote:
>> +  lhs_type = lang_hooks.types.type_for_mode (lhs_mode, lhs_uns);
>> ...
>> +      && ((!lhs_uns && !wi::neg_p (min, TYPE_SIGN (lhs_type)))
>> ...
>> +  type_min = wide_int::from (TYPE_MIN_VALUE (lhs_type), prec,
>> +                            TYPE_SIGN (TREE_TYPE (ssa)));
>> +  type_max = wide_int::from (TYPE_MAX_VALUE (lhs_type), prec,
>> +                            TYPE_SIGN (TREE_TYPE (ssa)));
>>
>> you shouldn't try getting at lhs_type.  Btw, do you want to constrain
>> lhs_mode to MODE_INTs somewhere?
>
> Is this in addition to !INTEGRAL_TYPE_P (TREE_TYPE (ssa)))? Do you mean
> that I should check lhs_mode as well?

No, that's probably enough.

>> For TYPE_SIGN use lhs_uns instead, for the min/max value you
>> should use wi::min_value () and wi::max_value () instead.
>>
>> You are still using TYPE_SIGN (TREE_TYPE (ssa)) here and later,
>> but we computed rhs_uns "properly" using PROMOTE_MODE.
>> I think  the code with re-setting lhs_uns if rhs_uns != lhs_uns
>> and later using TYPE_SIGN again is pretty hard to follow.
>>
>> Btw, it seems you need to conditionalize the call to PROMOTE_MODE
>> on its availability.
>>
>> Isn't it simply about choosing a proper range we need to restrict
>> ssa to?  That is, dependent on rhs_uns computed by PROMOTE_MODE,
>> simply:
>>
>> +  mode = TYPE_MODE (TREE_TYPE (ssa));
>> +  rhs_uns = TYPE_UNSIGNED (TREE_TYPE (ssa));
>> #ifdef PROMOTE_MODE
>> +  PROMOTE_MODE (mode, rhs_uns, TREE_TYPE (ssa));
>> #endif
>>
>>  if (rhs_uns)
>>    return wi::ge_p (min, 0);  // if min >= 0 then range contains positive values
>>  else
>>    return wi::le_p (max, wi::max_value (TYPE_PRECISION (TREE_TYPE
>> (ssa)), SIGNED);  // if max <= signed-max-of-type then range doesn't
>> need sign-extension
>
> I think we will have to check that ssa has necessary sign/zero extension
> when assigned to lhs_type. If PROMOTE_MODE tells us that ssa's type will
> be interpreted differently, the value range of ssa also will have
> corresponding range.  In this cases, shouldn’t we have to check for
> upper and lower limit for both min and max?

Hmm?  That's exactly what the check is testing...  we know that
min <= max thus if min >= 0 then max >= 0.

zero_extension will never do anything on [0, INF]

If max < MAX-SIGNED then sign-extension will not do anything.  Ok,
sign-extension will do sth for negative values still.  So rather

  if (rhs_uns)
    return wi::geu_p (min, 0);
  else
    return wi::ges_p (min, 0) && wi::les_p (max, wi::max_value
(TYPE_PRECISION (TREE_TYPE (ssa)), SIGNED));

?

I don't like the use of int_fits_type_p you propose.

Richard.

> How about this?
>
> bool
> promoted_for_type_p (tree ssa, enum machine_mode lhs_mode, bool lhs_uns)
> {
>   wide_int min, max;
>   tree lhs_type, rhs_type;
>   bool rhs_uns;
>   enum machine_mode rhs_mode;
>   tree min_tree, max_tree;
>
>   if (ssa == NULL_TREE
>       || TREE_CODE (ssa) != SSA_NAME
>       || !INTEGRAL_TYPE_P (TREE_TYPE (ssa)))
>     return false;
>
>   /* Return FALSE if value_range is not recorded for SSA.  */
>   if (get_range_info (ssa, &min, &max) != VR_RANGE)
>     return false;
>
>   rhs_uns = TYPE_UNSIGNED (TREE_TYPE (ssa));
>   if (rhs_uns != lhs_uns)
>     {
>       /* Signedness of LHS and RHS differs and values also cannot be
>          represented in LHS range.  */
>       unsigned int prec = min.get_precision ();
>       if ((lhs_uns && wi::neg_p (min, rhs_uns ? UNSIGNED : SIGNED))
>           || (!lhs_uns && !wi::le_p (max,
>                                     wi::max_value (prec, SIGNED),
>                                     rhs_uns ? UNSIGNED : SIGNED)))
>         return false;
>     }
>
>   /* In some architectures, modes are promoted and sign changed with
>      target defined PROMOTE_MODE macro.  If PROMOTE_MODE tells you to
>      promote _not_ according to ssa's sign then honour that.  */
>   rhs_mode = TYPE_MODE (TREE_TYPE (ssa));
> #ifdef PROMOTE_MODE
>   PROMOTE_MODE (rhs_mode, rhs_uns, TREE_TYPE (ssa));
> #endif
>
>   rhs_type = lang_hooks.types.type_for_mode (rhs_mode, rhs_uns);
>   lhs_type = lang_hooks.types.type_for_mode (lhs_mode, lhs_uns);
>   min_tree = wide_int_to_tree (rhs_type, min);
>   max_tree = wide_int_to_tree (rhs_type, max);
>
>   /* Check if values lies in-between the type range.  */
>   if (int_fits_type_p (min_tree, lhs_type)
>       && int_fits_type_p (max_tree, lhs_type))
>     return true;
>   else
>     return false;
> }
>
>
> Thanks,
> Kugan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-07-23 14:22                   ` Richard Biener
@ 2014-08-01  4:51                     ` Kugan
  2014-08-01 11:16                       ` Richard Biener
  0 siblings, 1 reply; 51+ messages in thread
From: Kugan @ 2014-08-01  4:51 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jakub Jelinek, gcc-patches

> +  lhs_type = lang_hooks.types.type_for_mode (lhs_mode, lhs_uns);
> ...
> +      && ((!lhs_uns && !wi::neg_p (min, TYPE_SIGN (lhs_type)))
> ...
> +  type_min = wide_int::from (TYPE_MIN_VALUE (lhs_type), prec,
> +                            TYPE_SIGN (TREE_TYPE (ssa)));
> +  type_max = wide_int::from (TYPE_MAX_VALUE (lhs_type), prec,
> +                            TYPE_SIGN (TREE_TYPE (ssa)));
> 
> you shouldn't try getting at lhs_type.  Btw, do you want to constrain
> lhs_mode to MODE_INTs somewhere?

Is this in addition to !INTEGRAL_TYPE_P (TREE_TYPE (ssa)))? Do you mean
that I should check lhs_mode as well?

> For TYPE_SIGN use lhs_uns instead, for the min/max value you
> should use wi::min_value () and wi::max_value () instead.
> 
> You are still using TYPE_SIGN (TREE_TYPE (ssa)) here and later,
> but we computed rhs_uns "properly" using PROMOTE_MODE.
> I think  the code with re-setting lhs_uns if rhs_uns != lhs_uns
> and later using TYPE_SIGN again is pretty hard to follow.
> 
> Btw, it seems you need to conditionalize the call to PROMOTE_MODE
> on its availability.
> 
> Isn't it simply about choosing a proper range we need to restrict
> ssa to?  That is, dependent on rhs_uns computed by PROMOTE_MODE,
> simply:
> 
> +  mode = TYPE_MODE (TREE_TYPE (ssa));
> +  rhs_uns = TYPE_UNSIGNED (TREE_TYPE (ssa));
> #ifdef PROMOTE_MODE
> +  PROMOTE_MODE (mode, rhs_uns, TREE_TYPE (ssa));
> #endif
> 
>  if (rhs_uns)
>    return wi::ge_p (min, 0);  // if min >= 0 then range contains positive values
>  else
>    return wi::le_p (max, wi::max_value (TYPE_PRECISION (TREE_TYPE
> (ssa)), SIGNED);  // if max <= signed-max-of-type then range doesn't
> need sign-extension

I think we will have to check that ssa has necessary sign/zero extension
when assigned to lhs_type. If PROMOTE_MODE tells us that ssa's type will
be interpreted differently, the value range of ssa also will have
corresponding range.  In this cases, shouldn’t we have to check for
upper and lower limit for both min and max?

How about this?

bool
promoted_for_type_p (tree ssa, enum machine_mode lhs_mode, bool lhs_uns)
{
  wide_int min, max;
  tree lhs_type, rhs_type;
  bool rhs_uns;
  enum machine_mode rhs_mode;
  tree min_tree, max_tree;

  if (ssa == NULL_TREE
      || TREE_CODE (ssa) != SSA_NAME
      || !INTEGRAL_TYPE_P (TREE_TYPE (ssa)))
    return false;

  /* Return FALSE if value_range is not recorded for SSA.  */
  if (get_range_info (ssa, &min, &max) != VR_RANGE)
    return false;

  rhs_uns = TYPE_UNSIGNED (TREE_TYPE (ssa));
  if (rhs_uns != lhs_uns)
    {
      /* Signedness of LHS and RHS differs and values also cannot be
	 represented in LHS range.  */
      unsigned int prec = min.get_precision ();
      if ((lhs_uns && wi::neg_p (min, rhs_uns ? UNSIGNED : SIGNED))
	  || (!lhs_uns && !wi::le_p (max,
				    wi::max_value (prec, SIGNED),
				    rhs_uns ? UNSIGNED : SIGNED)))
	return false;
    }

  /* In some architectures, modes are promoted and sign changed with
     target defined PROMOTE_MODE macro.  If PROMOTE_MODE tells you to
     promote _not_ according to ssa's sign then honour that.  */
  rhs_mode = TYPE_MODE (TREE_TYPE (ssa));
#ifdef PROMOTE_MODE
  PROMOTE_MODE (rhs_mode, rhs_uns, TREE_TYPE (ssa));
#endif

  rhs_type = lang_hooks.types.type_for_mode (rhs_mode, rhs_uns);
  lhs_type = lang_hooks.types.type_for_mode (lhs_mode, lhs_uns);
  min_tree = wide_int_to_tree (rhs_type, min);
  max_tree = wide_int_to_tree (rhs_type, max);

  /* Check if values lies in-between the type range.  */
  if (int_fits_type_p (min_tree, lhs_type)
      && int_fits_type_p (max_tree, lhs_type))
    return true;
  else
    return false;
}


Thanks,
Kugan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-07-14  2:58                 ` Kugan
  2014-07-14 20:11                   ` Bernhard Reutner-Fischer
@ 2014-07-23 14:22                   ` Richard Biener
  2014-08-01  4:51                     ` Kugan
  1 sibling, 1 reply; 51+ messages in thread
From: Richard Biener @ 2014-07-23 14:22 UTC (permalink / raw)
  To: Kugan; +Cc: Jakub Jelinek, gcc-patches

On Mon, Jul 14, 2014 at 4:57 AM, Kugan
<kugan.vivekanandarajah@linaro.org> wrote:
> On 11/07/14 22:47, Richard Biener wrote:
>> On Fri, Jul 11, 2014 at 1:52 PM, Kugan
>> <kugan.vivekanandarajah@linaro.org> wrote:
>>> Thanks foe the review and suggestions.
>>>
>>> On 10/07/14 22:15, Richard Biener wrote:
>>>> On Mon, Jul 7, 2014 at 8:55 AM, Kugan <kugan.vivekanandarajah@linaro.org> wrote:
>>>
>>> [...]
>>>
>>>>>
>>>>> For -fwrapv, it is due to how PROMOTE_MODE is defined in arm back-end.
>>>>> In the test-case, a function (which has signed char return type) returns
>>>>> -1 in one of the paths. ARM PROMOTE_MODE changes that to 255 and relies
>>>>> on zero/sign extension generated by RTL again for the correct value. I
>>>>> saw some other targets also defining similar think. I am therefore
>>>>> skipping removing zero/sign extension if the ssa variable can be set to
>>>>> negative integer constants.
>>>>
>>>> Hm?  I think you should rather check that you are removing a
>>>> sign-/zero-extension - PROMOTE_MODE tells you if it will sign- or
>>>> zero-extend.  Definitely
>>>>
>>>> +  /* In some architectures, negative integer constants are truncated and
>>>> +     sign changed with target defined PROMOTE_MODE macro. This will impact
>>>> +     the value range seen here and produce wrong code if zero/sign extensions
>>>> +     are eliminated. Therefore, return false if this SSA can have negative
>>>> +     integers.  */
>>>> +  if (is_gimple_assign (stmt)
>>>> +      && (TREE_CODE_CLASS (gimple_assign_rhs_code (stmt)) == tcc_unary))
>>>> +    {
>>>> +      tree rhs1 = gimple_assign_rhs1 (stmt);
>>>> +      if (TREE_CODE (rhs1) == INTEGER_CST
>>>> +         && !TYPE_UNSIGNED (TREE_TYPE (ssa))
>>>> +         && tree_int_cst_compare (rhs1, integer_zero_node) == -1)
>>>> +       return false;
>>>>
>>>> looks completely bogus ... (an unary op with a constant operand?)
>>>> instead you want to do sth like
>>>
>>> I see that unary op with a constant operand is not possible in gimple.
>>> What I wanted to check here is any sort of constant loads; but seems
>>> that will not happen in gimple. Is PHI statements the only possible
>>> statements where we will end up with such constants.
>>
>> No, in theory you can have
>>
>>   ssa_1 = -1;
>>
>> but that's not unary but a GIMPLE_SINGLE_RHS and thus
>> gimple_assign_rhs_code (stmt) == INTEGER_CST.
>>
>>>>   mode = TYPE_MODE (TREE_TYPE (ssa));
>>>>   rhs_uns = TYPE_UNSIGNED (TREE_TYPE (ssa));
>>>>   PROMOTE_MODE (mode, rhs_uns, TREE_TYPE (ssa));
>>>>
>>>> instead of initializing rhs_uns from ssas type.  That is, if
>>>> PROMOTE_MODE tells you to promote _not_ according to ssas sign then
>>>> honor that.
>>>
>>> This is triggered in pr43017.c in function foo for arm-none-linux-gnueabi.
>>>
>>> where, the gimple statement that cause this looks like:
>>> .....
>>>   # _3 = PHI <_17(7), -1(2)>
>>> bb43:
>>>   return _3;
>>>
>>> ARM PROMOTE_MODE changes the sign for integer constants only and hence
>>> looking at the variable with PROMOTE_MODE is not changing the sign in
>>> this case.
>>>
>>> #define PROMOTE_MODE(MODE, UNSIGNEDP, TYPE)     \
>>>   if (GET_MODE_CLASS (MODE) == MODE_INT         \
>>>       && GET_MODE_SIZE (MODE) < 4)              \
>>>     {                                           \
>>>       if (MODE == QImode)                       \
>>>         UNSIGNEDP = 1;                          \
>>>       else if (MODE == HImode)                  \
>>>         UNSIGNEDP = 1;                          \
>>>       (MODE) = SImode;                          \
>>>     }
>>
>> Where does it only apply for "constants"?  It applies to all QImode and
>> HImode entities.
>
> oops, sorry. I don’t know what I was thinking or looking at when I wrote
> that :( It indeed fixes my problems. Thanks for that.
>
> Here is the modified patch. Bootstrapped and regression tested for
> 86_64-unknown-linux-gnu and arm-none-linux-gnueabi with no new regressions.
>
>
> Is this OK?

+  lhs_type = lang_hooks.types.type_for_mode (lhs_mode, lhs_uns);
...
+      && ((!lhs_uns && !wi::neg_p (min, TYPE_SIGN (lhs_type)))
...
+  type_min = wide_int::from (TYPE_MIN_VALUE (lhs_type), prec,
+                            TYPE_SIGN (TREE_TYPE (ssa)));
+  type_max = wide_int::from (TYPE_MAX_VALUE (lhs_type), prec,
+                            TYPE_SIGN (TREE_TYPE (ssa)));

you shouldn't try getting at lhs_type.  Btw, do you want to constrain
lhs_mode to MODE_INTs somewhere?

For TYPE_SIGN use lhs_uns instead, for the min/max value you
should use wi::min_value () and wi::max_value () instead.

You are still using TYPE_SIGN (TREE_TYPE (ssa)) here and later,
but we computed rhs_uns "properly" using PROMOTE_MODE.
I think  the code with re-setting lhs_uns if rhs_uns != lhs_uns
and later using TYPE_SIGN again is pretty hard to follow.

Btw, it seems you need to conditionalize the call to PROMOTE_MODE
on its availability.

Isn't it simply about choosing a proper range we need to restrict
ssa to?  That is, dependent on rhs_uns computed by PROMOTE_MODE,
simply:

+  mode = TYPE_MODE (TREE_TYPE (ssa));
+  rhs_uns = TYPE_UNSIGNED (TREE_TYPE (ssa));
#ifdef PROMOTE_MODE
+  PROMOTE_MODE (mode, rhs_uns, TREE_TYPE (ssa));
#endif

 if (rhs_uns)
   return wi::ge_p (min, 0);  // if min >= 0 then range contains positive values
 else
   return wi::le_p (max, wi::max_value (TYPE_PRECISION (TREE_TYPE
(ssa)), SIGNED);  // if max <= signed-max-of-type then range doesn't
need sign-extension

?

Thanks,
Richard.

> Thanks,
> Kugan
>
>
> gcc/
>
> 2014-07-14  Kugan Vivekanandarajah  <kuganv@linaro.org>
>
>         * calls.c (precompute_arguments): Check is_promoted_for_type
>         and set the promoted mode.
>         (is_promoted_for_type): New function.
>         (expand_expr_real_1): Check is_promoted_for_type
>         and set the promoted mode.
>         * expr.h (is_promoted_for_type): New function definition.
>         * cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if
>         SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED.
>
>
> gcc/testsuite
> 2014-07-14  Kugan Vivekanandarajah  <kuganv@linaro.org>
>
>         * gcc.dg/zero_sign_ext_test.c: New test.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-07-14  2:58                 ` Kugan
@ 2014-07-14 20:11                   ` Bernhard Reutner-Fischer
  2014-07-23 14:22                   ` Richard Biener
  1 sibling, 0 replies; 51+ messages in thread
From: Bernhard Reutner-Fischer @ 2014-07-14 20:11 UTC (permalink / raw)
  To: Kugan, Richard Biener; +Cc: Jakub Jelinek, gcc-patches

On 14 July 2014 04:58:17 Kugan <kugan.vivekanandarajah@linaro.org> wrote:

> On 11/07/14 22:47, Richard Biener wrote:
> > On Fri, Jul 11, 2014 at 1:52 PM, Kugan
> > <kugan.vivekanandarajah@linaro.org> wrote:
> >> Thanks foe the review and suggestions.
> >>
> >> On 10/07/14 22:15, Richard Biener wrote:
> >>> On Mon, Jul 7, 2014 at 8:55 AM, Kugan 
> <kugan.vivekanandarajah@linaro.org> wrote:
> >>
> >> [...]
> >>
> >>>>
> >>>> For -fwrapv, it is due to how PROMOTE_MODE is defined in arm back-end.
> >>>> In the test-case, a function (which has signed char return type) returns
> >>>> -1 in one of the paths. ARM PROMOTE_MODE changes that to 255 and relies
> >>>> on zero/sign extension generated by RTL again for the correct value. I
> >>>> saw some other targets also defining similar think. I am therefore
> >>>> skipping removing zero/sign extension if the ssa variable can be set to
> >>>> negative integer constants.
> >>>
> >>> Hm?  I think you should rather check that you are removing a
> >>> sign-/zero-extension - PROMOTE_MODE tells you if it will sign- or
> >>> zero-extend.  Definitely
> >>>
> >>> +  /* In some architectures, negative integer constants are truncated and
> >>> +     sign changed with target defined PROMOTE_MODE macro. This will impact
> >>> +     the value range seen here and produce wrong code if zero/sign 
> extensions
> >>> +     are eliminated. Therefore, return false if this SSA can have negative
> >>> +     integers.  */
> >>> +  if (is_gimple_assign (stmt)
> >>> +      && (TREE_CODE_CLASS (gimple_assign_rhs_code (stmt)) == tcc_unary))
> >>> +    {
> >>> +      tree rhs1 = gimple_assign_rhs1 (stmt);
> >>> +      if (TREE_CODE (rhs1) == INTEGER_CST
> >>> +         && !TYPE_UNSIGNED (TREE_TYPE (ssa))
> >>> +         && tree_int_cst_compare (rhs1, integer_zero_node) == -1)
> >>> +       return false;
> >>>
> >>> looks completely bogus ... (an unary op with a constant operand?)
> >>> instead you want to do sth like
> >>
> >> I see that unary op with a constant operand is not possible in gimple.
> >> What I wanted to check here is any sort of constant loads; but seems
> >> that will not happen in gimple. Is PHI statements the only possible
> >> statements where we will end up with such constants.
> >
> > No, in theory you can have
> >
> >   ssa_1 = -1;
> >
> > but that's not unary but a GIMPLE_SINGLE_RHS and thus
> > gimple_assign_rhs_code (stmt) == INTEGER_CST.
> >
> >>>   mode = TYPE_MODE (TREE_TYPE (ssa));
> >>>   rhs_uns = TYPE_UNSIGNED (TREE_TYPE (ssa));
> >>>   PROMOTE_MODE (mode, rhs_uns, TREE_TYPE (ssa));
> >>>
> >>> instead of initializing rhs_uns from ssas type.  That is, if
> >>> PROMOTE_MODE tells you to promote _not_ according to ssas sign then
> >>> honor that.
> >>
> >> This is triggered in pr43017.c in function foo for arm-none-linux-gnueabi.
> >>
> >> where, the gimple statement that cause this looks like:
> >> .....
> >>   # _3 = PHI <_17(7), -1(2)>
> >> bb43:
> >>   return _3;
> >>
> >> ARM PROMOTE_MODE changes the sign for integer constants only and hence
> >> looking at the variable with PROMOTE_MODE is not changing the sign in
> >> this case.
> >>
> >> #define PROMOTE_MODE(MODE, UNSIGNEDP, TYPE)     \
> >>   if (GET_MODE_CLASS (MODE) == MODE_INT         \
> >>       && GET_MODE_SIZE (MODE) < 4)              \
> >>     {                                           \
> >>       if (MODE == QImode)                       \
> >>         UNSIGNEDP = 1;                          \
> >>       else if (MODE == HImode)                  \
> >>         UNSIGNEDP = 1;                          \
> >>       (MODE) = SImode;                          \
> >>     }
> >
> > Where does it only apply for "constants"?  It applies to all QImode and
> > HImode entities.
>
> oops, sorry. I don’t know what I was thinking or looking at when I wrote
> that :( It indeed fixes my problems. Thanks for that.
>
> Here is the modified patch. Bootstrapped and regression tested for
> 86_64-unknown-linux-gnu and arm-none-linux-gnueabi with no new regressions.
>
>
> Is this OK?
>
> Thanks,
> Kugan
>
>
> gcc/
>
> 2014-07-14  Kugan Vivekanandarajah  <kuganv@linaro.org>
>
> 	* calls.c (precompute_arguments): Check is_promoted_for_type
> 	and set the promoted mode.
> 	(is_promoted_for_type): New function.

Don't we name predicates more like promoted_for_type_p?

Thanks,
> 	(expand_expr_real_1): Check is_promoted_for_type
> 	and set the promoted mode.
> 	* expr.h (is_promoted_for_type): New function definition.
> 	* cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if
> 	SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED.
>
>
> gcc/testsuite
> 2014-07-14  Kugan Vivekanandarajah  <kuganv@linaro.org>
>
> 	* gcc.dg/zero_sign_ext_test.c: New test.



Sent with AquaMail for Android
http://www.aqua-mail.com


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-07-11 12:47               ` Richard Biener
@ 2014-07-14  2:58                 ` Kugan
  2014-07-14 20:11                   ` Bernhard Reutner-Fischer
  2014-07-23 14:22                   ` Richard Biener
  0 siblings, 2 replies; 51+ messages in thread
From: Kugan @ 2014-07-14  2:58 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jakub Jelinek, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 4365 bytes --]

On 11/07/14 22:47, Richard Biener wrote:
> On Fri, Jul 11, 2014 at 1:52 PM, Kugan
> <kugan.vivekanandarajah@linaro.org> wrote:
>> Thanks foe the review and suggestions.
>>
>> On 10/07/14 22:15, Richard Biener wrote:
>>> On Mon, Jul 7, 2014 at 8:55 AM, Kugan <kugan.vivekanandarajah@linaro.org> wrote:
>>
>> [...]
>>
>>>>
>>>> For -fwrapv, it is due to how PROMOTE_MODE is defined in arm back-end.
>>>> In the test-case, a function (which has signed char return type) returns
>>>> -1 in one of the paths. ARM PROMOTE_MODE changes that to 255 and relies
>>>> on zero/sign extension generated by RTL again for the correct value. I
>>>> saw some other targets also defining similar think. I am therefore
>>>> skipping removing zero/sign extension if the ssa variable can be set to
>>>> negative integer constants.
>>>
>>> Hm?  I think you should rather check that you are removing a
>>> sign-/zero-extension - PROMOTE_MODE tells you if it will sign- or
>>> zero-extend.  Definitely
>>>
>>> +  /* In some architectures, negative integer constants are truncated and
>>> +     sign changed with target defined PROMOTE_MODE macro. This will impact
>>> +     the value range seen here and produce wrong code if zero/sign extensions
>>> +     are eliminated. Therefore, return false if this SSA can have negative
>>> +     integers.  */
>>> +  if (is_gimple_assign (stmt)
>>> +      && (TREE_CODE_CLASS (gimple_assign_rhs_code (stmt)) == tcc_unary))
>>> +    {
>>> +      tree rhs1 = gimple_assign_rhs1 (stmt);
>>> +      if (TREE_CODE (rhs1) == INTEGER_CST
>>> +         && !TYPE_UNSIGNED (TREE_TYPE (ssa))
>>> +         && tree_int_cst_compare (rhs1, integer_zero_node) == -1)
>>> +       return false;
>>>
>>> looks completely bogus ... (an unary op with a constant operand?)
>>> instead you want to do sth like
>>
>> I see that unary op with a constant operand is not possible in gimple.
>> What I wanted to check here is any sort of constant loads; but seems
>> that will not happen in gimple. Is PHI statements the only possible
>> statements where we will end up with such constants.
> 
> No, in theory you can have
> 
>   ssa_1 = -1;
> 
> but that's not unary but a GIMPLE_SINGLE_RHS and thus
> gimple_assign_rhs_code (stmt) == INTEGER_CST.
> 
>>>   mode = TYPE_MODE (TREE_TYPE (ssa));
>>>   rhs_uns = TYPE_UNSIGNED (TREE_TYPE (ssa));
>>>   PROMOTE_MODE (mode, rhs_uns, TREE_TYPE (ssa));
>>>
>>> instead of initializing rhs_uns from ssas type.  That is, if
>>> PROMOTE_MODE tells you to promote _not_ according to ssas sign then
>>> honor that.
>>
>> This is triggered in pr43017.c in function foo for arm-none-linux-gnueabi.
>>
>> where, the gimple statement that cause this looks like:
>> .....
>>   # _3 = PHI <_17(7), -1(2)>
>> bb43:
>>   return _3;
>>
>> ARM PROMOTE_MODE changes the sign for integer constants only and hence
>> looking at the variable with PROMOTE_MODE is not changing the sign in
>> this case.
>>
>> #define PROMOTE_MODE(MODE, UNSIGNEDP, TYPE)     \
>>   if (GET_MODE_CLASS (MODE) == MODE_INT         \
>>       && GET_MODE_SIZE (MODE) < 4)              \
>>     {                                           \
>>       if (MODE == QImode)                       \
>>         UNSIGNEDP = 1;                          \
>>       else if (MODE == HImode)                  \
>>         UNSIGNEDP = 1;                          \
>>       (MODE) = SImode;                          \
>>     }
> 
> Where does it only apply for "constants"?  It applies to all QImode and
> HImode entities.

oops, sorry. I don’t know what I was thinking or looking at when I wrote
that :( It indeed fixes my problems. Thanks for that.

Here is the modified patch. Bootstrapped and regression tested for
86_64-unknown-linux-gnu and arm-none-linux-gnueabi with no new regressions.


Is this OK?

Thanks,
Kugan


gcc/

2014-07-14  Kugan Vivekanandarajah  <kuganv@linaro.org>

	* calls.c (precompute_arguments): Check is_promoted_for_type
	and set the promoted mode.
	(is_promoted_for_type): New function.
	(expand_expr_real_1): Check is_promoted_for_type
	and set the promoted mode.
	* expr.h (is_promoted_for_type): New function definition.
	* cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if
	SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED.


gcc/testsuite
2014-07-14  Kugan Vivekanandarajah  <kuganv@linaro.org>

	* gcc.dg/zero_sign_ext_test.c: New test.

[-- Attachment #2: p2.txt --]
[-- Type: text/plain, Size: 9696 bytes --]

diff --git a/gcc/calls.c b/gcc/calls.c
index a3e6faa..eac512f 100644
--- a/gcc/calls.c
+++ b/gcc/calls.c
@@ -1484,7 +1484,10 @@ precompute_arguments (int num_actuals, struct arg_data *args)
 	      args[i].initial_value
 		= gen_lowpart_SUBREG (mode, args[i].value);
 	      SUBREG_PROMOTED_VAR_P (args[i].initial_value) = 1;
-	      SUBREG_PROMOTED_SET (args[i].initial_value, args[i].unsignedp);
+	      if (is_promoted_for_type (args[i].tree_value, mode, !args[i].unsignedp))
+		SUBREG_PROMOTED_SET (args[i].initial_value, SRP_SIGNED_AND_UNSIGNED);
+	      else
+		SUBREG_PROMOTED_SET (args[i].initial_value, args[i].unsignedp);
 	    }
 	}
     }
diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index f98c322..b14626c 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -3309,7 +3309,13 @@ expand_gimple_stmt_1 (gimple stmt)
 					  GET_MODE (target), temp, unsignedp);
 		  }
 
-		convert_move (SUBREG_REG (target), temp, unsignedp);
+		if ((SUBREG_PROMOTED_GET (target) == SRP_SIGNED_AND_UNSIGNED)
+		    && (GET_CODE (temp) == SUBREG)
+		    && (GET_MODE (target) == GET_MODE (temp))
+		    && (GET_MODE (SUBREG_REG (target)) == GET_MODE (SUBREG_REG (temp))))
+		  emit_move_insn (SUBREG_REG (target), SUBREG_REG (temp));
+		else
+		  convert_move (SUBREG_REG (target), temp, unsignedp);
 	      }
 	    else if (nontemporal && emit_storent_insn (target, temp))
 	      ;
diff --git a/gcc/expr.c b/gcc/expr.c
index 7356e76..d25f506 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -68,6 +68,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-ssa-address.h"
 #include "cfgexpand.h"
 #include "builtins.h"
+#include "tree-ssa.h"
 
 #ifndef STACK_PUSH_CODE
 #ifdef STACK_GROWS_DOWNWARD
@@ -9224,6 +9225,65 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode,
 }
 #undef REDUCE_BIT_FIELD
 
+/* Return TRUE if value in SSA is already zero/sign extended for lhs type
+   (type here is the combination of LHS_MODE and LHS_UNS) using value range
+   information stored.  Return FALSE otherwise.  */
+bool
+is_promoted_for_type (tree ssa, enum machine_mode lhs_mode, bool lhs_uns)
+{
+  wide_int type_min, type_max;
+  wide_int min, max, limit;
+  unsigned int prec;
+  tree lhs_type;
+  bool rhs_uns;
+  enum machine_mode mode;
+
+  if (ssa == NULL_TREE
+      || TREE_CODE (ssa) != SSA_NAME
+      || !INTEGRAL_TYPE_P (TREE_TYPE (ssa)))
+    return false;
+
+  /* Return FALSE if value_range is not recorded for SSA.  */
+  if (get_range_info (ssa, &min, &max) != VR_RANGE)
+    return false;
+
+  /* In some architectures, modes are promoted and sign changed with
+     target defined PROMOTE_MODE macro.  If PROMOTE_MODE tells you to
+     promote _not_ according to ssa's sign then honour that.  */
+  mode = TYPE_MODE (TREE_TYPE (ssa));
+  rhs_uns = TYPE_UNSIGNED (TREE_TYPE (ssa));
+  PROMOTE_MODE (mode, rhs_uns, TREE_TYPE (ssa));
+
+  lhs_type = lang_hooks.types.type_for_mode (lhs_mode, lhs_uns);
+  prec = min.get_precision ();
+
+  /* Signed maximum value.  */
+  limit = wide_int::from (TYPE_MAX_VALUE (TREE_TYPE (ssa)), prec, SIGNED);
+
+  /* Signedness of LHS and RHS differs but values in range.  */
+  if ((rhs_uns != lhs_uns)
+      && ((!lhs_uns && !wi::neg_p (min, TYPE_SIGN (lhs_type)))
+	  || (lhs_uns && (wi::cmp (max, limit, TYPE_SIGN (TREE_TYPE (ssa))) == -1))))
+    lhs_uns = !lhs_uns;
+
+  /* Signedness of LHS and RHS should match.  */
+  if (rhs_uns != lhs_uns)
+    return false;
+
+  type_min = wide_int::from (TYPE_MIN_VALUE (lhs_type), prec,
+			     TYPE_SIGN (TREE_TYPE (ssa)));
+  type_max = wide_int::from (TYPE_MAX_VALUE (lhs_type), prec,
+			     TYPE_SIGN (TREE_TYPE (ssa)));
+
+  /* Check if values lies in-between the type range.  */
+  if ((wi::neg_p (max, TYPE_SIGN (TREE_TYPE (ssa)))
+       || (wi::cmp (max, type_max, TYPE_SIGN (TREE_TYPE (ssa))) != 1))
+      && (!wi::neg_p (min, TYPE_SIGN (TREE_TYPE (ssa)))
+	  || (wi::cmp (type_min, min, TYPE_SIGN (TREE_TYPE (ssa))) != 1)))
+    return true;
+
+  return false;
+}
 
 /* Return TRUE if expression STMT is suitable for replacement.  
    Never consider memory loads as replaceable, because those don't ever lead 
@@ -9527,7 +9587,10 @@ expand_expr_real_1 (tree exp, rtx target, enum machine_mode tmode,
 
 	  temp = gen_lowpart_SUBREG (mode, decl_rtl);
 	  SUBREG_PROMOTED_VAR_P (temp) = 1;
-	  SUBREG_PROMOTED_SET (temp, unsignedp);
+	  if (is_promoted_for_type (ssa_name, mode, !unsignedp))
+	    SUBREG_PROMOTED_SET (temp, SRP_SIGNED_AND_UNSIGNED);
+	  else
+	    SUBREG_PROMOTED_SET (temp, unsignedp);
 	  return temp;
 	}
 
diff --git a/gcc/expr.h b/gcc/expr.h
index 6a1d3ab..e99d000 100644
--- a/gcc/expr.h
+++ b/gcc/expr.h
@@ -440,6 +440,7 @@ extern rtx expand_expr_real_1 (tree, rtx, enum machine_mode,
 			       enum expand_modifier, rtx *, bool);
 extern rtx expand_expr_real_2 (sepops, rtx, enum machine_mode,
 			       enum expand_modifier);
+extern bool is_promoted_for_type (tree, enum machine_mode, bool);
 
 /* Generate code for computing expression EXP.
    An rtx for the computed value is returned.  The value is never null.
diff --git a/gcc/testsuite/gcc.dg/zero_sign_ext_test.c b/gcc/testsuite/gcc.dg/zero_sign_ext_test.c
index e69de29..6a52678 100644
--- a/gcc/testsuite/gcc.dg/zero_sign_ext_test.c
+++ b/gcc/testsuite/gcc.dg/zero_sign_ext_test.c
@@ -0,0 +1,136 @@
+extern void abort (void);
+
+/* { dg-options "-O2" } */
+/* { dg-do run } */
+
+#define TYPE_MAX(type, sign)	\
+  ((!sign) ? ((1 << (sizeof (type) * 8 - 1)) - 1) :	\
+   ((1 << (sizeof (type) * 8)) - 1))
+#define TYPE_MIN(type, sign)	\
+  ((!sign) ? -(1 << (sizeof (type) * 8 - 1)) : 0)
+
+#define TEST_FN(NAME, ARG_TYPE, RET_TYPE, CAST_TYPE, VAL, VR_MIN, VR_MAX)\
+  __attribute__((noinline, noclone)) RET_TYPE				\
+      NAME (ARG_TYPE arg){						\
+      RET_TYPE ret = VAL;						\
+      if (arg + 1 < VR_MIN || arg + 1 > VR_MAX) return ret;		\
+      /* Value Range of arg at this point will be  [VR_min, VR_max].  */\
+      arg = arg + VAL;							\
+      ret = (CAST_TYPE)arg;						\
+      return arg;							\
+  }
+
+/* Signed to signed conversion with value in-range.  */
+TEST_FN (foo1, short, short, char, 1, TYPE_MIN (char, 0), TYPE_MAX (char, 0));
+TEST_FN (foo2, short, short, char, 1, TYPE_MIN (char, 0) + 1,\
+	TYPE_MAX (char, 0) - 1);
+
+/* Signed to signed conversion with value not in-range.  */
+TEST_FN (foo3, short, short, char, -1, TYPE_MIN (short, 0) + 1,  100);
+TEST_FN (foo4, short, short, char, 1, 12, TYPE_MAX (short, 0) + 1);
+
+/* Unsigned to unsigned conversion with value in-range.  */
+TEST_FN (foo5, unsigned short, unsigned short, unsigned char, 1,\
+	TYPE_MIN (char, 1) + 1, TYPE_MAX (char, 1) - 1);
+TEST_FN (foo6, unsigned short, unsigned short, unsigned char, 1,\
+	TYPE_MIN (char, 1), TYPE_MAX (char, 1));
+
+/* Unsigned to unsigned conversion with value not in-range.  */
+TEST_FN (foo7, unsigned short, unsigned short, unsigned char, 1,\
+	TYPE_MIN (short, 1) + 1, TYPE_MAX (short, 1) - 1);
+TEST_FN (foo8, unsigned short, unsigned short, unsigned char, 1,\
+	TYPE_MIN (short, 1), TYPE_MAX (short, 1));
+
+/* Signed to unsigned conversion with value range positive.  */
+TEST_FN (foo9, short, short, unsigned char, -1, 1,\
+	TYPE_MAX (char, 1) - 1);
+TEST_FN (foo10, short, short, unsigned char, 1, 0,\
+	TYPE_MAX (char, 1));
+
+/* Signed to unsigned conversion with value range negative.  */
+TEST_FN (foo11, short, short, unsigned char, 1,\
+	TYPE_MIN (char, 0) + 1, TYPE_MAX (char, 0) - 1);
+TEST_FN (foo12, short, short, unsigned char, 1,\
+	TYPE_MIN (char, 0), TYPE_MAX (char, 0));
+
+/* Unsigned to Signed conversion with value range in signed equiv range.  */
+TEST_FN (foo13, unsigned short, unsigned short, char, 1,\
+	TYPE_MIN (char, 1) + 1, TYPE_MAX (char, 0) - 1);
+TEST_FN (foo14, unsigned short, unsigned short, char, 1,\
+	TYPE_MIN (char, 1), TYPE_MAX (char, 0));
+
+/* Unsigned to Signed conversion with value range not-in signed range.  */
+TEST_FN (foo15, unsigned short, unsigned short, char, 1,\
+	TYPE_MIN (char, 1) + 1, TYPE_MAX (char, 1) - 1);
+TEST_FN (foo16, unsigned short, unsigned short, char, 1,\
+	TYPE_MIN (char, 1), TYPE_MAX (char, 1));
+
+int main ()
+{
+  /* Signed to signed conversion with value in-range.  */
+  /* arg + 1.  */
+  if (foo1 (-32) != -31)
+    abort ();
+  /* arg + 1.  */
+  if (foo2 (32) != 33)
+    abort ();
+
+  /* Signed to signed conversion with value not in-range.  */
+  /* arg - 1.  */
+  if (foo3 (-512) != -513)
+    abort ();
+  /* arg + 1.  */
+  if (foo4 (512) != 513)
+    abort ();
+
+  /* Unsigned to unsigned conversion with value in-range.  */
+  /* arg + 1.  */
+  if (foo5 (64) != 65)
+    abort ();
+  /* arg + 1.  */
+  if (foo6 (64) != 65)
+    abort ();
+
+  /* Unsigned to unsigned conversion with value not in-range.  */
+  /* arg + 1.  */
+  if (foo7 (512) != 513)
+    abort ();
+  /* arg + 1.  */
+  if (foo8 (512) != 513)
+    abort ();
+
+  /* Signed to unsigned conversion with value range positive.  */
+  /* arg - 1.  */
+  if (foo9 (2) != 1)
+    abort ();
+  /* arg + 1.  */
+  if (foo10 (2) != 3)
+    abort ();
+
+  /* Signed to unsigned conversion with value range negative.  */
+  /* arg + 1.  */
+  if (foo11 (-125) != -124)
+    abort ();
+  /* arg + 1.  */
+  if (foo12 (-125) != -124)
+    abort ();
+
+  /* Unsigned to Signed conversion with value range in signed equiv range.  */
+  /* arg + 1.  */
+  if (foo13 (125) != 126)
+    abort ();
+  /* arg + 1.  */
+  if (foo14 (125) != 126)
+    abort ();
+
+  /* Unsigned to Signed conversion with value range not-in signed range.  */
+  /* arg + 1.  */
+  if (foo15 (250) != 251)
+    abort ();
+  /* arg + 1.  */
+  if (foo16 (250) != 251)
+    abort ();
+
+  return 0;
+}
+

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-07-11 11:52             ` Kugan
@ 2014-07-11 12:47               ` Richard Biener
  2014-07-14  2:58                 ` Kugan
  0 siblings, 1 reply; 51+ messages in thread
From: Richard Biener @ 2014-07-11 12:47 UTC (permalink / raw)
  To: Kugan; +Cc: Jakub Jelinek, gcc-patches

On Fri, Jul 11, 2014 at 1:52 PM, Kugan
<kugan.vivekanandarajah@linaro.org> wrote:
> Thanks foe the review and suggestions.
>
> On 10/07/14 22:15, Richard Biener wrote:
>> On Mon, Jul 7, 2014 at 8:55 AM, Kugan <kugan.vivekanandarajah@linaro.org> wrote:
>
> [...]
>
>>>
>>> For -fwrapv, it is due to how PROMOTE_MODE is defined in arm back-end.
>>> In the test-case, a function (which has signed char return type) returns
>>> -1 in one of the paths. ARM PROMOTE_MODE changes that to 255 and relies
>>> on zero/sign extension generated by RTL again for the correct value. I
>>> saw some other targets also defining similar think. I am therefore
>>> skipping removing zero/sign extension if the ssa variable can be set to
>>> negative integer constants.
>>
>> Hm?  I think you should rather check that you are removing a
>> sign-/zero-extension - PROMOTE_MODE tells you if it will sign- or
>> zero-extend.  Definitely
>>
>> +  /* In some architectures, negative integer constants are truncated and
>> +     sign changed with target defined PROMOTE_MODE macro. This will impact
>> +     the value range seen here and produce wrong code if zero/sign extensions
>> +     are eliminated. Therefore, return false if this SSA can have negative
>> +     integers.  */
>> +  if (is_gimple_assign (stmt)
>> +      && (TREE_CODE_CLASS (gimple_assign_rhs_code (stmt)) == tcc_unary))
>> +    {
>> +      tree rhs1 = gimple_assign_rhs1 (stmt);
>> +      if (TREE_CODE (rhs1) == INTEGER_CST
>> +         && !TYPE_UNSIGNED (TREE_TYPE (ssa))
>> +         && tree_int_cst_compare (rhs1, integer_zero_node) == -1)
>> +       return false;
>>
>> looks completely bogus ... (an unary op with a constant operand?)
>> instead you want to do sth like
>
> I see that unary op with a constant operand is not possible in gimple.
> What I wanted to check here is any sort of constant loads; but seems
> that will not happen in gimple. Is PHI statements the only possible
> statements where we will end up with such constants.

No, in theory you can have

  ssa_1 = -1;

but that's not unary but a GIMPLE_SINGLE_RHS and thus
gimple_assign_rhs_code (stmt) == INTEGER_CST.

>>   mode = TYPE_MODE (TREE_TYPE (ssa));
>>   rhs_uns = TYPE_UNSIGNED (TREE_TYPE (ssa));
>>   PROMOTE_MODE (mode, rhs_uns, TREE_TYPE (ssa));
>>
>> instead of initializing rhs_uns from ssas type.  That is, if
>> PROMOTE_MODE tells you to promote _not_ according to ssas sign then
>> honor that.
>
> This is triggered in pr43017.c in function foo for arm-none-linux-gnueabi.
>
> where, the gimple statement that cause this looks like:
> .....
>   # _3 = PHI <_17(7), -1(2)>
> bb43:
>   return _3;
>
> ARM PROMOTE_MODE changes the sign for integer constants only and hence
> looking at the variable with PROMOTE_MODE is not changing the sign in
> this case.
>
> #define PROMOTE_MODE(MODE, UNSIGNEDP, TYPE)     \
>   if (GET_MODE_CLASS (MODE) == MODE_INT         \
>       && GET_MODE_SIZE (MODE) < 4)              \
>     {                                           \
>       if (MODE == QImode)                       \
>         UNSIGNEDP = 1;                          \
>       else if (MODE == HImode)                  \
>         UNSIGNEDP = 1;                          \
>       (MODE) = SImode;                          \
>     }

Where does it only apply for "constants"?  It applies to all QImode and
HImode entities.

>>> As for the -fno-strict-overflow case, if the variables overflows, in VRP
>>> dumps, I see +INF(OVF), but the value range stored in ssa has TYPE_MAX.
>>> We therefore should limit the comparison to (TYPE_MIN < VR_MIN && VR_MAX
>>> < TYPE_MAX) instead of (TYPE_MIN <= VR_MIN && VR_MAX <= TYPE_MAX) when
>>> checking to be sure that this is not the overflowing case. Attached
>>> patch changes this.
>>
>> I don't think that's necessary - the overflow cases happen only when
>> that overflow has undefined behavior, thus any valid program will have
>> values <= MAX.
>
> I see that you have now removed +INF(OVF). I will change it this way.

I have not removed anything, I just fixed a bug.

Richard.

> Thanks again,
> Kugan
>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-07-10 12:15           ` Richard Biener
@ 2014-07-11 11:52             ` Kugan
  2014-07-11 12:47               ` Richard Biener
  0 siblings, 1 reply; 51+ messages in thread
From: Kugan @ 2014-07-11 11:52 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jakub Jelinek, gcc-patches

Thanks foe the review and suggestions.

On 10/07/14 22:15, Richard Biener wrote:
> On Mon, Jul 7, 2014 at 8:55 AM, Kugan <kugan.vivekanandarajah@linaro.org> wrote:

[...]

>>
>> For -fwrapv, it is due to how PROMOTE_MODE is defined in arm back-end.
>> In the test-case, a function (which has signed char return type) returns
>> -1 in one of the paths. ARM PROMOTE_MODE changes that to 255 and relies
>> on zero/sign extension generated by RTL again for the correct value. I
>> saw some other targets also defining similar think. I am therefore
>> skipping removing zero/sign extension if the ssa variable can be set to
>> negative integer constants.
> 
> Hm?  I think you should rather check that you are removing a
> sign-/zero-extension - PROMOTE_MODE tells you if it will sign- or
> zero-extend.  Definitely
> 
> +  /* In some architectures, negative integer constants are truncated and
> +     sign changed with target defined PROMOTE_MODE macro. This will impact
> +     the value range seen here and produce wrong code if zero/sign extensions
> +     are eliminated. Therefore, return false if this SSA can have negative
> +     integers.  */
> +  if (is_gimple_assign (stmt)
> +      && (TREE_CODE_CLASS (gimple_assign_rhs_code (stmt)) == tcc_unary))
> +    {
> +      tree rhs1 = gimple_assign_rhs1 (stmt);
> +      if (TREE_CODE (rhs1) == INTEGER_CST
> +         && !TYPE_UNSIGNED (TREE_TYPE (ssa))
> +         && tree_int_cst_compare (rhs1, integer_zero_node) == -1)
> +       return false;
> 
> looks completely bogus ... (an unary op with a constant operand?)
> instead you want to do sth like

I see that unary op with a constant operand is not possible in gimple.
What I wanted to check here is any sort of constant loads; but seems
that will not happen in gimple. Is PHI statements the only possible
statements where we will end up with such constants.

>   mode = TYPE_MODE (TREE_TYPE (ssa));
>   rhs_uns = TYPE_UNSIGNED (TREE_TYPE (ssa));
>   PROMOTE_MODE (mode, rhs_uns, TREE_TYPE (ssa));
> 
> instead of initializing rhs_uns from ssas type.  That is, if
> PROMOTE_MODE tells you to promote _not_ according to ssas sign then
> honor that.

This is triggered in pr43017.c in function foo for arm-none-linux-gnueabi.

where, the gimple statement that cause this looks like:
.....
  # _3 = PHI <_17(7), -1(2)>
bb43:
  return _3;

ARM PROMOTE_MODE changes the sign for integer constants only and hence
looking at the variable with PROMOTE_MODE is not changing the sign in
this case.

#define PROMOTE_MODE(MODE, UNSIGNEDP, TYPE)	\
  if (GET_MODE_CLASS (MODE) == MODE_INT		\
      && GET_MODE_SIZE (MODE) < 4)      	\
    {						\
      if (MODE == QImode)			\
	UNSIGNEDP = 1;				\
      else if (MODE == HImode)			\
	UNSIGNEDP = 1;				\
      (MODE) = SImode;				\
    }

>> As for the -fno-strict-overflow case, if the variables overflows, in VRP
>> dumps, I see +INF(OVF), but the value range stored in ssa has TYPE_MAX.
>> We therefore should limit the comparison to (TYPE_MIN < VR_MIN && VR_MAX
>> < TYPE_MAX) instead of (TYPE_MIN <= VR_MIN && VR_MAX <= TYPE_MAX) when
>> checking to be sure that this is not the overflowing case. Attached
>> patch changes this.
> 
> I don't think that's necessary - the overflow cases happen only when
> that overflow has undefined behavior, thus any valid program will have
> values <= MAX.

I see that you have now removed +INF(OVF). I will change it this way.

Thanks again,
Kugan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-07-07  6:55         ` Kugan
@ 2014-07-10 12:15           ` Richard Biener
  2014-07-11 11:52             ` Kugan
  0 siblings, 1 reply; 51+ messages in thread
From: Richard Biener @ 2014-07-10 12:15 UTC (permalink / raw)
  To: Kugan; +Cc: Jakub Jelinek, gcc-patches

On Mon, Jul 7, 2014 at 8:55 AM, Kugan <kugan.vivekanandarajah@linaro.org> wrote:
>> For -fwrapv I don't see why you'd get into trouble ever, the VRP computation
>> should be well aware of the -fwrapv semantics and the value ranges should
>> reflect that.
>>
>> For -fno-strict-overflow, I have no idea since it is very weirdly defined.
>>
>> In any case, for your example above, the loop is always well defined,
>> because for char/short a++ is performed as:
>> a = (short) ((int) a + 1)
>> So, if the patch turns it into infinite loop, with -Os -fno-strict-overflow
>> or -Os, it is simply a problem with the patch.  VR [1, 32768] looks correct,
>> a++ is performed only if a is >= 0, therefore before addition [0, 32767].
>> But from VR [1, 32768] you can't optimize away the sign extension, make sure
>> you don't have there off-by-one?

I have fixed the above bug yesterday.

>> It would be nice if the patch contained some testcases, it is easy
>> to construct testcases where you have arbitrary VRs on some SSA_NAMEs,
>> you just need something to stick the VR on, so you can do something like:
>> type foo (type a)
>> {
>>   if (a < VR_min + 1 || a > VR_max + 1) return; // If VR_min is type minimum or VR_max type maximum this needs to be adjusted of course.
>>   a = a + 1;
>>   // now you can try some cast that your optimization would try to optimize
>>   return a;
>> }
>> Or void bar (type a) { a = (a & mask) + bias; (or similarly) }
>> Make sure to cover the boundary cases, where VR minimum or maximum still
>> allow optimizing away zero and/or sign extensions, and another case where
>> they are +- 1 and already don't allow it.
>
>
> Hi Jakub,
>
> For -fwrapv, it is due to how PROMOTE_MODE is defined in arm back-end.
> In the test-case, a function (which has signed char return type) returns
> -1 in one of the paths. ARM PROMOTE_MODE changes that to 255 and relies
> on zero/sign extension generated by RTL again for the correct value. I
> saw some other targets also defining similar think. I am therefore
> skipping removing zero/sign extension if the ssa variable can be set to
> negative integer constants.

Hm?  I think you should rather check that you are removing a
sign-/zero-extension - PROMOTE_MODE tells you if it will sign- or
zero-extend.  Definitely

+  /* In some architectures, negative integer constants are truncated and
+     sign changed with target defined PROMOTE_MODE macro. This will impact
+     the value range seen here and produce wrong code if zero/sign extensions
+     are eliminated. Therefore, return false if this SSA can have negative
+     integers.  */
+  if (is_gimple_assign (stmt)
+      && (TREE_CODE_CLASS (gimple_assign_rhs_code (stmt)) == tcc_unary))
+    {
+      tree rhs1 = gimple_assign_rhs1 (stmt);
+      if (TREE_CODE (rhs1) == INTEGER_CST
+         && !TYPE_UNSIGNED (TREE_TYPE (ssa))
+         && tree_int_cst_compare (rhs1, integer_zero_node) == -1)
+       return false;

looks completely bogus ... (an unary op with a constant operand?)

instead you want to do sth like

  mode = TYPE_MODE (TREE_TYPE (ssa));
  rhs_uns = TYPE_UNSIGNED (TREE_TYPE (ssa));
  PROMOTE_MODE (mode, rhs_uns, TREE_TYPE (ssa));

instead of initializing rhs_uns from ssas type.  That is, if
PROMOTE_MODE tells you to promote _not_ according to ssas sign then
honor that.

> As for the -fno-strict-overflow case, if the variables overflows, in VRP
> dumps, I see +INF(OVF), but the value range stored in ssa has TYPE_MAX.
> We therefore should limit the comparison to (TYPE_MIN < VR_MIN && VR_MAX
> < TYPE_MAX) instead of (TYPE_MIN <= VR_MIN && VR_MAX <= TYPE_MAX) when
> checking to be sure that this is not the overflowing case. Attached
> patch changes this.

I don't think that's necessary - the overflow cases happen only when
that overflow has undefined behavior, thus any valid program will have
values <= MAX.

Richard.

> I have bootstrapped on x86_64-unknown-linux-gnu and regression tested
> for x86_64-unknown-linux-gnu, arm-none-linux-gnueabi (using qemu),
> aarch64_be-none-elf (Foundation model), aarch64-none-elf
> --with-abi=ilp32 (Foundation model) and s390x-ibm-linux (64bit, using
> qemu) with no new regression.
>
> Is this OK?
>
> Thanks,
> Kugan
>
> gcc/
> 2014-07-07  Kugan Vivekanandarajah  <kuganv@linaro.org>
>
>         * calls.c (precompute_arguments): Check is_promoted_for_type
>         and set the promoted mode.
>         (is_promoted_for_type): New function.
>         (expand_expr_real_1): Check is_promoted_for_type
>         and set the promoted mode.
>         * expr.h (is_promoted_for_type): New function definition.
>         * cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if
>         SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED.
>
>
> gcc/testsuite
>
> 2014-07-07  Kugan Vivekanandarajah  <kuganv@linaro.org>
>
>         * gcc.dg/zero_sign_ext_test.c: New test.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-06-25  8:36       ` Jakub Jelinek
@ 2014-07-07  6:55         ` Kugan
  2014-07-10 12:15           ` Richard Biener
  0 siblings, 1 reply; 51+ messages in thread
From: Kugan @ 2014-07-07  6:55 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 3228 bytes --]

> For -fwrapv I don't see why you'd get into trouble ever, the VRP computation
> should be well aware of the -fwrapv semantics and the value ranges should
> reflect that.
> 
> For -fno-strict-overflow, I have no idea since it is very weirdly defined.
> 
> In any case, for your example above, the loop is always well defined,
> because for char/short a++ is performed as:
> a = (short) ((int) a + 1)
> So, if the patch turns it into infinite loop, with -Os -fno-strict-overflow
> or -Os, it is simply a problem with the patch.  VR [1, 32768] looks correct,
> a++ is performed only if a is >= 0, therefore before addition [0, 32767].
> But from VR [1, 32768] you can't optimize away the sign extension, make sure
> you don't have there off-by-one?
> 
> It would be nice if the patch contained some testcases, it is easy
> to construct testcases where you have arbitrary VRs on some SSA_NAMEs,
> you just need something to stick the VR on, so you can do something like:
> type foo (type a)
> {
>   if (a < VR_min + 1 || a > VR_max + 1) return; // If VR_min is type minimum or VR_max type maximum this needs to be adjusted of course.
>   a = a + 1;
>   // now you can try some cast that your optimization would try to optimize
>   return a;
> }
> Or void bar (type a) { a = (a & mask) + bias; (or similarly) }
> Make sure to cover the boundary cases, where VR minimum or maximum still
> allow optimizing away zero and/or sign extensions, and another case where
> they are +- 1 and already don't allow it.


Hi Jakub,

For -fwrapv, it is due to how PROMOTE_MODE is defined in arm back-end.
In the test-case, a function (which has signed char return type) returns
-1 in one of the paths. ARM PROMOTE_MODE changes that to 255 and relies
on zero/sign extension generated by RTL again for the correct value. I
saw some other targets also defining similar think. I am therefore
skipping removing zero/sign extension if the ssa variable can be set to
negative integer constants.


As for the -fno-strict-overflow case, if the variables overflows, in VRP
dumps, I see +INF(OVF), but the value range stored in ssa has TYPE_MAX.
We therefore should limit the comparison to (TYPE_MIN < VR_MIN && VR_MAX
< TYPE_MAX) instead of (TYPE_MIN <= VR_MIN && VR_MAX <= TYPE_MAX) when
checking to be sure that this is not the overflowing case. Attached
patch changes this.

I have bootstrapped on x86_64-unknown-linux-gnu and regression tested
for x86_64-unknown-linux-gnu, arm-none-linux-gnueabi (using qemu),
aarch64_be-none-elf (Foundation model), aarch64-none-elf
--with-abi=ilp32 (Foundation model) and s390x-ibm-linux (64bit, using
qemu) with no new regression.

Is this OK?

Thanks,
Kugan

gcc/
2014-07-07  Kugan Vivekanandarajah  <kuganv@linaro.org>

	* calls.c (precompute_arguments): Check is_promoted_for_type
	and set the promoted mode.
	(is_promoted_for_type): New function.
	(expand_expr_real_1): Check is_promoted_for_type
	and set the promoted mode.
	* expr.h (is_promoted_for_type): New function definition.
	* cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if
	SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED.


gcc/testsuite

2014-07-07  Kugan Vivekanandarajah  <kuganv@linaro.org>

	* gcc.dg/zero_sign_ext_test.c: New test.

[-- Attachment #2: p2.txt --]
[-- Type: text/plain, Size: 10440 bytes --]

diff --git a/gcc/calls.c b/gcc/calls.c
index a3e6faa..eac512f 100644
--- a/gcc/calls.c
+++ b/gcc/calls.c
@@ -1484,7 +1484,10 @@ precompute_arguments (int num_actuals, struct arg_data *args)
 	      args[i].initial_value
 		= gen_lowpart_SUBREG (mode, args[i].value);
 	      SUBREG_PROMOTED_VAR_P (args[i].initial_value) = 1;
-	      SUBREG_PROMOTED_SET (args[i].initial_value, args[i].unsignedp);
+	      if (is_promoted_for_type (args[i].tree_value, mode, !args[i].unsignedp))
+		SUBREG_PROMOTED_SET (args[i].initial_value, SRP_SIGNED_AND_UNSIGNED);
+	      else
+		SUBREG_PROMOTED_SET (args[i].initial_value, args[i].unsignedp);
 	    }
 	}
     }
diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index b7a34a2..ac6776d 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -3309,7 +3309,13 @@ expand_gimple_stmt_1 (gimple stmt)
 					  GET_MODE (target), temp, unsignedp);
 		  }
 
-		convert_move (SUBREG_REG (target), temp, unsignedp);
+		if ((SUBREG_PROMOTED_GET (target) == SRP_SIGNED_AND_UNSIGNED)
+		    && (GET_CODE (temp) == SUBREG)
+		    && (GET_MODE (target) == GET_MODE (temp))
+		    && (GET_MODE (SUBREG_REG (target)) == GET_MODE (SUBREG_REG (temp))))
+		  emit_move_insn (SUBREG_REG (target), SUBREG_REG (temp));
+		else
+		  convert_move (SUBREG_REG (target), temp, unsignedp);
 	      }
 	    else if (nontemporal && emit_storent_insn (target, temp))
 	      ;
diff --git a/gcc/expr.c b/gcc/expr.c
index 10f4a96..68708c1 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -68,6 +68,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-ssa-address.h"
 #include "cfgexpand.h"
 #include "builtins.h"
+#include "tree-ssa.h"
 
 #ifndef STACK_PUSH_CODE
 #ifdef STACK_GROWS_DOWNWARD
@@ -9210,6 +9211,88 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode,
 }
 #undef REDUCE_BIT_FIELD
 
+/* Return TRUE if value in SSA is already zero/sign extended for lhs type
+   (type here is the combination of LHS_MODE and LHS_UNS) using value range
+   information stored.  Return FALSE otherwise.  */
+bool
+is_promoted_for_type (tree ssa, enum machine_mode lhs_mode, bool lhs_uns)
+{
+  wide_int type_min, type_max;
+  wide_int min, max, limit;
+  unsigned int prec;
+  tree lhs_type;
+  bool rhs_uns;
+  gimple stmt;
+
+  if (ssa == NULL_TREE
+      || TREE_CODE (ssa) != SSA_NAME
+      || !INTEGRAL_TYPE_P (TREE_TYPE (ssa)))
+    return false;
+
+  /* Return FALSE if value_range is not recorded for SSA.  */
+  if (get_range_info (ssa, &min, &max) != VR_RANGE)
+    return false;
+  stmt = SSA_NAME_DEF_STMT (ssa);
+
+  /* In some architectures, negative integer constants are truncated and
+     sign changed with target defined PROMOTE_MODE macro. This will impact
+     the value range seen here and produce wrong code if zero/sign extensions
+     are eliminated. Therefore, return false if this SSA can have negative
+     integers.  */
+  if (is_gimple_assign (stmt)
+      && (TREE_CODE_CLASS (gimple_assign_rhs_code (stmt)) == tcc_unary))
+    {
+      tree rhs1 = gimple_assign_rhs1 (stmt);
+      if (TREE_CODE (rhs1) == INTEGER_CST
+	  && !TYPE_UNSIGNED (TREE_TYPE (ssa))
+	  && tree_int_cst_compare (rhs1, integer_zero_node) == -1)
+	return false;
+    }
+  else if (gimple_code (stmt) == GIMPLE_PHI)
+    {
+      unsigned int i;
+      for (i = 0; i < gimple_phi_num_args (stmt); ++i)
+	{
+	  tree arg = gimple_phi_arg_def (stmt, i);
+	  if (TREE_CODE (arg) == INTEGER_CST
+	      && !TYPE_UNSIGNED (TREE_TYPE (ssa))
+	      && tree_int_cst_compare (arg, integer_zero_node) == -1)
+	    return false;
+	}
+    }
+
+  lhs_type = lang_hooks.types.type_for_mode (lhs_mode, lhs_uns);
+  rhs_uns = TYPE_UNSIGNED (TREE_TYPE (ssa));
+  prec = min.get_precision ();
+
+  /* Signed maximum value.  */
+  limit = wide_int::from (TYPE_MAX_VALUE (TREE_TYPE (ssa)), prec, SIGNED);
+
+  /* Signedness of LHS and RHS differs but values in range.  */
+  if ((rhs_uns != lhs_uns)
+      && ((!lhs_uns && !wi::neg_p (min, TYPE_SIGN (lhs_type)))
+	  || (lhs_uns && (wi::cmp (max, limit, TYPE_SIGN (TREE_TYPE (ssa))) == -1))))
+    lhs_uns = !lhs_uns;
+
+  /* Signedness of LHS and RHS should match.  */
+  if (rhs_uns != lhs_uns)
+    return false;
+
+  type_min = wide_int::from (TYPE_MIN_VALUE (lhs_type), prec,
+			     TYPE_SIGN (TREE_TYPE (ssa)));
+  type_max = wide_int::from (TYPE_MAX_VALUE (lhs_type), prec,
+			     TYPE_SIGN (TREE_TYPE (ssa)));
+
+  /* Check if values lies in-between the type range.  */
+  if ((wi::neg_p (max, TYPE_SIGN (TREE_TYPE (ssa)))
+       || (wi::cmp (max, type_max, TYPE_SIGN (TREE_TYPE (ssa))) == -1))
+      && (!wi::neg_p (min, TYPE_SIGN (TREE_TYPE (ssa)))
+	  || (wi::cmp (min, 0, TYPE_SIGN (TREE_TYPE (ssa))) == 0)
+	  || (wi::cmp (type_min, min, TYPE_SIGN (TREE_TYPE (ssa))) == -1)))
+    return true;
+
+  return false;
+}
 
 /* Return TRUE if expression STMT is suitable for replacement.  
    Never consider memory loads as replaceable, because those don't ever lead 
@@ -9513,7 +9596,10 @@ expand_expr_real_1 (tree exp, rtx target, enum machine_mode tmode,
 
 	  temp = gen_lowpart_SUBREG (mode, decl_rtl);
 	  SUBREG_PROMOTED_VAR_P (temp) = 1;
-	  SUBREG_PROMOTED_SET (temp, unsignedp);
+	  if (is_promoted_for_type (ssa_name, mode, !unsignedp))
+	    SUBREG_PROMOTED_SET (temp, SRP_SIGNED_AND_UNSIGNED);
+	  else
+	    SUBREG_PROMOTED_SET (temp, unsignedp);
 	  return temp;
 	}
 
diff --git a/gcc/expr.h b/gcc/expr.h
index 6a1d3ab..e99d000 100644
--- a/gcc/expr.h
+++ b/gcc/expr.h
@@ -440,6 +440,7 @@ extern rtx expand_expr_real_1 (tree, rtx, enum machine_mode,
 			       enum expand_modifier, rtx *, bool);
 extern rtx expand_expr_real_2 (sepops, rtx, enum machine_mode,
 			       enum expand_modifier);
+extern bool is_promoted_for_type (tree, enum machine_mode, bool);
 
 /* Generate code for computing expression EXP.
    An rtx for the computed value is returned.  The value is never null.
diff --git a/gcc/testsuite/gcc.dg/zero_sign_ext_test.c b/gcc/testsuite/gcc.dg/zero_sign_ext_test.c
index e69de29..ef37612 100644
--- a/gcc/testsuite/gcc.dg/zero_sign_ext_test.c
+++ b/gcc/testsuite/gcc.dg/zero_sign_ext_test.c
@@ -0,0 +1,135 @@
+extern void abort (void);
+
+/* { dg-options "-O2" } */
+/* { dg-do run } */
+
+#define	TYPE_MAX(type, sign)	\
+  ((!sign) ? ((1 << (sizeof (type) * 8 - 1)) - 1) :	\
+   ((1 << (sizeof (type) * 8)) - 1))
+#define	TYPE_MIN(type, sign)	\
+  ((!sign) ? -(1 << (sizeof (type) * 8 - 1)) : 0)
+
+#define	TEST_FN(NAME, ARG_TYPE, RET_TYPE, CAST_TYPE, VAL, VR_MIN, VR_MAX)\
+  __attribute__((noinline, noclone)) RET_TYPE				\
+      NAME (ARG_TYPE arg){						\
+      RET_TYPE ret = VAL;						\
+      if (arg + 1 < VR_MIN || arg + 1 > VR_MAX) return ret;		\
+      /* Value Range of arg at this point will be  [VR_min, VR_max].  */\
+      arg = arg + VAL;							\
+      ret = (CAST_TYPE)arg;						\
+      return arg;							\
+  }
+
+/* Signed to signed conversion with value in-range. */
+TEST_FN (foo1, short, short, char, 1, TYPE_MIN (char, 0), TYPE_MAX (char, 0));
+TEST_FN (foo2, short, short, char, 1, TYPE_MIN (char, 0) + 1,\
+	TYPE_MAX (char, 0) - 1);
+
+/* Signed to signed conversion with value not in-range. */
+TEST_FN (foo3, short, short, char, -1, TYPE_MIN (short, 0) + 1,  100);
+TEST_FN (foo4, short, short, char, 1, 12, TYPE_MAX (short, 0) + 1);
+
+/* Unsigned to unsigned conversion with value in-range. */
+TEST_FN (foo5, unsigned short, unsigned short, unsigned char, 1,\
+	TYPE_MIN (char, 1) + 1, TYPE_MAX (char, 1) - 1);
+TEST_FN (foo6, unsigned short, unsigned short, unsigned char, 1,\
+	TYPE_MIN (char, 1), TYPE_MAX (char, 1));
+
+/* Unsigned to unsigned conversion with value not in-range. */
+TEST_FN (foo7, unsigned short, unsigned short, unsigned char, 1,\
+	TYPE_MIN (short, 1) + 1, TYPE_MAX (short, 1) - 1);
+TEST_FN (foo8, unsigned short, unsigned short, unsigned char, 1,\
+	TYPE_MIN (short, 1), TYPE_MAX (short, 1));
+
+/* Signed to unsigned conversion with value range positive. */
+TEST_FN (foo9, short, short, unsigned char, -1, 1,\
+	TYPE_MAX (char, 1) - 1);
+TEST_FN (foo10, short, short, unsigned char, 1, 0,\
+	TYPE_MAX (char, 1));
+
+/* Signed to unsigned conversion with value range negative. */
+TEST_FN (foo11, short, short, unsigned char, 1,\
+	TYPE_MIN (char, 0) + 1, TYPE_MAX (char, 0) - 1);
+TEST_FN (foo12, short, short, unsigned char, 1,\
+	TYPE_MIN (char, 0), TYPE_MAX (char, 0));
+
+/* Unsigned to Signed conversion with value range in signed equiv range */
+TEST_FN (foo13, unsigned short, unsigned short, char, 1,\
+	TYPE_MIN (char, 1) + 1, TYPE_MAX (char, 0) - 1);
+TEST_FN (foo14, unsigned short, unsigned short, char, 1,\
+	TYPE_MIN (char, 1), TYPE_MAX (char, 0));
+
+/* Unsigned to Signed conversion with value range not-in signed range */
+TEST_FN (foo15, unsigned short, unsigned short, char, 1,\
+	TYPE_MIN (char, 1) + 1, TYPE_MAX (char, 1) - 1);
+TEST_FN (foo16, unsigned short, unsigned short, char, 1,\
+	TYPE_MIN (char, 1), TYPE_MAX (char, 1));
+
+int main ()
+{
+  /* Signed to signed conversion with value in-range. */
+  /* arg + 1 */
+  if (foo1 (-32) != -31)
+    abort ();
+  /* arg + 1 */
+  if (foo2 (32) != 33)
+    abort ();
+
+  /* Signed to signed conversion with value not in-range. */
+  /* arg - 1 */
+  if (foo3 (-512) != -513)
+    abort ();
+  /* arg + 1 */
+  if (foo4 (512) != 513)
+    abort ();
+
+  /* Unsigned to unsigned conversion with value in-range. */
+  /* arg + 1 */
+  if (foo5 (64) != 65)
+    abort ();
+  /* arg + 1 */
+  if (foo6 (64) != 65)
+    abort ();
+
+  /* Unigned to unsigned conversion with value not in-range. */
+  /* arg + 1 */
+  if (foo7 (512) != 513)
+    abort ();
+  /* arg + 1 */
+  if (foo8 (512) != 513)
+    abort ();
+
+  /* Signed to unsigned conversion with value range positive. */
+  /* arg - 1 */
+  if (foo9 (2) != 1)
+    abort ();
+  /* arg + 1 */
+  if (foo10 (2) != 3)
+    abort ();
+
+  /* Signed to unsigned conversion with value range negative. */
+  /* arg + 1 */
+  if (foo11 (-125) != -124)
+    abort ();
+  /* arg + 1 */
+  if (foo12 (-125) != -124)
+    abort ();
+
+  /* Unsigned to Signed conversion with value range in signed equiv range */
+  /* arg + 1 */
+  if (foo13 (125) != 126)
+    abort ();
+  /* arg + 1 */
+  if (foo14 (125) != 126)
+    abort ();
+
+  /* Unsigned to Signed conversion with value range not-in signed range */
+  /* arg + 1 */
+  if (foo15 (250) != 251)
+    abort ();
+  /* arg + 1 */
+  if (foo16 (250) != 251)
+    abort ();
+
+  return 0;
+}

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-06-25  8:15     ` Kugan
@ 2014-06-25  8:36       ` Jakub Jelinek
  2014-07-07  6:55         ` Kugan
  0 siblings, 1 reply; 51+ messages in thread
From: Jakub Jelinek @ 2014-06-25  8:36 UTC (permalink / raw)
  To: Kugan; +Cc: gcc-patches

On Wed, Jun 25, 2014 at 06:14:57PM +1000, Kugan wrote:
> For these flags, value ranges generated are not usable for extension
> eliminations. Therefore, without this some of the test cases in
> regression fails. For example:
> 
> short a;
> void
> foo (void)
> {
>   for (a = 0; a >= 0; a++)
>     ;
> }
> -Os  -fno-strict-overflow produces the following range for the index
> increment and hence goes into infinite loop.
> _10: [1, 32768]
> _10 = _4 + 1;

For -fwrapv I don't see why you'd get into trouble ever, the VRP computation
should be well aware of the -fwrapv semantics and the value ranges should
reflect that.

For -fno-strict-overflow, I have no idea since it is very weirdly defined.

In any case, for your example above, the loop is always well defined,
because for char/short a++ is performed as:
a = (short) ((int) a + 1)
So, if the patch turns it into infinite loop, with -Os -fno-strict-overflow
or -Os, it is simply a problem with the patch.  VR [1, 32768] looks correct,
a++ is performed only if a is >= 0, therefore before addition [0, 32767].
But from VR [1, 32768] you can't optimize away the sign extension, make sure
you don't have there off-by-one?

It would be nice if the patch contained some testcases, it is easy
to construct testcases where you have arbitrary VRs on some SSA_NAMEs,
you just need something to stick the VR on, so you can do something like:
type foo (type a)
{
  if (a < VR_min + 1 || a > VR_max + 1) return; // If VR_min is type minimum or VR_max type maximum this needs to be adjusted of course.
  a = a + 1;
  // now you can try some cast that your optimization would try to optimize
  return a;
}
Or void bar (type a) { a = (a & mask) + bias; (or similarly) }
Make sure to cover the boundary cases, where VR minimum or maximum still
allow optimizing away zero and/or sign extensions, and another case where
they are +- 1 and already don't allow it.

	Jakub

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-06-24 12:21   ` Jakub Jelinek
@ 2014-06-25  8:15     ` Kugan
  2014-06-25  8:36       ` Jakub Jelinek
  0 siblings, 1 reply; 51+ messages in thread
From: Kugan @ 2014-06-25  8:15 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 2298 bytes --]

On 24/06/14 22:21, Jakub Jelinek wrote:
> On Tue, Jun 24, 2014 at 09:53:35PM +1000, Kugan wrote:
>> 2014-06-24  Kugan Vivekanandarajah  <kuganv@linaro.org>
>>
>> 	* gcc/calls.c (precompute_arguments: Check is_promoted_for_type
>> 	and set the promoted mode.
>> 	(is_promoted_for_type) : New function.
>> 	(expand_expr_real_1) : Check is_promoted_for_type
>> 	and set the promoted mode.
>> 	* gcc/expr.h (is_promoted_for_type) : New function definition.
>> 	* gcc/cfgexpand.c (expand_gimple_stmt_1) : Call emit_move_insn if
>> 	SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED.
> 
> Similarly to the other patch, no gcc/ prefix in ChangeLog, no space before
> :, watch for too long lines, remove useless ()s around conditions.

Changed it.

>> +bool
>> +is_promoted_for_type (tree ssa, enum machine_mode lhs_mode, bool lhs_uns)
>> +{
>> +  wide_int type_min, type_max;
>> +  wide_int min, max, limit;
>> +  unsigned int prec;
>> +  tree lhs_type;
>> +  bool rhs_uns;
>> +
>> +  if (flag_wrapv
> 
> Why?
> 
>> +      || (flag_strict_overflow == false)
> 
> Why?  Also, that would be !flag_strict_overflow instead of
> (flag_strict_overflow == false)

For these flags, value ranges generated are not usable for extension
eliminations. Therefore, without this some of the test cases in
regression fails. For example:

short a;
void
foo (void)
{
  for (a = 0; a >= 0; a++)
    ;
}
-Os  -fno-strict-overflow produces the following range for the index
increment and hence goes into infinite loop.
_10: [1, 32768]
_10 = _4 + 1;

> 
>> +      || (ssa == NULL_TREE)
>> +      || (TREE_CODE (ssa) != SSA_NAME)
>> +      || !INTEGRAL_TYPE_P (TREE_TYPE (ssa))
>> +      || POINTER_TYPE_P (TREE_TYPE (ssa)))
> 
> All pointer types are !INTEGRAL_TYPE_P, so the last condition
> doesn't make any sense.

I have changed this. Please see the attached patch.


Thanks,
Kugan

gcc/
2014-06-25  Kugan Vivekanandarajah  <kuganv@linaro.org>

	* calls.c (precompute_arguments): Check is_promoted_for_type
	and set the promoted mode.
	(is_promoted_for_type): New function.
	(expand_expr_real_1): Check is_promoted_for_type
	and set the promoted mode.
	* expr.h (is_promoted_for_type): New function definition.
	* cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if
	SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED.



[-- Attachment #2: p2.txt --]
[-- Type: text/plain, Size: 4582 bytes --]

diff --git a/gcc/calls.c b/gcc/calls.c
index a3e6faa..eac512f 100644
--- a/gcc/calls.c
+++ b/gcc/calls.c
@@ -1484,7 +1484,10 @@ precompute_arguments (int num_actuals, struct arg_data *args)
 	      args[i].initial_value
 		= gen_lowpart_SUBREG (mode, args[i].value);
 	      SUBREG_PROMOTED_VAR_P (args[i].initial_value) = 1;
-	      SUBREG_PROMOTED_SET (args[i].initial_value, args[i].unsignedp);
+	      if (is_promoted_for_type (args[i].tree_value, mode, !args[i].unsignedp))
+		SUBREG_PROMOTED_SET (args[i].initial_value, SRP_SIGNED_AND_UNSIGNED);
+	      else
+		SUBREG_PROMOTED_SET (args[i].initial_value, args[i].unsignedp);
 	    }
 	}
     }
diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index e8cd87f..0540b4d 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -3309,7 +3309,13 @@ expand_gimple_stmt_1 (gimple stmt)
 					  GET_MODE (target), temp, unsignedp);
 		  }
 
-		convert_move (SUBREG_REG (target), temp, unsignedp);
+		if ((SUBREG_PROMOTED_GET (target) == SRP_SIGNED_AND_UNSIGNED)
+		    && (GET_CODE (temp) == SUBREG)
+		    && (GET_MODE (target) == GET_MODE (temp))
+		    && (GET_MODE (SUBREG_REG (target)) == GET_MODE (SUBREG_REG (temp))))
+		  emit_move_insn (SUBREG_REG (target), SUBREG_REG (temp));
+		else
+		  convert_move (SUBREG_REG (target), temp, unsignedp);
 	      }
 	    else if (nontemporal && emit_storent_insn (target, temp))
 	      ;
diff --git a/gcc/expr.c b/gcc/expr.c
index f9103a5..15da092 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -9210,6 +9210,59 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode,
 }
 #undef REDUCE_BIT_FIELD
 
+/* Return TRUE if value in SSA is already zero/sign extended for lhs type
+   (type here is the combination of LHS_MODE and LHS_UNS) using value range
+   information stored.  Return FALSE otherwise.  */
+bool
+is_promoted_for_type (tree ssa, enum machine_mode lhs_mode, bool lhs_uns)
+{
+  wide_int type_min, type_max;
+  wide_int min, max, limit;
+  unsigned int prec;
+  tree lhs_type;
+  bool rhs_uns;
+
+  if (flag_wrapv
+      || !flag_strict_overflow
+      || ssa == NULL_TREE
+      || TREE_CODE (ssa) != SSA_NAME
+      || !INTEGRAL_TYPE_P (TREE_TYPE (ssa)))
+    return false;
+
+  /* Return FALSE if value_range is not recorded for SSA.  */
+  if (get_range_info (ssa, &min, &max) != VR_RANGE)
+    return false;
+
+  lhs_type = lang_hooks.types.type_for_mode (lhs_mode, lhs_uns);
+  rhs_uns = TYPE_UNSIGNED (TREE_TYPE (ssa));
+  prec = min.get_precision ();
+
+  /* Signed maximum value.  */
+  limit = wide_int::from (TYPE_MAX_VALUE (TREE_TYPE (ssa)), prec, SIGNED);
+
+  /* Signedness of LHS and RHS differs but values in range.  */
+  if ((rhs_uns != lhs_uns)
+      && ((!lhs_uns && !wi::neg_p (min, TYPE_SIGN (lhs_type)))
+	  || (lhs_uns && (wi::cmp (max, limit, TYPE_SIGN (TREE_TYPE (ssa))) == -1))))
+    lhs_uns = !lhs_uns;
+
+  /* Signedness of LHS and RHS should match.  */
+  if (rhs_uns != lhs_uns)
+    return false;
+
+  type_min = wide_int::from (TYPE_MIN_VALUE (lhs_type), prec,
+			     TYPE_SIGN (TREE_TYPE (ssa)));
+  type_max = wide_int::from (TYPE_MAX_VALUE (lhs_type), prec,
+			     TYPE_SIGN (TREE_TYPE (ssa)));
+
+  /* Check if values lies in-between the type range.  */
+  if ((wi::neg_p (max, TYPE_SIGN (TREE_TYPE (ssa)))
+       || (wi::cmp (max, type_max, TYPE_SIGN (TREE_TYPE (ssa))) != 1))
+      && (!wi::neg_p (min, TYPE_SIGN (TREE_TYPE (ssa)))
+	  || (wi::cmp (type_min, min, TYPE_SIGN (TREE_TYPE (ssa))) != 1)))
+    return true;
+  return false;
+}
 
 /* Return TRUE if expression STMT is suitable for replacement.  
    Never consider memory loads as replaceable, because those don't ever lead 
@@ -9513,7 +9566,10 @@ expand_expr_real_1 (tree exp, rtx target, enum machine_mode tmode,
 
 	  temp = gen_lowpart_SUBREG (mode, decl_rtl);
 	  SUBREG_PROMOTED_VAR_P (temp) = 1;
-	  SUBREG_PROMOTED_SET (temp, unsignedp);
+	  if (is_promoted_for_type (ssa_name, mode, !unsignedp))
+	    SUBREG_PROMOTED_SET (temp, SRP_SIGNED_AND_UNSIGNED);
+	  else
+	    SUBREG_PROMOTED_SET (temp, unsignedp);
 
 	  return temp;
 	}
diff --git a/gcc/expr.h b/gcc/expr.h
index 6a1d3ab..e99d000 100644
--- a/gcc/expr.h
+++ b/gcc/expr.h
@@ -440,6 +440,7 @@ extern rtx expand_expr_real_1 (tree, rtx, enum machine_mode,
 			       enum expand_modifier, rtx *, bool);
 extern rtx expand_expr_real_2 (sepops, rtx, enum machine_mode,
 			       enum expand_modifier);
+extern bool is_promoted_for_type (tree, enum machine_mode, bool);
 
 /* Generate code for computing expression EXP.
    An rtx for the computed value is returned.  The value is never null.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/2] Enable elimination of zext/sext
  2014-06-24 11:53 ` [PATCH 2/2] Enable elimination of zext/sext Kugan
@ 2014-06-24 12:21   ` Jakub Jelinek
  2014-06-25  8:15     ` Kugan
  0 siblings, 1 reply; 51+ messages in thread
From: Jakub Jelinek @ 2014-06-24 12:21 UTC (permalink / raw)
  To: Kugan; +Cc: gcc-patches

On Tue, Jun 24, 2014 at 09:53:35PM +1000, Kugan wrote:
> 2014-06-24  Kugan Vivekanandarajah  <kuganv@linaro.org>
> 
> 	* gcc/calls.c (precompute_arguments: Check is_promoted_for_type
> 	and set the promoted mode.
> 	(is_promoted_for_type) : New function.
> 	(expand_expr_real_1) : Check is_promoted_for_type
> 	and set the promoted mode.
> 	* gcc/expr.h (is_promoted_for_type) : New function definition.
> 	* gcc/cfgexpand.c (expand_gimple_stmt_1) : Call emit_move_insn if
> 	SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED.

Similarly to the other patch, no gcc/ prefix in ChangeLog, no space before
:, watch for too long lines, remove useless ()s around conditions.

> +bool
> +is_promoted_for_type (tree ssa, enum machine_mode lhs_mode, bool lhs_uns)
> +{
> +  wide_int type_min, type_max;
> +  wide_int min, max, limit;
> +  unsigned int prec;
> +  tree lhs_type;
> +  bool rhs_uns;
> +
> +  if (flag_wrapv

Why?

> +      || (flag_strict_overflow == false)

Why?  Also, that would be !flag_strict_overflow instead of
(flag_strict_overflow == false)

> +      || (ssa == NULL_TREE)
> +      || (TREE_CODE (ssa) != SSA_NAME)
> +      || !INTEGRAL_TYPE_P (TREE_TYPE (ssa))
> +      || POINTER_TYPE_P (TREE_TYPE (ssa)))

All pointer types are !INTEGRAL_TYPE_P, so the last condition
doesn't make any sense.

	Jakub

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH 2/2] Enable elimination of zext/sext
  2014-06-24 11:48 [PATCH 0/2] Zext/sext elimination using value range Kugan
@ 2014-06-24 11:53 ` Kugan
  2014-06-24 12:21   ` Jakub Jelinek
  0 siblings, 1 reply; 51+ messages in thread
From: Kugan @ 2014-06-24 11:53 UTC (permalink / raw)
  To: Jakub Jelinek, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 582 bytes --]

Sets proper flags on the SUBREG based on value
range info and enables elimination of zext/sext when possible.

Thanks,
Kugan


gcc/
2014-06-24  Kugan Vivekanandarajah  <kuganv@linaro.org>

	* gcc/calls.c (precompute_arguments: Check is_promoted_for_type
	and set the promoted mode.
	(is_promoted_for_type) : New function.
	(expand_expr_real_1) : Check is_promoted_for_type
	and set the promoted mode.
	* gcc/expr.h (is_promoted_for_type) : New function definition.
	* gcc/cfgexpand.c (expand_gimple_stmt_1) : Call emit_move_insn if
	SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED.

[-- Attachment #2: p2.txt --]
[-- Type: text/plain, Size: 4620 bytes --]

diff --git a/gcc/calls.c b/gcc/calls.c
index c1fe3b8..4ef9df8 100644
--- a/gcc/calls.c
+++ b/gcc/calls.c
@@ -1484,7 +1484,10 @@ precompute_arguments (int num_actuals, struct arg_data *args)
 	      args[i].initial_value
 		= gen_lowpart_SUBREG (mode, args[i].value);
 	      SUBREG_PROMOTED_VAR_P (args[i].initial_value) = 1;
-	      SUBREG_PROMOTED_SET (args[i].initial_value, args[i].unsignedp);
+	      if (is_promoted_for_type (args[i].tree_value, mode, !args[i].unsignedp))
+		SUBREG_PROMOTED_SET (args[i].initial_value, SRP_SIGNED_AND_UNSIGNED);
+	      else
+		SUBREG_PROMOTED_SET (args[i].initial_value, args[i].unsignedp);
 	    }
 	}
     }
diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index e8cd87f..0540b4d 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -3309,7 +3309,13 @@ expand_gimple_stmt_1 (gimple stmt)
 					  GET_MODE (target), temp, unsignedp);
 		  }
 
-		convert_move (SUBREG_REG (target), temp, unsignedp);
+		if ((SUBREG_PROMOTED_GET (target) == SRP_SIGNED_AND_UNSIGNED)
+		    && (GET_CODE (temp) == SUBREG)
+		    && (GET_MODE (target) == GET_MODE (temp))
+		    && (GET_MODE (SUBREG_REG (target)) == GET_MODE (SUBREG_REG (temp))))
+		  emit_move_insn (SUBREG_REG (target), SUBREG_REG (temp));
+		else
+		  convert_move (SUBREG_REG (target), temp, unsignedp);
 	      }
 	    else if (nontemporal && emit_storent_insn (target, temp))
 	      ;
diff --git a/gcc/expr.c b/gcc/expr.c
index a8db9f5..b2c8146 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -9209,6 +9209,59 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode,
 }
 #undef REDUCE_BIT_FIELD
 
+/* Return TRUE if value in SSA is already zero/sign extended for lhs type
+   (type here is the combination of LHS_MODE and LHS_UNS) using value range
+   information stored. Return FALSE otherwise. */
+bool
+is_promoted_for_type (tree ssa, enum machine_mode lhs_mode, bool lhs_uns)
+{
+  wide_int type_min, type_max;
+  wide_int min, max, limit;
+  unsigned int prec;
+  tree lhs_type;
+  bool rhs_uns;
+
+  if (flag_wrapv
+      || (flag_strict_overflow == false)
+      || (ssa == NULL_TREE)
+      || (TREE_CODE (ssa) != SSA_NAME)
+      || !INTEGRAL_TYPE_P (TREE_TYPE (ssa))
+      || POINTER_TYPE_P (TREE_TYPE (ssa)))
+    return false;
+
+  /* Return FALSE if value_range is not recorded for SSA. */
+  if (get_range_info (ssa, &min, &max) != VR_RANGE)
+    return false;
+
+  lhs_type = lang_hooks.types.type_for_mode (lhs_mode, lhs_uns);
+  rhs_uns = TYPE_UNSIGNED (TREE_TYPE (ssa));
+  prec = min.get_precision ();
+
+  /* Signed maximum value.  */
+  limit = wide_int::from (TYPE_MAX_VALUE (TREE_TYPE (ssa)), prec, SIGNED);
+
+  /* Signedness of LHS and RHS differs but values in range.  */
+  if ((rhs_uns != lhs_uns)
+      && ((!lhs_uns && !wi::neg_p (min, TYPE_SIGN (lhs_type)))
+	  || (lhs_uns && (wi::cmp (max, limit, TYPE_SIGN (TREE_TYPE (ssa))) == -1))))
+    lhs_uns = !lhs_uns;
+
+  /* Signedness of LHS and RHS should match.  */
+  if (rhs_uns != lhs_uns)
+    return false;
+
+  type_min = wide_int::from (TYPE_MIN_VALUE (lhs_type), prec, TYPE_SIGN (TREE_TYPE (ssa)));
+  type_max = wide_int::from (TYPE_MAX_VALUE (lhs_type), prec, TYPE_SIGN (TREE_TYPE (ssa)));
+
+  /* Check if values lies in-between the type range.  */
+  if ((wi::neg_p (max, TYPE_SIGN (TREE_TYPE (ssa)))
+       || (wi::cmp (max, type_max, TYPE_SIGN (TREE_TYPE (ssa))) != 1))
+      && (!wi::neg_p (min, TYPE_SIGN (TREE_TYPE (ssa)))
+	  || (wi::cmp (type_min, min, TYPE_SIGN (TREE_TYPE (ssa))) != 1)))
+    return true;
+
+  return false;
+}
 
 /* Return TRUE if expression STMT is suitable for replacement.  
    Never consider memory loads as replaceable, because those don't ever lead 
@@ -9512,7 +9565,10 @@ expand_expr_real_1 (tree exp, rtx target, enum machine_mode tmode,
 
 	  temp = gen_lowpart_SUBREG (mode, decl_rtl);
 	  SUBREG_PROMOTED_VAR_P (temp) = 1;
-	  SUBREG_PROMOTED_SET (temp, unsignedp);
+	  if (is_promoted_for_type (ssa_name, mode, !unsignedp))
+	    SUBREG_PROMOTED_SET (temp, SRP_SIGNED_AND_UNSIGNED);
+	  else
+	    SUBREG_PROMOTED_SET (temp, unsignedp);
 
 	  return temp;
 	}
diff --git a/gcc/expr.h b/gcc/expr.h
index 6a1d3ab..e99d000 100644
--- a/gcc/expr.h
+++ b/gcc/expr.h
@@ -440,6 +440,7 @@ extern rtx expand_expr_real_1 (tree, rtx, enum machine_mode,
 			       enum expand_modifier, rtx *, bool);
 extern rtx expand_expr_real_2 (sepops, rtx, enum machine_mode,
 			       enum expand_modifier);
+extern bool is_promoted_for_type (tree, enum machine_mode, bool);
 
 /* Generate code for computing expression EXP.
    An rtx for the computed value is returned.  The value is never null.

^ permalink raw reply	[flat|nested] 51+ messages in thread

end of thread, other threads:[~2015-08-05  9:10 UTC | newest]

Thread overview: 51+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-27 10:01 [PATCH 2/2] Enable elimination of zext/sext Uros Bizjak
2014-08-27 10:07 ` Richard Biener
2014-08-27 10:32   ` Uros Bizjak
2014-08-27 10:32     ` Richard Biener
2014-09-01  8:48     ` Jakub Jelinek
2014-09-01  8:54       ` Uros Bizjak
2014-08-28  7:50   ` Kugan
2014-08-28  8:57     ` Richard Biener
2014-09-04  3:41       ` Kugan
2014-09-04 13:00         ` Richard Biener
2014-09-05  1:33           ` Kugan
2014-09-05  9:51             ` Richard Biener
2014-09-07  9:51               ` Kugan
2014-09-08  9:48                 ` Richard Biener
2014-09-09 10:06                   ` Kugan
2014-09-09 10:28                     ` Richard Biener
2014-11-09 23:30               ` [RFC] Elimination of zext/sext - type promotion pass Kugan
2014-11-10 12:56                 ` Richard Biener
2015-05-01  4:41                   ` Kugan
2015-05-08 12:48                     ` Richard Biener
2015-06-01 23:20                       ` Kugan
2015-06-19  2:55                         ` Kugan
2015-07-28 11:05                         ` Richard Biener
2015-08-05  0:12                           ` kugan
2015-08-05  9:10                             ` Richard Biener
2014-08-27 13:02 ` [PATCH 2/2] Enable elimination of zext/sext Kugan
2014-08-28  3:46   ` Kugan
2014-08-28  6:44     ` Marc Glisse
2014-08-28  7:29       ` Kugan
  -- strict thread matches above, loose matches on Subject: below --
2014-06-24 11:48 [PATCH 0/2] Zext/sext elimination using value range Kugan
2014-06-24 11:53 ` [PATCH 2/2] Enable elimination of zext/sext Kugan
2014-06-24 12:21   ` Jakub Jelinek
2014-06-25  8:15     ` Kugan
2014-06-25  8:36       ` Jakub Jelinek
2014-07-07  6:55         ` Kugan
2014-07-10 12:15           ` Richard Biener
2014-07-11 11:52             ` Kugan
2014-07-11 12:47               ` Richard Biener
2014-07-14  2:58                 ` Kugan
2014-07-14 20:11                   ` Bernhard Reutner-Fischer
2014-07-23 14:22                   ` Richard Biener
2014-08-01  4:51                     ` Kugan
2014-08-01 11:16                       ` Richard Biener
2014-08-01 16:04                         ` Kugan
2014-08-03 23:56                           ` Kugan
2014-08-05 14:18                           ` Richard Biener
2014-08-05 14:21                             ` Jakub Jelinek
2014-08-06 12:09                               ` Richard Biener
2014-08-06 13:22                                 ` Kugan
2014-08-06 13:29                                   ` Richard Biener
2014-08-07  5:25                                     ` Kugan
2014-08-07  8:09                                       ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).