From: Senthil Kumar Selvaraj <senthil_kumar.selvaraj@atmel.com>
To: <gcc@gcc.gnu.org>
Cc: <law@redhat.com>, <rguenther@suse.de>, <amacleod@redhat.com>
Subject: VRP causing extra register usage
Date: Thu, 12 Nov 2015 19:10:00 -0000 [thread overview]
Message-ID: <20151112191005.GA2731@jaguar.atmel.com> (raw)
Hi,
When analyzing code size differences between a 4.9.x compiler and
trunk for the AVR target, I found quite a few cases where extra
registers were being used to hold smaller types (two 8 bit registers
for a uint8_t, for example).
On deeper analysis, I found that the VRP pass (gcc/tree-vrp.c) was the point
at which the dumps start to diverge.
For code like this
#include <stdint.h>
uint16_t get(const uint16_t wvalue)
{
const uint8_t type = (wvalue >> 8);
uint16_t size = 0;
if (type == 1)
{
size = 20;
}
else if (type == 2)
{
size = 32;
}
return size;
}
VRP substitutes wvalue for the type variable in the conditionals
(simplify_cond_using_ranges:9506), as it figures out that the range
of wvalue is the same as a uint8_t). That is, code goes from
<bb 2>:
_3 = wvalue_2(D) >> 8;
type_4 = (const uint8_t) _3;
if (type_4 == 1)
goto <bb 5>;
else
goto <bb 3>;
to
<bb 2>:
_3 = wvalue_2(D) >> 8;
type_4 = (const uint8_t) _3;
if (_3 == 1)
goto <bb 5>;
else
goto <bb 3>;
This "widening" causes later passes to allocate extra registers
(holding zeros for the extra bits) and generate extra comparisons
for the AVR target.
I found that in the 4.9.2 compiler I was benchmarking against, VRP
didn't figure out the range for wvalue and therefore the folding
didn't happen, which was why the code was better.
With the native compiler on my machine (gcc 5.2 x86_64) - replacing
uint8_t by uint32_t and uint16_t by uint64_t, and shifting right by
32 bits instead of 8 shows the same effect - the generated code uses
rdi instead of just di to hold the type variable.
Is this a bug? Should the folding happen only if the type
conversion was from a smaller type to a bigger one? Or is the backend
supposed to detect this pattern and deal with it?
Regards
Senthil
details-raw vrp1 dump
;; Function get (get, funcdef_no=0, decl_uid=1522, cgraph_uid=0, symbol_order=0)
;; 1 loops found
;;
;; Loop 0
;; header 0, latch 1
;; depth 0, outer -1
;; nodes: 0 1 2 3 4 5
;; 2 succs { 5 3 }
;; 3 succs { 4 5 }
;; 4 succs { 5 }
;; 5 succs { 1 }
ASSERT_EXPRs to be inserted
Assertions to be inserted for type_4
if (type_4 == 1)
BB #3
EDGE 2->3 2 [55.9%] (FALSE_VALUE,EXECUTABLE)
PREDICATE: type_4 ne_expr 1
Updating SSA:
Registering new PHI nodes in block #2
Updating SSA information for statement type_4 = (const uint8_t) _3;
Updating SSA information for statement if (type_4 == 1)
Registering new PHI nodes in block #3
Updating SSA information for statement type_6 = ASSERT_EXPR <type_4, type_4 != 1>;
Updating SSA information for statement if (type_4 == 2)
Registering new PHI nodes in block #4
Registering new PHI nodes in block #5
SSA replacement table
N_i -> { O_1 ... O_j } means that N_i replaces O_1, ..., O_j
type_6 -> { type_4 }
Incremental SSA update started at block: 2
Number of blocks in CFG: 6
Number of blocks to update: 2 ( 33%)
Affected blocks: 2 3
SSA form after inserting ASSERT_EXPRs
get (const uint16_t wvalue, const uint8_t windex, const void * * const address)
{
uint16_t size;
const uint8_t type;
unsigned int _3;
<bb 2>:
_3 = wvalue_2(D) >> 8;
type_4 = (const uint8_t) _3;
if (type_4 == 1)
goto <bb 5>;
else
goto <bb 3>;
<bb 3>:
type_6 = ASSERT_EXPR <type_4, type_4 != 1>;
if (type_6 == 2)
goto <bb 4>;
else
goto <bb 5>;
<bb 4>:
<bb 5>:
# size_1 = PHI <20(2), 0(3), 32(4)>
return size_1;
}
Immediate_uses:
size_1 : --> single use.
return size_1;
wvalue_2(D) : --> single use.
_3 = wvalue_2(D) >> 8;
_3 : --> single use.
type_4 = (const uint8_t) _3;
type_4 : -->3 uses.
type_6 = ASSERT_EXPR <type_4, type_4 != 1>;
type_6 = ASSERT_EXPR <type_4, type_4 != 1>;
if (type_4 == 1)
.MEM_5(D) : --> single use.
# VUSE <.MEM_5(D)>
return size_1;
type_6 : --> single use.
if (type_6 == 2)
Adding destination of edge (0 -> 2) to worklist
Simulating block 2
Visiting statement:
_3 = wvalue_2(D) >> 8;
Intersecting
[0, 255]
and
[0, +INF]
to
[0, 255]
Found new range for _3: [0, 255]
interesting_ssa_edges: adding SSA use in type_4 = (const uint8_t) _3;
marking stmt to be not simulated again
Visiting statement:
type_4 = (const uint8_t) _3;
Found new range for type_4: [0, +INF]
interesting_ssa_edges: adding SSA use in type_6 = ASSERT_EXPR <type_4, type_4 != 1>;
interesting_ssa_edges: adding SSA use in if (type_4 == 1)
marking stmt to be not simulated again
Visiting statement:
if (type_4 == 1)
Visiting conditional with predicate: if (type_4 == 1)
With known ranges
type_4: [0, +INF]
Predicate evaluates to: DON'T KNOW
Adding destination of edge (2 -> 5) to worklist
Adding destination of edge (2 -> 3) to worklist
Simulating block 3
Visiting statement:
type_6 = ASSERT_EXPR <type_4, type_4 != 1>;
Intersecting
~[1, 1] EQUIVALENCES: { type_4 } (1 elements)
and
[0, +INF]
to
~[1, 1] EQUIVALENCES: { type_4 } (1 elements)
Found new range for type_6: ~[1, 1]
interesting_ssa_edges: adding SSA use in if (type_6 == 2)
marking stmt to be not simulated again
Visiting statement:
if (type_6 == 2)
Visiting conditional with predicate: if (type_6 == 2)
With known ranges
type_6: ~[1, 1] EQUIVALENCES: { type_4 } (1 elements)
Predicate evaluates to: DON'T KNOW
Adding destination of edge (3 -> 4) to worklist
Simulating block 4
Simulating block 5
Visiting PHI node: size_1 = PHI <20(2), 0(3), 32(4)>
Argument #0 (2 -> 5 executable)
20: [20, 20]
Argument #1 (3 -> 5 executable)
0: [0, 0]
Meeting
[20, 20]
and
[0, 0]
to
[0, 20]
Argument #2 (4 -> 5 executable)
32: [32, 32]
Meeting
[0, 20]
and
[32, 32]
to
[0, 32]
Intersecting
[0, 32]
and
[0, +INF]
to
[0, 32]
Found new range for size_1: [0, 32]
interesting_ssa_edges: adding SSA use in return size_1;
marking stmt to be not simulated again
Visiting statement:
return size_1;
Value ranges after VRP:
size_1: [0, 32]
wvalue_2(D): VARYING
_3: [0, 255]
type_4: [0, +INF]
type_6: ~[1, 1] EQUIVALENCES: { type_4 } (1 elements)
Substituting values and folding statements
Folding statement: _3 = wvalue_2(D) >> 8;
Not folded
Folding statement: type_4 = (const uint8_t) _3;
Not folded
Folding statement: if (type_4 == 1)
Folded into: if (_3 == 1)
Folding statement: if (type_6 == 2)
Not folded
Folding PHI node: size_1 = PHI <20(2), 0(3), 32(4)>
No folding possible
Folding statement: return size_1;
Not folded
get (const uint16_t wvalue, const uint8_t windex, const void * * const address)
{
uint16_t size;
const uint8_t type;
unsigned int _3;
<bb 2>:
_3 = wvalue_2(D) >> 8;
type_4 = (const uint8_t) _3;
if (_3 == 1)
goto <bb 5>;
else
goto <bb 3>;
<bb 3>:
if (type_4 == 2)
goto <bb 4>;
else
goto <bb 5>;
<bb 4>:
<bb 5>:
# size_1 = PHI <20(2), 0(3), 32(4)>
return size_1;
}
next reply other threads:[~2015-11-12 19:10 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-11-12 19:10 Senthil Kumar Selvaraj [this message]
2015-11-12 19:37 ` Richard Biener
2015-11-13 13:36 ` Senthil Kumar Selvaraj
2015-11-13 15:15 ` Richard Biener
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151112191005.GA2731@jaguar.atmel.com \
--to=senthil_kumar.selvaraj@atmel.com \
--cc=amacleod@redhat.com \
--cc=gcc@gcc.gnu.org \
--cc=law@redhat.com \
--cc=rguenther@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).