* [Bug target/64036] [SH] Evaluate re-enabling scheduling before RA
2014-11-23 22:12 [Bug target/64036] New: [SH] Evaluate re-enabling scheduling before RA olegendo at gcc dot gnu.org
@ 2014-12-21 13:33 ` olegendo at gcc dot gnu.org
2014-12-21 14:23 ` olegendo at gcc dot gnu.org
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: olegendo at gcc dot gnu.org @ 2014-12-21 13:33 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64036
--- Comment #1 from Oleg Endo <olegendo at gcc dot gnu.org> ---
In PR 22553 c#23 Kaz posted some CSiBE numbers to compare sched1 and no-sched1:
CSiBE with -O2)
test name sched1 no-sched1 sched1/no-sched1
bzip2-1.0.2 bzip2.d 11.06 10.8067 1.02344
bzip2-1.0.2 bzip2recover 4.54333 4.66333 0.974267
bzip2-1.0.2 bzip2.c 42.8567 44.2067 0.969462
compiler vam.fib 2.00333 1.97 1.01692
compiler vam.fact 1.87333 1.88333 0.99469
compiler vam.test2 0.246667 0.25 0.986667
flex-2.5.31 flex 13.1133 13.25 0.989686
jikespg-1.3 src/jikespg 1.62 1.59333 1.01674
jpeg-6b jpegtran2 4.62667 4.74667 0.974719
jpeg-6b djpeg2 2.19333 2.28667 0.960641
jpeg-6b djpeg1 2.2 2.26667 0.970588
jpeg-6b cjpeg2 3.19333 3.05333 1.04585
jpeg-6b djpeg0 0.33 0.33 1
jpeg-6b cjpeg0 0.496667 0.496667 1
jpeg-6b cjpeg1 3.18 3.11667 1.02032
jpeg-6b jpegtran0 0.256667 0.256667 1
jpeg-6b jpegtran1 1.98667 2.00667 0.990033
libpng-1.2.5 png2pnm0 0.973333 0.97 1.00344
libpng-1.2.5 pnm2png1 44.63 44.6933 0.998583
libpng-1.2.5 pnm2png0 7.94667 7.95 0.999581
libpng-1.2.5 png2pnm1 6.71667 6.71 1.00099
teem-1.6.0-src src/hex/dehex0 1.66 1.68 0.988095
teem-1.6.0-src src/hex/dehex1 10.8967 10.92 0.997863
teem-1.6.0-src src/hex/enhex1 40.5033 41.2133 0.982773
teem-1.6.0-src src/hex/enhex0 6.76333 6.86667 0.984951
zlib-1.1.4 minigzip0 44.73 45.0633 0.992603
zlib-1.1.4 minigzip 5.41333 5.41333 1
(not CSiBE test with -O3. 201 frames decoded)
mpeg2dec-0.3.1 2.18 2.22 0.981981
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/64036] [SH] Evaluate re-enabling scheduling before RA
2014-11-23 22:12 [Bug target/64036] New: [SH] Evaluate re-enabling scheduling before RA olegendo at gcc dot gnu.org
2014-12-21 13:33 ` [Bug target/64036] " olegendo at gcc dot gnu.org
@ 2014-12-21 14:23 ` olegendo at gcc dot gnu.org
2015-07-07 12:44 ` olegendo at gcc dot gnu.org
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: olegendo at gcc dot gnu.org @ 2014-12-21 14:23 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64036
--- Comment #2 from Oleg Endo <olegendo at gcc dot gnu.org> ---
An example function, compiling with -O2 -m4:
int test_0 (unsigned short* x, int y, int z)
{
return
(x[0] + x[1] + x[2] + x[3] + x[4] + x[5] + x[6]
+ x[7] + x[8] + x[9] + x[10]) ? y : z;
}
Without sched1, there are lots of dependencies on the results of memory loads.
With sched1, there is generally more code generated and variable live ranges
are longer. The above function will use r8 and r9, which is not really
necessary. Memory load dependencies are reduced and more LS/EX/MT instructions
can be executed in parallel. Code size for the test function increases from
~37 insns to ~50 insns. Approximated cycles on SH4 pipeline should be ~37
cycles without sched1 and ~33 cycles with sched1. On SH4A the latency of a
load is 1 cycle, so without sched1 it should be ~28 cycles.
So basically this seems to stuff stall cycles with additional (reg-reg move)
instructions, but the end result is almost the same. The larger code size and
longer live ranges seem to eliminate the benefits. Using post-inc addressing
and appropriate scheduling and RA, it should be possible to get ~27 cycles (38
insns) on SH4 for that function.
If we could get that 37 -> 27 cycle drop with sched1, it'd be worth enabling
it. It looks like addressing mode selection has to be improved first to reduce
pressure on R0. Even then, there should be some way to prevent sched1 if the
resulting code will only stuff stall cycles with reg-reg copy insns to avoid
code bloat, since larger code increases probability of icache misses.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/64036] [SH] Evaluate re-enabling scheduling before RA
2014-11-23 22:12 [Bug target/64036] New: [SH] Evaluate re-enabling scheduling before RA olegendo at gcc dot gnu.org
2014-12-21 13:33 ` [Bug target/64036] " olegendo at gcc dot gnu.org
2014-12-21 14:23 ` olegendo at gcc dot gnu.org
@ 2015-07-07 12:44 ` olegendo at gcc dot gnu.org
2015-07-07 12:55 ` olegendo at gcc dot gnu.org
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: olegendo at gcc dot gnu.org @ 2015-07-07 12:44 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64036
--- Comment #3 from Oleg Endo <olegendo at gcc dot gnu.org> ---
I've just tried the following example on the AMS branch:
float fun (float* x)
{
return x[0] + x[1] + x[2] + x[3];
}
no AMS:
mov r4,r1
add #4,r1
fmov.s @r4,fr0
fmov.s @r1,fr1
mov r4,r1
add #8,r1
fadd fr1,fr0
fmov.s @r1,fr1
add #12,r4
fadd fr1,fr0
fmov.s @r4,fr1
rts
fadd fr1,fr0
AMS:
fmov.s @r4+,fr0
fmov.s @r4+,fr1
fadd fr1,fr0 << load dependency stall
fmov.s @r4+,fr1
fadd fr1,fr0 << load dependency stall
fmov.s @r4,fr1
rts
fadd fr1,fr0
AMS + sched1:
fmov.s @r4+,fr0
fmov.s @r4+,fr1
fmov.s @r4+,fr2
fadd fr1,fr0 << no stall
fmov.s @r4,fr1
fadd fr2,fr0 << no stall
rts
fadd fr1,fr0
The sched1 code seems better and it might make sense to enable sched1 after
AMS.
In sh.c there is ...
else if (flag_exceptions)
{
if (flag_schedule_insns && global_options_set.x_flag_schedule_insns)
warning (0, "ignoring -fschedule-insns because of exception "
"handling bug");
flag_schedule_insns = 0;
}
... which makes it impossible to enable sched1 for C++ code unless exceptions
are disabled. Kaz, do you know whether this is still an issue?
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/64036] [SH] Evaluate re-enabling scheduling before RA
2014-11-23 22:12 [Bug target/64036] New: [SH] Evaluate re-enabling scheduling before RA olegendo at gcc dot gnu.org
` (2 preceding siblings ...)
2015-07-07 12:44 ` olegendo at gcc dot gnu.org
@ 2015-07-07 12:55 ` olegendo at gcc dot gnu.org
2015-07-07 20:46 ` kkojima at gcc dot gnu.org
2015-07-25 11:21 ` olegendo at gcc dot gnu.org
5 siblings, 0 replies; 7+ messages in thread
From: olegendo at gcc dot gnu.org @ 2015-07-07 12:55 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64036
--- Comment #4 from Oleg Endo <olegendo at gcc dot gnu.org> ---
(In reply to Oleg Endo from comment #2)
> An example function, compiling with -O2 -m4:
>
> int test_0 (unsigned short* x, int y, int z)
> {
> return
> (x[0] + x[1] + x[2] + x[3] + x[4] + x[5] + x[6]
> + x[7] + x[8] + x[9] + x[10]) ? y : z;
> }
>
> Without sched1, there are lots of dependencies on the results of memory
> loads.
>
> With sched1, there is generally more code generated and variable live ranges
> are longer. The above function will use r8 and r9, which is not really
> necessary. Memory load dependencies are reduced and more LS/EX/MT
> instructions can be executed in parallel. Code size for the test function
> increases from ~37 insns to ~50 insns. Approximated cycles on SH4 pipeline
> should be ~37 cycles without sched1 and ~33 cycles with sched1. On SH4A the
> latency of a load is 1 cycle, so without sched1 it should be ~28 cycles.
BTW, with AMS there is no difference of sched1 or no-sched1 in code size,
because it uses post-inc loads. With AMS + sched1 the example above compiles
to 38 insns and there are no dependencies/stalls on the mem loads anymore. The
r8,r9 usage issue is also gone.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/64036] [SH] Evaluate re-enabling scheduling before RA
2014-11-23 22:12 [Bug target/64036] New: [SH] Evaluate re-enabling scheduling before RA olegendo at gcc dot gnu.org
` (3 preceding siblings ...)
2015-07-07 12:55 ` olegendo at gcc dot gnu.org
@ 2015-07-07 20:46 ` kkojima at gcc dot gnu.org
2015-07-25 11:21 ` olegendo at gcc dot gnu.org
5 siblings, 0 replies; 7+ messages in thread
From: kkojima at gcc dot gnu.org @ 2015-07-07 20:46 UTC (permalink / raw)
To: gcc-bugs
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="UTF-8", Size: 9126 bytes --]
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64036
--- Comment #5 from Kazumoto Kojima <kkojima at gcc dot gnu.org> ---
(In reply to Oleg Endo from comment #3)
> else if (flag_exceptions)
> {
> if (flag_schedule_insns && global_options_set.x_flag_schedule_insns)
> warning (0, "ignoring -fschedule-insns because of exception "
> "handling bug");
> flag_schedule_insns = 0;
> }
>
> ... which makes it impossible to enable sched1 for C++ code unless
> exceptions are disabled. Kaz, do you know whether this is still an issue?
I think that the original issue is solved already and we simply forgot
this workaround.
>From gcc-bugs-return-491683-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org Tue Jul 07 21:07:04 2015
Return-Path: <gcc-bugs-return-491683-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Delivered-To: listarch-gcc-bugs@gcc.gnu.org
Received: (qmail 4337 invoked by alias); 7 Jul 2015 21:07:03 -0000
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
Delivered-To: mailing list gcc-bugs@gcc.gnu.org
Received: (qmail 4278 invoked by uid 48); 7 Jul 2015 21:06:59 -0000
From: "su at cs dot ucdavis.edu" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug ipa/66793] New: ICE at -Os and above on x86_64-linux-gnu (verify_flow_info failed)
Date: Tue, 07 Jul 2015 21:07:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: new
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: ipa
X-Bugzilla-Version: 6.0
X-Bugzilla-Keywords:
X-Bugzilla-Severity: normal
X-Bugzilla-Who: su at cs dot ucdavis.edu
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Resolution:
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter target_milestone
Message-ID: <bug-66793-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2015-07/txt/msg00573.txt.bz2
Content-length: 2181
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66793
Bug ID: 66793
Summary: ICE at -Os and above on x86_64-linux-gnu
(verify_flow_info failed)
Product: gcc
Version: 6.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: ipa
Assignee: unassigned at gcc dot gnu.org
Reporter: su at cs dot ucdavis.edu
Target Milestone: ---
The following code causes an ICE when compiled with the current gcc trunk at
-Os and above on x86_64-linux-gnu in both 32-bit and 64-bit modes.
It is a regression from 5.1.x.
$ gcc-trunk -v
Using built-in specs.
COLLECT_GCC=gcc-trunk
COLLECT_LTO_WRAPPER=/usr/local/gcc-trunk/libexec/gcc/x86_64-unknown-linux-gnu/6.0.0/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with: ../gcc-trunk/configure --prefix=/usr/local/gcc-trunk
--enable-languages=c,c++ --disable-werror --enable-multilib
Thread model: posix
gcc version 6.0.0 20150707 (experimental) [trunk revision 225501] (GCC)
$
$ gcc-trunk -O1 small.c
$ gcc-5.1 -Os small.c
$
$ gcc-trunk -Os small.c
small.c: In function âmainâ:
small.c:18:1: error: control flow in the middle of basic block 2
main ()
^
small.c:18:1: internal compiler error: verify_flow_info failed
0x6fa932 verify_flow_info()
../../gcc-trunk/gcc/cfghooks.c:268
0xbfac91 cleanup_tree_cfg_noloop
../../gcc-trunk/gcc/tree-cfgcleanup.c:751
0xbfac91 cleanup_tree_cfg()
../../gcc-trunk/gcc/tree-cfgcleanup.c:800
0xaeeb84 execute_function_todo
../../gcc-trunk/gcc/passes.c:1910
0xaef433 execute_todo
../../gcc-trunk/gcc/passes.c:2014
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <http://gcc.gnu.org/bugs.html> for instructions.
$
-----------------------------------
int a, b, c;
struct S0
{
int f1;
} *d;
void
fn1 (struct S0 p)
{
for (p.f1 = 0; p.f1 < 1; p.f1++)
c = a && b ? a && b : 1;
for (; c;)
;
}
int
main ()
{
struct S0 **f = &d;
d = 0;
fn1 (**f);
return 0;
}
>From gcc-bugs-return-491684-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org Tue Jul 07 21:13:24 2015
Return-Path: <gcc-bugs-return-491684-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Delivered-To: listarch-gcc-bugs@gcc.gnu.org
Received: (qmail 66335 invoked by alias); 7 Jul 2015 21:13:23 -0000
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
Delivered-To: mailing list gcc-bugs@gcc.gnu.org
Received: (qmail 66013 invoked by uid 48); 7 Jul 2015 21:13:19 -0000
From: "pault at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug fortran/52846] [F2008] Support submodules
Date: Tue, 07 Jul 2015 21:13:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: fortran
X-Bugzilla-Version: 4.8.0
X-Bugzilla-Keywords:
X-Bugzilla-Severity: normal
X-Bugzilla-Who: pault at gcc dot gnu.org
X-Bugzilla-Status: NEW
X-Bugzilla-Resolution:
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: pault at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields: attachments.created
Message-ID: <bug-52846-4-QrSfhSp6gz@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-52846-4@http.gcc.gnu.org/bugzilla/>
References: <bug-52846-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2015-07/txt/msg00574.txt.bz2
Content-length: 2621
https://gcc.gnu.org/bugzilla/show_bug.cgi?idR846
--- Comment #7 from Paul Thomas <pault at gcc dot gnu.org> ---
Created attachment 35926
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id5926&actioníit
A partially cooked patch to complete the implentation of submodules
The attached is a first attempt to complete the submodule implementation such
that private entities are correctly dealt with.
There are two parts to the patch:
(i) Modifications to the front end to write a second half to the module files,
which contains all the information about the private entities in the module.
This is the bulk of the patch; and
(ii) A change in the way that declarations of private entities are handled in
trans-decl.c. This follows a suggestion from Richard Biener to use a technique
borrowed from g++. In this patch it is only applied to variables.
Concerning (i), I am open to opinions as to whether or not it is better to
write the private parts to a separate file. At present, anything other than a
submodule does not visit the private part of the module file. Mention was made
on the list of encrypting this part of the file. Unfortunately, this overwhelms
the compression and module files remain rather large. This might be a strong
argument for emitting two files; one for the public part, as at present, and
the other for the private. In this way, a build could result in a library that
is only propagated with the public module file.
In respect of (ii), I have checked that the map shows no signs of the private
variables and that the library only carries their names.
A typical, basic makefile might look like this:
# Build submodule_8
#
submodule_8 : submodule_8_prog.o submodule_8_lib.so
~/irun/bin/gfortran -O3 submodule_8_prog.o submodule_8_lib.so -o
submodule_8
#
submodule_8_prog.o : submodule_8_prog.f90
~/irun/bin/gfortran -O3 -c submodule_8_prog.f90 -o submodule_8_prog.o
#
submodule_8_lib.so : submodule_8_lib1.o submodule_8_lib2.o
~/irun/bin/gfortran -O3 -shared submodule_8_lib1.o submodule_8_lib2.o
-o submodule_8_lib.so
#
# Relocation runs into problems without -fPIC for both of these compilations.
#
submodule_8_lib1.o : submodule_8_lib1.f90
~/irun/bin/gfortran -O3 -fPIC -c submodule_8_lib1.f90 -o
submodule_8_lib1.o
#
submodule_8_lib2.o : submodule_8_lib2.f90
~/irun/bin/gfortran -O3 -fPIC -c submodule_8_lib2.f90 -o
submodule_8_lib2.o
In this tested case, ~lib1 contains two modules and ~lib2 the submodules. Some
of the entities in ~lib1 are private but they are correctly propagated to ~lib2
but not to the program, ~prog.
Cheers
Paul
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/64036] [SH] Evaluate re-enabling scheduling before RA
2014-11-23 22:12 [Bug target/64036] New: [SH] Evaluate re-enabling scheduling before RA olegendo at gcc dot gnu.org
` (4 preceding siblings ...)
2015-07-07 20:46 ` kkojima at gcc dot gnu.org
@ 2015-07-25 11:21 ` olegendo at gcc dot gnu.org
5 siblings, 0 replies; 7+ messages in thread
From: olegendo at gcc dot gnu.org @ 2015-07-25 11:21 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64036
--- Comment #6 from Oleg Endo <olegendo at gcc dot gnu.org> ---
For this test case
static volatile int* const g_0 = (volatile int*)0x1240;
static volatile int* const g_1 = (volatile int*)0x1244;
static volatile int* const g_2 = (volatile int*)0x1248;
static volatile int* const g_3 = (volatile int*)0x124C;
int fun (void)
{
return *g_0 + *g_1 + *g_2 + *g_3;
}
I'm getting the following with -O2 with AMS (without sched1):
mov.w .L2,r1
mov.l @r1,r0
mov.l @(4,r1),r3
mov.l @(8,r1),r2
mov.l @(12,r1),r1
add r3,r0
add r2,r0
rts
add r1,r0
.align 1
.L2:
.short 4672
And with sched1:
mov.w .L2,r1
mov.w .L3,r2
mov.l @r1+,r0
mov.l @r1,r1
mov.l @r2,r2
add r1,r0
mov.w .L4,r1
add r2,r0
mov.l @r1,r1
rts
add r1,r0
.align 1
.L2:
.short 4672
.L3:
.short 4680
.L4:
.short 4684
This is one of the cases where sched1 makes things worse.
^ permalink raw reply [flat|nested] 7+ messages in thread