public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/99286] New: ivopts don't select the best candidates with -Os
@ 2021-02-26 11:15 gengqi at linux dot alibaba.com
  0 siblings, 0 replies; only message in thread
From: gengqi at linux dot alibaba.com @ 2021-02-26 11:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99286

            Bug ID: 99286
           Summary: ivopts don't select the best candidates with -Os
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: gengqi at linux dot alibaba.com
  Target Milestone: ---

Created attachment 50261
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50261&action=edit
-c -march=rv32imafdc -mabi=ilp32d -Os ivopt_os.c -fdump-tree-ivopts-details

I have compared the assembly files and object files generated by different
versions of the gcc.

One is:
$ /lhome/gengq/riscv64-linux-mastertest/bin/riscv64-unknown-linux-gnu-gcc -v
Using built-in specs.
COLLECT_GCC=/lhome/gengq/riscv64-linux-mastertest/bin/riscv64-unknown-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/lhome/gengq/riscv64-linux-mastertest/libexec/gcc/riscv64-unknown-linux-gnu/11.0.0/lto-wrapper
Target: riscv64-unknown-linux-gnu
Configured with:
/lhome/gengq/riscv-gnu-toolchain-master/riscv-gnu-toolchain/riscv-gcc/configure
--target=riscv64-unknown-linux-gnu
--prefix=/lhome/gengq/riscv64-linux-mastertest
--with-sysroot=/lhome/gengq/riscv64-linux-mastertest/sysroot --with-system-zlib
--enable-shared --enable-tls --enable-languages=c,c++,fortran
--disable-libmudflap --disable-libssp --disable-libquadmath
--disable-libsanitizer --disable-nls --disable-bootstrap --src=.././riscv-gcc
--disable-multilib --with-abi=lp64d --with-arch=rv64gc 'CFLAGS_FOR_TARGET=-O2  
-mcmodel=medlow' 'CXXFLAGS_FOR_TARGET=-O2   -mcmodel=medlow'
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 11.0.0 20210209 (experimental) (GCC)

cmd is:
/lhome/gengq/riscv64-linux-mastertest/bin/riscv64-unknown-linux-gnu-gcc
-march=rv32imafdc -mabi=ilp32d -Os ivopt_os.c -c

The other is:
$ /lhome/gengq/riscv64-linux-810test/bin/riscv32-unknown-linux-gnu-gcc -v
Using built-in specs.
COLLECT_GCC=/lhome/gengq/riscv64-linux-810test/bin/riscv32-unknown-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/lhome/gengq/riscv64-linux-810test/libexec/gcc/riscv32-unknown-linux-gnu/8.1.0/lto-wrapper
Target: riscv32-unknown-linux-gnu
Configured with:
/lhome/gengq/riscv-gnu-toolchain-master/riscv-gnu-toolchain/riscv-gcc/configure
--target=riscv32-unknown-linux-gnu --prefix=/lhome/gengq/riscv64-linux-810test
--with-sysroot=/lhome/gengq/riscv64-linux-810test/sysroot --with-newlib
--without-headers --disable-shared --disable-threads --with-system-zlib
--enable-tls --enable-languages=c --disable-libatomic --disable-libmudflap
--disable-libssp --disable-libquadmath --disable-libgomp --disable-nls
--disable-bootstrap --src=.././riscv-gcc --with-pkgversion= --disable-multilib
--with-abi=ilp32d --with-arch=rv32gc 'CFLAGS_FOR_TARGET=-O2  -mcmodel=medlow'
'CXXFLAGS_FOR_TARGET=-O2  -mcmodel=medlow' CC=gcc CXX=g++
Thread model: single
gcc version 8.1.0 ()

cmd is:
/lhome/gengq/riscv64-linux-810test/bin/riscv32-unknown-linux-gnu-gcc
-march=rv32imafdc -mabi=ilp32d -Os ivopt_os.c -fdump-tree-all-details -c

The code generated by gcc11.0 is worse than by gcc8.1.0. I have done some
analysis and I think the difference due to 'ivopts'.

It seems that gcc11.0 has done a more detailed job in 'ivopts'. For
gcc11.0,there are 2 best candidate sets:
One is equivalent to what gcc8.0 used.
Another one is the final choice of gcc11.0. And its 'cost' is very close to the
other one.
I noticed that: The second set include more invariants and less induction
varibles. The code implementation prefers to use iv. And this preference can
sway the final choice as the differences are minimal.
So,why prefer iv? Is there any better treatment here? What I can think of from
my experience is that the inv variables are more atomic and have more potential
to be optimized. But this also means that the inv may generate more
intermediate variables if it is not optimised. Like this case, we chose to use
more invs and also created more intermediate variables, which ended up
overflowing the registers.

I'm not sure I've hit the nail on the head with my analysis, and I'd like to
try to find a better solution.

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2021-02-26 11:15 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-26 11:15 [Bug tree-optimization/99286] New: ivopts don't select the best candidates with -Os gengqi at linux dot alibaba.com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).