public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Sources required...
@ 2023-10-08  7:50 Jacob Navia
  0 siblings, 0 replies; only message in thread
From: Jacob Navia @ 2023-10-08  7:50 UTC (permalink / raw)
  To: gcc

[-- Attachment #1: Type: text/plain, Size: 3239 bytes --]

Hi
Looking at the code generated by the riscv backend:

Consider this C source code:

void shup1(QfloatAccump x)
{
	QELT newbits,bits;
	int i;
	bits = x->mantissa[9] >> 63;
	x->mantissa[9] <<= 1;
	for( i=8; i>0; i-- ) {
		newbits = x->mantissa[i] >> 63;
		x->mantissa[i] <<= 1;
		x->mantissa[i] |= bits;
		bits = newbits;
	}    
	x->mantissa[0] <<= 1;
	x->mantissa[0] |= bits;
}

This code is shifting a $64\times 10\rightarrow640$ bits right by 1 position. The algorithm is simple: save the highest bit, do the shift, and introduce the bits of the previous position at the least significant position.

When compiling with gcc the generated code looks extremely weird. Instead of loading a 64 bit number into some register, doing the operation, then storing the result into memory, gcc does the following:

	1) Load the 64 bit number byte by byte into 8 different registers. Each 64 bit register contains only one byte.
	2) ORing the 8 registers together into a 64 bit number
	3) Doing the 64 bit operation
	4) Splitting the result into 8 different registers
	5) Storing the 8 different bytes one by one.

Obviously, I thought that this is a serious bug in gcc. I was going to write that bug report but I had the reflex of rewriting that function using reasonable assembly like this:

	1) Loading 64 bits into 10 different registers
	2) Doing the operations
	3) Storing 64 bits at a time.

The results are /catastrophic/  Instead of increasing performance, there is a slow down of several times compared to the performance of gcc.

Now, my question is:
Where did you get this information from? Because I can’t believe that by « trial and error » you arrived at that weird way of doing things. There must be some document that pointed you to the right solution. Can you share that information with the public?

Thanks in advance.

Jacob


sipeed@lpi4a:~/lcc/qlibriscv$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/riscv64-linux-gnu/13/lto-wrapper
Target: riscv64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 13.2.0-4revyos1' --with-bugurl=file:///usr/share/doc/gcc-13/README.Bugs --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --prefix=/usr --with-gcc-major-version-only --program-suffix=-13 --program-prefix=riscv64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/libexec --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-libitm --disable-libquadmath --disable-libquadmath-support --enable-plugin --enable-default-pie --with-system-zlib --enable-libphobos-checking=release --with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch --disable-werror --disable-multilib --with-arch=rv64gc --with-abi=lp64d --enable-checking=release --build=riscv64-linux-gnu --host=riscv64-linux-gnu --target=riscv64-linux-gnu --with-build-config=bootstrap-lto-lean --enable-link-serialization=16
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 13.2.0 (Debian 13.2.0-4revyos1) 


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2023-10-08  7:50 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-08  7:50 Sources required Jacob Navia

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).