public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/28395] New: Improved division-by-constant code
@ 2006-07-16 15:44 vda dot linux at googlemail dot com
2006-07-16 15:45 ` [Bug c/28395] " vda dot linux at googlemail dot com
` (11 more replies)
0 siblings, 12 replies; 13+ messages in thread
From: vda dot linux at googlemail dot com @ 2006-07-16 15:44 UTC (permalink / raw)
To: gcc-bugs
32-bit unsigned division A/B by compile-time constant B can be optimized by
replacing it with multiplication and shift right. For example, division by 10
is done like this: (A*3435973837) >> 35, in i386 asm:
movl $-858993459, %ecx
movl 8(%ebp), %eax
mull %ecx
movl %edx, %esi
shrl $3, %esi
Choosing right constant by which we need to multiply is not trivial: large A
values can give wrong results. I spent a few days researching this. I will
attach the following test programs:
find_fast_div.c: reports fastdiv params for every B in [3..2^32-1]
fast_div_bench.c: measures speed improvement when dividing by B=10.
find_fast_div_random.c: there is fastdiv_params() function which is intended to
go into gcc's optimizer, "constant division" department. It calculates fastdiv
parameters (K,bits) for given B and max_A (max_A is to be supplied by gcc's
value range analyser). main() calls fastdiv_params() with random parameters in
the loop.
I am not familiar with gcc internals. Please tell me what should I do next in
order to integrate it inot gcc.
--
Summary: Improved division-by-constant code
Product: gcc
Version: 4.2.0
Status: UNCONFIRMED
Severity: enhancement
Priority: P3
Component: c
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: vda dot linux at googlemail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28395
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug c/28395] Improved division-by-constant code
2006-07-16 15:44 [Bug c/28395] New: Improved division-by-constant code vda dot linux at googlemail dot com
@ 2006-07-16 15:45 ` vda dot linux at googlemail dot com
2006-07-16 15:46 ` vda dot linux at googlemail dot com
` (10 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: vda dot linux at googlemail dot com @ 2006-07-16 15:45 UTC (permalink / raw)
To: gcc-bugs
------- Comment #1 from vda dot linux at googlemail dot com 2006-07-16 15:45 -------
Created an attachment (id=11898)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=11898&action=view)
find_fast_div.c
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28395
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug c/28395] Improved division-by-constant code
2006-07-16 15:44 [Bug c/28395] New: Improved division-by-constant code vda dot linux at googlemail dot com
2006-07-16 15:45 ` [Bug c/28395] " vda dot linux at googlemail dot com
2006-07-16 15:46 ` vda dot linux at googlemail dot com
@ 2006-07-16 15:46 ` vda dot linux at googlemail dot com
2006-07-16 15:50 ` [Bug middle-end/28395] " pinskia at gcc dot gnu dot org
` (8 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: vda dot linux at googlemail dot com @ 2006-07-16 15:46 UTC (permalink / raw)
To: gcc-bugs
------- Comment #3 from vda dot linux at googlemail dot com 2006-07-16 15:46 -------
Created an attachment (id=11900)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=11900&action=view)
find_fast_div_random.c
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28395
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug c/28395] Improved division-by-constant code
2006-07-16 15:44 [Bug c/28395] New: Improved division-by-constant code vda dot linux at googlemail dot com
2006-07-16 15:45 ` [Bug c/28395] " vda dot linux at googlemail dot com
@ 2006-07-16 15:46 ` vda dot linux at googlemail dot com
2006-07-16 15:46 ` vda dot linux at googlemail dot com
` (9 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: vda dot linux at googlemail dot com @ 2006-07-16 15:46 UTC (permalink / raw)
To: gcc-bugs
------- Comment #2 from vda dot linux at googlemail dot com 2006-07-16 15:46 -------
Created an attachment (id=11899)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=11899&action=view)
fast_div_bench.c
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28395
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug middle-end/28395] Improved division-by-constant code
2006-07-16 15:44 [Bug c/28395] New: Improved division-by-constant code vda dot linux at googlemail dot com
` (2 preceding siblings ...)
2006-07-16 15:46 ` vda dot linux at googlemail dot com
@ 2006-07-16 15:50 ` pinskia at gcc dot gnu dot org
2006-07-16 15:52 ` pinskia at gcc dot gnu dot org
` (7 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2006-07-16 15:50 UTC (permalink / raw)
To: gcc-bugs
------- Comment #4 from pinskia at gcc dot gnu dot org 2006-07-16 15:50 -------
GCC already does something like this.
For /10, GCC produces:
f:
movl $-858993459, %eax
mull 4(%esp)
shrl $3, %edx
movl %edx, %eax
ret
Maybe I don't understand what you are requesting.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28395
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug middle-end/28395] Improved division-by-constant code
2006-07-16 15:44 [Bug c/28395] New: Improved division-by-constant code vda dot linux at googlemail dot com
` (3 preceding siblings ...)
2006-07-16 15:50 ` [Bug middle-end/28395] " pinskia at gcc dot gnu dot org
@ 2006-07-16 15:52 ` pinskia at gcc dot gnu dot org
2006-07-16 15:54 ` pinskia at gcc dot gnu dot org
` (6 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2006-07-16 15:52 UTC (permalink / raw)
To: gcc-bugs
------- Comment #5 from pinskia at gcc dot gnu dot org 2006-07-16 15:52 -------
In fact we do it also for signed integers (PPC asm this time):
_f:
lis r0,0x6666
srawi r2,r3,31
ori r0,r0,26215
mulhw r3,r3,r0
srawi r3,r3,2
subf r3,r2,r3
blr
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28395
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug middle-end/28395] Improved division-by-constant code
2006-07-16 15:44 [Bug c/28395] New: Improved division-by-constant code vda dot linux at googlemail dot com
` (4 preceding siblings ...)
2006-07-16 15:52 ` pinskia at gcc dot gnu dot org
@ 2006-07-16 15:54 ` pinskia at gcc dot gnu dot org
2006-07-16 16:22 ` vda dot linux at googlemail dot com
` (5 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2006-07-16 15:54 UTC (permalink / raw)
To: gcc-bugs
------- Comment #6 from pinskia at gcc dot gnu dot org 2006-07-16 15:54 -------
This has been done in GCC since at least 1994 revision 7598 in the SVN.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28395
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug middle-end/28395] Improved division-by-constant code
2006-07-16 15:44 [Bug c/28395] New: Improved division-by-constant code vda dot linux at googlemail dot com
` (5 preceding siblings ...)
2006-07-16 15:54 ` pinskia at gcc dot gnu dot org
@ 2006-07-16 16:22 ` vda dot linux at googlemail dot com
2006-07-16 16:51 ` steven at gcc dot gnu dot org
` (4 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: vda dot linux at googlemail dot com @ 2006-07-16 16:22 UTC (permalink / raw)
To: gcc-bugs
------- Comment #7 from vda dot linux at googlemail dot com 2006-07-16 16:22 -------
Oh my.
It looks that use of -Os played a joke on me. gcc 3.4.3 -Os uses a division
instruction, even though it results in slower and _also bigger_ code.
Maybe it makes sense to enable this optimization for -Os too?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28395
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug middle-end/28395] Improved division-by-constant code
2006-07-16 15:44 [Bug c/28395] New: Improved division-by-constant code vda dot linux at googlemail dot com
` (6 preceding siblings ...)
2006-07-16 16:22 ` vda dot linux at googlemail dot com
@ 2006-07-16 16:51 ` steven at gcc dot gnu dot org
2006-07-16 18:47 ` vda dot linux at googlemail dot com
` (3 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: steven at gcc dot gnu dot org @ 2006-07-16 16:51 UTC (permalink / raw)
To: gcc-bugs
------- Comment #8 from steven at gcc dot gnu dot org 2006-07-16 16:51 -------
No. At -Os, we care about smaller code. Unless that sequence of insns with
muls and shifts is smaller than a div, we should produce the div at -Os. And
as far as I can see, the div will always be smaller.
Not a bug.
--
steven at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |RESOLVED
Resolution| |WORKSFORME
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28395
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug middle-end/28395] Improved division-by-constant code
2006-07-16 15:44 [Bug c/28395] New: Improved division-by-constant code vda dot linux at googlemail dot com
` (7 preceding siblings ...)
2006-07-16 16:51 ` steven at gcc dot gnu dot org
@ 2006-07-16 18:47 ` vda dot linux at googlemail dot com
2006-07-16 18:55 ` vda dot linux at googlemail dot com
` (2 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: vda dot linux at googlemail dot com @ 2006-07-16 18:47 UTC (permalink / raw)
To: gcc-bugs
------- Comment #9 from vda dot linux at googlemail dot com 2006-07-16 18:47 -------
The test program below shows that in this case doing division with div insn
takes more instructions than with mul+shift.
Also mul+shift path has absolutely useless "movl %edx, %eax" insn, shaving that
will make it even smaller. Need to build newer gcc and retest...
# cat t.c
enum { B = 10 };
enum { shift_bits = 35 };
enum { K = (1ULL<<shift_bits)/B + 1 };
unsigned a,b;
void f(unsigned A)
{
asm("#1");
a = A/B;
asm("#2");
b = (((unsigned long long)A) * K) >> shift_bits;
asm("#3");
}
# gcc -Os -fomit-frame-pointer -S t.c
# cat t.s
.file "t.c"
.text
.globl f
.type f, @function
f:
pushl %ebx
#APP
#1
#NO_APP
movl $10, %edx
movl %edx, %ebx
movl 8(%esp), %eax
xorl %edx, %edx
divl %ebx
movl %eax, a
#APP
#2
#NO_APP
movl $-858993459, %eax
mull 8(%esp)
movl %edx, %eax
shrl $3, %eax
movl %eax, b
#APP
#3
#NO_APP
popl %ebx
ret
.size f, .-f
.comm a,4,4
.comm b,4,4
.section .note.GNU-stack,"",@progbits
.ident "GCC: (GNU) 3.4.3"
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28395
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug middle-end/28395] Improved division-by-constant code
2006-07-16 15:44 [Bug c/28395] New: Improved division-by-constant code vda dot linux at googlemail dot com
` (8 preceding siblings ...)
2006-07-16 18:47 ` vda dot linux at googlemail dot com
@ 2006-07-16 18:55 ` vda dot linux at googlemail dot com
2006-07-17 10:32 ` vda dot linux at googlemail dot com
2006-07-17 11:46 ` pinskia at gcc dot gnu dot org
11 siblings, 0 replies; 13+ messages in thread
From: vda dot linux at googlemail dot com @ 2006-07-16 18:55 UTC (permalink / raw)
To: gcc-bugs
------- Comment #10 from vda dot linux at googlemail dot com 2006-07-16 18:54 -------
gcc-4.1.1 differs only by insterting one more useless insn:
movl $-858993459, %eax
mull 8(%esp)
movl %edx, %eax
+ xorl %edx, %edx
shrl $3, %eax
movl %eax, b
Huh?
--
vda dot linux at googlemail dot com changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |vda dot linux at googlemail
| |dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28395
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug middle-end/28395] Improved division-by-constant code
2006-07-16 15:44 [Bug c/28395] New: Improved division-by-constant code vda dot linux at googlemail dot com
` (9 preceding siblings ...)
2006-07-16 18:55 ` vda dot linux at googlemail dot com
@ 2006-07-17 10:32 ` vda dot linux at googlemail dot com
2006-07-17 11:46 ` pinskia at gcc dot gnu dot org
11 siblings, 0 replies; 13+ messages in thread
From: vda dot linux at googlemail dot com @ 2006-07-17 10:32 UTC (permalink / raw)
To: gcc-bugs
------- Comment #11 from vda dot linux at googlemail dot com 2006-07-17 10:32 -------
Andrew, I must be extremely dumb today. I looked in the source and for the life
of me, I can't see where that optimization is done. I even generated a
2.6.1->2.6.1 diff where it was added and still don't see it.
But it is surely there, because it works :)
Can you give me a hint where to look?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28395
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug middle-end/28395] Improved division-by-constant code
2006-07-16 15:44 [Bug c/28395] New: Improved division-by-constant code vda dot linux at googlemail dot com
` (10 preceding siblings ...)
2006-07-17 10:32 ` vda dot linux at googlemail dot com
@ 2006-07-17 11:46 ` pinskia at gcc dot gnu dot org
11 siblings, 0 replies; 13+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2006-07-17 11:46 UTC (permalink / raw)
To: gcc-bugs
------- Comment #12 from pinskia at gcc dot gnu dot org 2006-07-17 11:46 -------
the code is in the function expand_divmod in expmed.c.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28395
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2006-07-17 11:46 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-07-16 15:44 [Bug c/28395] New: Improved division-by-constant code vda dot linux at googlemail dot com
2006-07-16 15:45 ` [Bug c/28395] " vda dot linux at googlemail dot com
2006-07-16 15:46 ` vda dot linux at googlemail dot com
2006-07-16 15:46 ` vda dot linux at googlemail dot com
2006-07-16 15:50 ` [Bug middle-end/28395] " pinskia at gcc dot gnu dot org
2006-07-16 15:52 ` pinskia at gcc dot gnu dot org
2006-07-16 15:54 ` pinskia at gcc dot gnu dot org
2006-07-16 16:22 ` vda dot linux at googlemail dot com
2006-07-16 16:51 ` steven at gcc dot gnu dot org
2006-07-16 18:47 ` vda dot linux at googlemail dot com
2006-07-16 18:55 ` vda dot linux at googlemail dot com
2006-07-17 10:32 ` vda dot linux at googlemail dot com
2006-07-17 11:46 ` pinskia at gcc dot gnu dot org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).