[Bug c/28417] New: suboptimal 'division by constant' optimization

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug c/28417]  New: suboptimal 'division by constant' optimization
@ 2006-07-18  8:34 vda dot linux at googlemail dot com
  2006-07-18  8:40 ` [Bug c/28417] " vda dot linux at googlemail dot com
                   ` (14 more replies)
  0 siblings, 15 replies; 16+ messages in thread
From: vda dot linux at googlemail dot com @ 2006-07-18  8:34 UTC (permalink / raw)
  To: gcc-bugs

It looks like expmed.c::choose_multiplier(), which is responsible for finding
parameters needed for replacing division by constant with mul+shift, sometimes
fails to find optimal parameters.

One example: A/1577682821 can be calculated by executing A*365384439 >>
(27+32), for any 32-bit unsigned A. However, gcc-4.1.1 and all previous
versions generate much more elaborate code.

The below program demonstrates tis and also two more similar examples. It also
checks that mul+shift works for any unsigned A, by testing all possibe values
of A.

#include <stdint.h>
#include <stdio.h>

unsigned v;
void f9188_mul365384439_shift27(unsigned A) { v = A/(unsigned)1577682821; }
void f9188_mul365384439_shift27_prime(unsigned A) { v =
(((uint64_t)A)*(unsigned)365384439) >> (27+32); }
void f8399_mul2283243215_shift29(unsigned A) { v = A/(unsigned)1009898111; }
void f8399_mul2283243215_shift29_prime(unsigned A) { v =
(((uint64_t)A)*(unsigned)2283243215) >> (29+32); }
void f8267_mul2482476753_shift30(unsigned A) { v = A/(unsigned)1857695551; }
void f8267_mul2482476753_shift30_prime(unsigned A) { v =
(((uint64_t)A)*(unsigned)2482476753) >> (30+32); }

/* Generated code is suboptimal (gcc 4.1.1):
void f9188_mul365384439_shift27(unsigned A) { v = A/1577682821; }
f9188_mul365384439_shift27:
        pushl   %edi
        pushl   %esi
        pushl   %ebx
        movl    16(%esp), %ebx
        movl    $1551183727, %ecx
        movl    %ebx, %eax
        mull    %ecx
        subl    %edx, %ebx
        shrl    %ebx
        leal    (%ebx,%edx), %eax
        shrl    $30, %eax
        movl    %eax, v
        popl    %ebx
        popl    %esi
        popl    %edi
        ret
void f8399_mul2283243215_shift29(unsigned A) { v = A/1009898111; }
f8399_mul2283243215_shift29:
        pushl   %edi
        pushl   %esi
        pushl   %ebx
        movl    16(%esp), %ebx
        movl    $271519133, %ecx
        movl    %ebx, %eax
        mull    %ecx
        subl    %edx, %ebx
        shrl    %ebx
        leal    (%ebx,%edx), %eax
        shrl    $29, %eax
        movl    %eax, v
        popl    %ebx
        popl    %esi
        popl    %edi
        ret
void f8267_mul2482476753_shift30(unsigned A) { v = A/1857695551; }
f8267_mul2482476753_shift30:
        pushl   %edi
        pushl   %esi
        pushl   %ebx
        movl    16(%esp), %ebx
        movl    $669986209, %ecx
        movl    %ebx, %eax
        mull    %ecx
        subl    %edx, %ebx
        shrl    %ebx
        leal    (%ebx,%edx), %eax
        shrl    $30, %eax
        movl    %eax, v
        popl    %ebx
        popl    %esi
        popl    %edi
        ret

These operations can be done like this:
f9188_mul365384439_shift27_prime:
        movl    $365384439, %eax
        mull    4(%esp)
        movl    %edx, %eax
        shrl    $27, %eax
        movl    %eax, v
        ret
f8399_mul2283243215_shift29_prime:
        movl    $-2011724081, %eax
        mull    4(%esp)
        movl    %edx, %eax
        shrl    $29, %eax
        movl    %eax, v
        ret
f8267_mul2482476753_shift30_prime:
        movl    $-1812490543, %eax
        mull    4(%esp)
        movl    %edx, %eax
        shrl    $30, %eax
        movl    %eax, v
        ret

(Why there is that silly movl %edx, %eax - that's another good question...)

The verification program is below. Was compiled with -Os
(in order to force gcc to use div insn for a/b - we want to do true
division for the purpose of correctness check!)
and didn't report a failure.
*/

int main()
{
        unsigned A=0,u,v;
        while(++A) {
                (A & 0xffff) || printf("A=%u\r",A);

                u = (((uint64_t)A)*(unsigned)365384439) >> (27+32);
                v = A / (unsigned)1577682821;
                if(u!=v) { printf("1: A=%u u=%u v=%u\n", A,u,v); return 0; }

                u = (((uint64_t)A)*(unsigned)2283243215) >> (29+32);
                v = A / (unsigned)1009898111;
                if(u!=v) { printf("2: A=%u u=%u v=%u\n", A,u,v); return 0; }

                u = (((uint64_t)A)*(unsigned)2482476753) >> (30+32);
                v  =A / (unsigned)1857695551;
                if(u!=v) { printf("3: A=%u u=%u v=%u\n", A,u,v); return 0; }
        }
        puts("");
        return 0;
}


-- 
           Summary: suboptimal 'division by constant' optimization
           Product: gcc
           Version: 4.1.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: vda dot linux at googlemail dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28417


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug c/28417] suboptimal 'division by constant' optimization
  2006-07-18  8:34 [Bug c/28417] New: suboptimal 'division by constant' optimization vda dot linux at googlemail dot com
@ 2006-07-18  8:40 ` vda dot linux at googlemail dot com
  2006-07-18 12:07 ` vda dot linux at googlemail dot com
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: vda dot linux at googlemail dot com @ 2006-07-18  8:40 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #1 from vda dot linux at googlemail dot com  2006-07-18 08:40 -------
I didn't look too close at choose_multiplier(), yet.
If anybody will work upon it in order to make it better,
this is the routine which prints values 'K' and 'bits'
so that
    (A*K)>>bits == A/B
is true for given fixed B and all A's in range [0...max_A].
You may want to add similar code to choose_multiplier().

Mathematical explanation is in the comment.

/*
[below: 'div' is unsigned integer division]
['/' is real division with infinite precision]
[A,B,C... - integers, a,b,c... - reals]

Theory: we want to calculate A div B, fast.
B is const > 2 and is not a power of 2.

In fp: A/B = A*(L/B)/L. (If L is a large power of 2,
say 2^32, then it could be done really fast).
Let k := L/B, K := L div B + 1.
Then A/B = A*k/L.

Then this is true:

A div B <= A * K div L.

For small enough A: A div B = A * K div L.
Lets find first A for which it is not true.

Lets compare k/L and K/L. K/L is larger by a small value d:

d := K/L - k/L = (L div B + 1) / L - L/B/L =
= (L div B * B + B) / (L*B) - L/(L*B) =
= (L div B * B + B - L) / (L*B)

A*K/L is larger than A*k/L by A*d.

A*k/L is closest to 'overflowing into next integer'
when A = N*B-1. The 'deficit' with such A is exactly 1/B.
If A*d >= 1/B, then A*K/L will 'overflow'.

Thus bad_A >= 1/B / d = (1/B) * (L*B)/(L div B * B + B - L) =
= L/(L div B * B + B - L). Then you need to 'walk up' to next
A representable as N*B-1: bad_A2 = (bad_A div B) * B + B-1
This is the first A which will have wrong result
(i.e. for which (A*K div L) = (A div B)+1, not (A div B).

Practical use.

Suppose that A and B are 32-bit unsigned integers.

Unfortunately, there exist only two B values in 3..2^32-1 range
for which _any_ 32-bit unsigned A can be fast divided using L=2^32
(because bad_A=2^32 and any A is less than that):

B=641 K=6700417 d=1/2753074036736 bad_A=4294967296 A=unlimited
B=6700417 K=641 d=1/28778071884562432 bad_A=4294967296 A=unlimited

We need to use larger powers of 2 for L if we need to handle
many more B's.
*/

void fastdiv_params(unsigned B, unsigned max_A)
{
        unsigned K, d_LB, bits, max_bits;
        uint64_t L, KL, mask, bad_A, max_bad_A;

        if (!B || ((B-1)&B) == 0) { // B is a power of 2
//if(0)
                printf("B=%u: power of 2, division by shift\n", B);
                return;
        }
        L = (max_ull/max_unsigned - 1) * B;
        bits = 63;
        mask = 1ULL << 63;
        while( !(L & mask))
                bits--, mask >>= 1;
        L = mask;

        while ( (KL = L/B + 1) > max_unsigned)
                bits--, L >>= 1;
        K = KL;
        // There is not much point in multiplying by even number
        // and then shifting right. Reduce K & L to avoid it:
        while (!(K & 1) && bits > 32)
                bits--, L >>= 1, K = L/B + 1;

        d_LB = ((L/B) * B + B - L);
        bad_A = L / d_LB;
        bad_A = (bad_A / B) * B + B-1;
        if (bad_A <= max_A) {
                printf("B=%u,A<=%u: no suitable fastdiv params found\n", B,
max_A);
                return;
        }
        max_bits = bits;
        max_bad_A = bad_A;
        while(1) {
                uint64_t last_L = L;
                uint64_t last_bad_A = bad_A;
                unsigned last_K = K;
                bits--;
                L >>= 1;
                K = L/B + 1;
                d_LB = ((L/B) * B + B - L);
                bad_A = L / d_LB;
                bad_A = (bad_A / B) * B + B-1;
                if (bad_A <= max_A || bits < 32) {
                        // new params are bad, report last ones and bail out
//if(0)
                        printf("B=%u,A<=%u: L=%llx(bits=%u) K=%u  (bad_A=%llu,
bad_A=%llu at bits=%u)\n",
                                B, max_A, last_L, bits+1, last_K, last_bad_A, 
max_bad_A, max_bits);
                        return;
                }
        }
}


-- 

vda dot linux at googlemail dot com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |vda dot linux at googlemail
                   |                            |dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28417


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug c/28417] suboptimal 'division by constant' optimization
  2006-07-18  8:34 [Bug c/28417] New: suboptimal 'division by constant' optimization vda dot linux at googlemail dot com
  2006-07-18  8:40 ` [Bug c/28417] " vda dot linux at googlemail dot com
@ 2006-07-18 12:07 ` vda dot linux at googlemail dot com
  2006-07-19 16:24 ` [Bug middle-end/28417] " vda dot linux at googlemail dot com
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: vda dot linux at googlemail dot com @ 2006-07-18 12:07 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #2 from vda dot linux at googlemail dot com  2006-07-18 12:06 -------
Pseudo-code to help in reading above code:

void fastdiv_params(unsigned B, unsigned max_A)
{
        L = (max_unsigned_long_long/max_unsigned - 1) * B;
        L = largest_pow2_smaller_or_eq_to(L);
        bits = log2(L);
        while ((K = L/B + 1) > max_unsigned)
                bits--, L >>= 1;
        // There is not much point in multiplying by even 'K'
        // and then shifting right (dividing by 2*n). Reduce K & L to avoid it:
        while (!(K & 1) && bits > 32)
                bits--, L >>= 1, K = L/B + 1;

        d_LB = ((L/B) * B + B - L);
        bad_A = L / d_LB;
        bad_A = (bad_A / B) * B + B-1;
        if (bad_A <= max_A)
                bail_out_no_params_exist;

        // 'K' and 'bits' params are already determined, can return them...
        /* return_params(K,bits); */

        // ...but if max_A is smaller then max_unsigned, we can
        // optionally use smaller 'K' and 'bits'. Find smallest which
        // still work:
        max_bits = bits;
        max_bad_A = bad_A;
        while(1) {
                uint64_t last_L = L;
                unsigned last_K = K;
                bits--;
                L >>= 1;
                K = L/B + 1;
                d_LB = ((L/B) * B + B - L);
                bad_A = L / d_LB;
                bad_A = (bad_A / B) * B + B-1;
                if (bad_A <= max_A || bits < 32)
                        // new params are bad, report last ones and bail out
                        return_params(last_K,bits+1);
        }
}


-- 

vda dot linux at googlemail dot com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|middle-end                  |c


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28417


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug middle-end/28417] suboptimal 'division by constant' optimization
  2006-07-18  8:34 [Bug c/28417] New: suboptimal 'division by constant' optimization vda dot linux at googlemail dot com
  2006-07-18  8:40 ` [Bug c/28417] " vda dot linux at googlemail dot com
  2006-07-18 12:07 ` vda dot linux at googlemail dot com
@ 2006-07-19 16:24 ` vda dot linux at googlemail dot com
  2006-07-30 13:35 ` vda dot linux at googlemail dot com
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: vda dot linux at googlemail dot com @ 2006-07-19 16:24 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #3 from vda dot linux at googlemail dot com  2006-07-19 16:24 -------
I instrumented choose_multiplier(). When called like this:
choose_multiplier(d=1577682821,n=32,precision=32),
it returns *post_shift_ptr=31,multiplier=5846151023

whereas optimal one is *post_shift_ptr=27,multiplier=365384439
In the code: "////" comments show variable values 
if d=1577682821.


The instrumented version is below.
Too bad there are no comments how the code works, what is mlow and mhigh.

Note that the good value, if shifted left 4 times,
365384439<<4, is equal to 5846151024, which is mhigh+1!

With mhigh=5846151024 "Reduce to lowest terms" part of code would reduce
mhigh/mlow pair exactly to correct value of 365384439, I think...

Isn't this a reason why choose_multiplier() doesn't consider 365384439
to be a valid multiplier? Maybe mhigh calcucation is off-by-one?

unsigned HOST_WIDE_INT

choose_multiplier (unsigned HOST_WIDE_INT d, int n, int precision,

                   rtx *multiplier_ptr, int *post_shift_ptr, int *lgup_ptr)

{

  HOST_WIDE_INT mhigh_hi, mlow_hi;

  unsigned HOST_WIDE_INT mhigh_lo, mlow_lo;

  int lgup, post_shift;

  int pow, pow2;

  unsigned HOST_WIDE_INT nl, dummy1;

  HOST_WIDE_INT nh, dummy2;



static int divdebug = -1;



if (divdebug == -1)

    divdebug = NULL!=getenv("DIV_DEBUG");



divdebug && printf("choose_multiplier(d=%llu,n=%i,precision=%i)\n",(unsigned
long long)d,n,precision);

//// d=1577682821,n=32,precision=32

  /* lgup = ceil(log2(divisor)); */

  lgup = ceil_log2 (d); //// 31

  gcc_assert (lgup <= n);



  pow = n + lgup; //// 63

  pow2 = n + lgup - precision; //// 31



  /* We could handle this with some effort, but this case is much

     better handled directly with a scc insn, so rely on caller using

     that.  */

  gcc_assert (pow != 2 * HOST_BITS_PER_WIDE_INT);



  /* mlow = 2^(N + lgup)/d */

 if (pow >= HOST_BITS_PER_WIDE_INT) //// yes

    {

      nh = (HOST_WIDE_INT) 1 << (pow - HOST_BITS_PER_WIDE_INT); //// 1<<31

      nl = 0;

    }

  else

    {

      nh = 0;

      nl = (unsigned HOST_WIDE_INT) 1 << pow;

    }

//// 1<<63 / d: mlow=5846151022

  div_and_round_double (TRUNC_DIV_EXPR, 1, nl, nh, d, (HOST_WIDE_INT) 0,

                        &mlow_lo, &mlow_hi, &dummy1, &dummy2);

divdebug && printf("choose_multiplier: mlow=%llu\n",((unsigned long
long)mlow_hi<<32)+mlow_lo);



  /* mhigh = (2^(N + lgup) + 2^N + lgup - precision)/d */

  if (pow2 >= HOST_BITS_PER_WIDE_INT) //// no

    nh |= (HOST_WIDE_INT) 1 << (pow2 - HOST_BITS_PER_WIDE_INT);

  else

    nl |= (unsigned HOST_WIDE_INT) 1 << pow2; //// 1<<31

//// (1<<63+1<<31) / d: mhigh=5846151023

  div_and_round_double (TRUNC_DIV_EXPR, 1, nl, nh, d, (HOST_WIDE_INT) 0,

                        &mhigh_lo, &mhigh_hi, &dummy1, &dummy2);

divdebug && printf("choose_multiplier: mhigh=%llu\n",((unsigned long
long)mhigh_hi<<32)+mhigh_lo);



  gcc_assert (!mhigh_hi || nh - d < d);

  gcc_assert (mhigh_hi <= 1 && mlow_hi <= 1);

  /* Assert that mlow < mhigh.  */

  gcc_assert (mlow_hi < mhigh_hi

              || (mlow_hi == mhigh_hi && mlow_lo < mhigh_lo));



  /* If precision == N, then mlow, mhigh exceed 2^N

     (but they do not exceed 2^(N+1)).  */



  /* Reduce to lowest terms.  */

  for (post_shift = lgup; post_shift > 0; post_shift--)

    {

      unsigned HOST_WIDE_INT ml_lo = (mlow_hi << (HOST_BITS_PER_WIDE_INT - 1))
| (mlow_lo >> 1);

      unsigned HOST_WIDE_INT mh_lo = (mhigh_hi << (HOST_BITS_PER_WIDE_INT - 1))
| (mhigh_lo >> 1);

      if (ml_lo >= mh_lo) //// if mlow/2 == mhigh/2 - bail out

        break;



      mlow_hi = 0;

      mlow_lo = ml_lo;

      mhigh_hi = 0;

      mhigh_lo = mh_lo;

    }



  *post_shift_ptr = post_shift;

  *lgup_ptr = lgup;

  if (n < HOST_BITS_PER_WIDE_INT)

    {

      unsigned HOST_WIDE_INT mask = ((unsigned HOST_WIDE_INT) 1 << n) - 1;

      *multiplier_ptr = GEN_INT (mhigh_lo & mask);

divdebug && printf("choose_multiplier:
*post_shift_ptr=%i,*lgup_ptr=%i,multiplier=%llu\n",*post_shift_ptr,*lgup_ptr,(unsigned
long long)(mhigh_lo & mask) + ((unsigned long long)(mhigh_lo >= mask) << 32));

      return mhigh_lo >= mask;

    }

  else

    {

      *multiplier_ptr = GEN_INT (mhigh_lo);

divdebug && printf("choose_multiplier:
*post_shift_ptr=%i,*lgup_ptr=%i,multiplier=%llu\n",*post_shift_ptr,*lgup_ptr,(unsigned
long long)(mhigh_lo) + ((unsigned long long)(mhigh_hi) << 32));

      return mhigh_hi;

    }

}


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28417


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug middle-end/28417] suboptimal 'division by constant' optimization
  2006-07-18  8:34 [Bug c/28417] New: suboptimal 'division by constant' optimization vda dot linux at googlemail dot com
                   ` (2 preceding siblings ...)
  2006-07-19 16:24 ` [Bug middle-end/28417] " vda dot linux at googlemail dot com
@ 2006-07-30 13:35 ` vda dot linux at googlemail dot com
  2006-07-30 13:37 ` vda dot linux at googlemail dot com
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: vda dot linux at googlemail dot com @ 2006-07-30 13:35 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #4 from vda dot linux at googlemail dot com  2006-07-30 13:35 -------
Created an attachment (id=11971)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=11971&action=view)
Use alternative algorithm if it gives better results

New algorithm lives in separate function gcc_fastdiv_params() which is called
from choose_multiplier(). Then we run old algorithm. Then we compare results
and if new algorithm gives different one (new result is always better or same
than old), we use it.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28417


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug middle-end/28417] suboptimal 'division by constant' optimization
  2006-07-18  8:34 [Bug c/28417] New: suboptimal 'division by constant' optimization vda dot linux at googlemail dot com
                   ` (3 preceding siblings ...)
  2006-07-30 13:35 ` vda dot linux at googlemail dot com
@ 2006-07-30 13:37 ` vda dot linux at googlemail dot com
  2006-07-30 13:38 ` vda dot linux at googlemail dot com
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: vda dot linux at googlemail dot com @ 2006-07-30 13:37 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #5 from vda dot linux at googlemail dot com  2006-07-30 13:37 -------
Created an attachment (id=11972)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=11972&action=view)
Test program

Compile with -O2 -S to see the assembly.
Compile with -Os and run to check the correctness.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28417


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug middle-end/28417] suboptimal 'division by constant' optimization
  2006-07-18  8:34 [Bug c/28417] New: suboptimal 'division by constant' optimization vda dot linux at googlemail dot com
                   ` (4 preceding siblings ...)
  2006-07-30 13:37 ` vda dot linux at googlemail dot com
@ 2006-07-30 13:38 ` vda dot linux at googlemail dot com
  2006-07-30 13:43 ` vda dot linux at googlemail dot com
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: vda dot linux at googlemail dot com @ 2006-07-30 13:38 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #6 from vda dot linux at googlemail dot com  2006-07-30 13:38 -------
Patched versus stock gcc-4.1.1 on this code

unsigned v;
void f9188_mul365384439_shift27(unsigned A) { v = A/(unsigned)1577682821; }

is:

# diff -u suboptimal-411-O2.s suboptimal-411-O2-fixed.s
--- suboptimal-411-O2.s 2006-07-29 19:34:09.000000000 +0200
+++ suboptimal-411-O2-fixed.s   2006-07-29 19:37:14.000000000 +0200
@@ -19,21 +19,10 @@
 f9188_mul365384439_shift27:
        pushl   %ebp
        movl    %esp, %ebp
-       pushl   %edi
-       pushl   %esi
-       pushl   %ebx
-       movl    8(%ebp), %ebx
-       movl    $1551183727, %ecx
-       movl    %ebx, %eax
-       mull    %ecx
-       subl    %edx, %ebx
-       shrl    %ebx
-       leal    (%ebx,%edx), %eax
-       shrl    $30, %eax
-       movl    %eax, v
-       popl    %ebx
-       popl    %esi
-       popl    %edi
+       movl    $365384439, %eax
+       mull    8(%ebp)
+       shrl    $27, %edx
+       movl    %edx, v
        leave
        ret
...


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28417


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug middle-end/28417] suboptimal 'division by constant' optimization
  2006-07-18  8:34 [Bug c/28417] New: suboptimal 'division by constant' optimization vda dot linux at googlemail dot com
                   ` (5 preceding siblings ...)
  2006-07-30 13:38 ` vda dot linux at googlemail dot com
@ 2006-07-30 13:43 ` vda dot linux at googlemail dot com
  2006-07-31 14:39 ` amylaar at gcc dot gnu dot org
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: vda dot linux at googlemail dot com @ 2006-07-30 13:43 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #7 from vda dot linux at googlemail dot com  2006-07-30 13:43 -------
Created an attachment (id=11973)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=11973&action=view)
Random tester

Randomly chooses a/b and compares "gcc patch code" results with code which was
previously verified to be ok. Just my internal tool to check that I did not
screw up while making a patch.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28417


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug middle-end/28417] suboptimal 'division by constant' optimization
  2006-07-18  8:34 [Bug c/28417] New: suboptimal 'division by constant' optimization vda dot linux at googlemail dot com
                   ` (6 preceding siblings ...)
  2006-07-30 13:43 ` vda dot linux at googlemail dot com
@ 2006-07-31 14:39 ` amylaar at gcc dot gnu dot org
  2006-08-02 19:05 ` vda dot linux at googlemail dot com
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: amylaar at gcc dot gnu dot org @ 2006-07-31 14:39 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #8 from amylaar at gcc dot gnu dot org  2006-07-31 14:39 -------
(In reply to comment #3)
> Too bad there are no comments how the code works, what is mlow and mhigh.

Tege has written a paper: http://swox.se/~tg/divcnst-pldi94.pdf

Also, you might consider adding him to the CC list of this bug.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28417


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug middle-end/28417] suboptimal 'division by constant' optimization
  2006-07-18  8:34 [Bug c/28417] New: suboptimal 'division by constant' optimization vda dot linux at googlemail dot com
                   ` (7 preceding siblings ...)
  2006-07-31 14:39 ` amylaar at gcc dot gnu dot org
@ 2006-08-02 19:05 ` vda dot linux at googlemail dot com
  2006-08-02 19:05 ` vda dot linux at googlemail dot com
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: vda dot linux at googlemail dot com @ 2006-08-02 19:05 UTC (permalink / raw)
  To: gcc-bugs

------- Comment #9 from vda dot linux at googlemail dot com  2006-08-02 19:05 -------
Thanks for the link to .pdf! Who's Tege? And his email address is ...?

Ok. This new version of the patch is working for signed divisions too, and is
giving the results which are always same or better than "old" code.

If it would be deemed acceptable for mainline, it can be simply be integrated
into choose_multiplier().

Since new code needs to know maximum possible dividend instead of (n,prec)
tuple, and it can generate better code if dividend is known to be small enough
(example: (n & 0xffff) / 10 will need only a mul, not mul+shift), it will make
sense to propagate possible dividend range into choose_multiplier() intead of
(n,prec). TODO later...

-- 

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28417

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug middle-end/28417] suboptimal 'division by constant' optimization
  2006-07-18  8:34 [Bug c/28417] New: suboptimal 'division by constant' optimization vda dot linux at googlemail dot com
                   ` (8 preceding siblings ...)
  2006-08-02 19:05 ` vda dot linux at googlemail dot com
@ 2006-08-02 19:05 ` vda dot linux at googlemail dot com
  2006-08-02 19:34 ` amylaar at gcc dot gnu dot org
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: vda dot linux at googlemail dot com @ 2006-08-02 19:05 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #10 from vda dot linux at googlemail dot com  2006-08-02 19:05 -------
Created an attachment (id=12000)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=12000&action=view)
Alternative algorithm v. 2


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28417


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug middle-end/28417] suboptimal 'division by constant' optimization
  2006-07-18  8:34 [Bug c/28417] New: suboptimal 'division by constant' optimization vda dot linux at googlemail dot com
                   ` (9 preceding siblings ...)
  2006-08-02 19:05 ` vda dot linux at googlemail dot com
@ 2006-08-02 19:34 ` amylaar at gcc dot gnu dot org
  2006-08-03 17:06 ` vda dot linux at googlemail dot com
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: amylaar at gcc dot gnu dot org @ 2006-08-02 19:34 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #11 from amylaar at gcc dot gnu dot org  2006-08-02 19:34 -------
(In reply to comment #9)
> Thanks for the link to .pdf! Who's Tege? And his email address is ...?

Torbjoern Granlund; homepage:  http://swox.se/~tg/ .  From MAINTAINERS:
*synthetic multiply     Torbjorn Granlund       tege@swox.com


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28417


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug middle-end/28417] suboptimal 'division by constant' optimization
  2006-07-18  8:34 [Bug c/28417] New: suboptimal 'division by constant' optimization vda dot linux at googlemail dot com
                   ` (10 preceding siblings ...)
  2006-08-02 19:34 ` amylaar at gcc dot gnu dot org
@ 2006-08-03 17:06 ` vda dot linux at googlemail dot com
  2006-08-04 11:57 ` tege at swox dot com
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: vda dot linux at googlemail dot com @ 2006-08-03 17:06 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #12 from vda dot linux at googlemail dot com  2006-08-03 17:06 -------
Created an attachment (id=12005)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=12005&action=view)
Alternative algorithm v. 3

Formatting fixes mostly


-- 

vda dot linux at googlemail dot com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #11971|0                           |1
        is obsolete|                            |
  Attachment #12000|0                           |1
        is obsolete|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28417


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug middle-end/28417] suboptimal 'division by constant' optimization
  2006-07-18  8:34 [Bug c/28417] New: suboptimal 'division by constant' optimization vda dot linux at googlemail dot com
                   ` (11 preceding siblings ...)
  2006-08-03 17:06 ` vda dot linux at googlemail dot com
@ 2006-08-04 11:57 ` tege at swox dot com
  2006-08-04 18:14 ` vda dot linux at googlemail dot com
  2009-06-06 15:08 ` aldot at gcc dot gnu dot org
  14 siblings, 0 replies; 16+ messages in thread
From: tege at swox dot com @ 2006-08-04 11:57 UTC (permalink / raw)
  To: gcc-bugs

------- Comment #13 from tege at swox dot com  2006-08-04 11:57 -------
Subject: Re:  suboptimal 'division by constant' optimization

I looked briefly into the suggested change, without trying to
assess its correctness.  But since these sort of transformations
are of a very mathematical nature, I trust the author of the
suggested changes has proven the algorithms?

I am aware of that Peter Montgomery's and my algorithms do not
generate optimal sequences for v/C where C is close to the
largest possible value of v.  We deemed it to have very little
practical use to optimize these cases.

If it is decided that these cases are indeed important enough to
require optimization, the choose_multiplier function should
course be amended, of course.

-- 

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28417

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug middle-end/28417] suboptimal 'division by constant' optimization
  2006-07-18  8:34 [Bug c/28417] New: suboptimal 'division by constant' optimization vda dot linux at googlemail dot com
                   ` (12 preceding siblings ...)
  2006-08-04 11:57 ` tege at swox dot com
@ 2006-08-04 18:14 ` vda dot linux at googlemail dot com
  2009-06-06 15:08 ` aldot at gcc dot gnu dot org
  14 siblings, 0 replies; 16+ messages in thread
From: vda dot linux at googlemail dot com @ 2006-08-04 18:14 UTC (permalink / raw)
  To: gcc-bugs

------- Comment #14 from vda dot linux at googlemail dot com  2006-08-04 18:14 -------
>But since these sort of transformations are of
>a very mathematical nature, I trust the author of the
>suggested changes has proven the algorithms?

I did not write a paper about it, but I think I have
the proof and it is documented in the patch.

I will happily ament that comment if it has any unclear spots.
I don't want anydoby to have headaches over it :)

> If it is decided that these cases are indeed important enough to
require optimization, the choose_multiplier function should
course be amended, of course.

I think that we can get more gains if we will use new algorithm
_and_ will propagate dividend value range info into
choose_multiplier(). New algorithm (of should I say "modified",
it is not that new) takes max_A parameter already, so it's trivial.

For example, with value range propagation (n & 0xffff) / 10
will generate code with just a mul without shift, because there exist
suitable K for L=2^32 -> we can just take higher half of multiply.

-- 

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28417

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug middle-end/28417] suboptimal 'division by constant' optimization
  2006-07-18  8:34 [Bug c/28417] New: suboptimal 'division by constant' optimization vda dot linux at googlemail dot com
                   ` (13 preceding siblings ...)
  2006-08-04 18:14 ` vda dot linux at googlemail dot com
@ 2009-06-06 15:08 ` aldot at gcc dot gnu dot org
  14 siblings, 0 replies; 16+ messages in thread
From: aldot at gcc dot gnu dot org @ 2009-06-06 15:08 UTC (permalink / raw)
  To: gcc-bugs



-- 

aldot at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
OtherBugsDependingO|                            |16996
              nThis|                            |
             Status|UNCONFIRMED                 |NEW
     Ever Confirmed|0                           |1
   Last reconfirmed|0000-00-00 00:00:00         |2009-06-06 15:08:07
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28417


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2009-06-06 15:08 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-07-18  8:34 [Bug c/28417] New: suboptimal 'division by constant' optimization vda dot linux at googlemail dot com
2006-07-18  8:40 ` [Bug c/28417] " vda dot linux at googlemail dot com
2006-07-18 12:07 ` vda dot linux at googlemail dot com
2006-07-19 16:24 ` [Bug middle-end/28417] " vda dot linux at googlemail dot com
2006-07-30 13:35 ` vda dot linux at googlemail dot com
2006-07-30 13:37 ` vda dot linux at googlemail dot com
2006-07-30 13:38 ` vda dot linux at googlemail dot com
2006-07-30 13:43 ` vda dot linux at googlemail dot com
2006-07-31 14:39 ` amylaar at gcc dot gnu dot org
2006-08-02 19:05 ` vda dot linux at googlemail dot com
2006-08-02 19:05 ` vda dot linux at googlemail dot com
2006-08-02 19:34 ` amylaar at gcc dot gnu dot org
2006-08-03 17:06 ` vda dot linux at googlemail dot com
2006-08-04 11:57 ` tege at swox dot com
2006-08-04 18:14 ` vda dot linux at googlemail dot com
2009-06-06 15:08 ` aldot at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).