public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug optimization/13724] New: Bad code generated for unsigned int -> long long multiplication
@ 2004-01-17 23:45 bonzini at gnu dot org
  2004-01-17 23:47 ` [Bug optimization/13724] " bonzini at gnu dot org
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: bonzini at gnu dot org @ 2004-01-17 23:45 UTC (permalink / raw)
  To: gcc-bugs

The following function (which by the way is meant to compute the
unsigned remainder by 3), produces this really awful code on x86
(gcc 3.3.1) at any optimization level:

int f(unsigned int x)
{
  return ((x * 0x2AAAAAAABULL) >> 33) & 3;
}

	...
	movl	$0xAAAAAAAB, %eax
	movl	8(%ebp), %ebx
	mull	%ebx
	xorl	%esi, %esi			(1)
	leal	(%edx,%ebx,2), %ecx
	movl	%esi, %ebx
	imull	$0xAAAAAAAB, %ebx, %ebx		(2)
	leal	(%ebx,%ecx), %edx
	movl	%edx, %eax
	shrl	$1, %eax
	andl	$3, %eax
	...

The insns I marked with (1) and (2) are respectively, after RTL
generation,

(insn 11 10 12 (nil) (parallel [
            (set (reg:DI 61)
                (zero_extend:DI (reg/v:SI 59)))
            (clobber (reg:CC 17 flags))
        ]) -1 (nil)
    (nil))

(insn 17 16 18 (nil) (parallel [
            (set (reg:SI 66)
                (mult:SI (subreg:SI (reg:DI 61) 4)
                    (const_int -1431655765 [0xaaaaaaab])))
            (clobber (reg:CC 17 flags))
        ]) -1 (nil)
    (nil))

CSE does not recognize that (subreg:SI (reg:DI 61) 4) must be
zero since it is created with a (zero_extend:DI (reg/v:SI 59)).
I've made a quick patch for 3.2.1:

2004-01-17  Paolo Bonzini  <bonzini@gnu.org>

	* gcc/cse.c (fold_rtx): Simplify a SUBREG to zero if it
	is the high part of the source, and the source is a REG
	equivalent to a ZERO_EXTEND.

The important part of the patch is:

,-----
+  if (GET_CODE (folded_arg0) == REG
+      && GET_MODE_SIZE (mode) < GET_MODE_SIZE (GET_MODE (folded_arg0))
+      && !subreg_lowpart_p (x))
+    {
+      struct table_elt *elt;
+
+      /* We can use HASH here since we know that canon_hash won't be
+         called.  */
+      elt = lookup (folded_arg0,
+		    HASH (folded_arg0, GET_MODE (folded_arg0)),
+		    GET_MODE (folded_arg0));
+
+      if (elt)
+        elt = elt->first_same_value;
+
+      /* If this is a SUBREG representing the high part of a REG, check if
+         the register is the result of a zero extension.  We can fold
+         the SUBREG to zero if the zero-extended expression's mode is
+         smaller or as big as the SUBREG's (for example, a subreg:HI of
+         a zero-extended reg:SI could be non-zero).  */
+      for (; elt; elt = elt->next_same_value)
+        {
+          if (GET_CODE (elt->exp) == ZERO_EXTEND
+    	      && GET_MODE_SIZE (GET_MODE (XEXP (elt->exp, 0)))
+    	         <= GET_MODE_SIZE (mode))
+    	    return const0_rtx;
+        }
+    }
`-----

It fixes this problem and bootstraps cleanly (C, C++, Java); please
tell me if it is the right way, so I can port to mainline and submit
properly.  Maybe it is better to put the transformation in simplify_rtx.c
instead?  It looks like a good deal of SUBREG manipulation done in cse.c
after looking up the operand in the available expression table is not
pass-dependent, but this is probably too much for me to do without any
kind of guidance (it is my first foray into patching gcc).

Paolo

-- 
           Summary: Bad code generated for unsigned int -> long long
                    multiplication
           Product: gcc
           Version: 3.3.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: optimization
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: bonzini at gnu dot org
                CC: gcc-bugs at gcc dot gnu dot org
GCC target triplet: i686-pc-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13724


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2004-01-27 13:04 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-01-17 23:45 [Bug optimization/13724] New: Bad code generated for unsigned int -> long long multiplication bonzini at gnu dot org
2004-01-17 23:47 ` [Bug optimization/13724] " bonzini at gnu dot org
2004-01-17 23:49 ` bonzini at gnu dot org
2004-01-18 17:16 ` pinskia at gcc dot gnu dot org
2004-01-18 17:18 ` pinskia at gcc dot gnu dot org
2004-01-22 17:59 ` bonzini at gnu dot org
2004-01-23  2:03 ` cvs-commit at gcc dot gnu dot org
2004-01-23  2:10 ` pinskia at gcc dot gnu dot org
2004-01-27 13:06 ` cvs-commit at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).