public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/50088] New: movzbl is generated instead of movl
@ 2011-08-15 13:08 enkovich.gnu at gmail dot com
  2011-08-15 13:14 ` [Bug rtl-optimization/50088] " rguenth at gcc dot gnu.org
                   ` (19 more replies)
  0 siblings, 20 replies; 21+ messages in thread
From: enkovich.gnu at gmail dot com @ 2011-08-15 13:08 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50088

             Bug #: 50088
           Summary: movzbl is generated instead of movl
    Classification: Unclassified
           Product: gcc
           Version: 4.7.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: enkovich.gnu@gmail.com


Created attachment 25016
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=25016
Reproducer

When spilled register is going to be used in subreg expression then short load
is generated to fill register.

Example:
    movl  %edx, 0x34(%esp)
    jz 0x1498 <Block 54>
  Block 34:
    movzxb  0x34(%esp), %ecx
    shl %cl, %eax

It is correct but may cause performance problems. I doubt there are situations
when zero extended load is better than natural one.

On Atom processors (and probably some others) such situations cause stalls
because store forwarding does not work for store/load pair using different
access sizes.

For example EEMBC 2.0/huffde has ~6% performance improvement on Atom if we
replace such movzbl with movl.

Attached reproducer demonstrates fills performed via movzbl.
Used compiler and options:

Target: x86_64-unknown-linux-gnu
Configured with: ../gcc1/configure --prefix=/export/users/gcc-perf/install
--enable-languages=c,c++,fortran
Thread model: posix
gcc version 4.7.0 20110615 (experimental) (GCC)
COLLECT_GCC_OPTIONS='-O2' '-m32' '-S' '-v' '-mtune=generic' '-march=x86-64'
 /export/users/gcc-perf/install/libexec/gcc/x86_64-unknown-linux-gnu/4.7.0/cc1
-quiet -v -imultilib 32 test_movzbl.c -quiet -dumpbase test_movzbl.c -m32
-mtune=generic -march=x86-64 -auxbase test_movzbl -O2 -version -o test_movzbl.s
GNU C (GCC) version 4.7.0 20110615 (experimental) (x86_64-unknown-linux-gnu)
        compiled by GNU C version 4.4.3, GMP version 4.3.1, MPFR version 2.4.2,
MPC version 0.8.1
GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug rtl-optimization/50088] movzbl is generated instead of movl
  2011-08-15 13:08 [Bug rtl-optimization/50088] New: movzbl is generated instead of movl enkovich.gnu at gmail dot com
@ 2011-08-15 13:14 ` rguenth at gcc dot gnu.org
  2011-08-15 13:30 ` enkovich.gnu at gmail dot com
                   ` (18 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: rguenth at gcc dot gnu.org @ 2011-08-15 13:14 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50088

--- Comment #1 from Richard Guenther <rguenth at gcc dot gnu.org> 2011-08-15 13:07:50 UTC ---
I don't think we know at this point that the data is already properly
zero-extended.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug rtl-optimization/50088] movzbl is generated instead of movl
  2011-08-15 13:08 [Bug rtl-optimization/50088] New: movzbl is generated instead of movl enkovich.gnu at gmail dot com
  2011-08-15 13:14 ` [Bug rtl-optimization/50088] " rguenth at gcc dot gnu.org
@ 2011-08-15 13:30 ` enkovich.gnu at gmail dot com
  2011-08-15 14:56 ` hjl.tools at gmail dot com
                   ` (17 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: enkovich.gnu at gmail dot com @ 2011-08-15 13:30 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50088

--- Comment #2 from Ilya Enkovich <enkovich.gnu at gmail dot com> 2011-08-15 13:24:05 UTC ---
Actually we do not need any zero extensions here. Zero extended load appears
only after IRA if we have to spill/fill register.

Here is c code from reproducer:

      n1 = (n1 + 1) & 15;
      s += arr[i] << n1;

RTL before IRA:

(insn 67 66 68 4 (parallel [
            (set (reg/v:SI 97 [ n1 ])
                (plus:SI (reg/v:SI 97 [ n1 ])
                    (const_int 1 [0x1])))
            (clobber (reg:CC 17 flags))
        ]) test_movzbl.c:18 249 {*addsi_1}
     (expr_list:REG_UNUSED (reg:CC 17 flags)
        (nil)))

(insn 68 67 70 4 (parallel [
            (set (reg/v:SI 97 [ n1 ])
                (and:SI (reg/v:SI 97 [ n1 ])
                    (const_int 15 [0xf])))
            (clobber (reg:CC 17 flags))
        ]) test_movzbl.c:18 385 {*andsi_1}
     (expr_list:REG_UNUSED (reg:CC 17 flags)
        (nil)))

(insn 70 68 71 4 (set (reg:SI 262)
        (mem:SI (reg:SI 224 [ ivtmp.52 ]) [2 MEM[base: D.2889_232, offset:
0B]+0 S4 A32])) test_movzbl.c:20 64 {*movsi_internal}
     (nil))

(insn 71 70 72 4 (parallel [
            (set (reg:SI 262)
                (ashift:SI (reg:SI 262)
                    (subreg:QI (reg/v:SI 97 [ n1 ]) 0)))
            (clobber (reg:CC 17 flags))
        ]) test_movzbl.c:20 502 {*ashlsi3_1}
     (expr_list:REG_UNUSED (reg:CC 17 flags)
        (expr_list:REG_EQUAL (ashift:SI (mem:SI (reg:SI 224 [ ivtmp.52 ]) [2
MEM[base: D.2889_232, offset: 0B]+0 S4 A32])
                (subreg:QI (reg/v:SI 97 [ n1 ]) 0))
            (nil))))

IRA then introduces fill for shift instruction and use byte load for it:

(insn 155 70 71 4 (set (reg:QI 2 cx)
        (mem/c:QI (reg/f:SI 7 sp) [4 %sfp+-28 S1 A32])) test_movzbl.c:20 66
{*movqi_internal}
     (nil))

(insn 71 155 72 4 (parallel [
            (set (reg:SI 5 di [262])
                (ashift:SI (reg:SI 5 di [262])
                    (reg:QI 2 cx)))
            (clobber (reg:CC 17 flags))
        ]) test_movzbl.c:20 502 {*ashlsi3_1}
     (expr_list:REG_EQUAL (ashift:SI (mem:SI (reg:SI 0 ax [orig:224 ivtmp.52 ]
[224]) [2 MEM[base: D.2889_232, offset: 0B]+0 S4 A32])
            (subreg:QI (mem/c:SI (reg/f:SI 7 sp) [4 %sfp+-28 S4 A32]) 0))
        (nil)))

Load for shift then is emitted as movzbl.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug rtl-optimization/50088] movzbl is generated instead of movl
  2011-08-15 13:08 [Bug rtl-optimization/50088] New: movzbl is generated instead of movl enkovich.gnu at gmail dot com
  2011-08-15 13:14 ` [Bug rtl-optimization/50088] " rguenth at gcc dot gnu.org
  2011-08-15 13:30 ` enkovich.gnu at gmail dot com
@ 2011-08-15 14:56 ` hjl.tools at gmail dot com
  2011-08-15 15:32 ` rguenth at gcc dot gnu.org
                   ` (16 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: hjl.tools at gmail dot com @ 2011-08-15 14:56 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50088

H.J. Lu <hjl.tools at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |hjl.tools at gmail dot com

--- Comment #3 from H.J. Lu <hjl.tools at gmail dot com> 2011-08-15 14:47:51 UTC ---
It is done on purpose:

  /* X86_TUNE_MOVX: Enable to zero extend integer registers to avoid
     partial dependencies.  */
  m_PPRO | m_P4_NOCONA | m_CORE2I7 | m_ATOM | m_GEODE | m_AMD_MULTIPLE  |
m_GENERIC,

You can remove m_ATOM to see its performance impact with -mtune=atom
on Atom.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug rtl-optimization/50088] movzbl is generated instead of movl
  2011-08-15 13:08 [Bug rtl-optimization/50088] New: movzbl is generated instead of movl enkovich.gnu at gmail dot com
                   ` (2 preceding siblings ...)
  2011-08-15 14:56 ` hjl.tools at gmail dot com
@ 2011-08-15 15:32 ` rguenth at gcc dot gnu.org
  2011-08-15 16:01 ` hjl.tools at gmail dot com
                   ` (15 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: rguenth at gcc dot gnu.org @ 2011-08-15 15:32 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50088

--- Comment #4 from Richard Guenther <rguenth at gcc dot gnu.org> 2011-08-15 15:26:19 UTC ---
(In reply to comment #3)
> It is done on purpose:
> 
>   /* X86_TUNE_MOVX: Enable to zero extend integer registers to avoid
>      partial dependencies.  */
>   m_PPRO | m_P4_NOCONA | m_CORE2I7 | m_ATOM | m_GEODE | m_AMD_MULTIPLE  |
> m_GENERIC,
> 
> You can remove m_ATOM to see its performance impact with -mtune=atom
> on Atom.

Well, yes, I think the proposal was to spill/load the full SImode instead
which would avoid both the partial dependency and the mismatched load/store
size.  No?


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug rtl-optimization/50088] movzbl is generated instead of movl
  2011-08-15 13:08 [Bug rtl-optimization/50088] New: movzbl is generated instead of movl enkovich.gnu at gmail dot com
                   ` (3 preceding siblings ...)
  2011-08-15 15:32 ` rguenth at gcc dot gnu.org
@ 2011-08-15 16:01 ` hjl.tools at gmail dot com
  2011-08-15 17:27 ` hjl.tools at gmail dot com
                   ` (14 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: hjl.tools at gmail dot com @ 2011-08-15 16:01 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50088

--- Comment #5 from H.J. Lu <hjl.tools at gmail dot com> 2011-08-15 15:47:39 UTC ---
(In reply to comment #4)
> (In reply to comment #3)
> > It is done on purpose:
> > 
> >   /* X86_TUNE_MOVX: Enable to zero extend integer registers to avoid
> >      partial dependencies.  */
> >   m_PPRO | m_P4_NOCONA | m_CORE2I7 | m_ATOM | m_GEODE | m_AMD_MULTIPLE  |
> > m_GENERIC,
> > 
> > You can remove m_ATOM to see its performance impact with -mtune=atom
> > on Atom.
> 
> Well, yes, I think the proposal was to spill/load the full SImode instead
> which would avoid both the partial dependency and the mismatched load/store
> size.  No?

It is for movqi.  We can only safely replace mozbl with movl if
the source is 4byte aligned.  It should a new backend option.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug rtl-optimization/50088] movzbl is generated instead of movl
  2011-08-15 13:08 [Bug rtl-optimization/50088] New: movzbl is generated instead of movl enkovich.gnu at gmail dot com
                   ` (4 preceding siblings ...)
  2011-08-15 16:01 ` hjl.tools at gmail dot com
@ 2011-08-15 17:27 ` hjl.tools at gmail dot com
  2011-08-15 17:35 ` hjl.tools at gmail dot com
                   ` (13 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: hjl.tools at gmail dot com @ 2011-08-15 17:27 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50088

--- Comment #6 from H.J. Lu <hjl.tools at gmail dot com> 2011-08-15 17:19:22 UTC ---
Created attachment 25019
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=25019
A patch


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug rtl-optimization/50088] movzbl is generated instead of movl
  2011-08-15 13:08 [Bug rtl-optimization/50088] New: movzbl is generated instead of movl enkovich.gnu at gmail dot com
                   ` (5 preceding siblings ...)
  2011-08-15 17:27 ` hjl.tools at gmail dot com
@ 2011-08-15 17:35 ` hjl.tools at gmail dot com
  2011-08-16  6:57 ` enkovich.gnu at gmail dot com
                   ` (12 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: hjl.tools at gmail dot com @ 2011-08-15 17:35 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50088

H.J. Lu <hjl.tools at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2011-08-15
     Ever Confirmed|0                           |1

--- Comment #7 from H.J. Lu <hjl.tools at gmail dot com> 2011-08-15 17:26:50 UTC ---
(In reply to comment #6)
> Created attachment 25019 [details]
> A patch

misaligned_operand instead of aligned_operand is used since 
aligned_operand has

  /* All patterns using aligned_operand on memory operands ends up
     in promoting memory operand to 64bit and thus causing memory mismatch.  */
  if (TARGET_MEMORY_MISMATCH_STALL && !optimize_insn_for_size_p ())
    return false;

which isn't true for *movqi_internal.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug rtl-optimization/50088] movzbl is generated instead of movl
  2011-08-15 13:08 [Bug rtl-optimization/50088] New: movzbl is generated instead of movl enkovich.gnu at gmail dot com
                   ` (6 preceding siblings ...)
  2011-08-15 17:35 ` hjl.tools at gmail dot com
@ 2011-08-16  6:57 ` enkovich.gnu at gmail dot com
  2011-08-16  7:31 ` enkovich.gnu at gmail dot com
                   ` (11 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: enkovich.gnu at gmail dot com @ 2011-08-16  6:57 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50088

--- Comment #8 from Ilya Enkovich <enkovich.gnu at gmail dot com> 2011-08-16 06:55:34 UTC ---
(In reply to comment #4)
> 
> Well, yes, I think the proposal was to spill/load the full SImode instead
> which would avoid both the partial dependency and the mismatched load/store
> size.  No?

Yes, I think we should generate full SImode spill/load.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug rtl-optimization/50088] movzbl is generated instead of movl
  2011-08-15 13:08 [Bug rtl-optimization/50088] New: movzbl is generated instead of movl enkovich.gnu at gmail dot com
                   ` (7 preceding siblings ...)
  2011-08-16  6:57 ` enkovich.gnu at gmail dot com
@ 2011-08-16  7:31 ` enkovich.gnu at gmail dot com
  2011-08-16 14:46 ` hjl.tools at gmail dot com
                   ` (10 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: enkovich.gnu at gmail dot com @ 2011-08-16  7:31 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50088

--- Comment #9 from Ilya Enkovich <enkovich.gnu at gmail dot com> 2011-08-16 07:28:33 UTC ---
(In reply to comment #5)
> 
> It is for movqi.  We can only safely replace mozbl with movl if
> the source is 4byte aligned.  It should a new backend option.

That should work. 

A better solution here would be to not generate movqi at all. But probably it
was performed intentionally and is profitable for some platforms. In this case
we should choose movl generation for movqi.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug rtl-optimization/50088] movzbl is generated instead of movl
  2011-08-15 13:08 [Bug rtl-optimization/50088] New: movzbl is generated instead of movl enkovich.gnu at gmail dot com
                   ` (8 preceding siblings ...)
  2011-08-16  7:31 ` enkovich.gnu at gmail dot com
@ 2011-08-16 14:46 ` hjl.tools at gmail dot com
  2011-08-16 14:47 ` hjl.tools at gmail dot com
                   ` (9 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: hjl.tools at gmail dot com @ 2011-08-16 14:46 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50088

--- Comment #10 from H.J. Lu <hjl.tools at gmail dot com> 2011-08-16 14:23:39 UTC ---
The real problem is the store forward issue on Atom:

        addl    $1, 4(%esp)     # 67    *addsi_1/2      [length = 5]
        andl    $15, 4(%esp)    # 68    *andsi_1/1      [length = 5]
        movl    (%eax), %edi    # 70    *movsi_internal/1       [length = 2]
        movzbl  4(%esp), %ecx   # 154   *movqi_internal/3       [length = 5]

That is we write 32bit and read 8bit, which performs very
poorly on Atom. We should write/read the same size.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug rtl-optimization/50088] movzbl is generated instead of movl
  2011-08-15 13:08 [Bug rtl-optimization/50088] New: movzbl is generated instead of movl enkovich.gnu at gmail dot com
                   ` (9 preceding siblings ...)
  2011-08-16 14:46 ` hjl.tools at gmail dot com
@ 2011-08-16 14:47 ` hjl.tools at gmail dot com
  2011-08-16 17:13 ` hjl.tools at gmail dot com
                   ` (8 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: hjl.tools at gmail dot com @ 2011-08-16 14:47 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50088

--- Comment #11 from H.J. Lu <hjl.tools at gmail dot com> 2011-08-16 14:45:27 UTC ---
Can we model shift instructions to take any QI/HI/SI/DI register
as shift count and make IRA to match the size when reading/writing
shift count?


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug rtl-optimization/50088] movzbl is generated instead of movl
  2011-08-15 13:08 [Bug rtl-optimization/50088] New: movzbl is generated instead of movl enkovich.gnu at gmail dot com
                   ` (10 preceding siblings ...)
  2011-08-16 14:47 ` hjl.tools at gmail dot com
@ 2011-08-16 17:13 ` hjl.tools at gmail dot com
  2011-08-17  9:12 ` enkovich.gnu at gmail dot com
                   ` (7 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: hjl.tools at gmail dot com @ 2011-08-16 17:13 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50088

H.J. Lu <hjl.tools at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #25019|0                           |1
        is obsolete|                            |

--- Comment #12 from H.J. Lu <hjl.tools at gmail dot com> 2011-08-16 16:43:55 UTC ---
Created attachment 25025
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=25025
A patch to use the same mode for shift count

This is an untested patch to use the same mode for shift count.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug rtl-optimization/50088] movzbl is generated instead of movl
  2011-08-15 13:08 [Bug rtl-optimization/50088] New: movzbl is generated instead of movl enkovich.gnu at gmail dot com
                   ` (11 preceding siblings ...)
  2011-08-16 17:13 ` hjl.tools at gmail dot com
@ 2011-08-17  9:12 ` enkovich.gnu at gmail dot com
  2011-08-17 13:42 ` hjl.tools at gmail dot com
                   ` (6 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: enkovich.gnu at gmail dot com @ 2011-08-17  9:12 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50088

--- Comment #13 from Ilya Enkovich <enkovich.gnu at gmail dot com> 2011-08-17 09:07:20 UTC ---
(In reply to comment #12)
> Created attachment 25025 [details]
> A patch to use the same mode for shift count
> 
> This is an untested patch to use the same mode for shift count.

We should find solution for the general problem. Not for its specific
appearance in reproducer. 

We may have the same issue for any other instructions consuming byte register
and it is better to fix the source of the problem (which is I suppose in IRA)
and do not introduce workaround for each such instruction.

BTW I think you should not increase size of immediate operands in your patch.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug rtl-optimization/50088] movzbl is generated instead of movl
  2011-08-15 13:08 [Bug rtl-optimization/50088] New: movzbl is generated instead of movl enkovich.gnu at gmail dot com
                   ` (12 preceding siblings ...)
  2011-08-17  9:12 ` enkovich.gnu at gmail dot com
@ 2011-08-17 13:42 ` hjl.tools at gmail dot com
  2011-08-17 14:29 ` enkovich.gnu at gmail dot com
                   ` (5 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: hjl.tools at gmail dot com @ 2011-08-17 13:42 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50088

--- Comment #14 from H.J. Lu <hjl.tools at gmail dot com> 2011-08-17 13:37:39 UTC ---
(In reply to comment #13)
> (In reply to comment #12)
> > Created attachment 25025 [details]
> > A patch to use the same mode for shift count
> > 
> > This is an untested patch to use the same mode for shift count.
> 
> We should find solution for the general problem. Not for its specific
> appearance in reproducer. 
> 
> We may have the same issue for any other instructions consuming byte register
> and it is better to fix the source of the problem (which is I suppose in IRA)
> and do not introduce workaround for each such instruction.

I think this problem is unique to x86 since some instructions have
different sizes in register operands.  In this example, shift count
is CL regardless the source operand size. I am not sure how much RA
can help here. By making register operands in shift instructions to
have the same size (32bit or less), it may work for most cases.

> BTW I think you should not increase size of immediate operands in your patch.

My patch is totally untested and probably is wrong.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug rtl-optimization/50088] movzbl is generated instead of movl
  2011-08-15 13:08 [Bug rtl-optimization/50088] New: movzbl is generated instead of movl enkovich.gnu at gmail dot com
                   ` (13 preceding siblings ...)
  2011-08-17 13:42 ` hjl.tools at gmail dot com
@ 2011-08-17 14:29 ` enkovich.gnu at gmail dot com
  2011-08-17 14:43 ` hjl.tools at gmail dot com
                   ` (4 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: enkovich.gnu at gmail dot com @ 2011-08-17 14:29 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50088

--- Comment #15 from Ilya Enkovich <enkovich.gnu at gmail dot com> 2011-08-17 14:16:27 UTC ---
(In reply to comment #14)
> 
> I think this problem is unique to x86 since some instructions have
> different sizes in register operands.  In this example, shift count
> is CL regardless the source operand size. I am not sure how much RA
> can help here. By making register operands in shift instructions to
> have the same size (32bit or less), it may work for most cases.
> 
We have a problem due to different sizes of spill and load generated by IRA for
the same var. I'm not sure that by patching shift instructions we cover all
cases when IRA may do that.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug rtl-optimization/50088] movzbl is generated instead of movl
  2011-08-15 13:08 [Bug rtl-optimization/50088] New: movzbl is generated instead of movl enkovich.gnu at gmail dot com
                   ` (14 preceding siblings ...)
  2011-08-17 14:29 ` enkovich.gnu at gmail dot com
@ 2011-08-17 14:43 ` hjl.tools at gmail dot com
  2011-08-17 14:54 ` rguenth at gcc dot gnu.org
                   ` (3 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: hjl.tools at gmail dot com @ 2011-08-17 14:43 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50088

--- Comment #16 from H.J. Lu <hjl.tools at gmail dot com> 2011-08-17 14:37:35 UTC ---
The testcase has ...

  int n8 = (arr[7] * 9 + 8) & 15;

  for (i = 0; i < len; i+=8)
    {
      n1 = (n1 + 1) & 15;

      s += arr[i] << n1;

The shift count is 32bit, which causes 32bit spill. Since shift/rotate
instructions only take 8bit register (CL) as shift count, we load 8bit
into CL. How do we solve this?


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug rtl-optimization/50088] movzbl is generated instead of movl
  2011-08-15 13:08 [Bug rtl-optimization/50088] New: movzbl is generated instead of movl enkovich.gnu at gmail dot com
                   ` (15 preceding siblings ...)
  2011-08-17 14:43 ` hjl.tools at gmail dot com
@ 2011-08-17 14:54 ` rguenth at gcc dot gnu.org
  2011-08-17 21:32 ` hjl.tools at gmail dot com
                   ` (2 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: rguenth at gcc dot gnu.org @ 2011-08-17 14:54 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50088

--- Comment #17 from Richard Guenther <rguenth at gcc dot gnu.org> 2011-08-17 14:42:11 UTC ---
(In reply to comment #16)
> The testcase has ...
> 
>   int n8 = (arr[7] * 9 + 8) & 15;
> 
>   for (i = 0; i < len; i+=8)
>     {
>       n1 = (n1 + 1) & 15;
> 
>       s += arr[i] << n1;
> 
> The shift count is 32bit, which causes 32bit spill. Since shift/rotate
> instructions only take 8bit register (CL) as shift count, we load 8bit
> into CL. How do we solve this?

Well, always load from the spill slot with the mode the reg was spilled
if the arch says that is prefered.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug rtl-optimization/50088] movzbl is generated instead of movl
  2011-08-15 13:08 [Bug rtl-optimization/50088] New: movzbl is generated instead of movl enkovich.gnu at gmail dot com
                   ` (16 preceding siblings ...)
  2011-08-17 14:54 ` rguenth at gcc dot gnu.org
@ 2011-08-17 21:32 ` hjl.tools at gmail dot com
  2011-08-17 22:32 ` hjl.tools at gmail dot com
  2011-08-17 22:49 ` hjl.tools at gmail dot com
  19 siblings, 0 replies; 21+ messages in thread
From: hjl.tools at gmail dot com @ 2011-08-17 21:32 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50088

H.J. Lu <hjl.tools at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #25025|0                           |1
        is obsolete|                            |

--- Comment #18 from H.J. Lu <hjl.tools at gmail dot com> 2011-08-17 21:30:50 UTC ---
Created attachment 25040
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=25040
A new patch


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug rtl-optimization/50088] movzbl is generated instead of movl
  2011-08-15 13:08 [Bug rtl-optimization/50088] New: movzbl is generated instead of movl enkovich.gnu at gmail dot com
                   ` (17 preceding siblings ...)
  2011-08-17 21:32 ` hjl.tools at gmail dot com
@ 2011-08-17 22:32 ` hjl.tools at gmail dot com
  2011-08-17 22:49 ` hjl.tools at gmail dot com
  19 siblings, 0 replies; 21+ messages in thread
From: hjl.tools at gmail dot com @ 2011-08-17 22:32 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50088

H.J. Lu <hjl.tools at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #25040|0                           |1
        is obsolete|                            |

--- Comment #19 from H.J. Lu <hjl.tools at gmail dot com> 2011-08-17 22:28:55 UTC ---
Created attachment 25042
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=25042
A updated patch


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug rtl-optimization/50088] movzbl is generated instead of movl
  2011-08-15 13:08 [Bug rtl-optimization/50088] New: movzbl is generated instead of movl enkovich.gnu at gmail dot com
                   ` (18 preceding siblings ...)
  2011-08-17 22:32 ` hjl.tools at gmail dot com
@ 2011-08-17 22:49 ` hjl.tools at gmail dot com
  19 siblings, 0 replies; 21+ messages in thread
From: hjl.tools at gmail dot com @ 2011-08-17 22:49 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50088

H.J. Lu <hjl.tools at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #25042|0                           |1
        is obsolete|                            |

--- Comment #20 from H.J. Lu <hjl.tools at gmail dot com> 2011-08-17 22:38:40 UTC ---
Created attachment 25043
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=25043
An updated patch


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2011-08-17 22:39 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-08-15 13:08 [Bug rtl-optimization/50088] New: movzbl is generated instead of movl enkovich.gnu at gmail dot com
2011-08-15 13:14 ` [Bug rtl-optimization/50088] " rguenth at gcc dot gnu.org
2011-08-15 13:30 ` enkovich.gnu at gmail dot com
2011-08-15 14:56 ` hjl.tools at gmail dot com
2011-08-15 15:32 ` rguenth at gcc dot gnu.org
2011-08-15 16:01 ` hjl.tools at gmail dot com
2011-08-15 17:27 ` hjl.tools at gmail dot com
2011-08-15 17:35 ` hjl.tools at gmail dot com
2011-08-16  6:57 ` enkovich.gnu at gmail dot com
2011-08-16  7:31 ` enkovich.gnu at gmail dot com
2011-08-16 14:46 ` hjl.tools at gmail dot com
2011-08-16 14:47 ` hjl.tools at gmail dot com
2011-08-16 17:13 ` hjl.tools at gmail dot com
2011-08-17  9:12 ` enkovich.gnu at gmail dot com
2011-08-17 13:42 ` hjl.tools at gmail dot com
2011-08-17 14:29 ` enkovich.gnu at gmail dot com
2011-08-17 14:43 ` hjl.tools at gmail dot com
2011-08-17 14:54 ` rguenth at gcc dot gnu.org
2011-08-17 21:32 ` hjl.tools at gmail dot com
2011-08-17 22:32 ` hjl.tools at gmail dot com
2011-08-17 22:49 ` hjl.tools at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).