public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug regression/67609] New: [Regression] Generates wrong code for SSE2 _mm_load_pd
@ 2015-09-17 11:37 bisqwit at iki dot fi
  2015-09-17 12:51 ` [Bug target/67609] [5/6 Regression] " rguenth at gcc dot gnu.org
                   ` (31 more replies)
  0 siblings, 32 replies; 33+ messages in thread
From: bisqwit at iki dot fi @ 2015-09-17 11:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67609

            Bug ID: 67609
           Summary: [Regression] Generates wrong code for SSE2 _mm_load_pd
           Product: gcc
           Version: 5.2.1
            Status: UNCONFIRMED
          Severity: major
          Priority: P3
         Component: regression
          Assignee: unassigned at gcc dot gnu.org
          Reporter: bisqwit at iki dot fi
  Target Milestone: ---

For this program (needs -msse2 to compile).

    #include <emmintrin.h>
    __m128d reg;
    void set_lower(double b)
    {
        double v[2];
        _mm_store_pd(v, reg);
        v[0] = b;
        reg = _mm_load_pd(v);
    }

On optimization levels -O1 and up, GCC 5.2 incorrectly generates code that
destroys the upper half of reg.
        movapd  %xmm0, %xmm1
        movaps  %xmm1, reg(%rip)

On -O0, the bug does not occur.
If the index expression is changed into an expression whose value is not known
at compile-time, the code will work properly.

GCC 4.9 does this correctly (if with bit too much labor):

        movdqa  reg(%rip), %xmm1
        movaps  %xmm1, -24(%rsp)
        movsd   %xmm0, -24(%rsp)
        movapd  -24(%rsp), %xmm2
        movaps  %xmm2, reg(%rip)

For comparison, Clang 3.4 and 3.5:
        movlpd  %xmm0, reg(%rip)

For comparison, Clang 3.6:
        movaps  reg(%rip), %xmm1
        movsd   %xmm0, %xmm1
        movaps  %xmm1, reg(%rip)


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Bug target/67609] [5/6 Regression] Generates wrong code for SSE2 _mm_load_pd
  2015-09-17 11:37 [Bug regression/67609] New: [Regression] Generates wrong code for SSE2 _mm_load_pd bisqwit at iki dot fi
@ 2015-09-17 12:51 ` rguenth at gcc dot gnu.org
  2015-09-17 13:13 ` ubizjak at gmail dot com
                   ` (30 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-09-17 12:51 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67609

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Target|                            |x86_64-*-*
             Status|UNCONFIRMED                 |NEW
      Known to work|                            |4.9.3
           Keywords|                            |ra, wrong-code
   Last reconfirmed|                            |2015-09-17
          Component|regression                  |target
                 CC|                            |uros at gcc dot gnu.org,
                   |                            |vmakarov at gcc dot gnu.org
     Ever confirmed|0                           |1
            Summary|[Regression] Generates      |[5/6 Regression] Generates
                   |wrong code for SSE2         |wrong code for SSE2
                   |_mm_load_pd                 |_mm_load_pd
   Target Milestone|---                         |5.3

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
;; MEM[(__m128d * {ref-all})&v] = reg.0_2;

(insn 6 5 0 (set (reg/v:TI 90 [ v ])
        (mem/c:TI (symbol_ref:DI ("reg") [flags 0x2]  <var_decl 0x7fba2a03eb40
reg>) [0 reg+0 S16 A128]))
/abuild/rguenther/trunk-g/gcc/include/emmintrin.h:161 -1
     (nil))


;; v[0] = b_4(D);

(insn 7 6 0 (set (subreg:DF (reg/v:TI 90 [ v ]) 0)
        (reg/v:DF 88 [ b ])) t.c:7 -1
     (nil))

;; reg = _6;

(insn 8 7 0 (set (mem/c:V2DF (symbol_ref:DI ("reg") [flags 0x2]  <var_decl
0x7fba2a03eb40 reg>) [0 reg+0 S16 A128])
        (subreg:V2DF (reg/v:TI 90 [ v ]) 0)) t.c:8 -1
     (nil))

the subreg set is expected to preserve the upper part.  It later gets
later assigned *movdf_internal - so eventually we'd have expected
a lowpart instead.  Not sure about bigger-than wordmode regs and subregs...

GCC 4.9 and before go through the stack extensively (otherwise identical
GIMPLE IL and RTL expansion though).  So it looks like a backend
(pattern constraints?) or RA related bug (the issue appears after IRA/reload).


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Bug target/67609] [5/6 Regression] Generates wrong code for SSE2 _mm_load_pd
  2015-09-17 11:37 [Bug regression/67609] New: [Regression] Generates wrong code for SSE2 _mm_load_pd bisqwit at iki dot fi
  2015-09-17 12:51 ` [Bug target/67609] [5/6 Regression] " rguenth at gcc dot gnu.org
@ 2015-09-17 13:13 ` ubizjak at gmail dot com
  2015-09-17 13:48 ` rguenth at gcc dot gnu.org
                   ` (29 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: ubizjak at gmail dot com @ 2015-09-17 13:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67609

--- Comment #3 from Uroš Bizjak <ubizjak at gmail dot com> ---
The doc says:

          When used as an lvalue, 'subreg' is a word-based accessor.
          Storing to a 'subreg' modifies all the words of REG that
          overlap the 'subreg', but it leaves the other words of REG
          alone.

          When storing to a normal 'subreg' that is smaller than a word,
          the other bits of the referenced word are usually left in an
          undefined state.  This laxity makes it easier to generate
          efficient code for such instructions.  To represent an
          instruction that preserves all the bits outside of those in
          the 'subreg', use 'strict_low_part' or 'zero_extract' around
          the 'subreg'.

However, we expand assignment to v[0] with:

;; v[0] = b_4(D);

(insn 7 6 0 (set (subreg:DF (reg/v:TI 90 [ v ]) 0)
        (reg/v:DF 88 [ b ])) pr67609.c:8 -1
     (nil))

According to the above explanation, a strict_low_part should be used here.

I think this is middle-end, not a target problem.
>From gcc-bugs-return-497407-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org Thu Sep 17 13:42:37 2015
Return-Path: <gcc-bugs-return-497407-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Delivered-To: listarch-gcc-bugs@gcc.gnu.org
Received: (qmail 19395 invoked by alias); 17 Sep 2015 13:42:37 -0000
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
Delivered-To: mailing list gcc-bugs@gcc.gnu.org
Received: (qmail 19348 invoked by uid 48); 17 Sep 2015 13:42:33 -0000
From: "ebotcazou at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug rtl-optimization/66790] Invalid uninitialized register handling in REE
Date: Thu, 17 Sep 2015 13:42:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: rtl-optimization
X-Bugzilla-Version: 6.0
X-Bugzilla-Keywords:
X-Bugzilla-Severity: normal
X-Bugzilla-Who: ebotcazou at gcc dot gnu.org
X-Bugzilla-Status: NEW
X-Bugzilla-Resolution:
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields:
Message-ID: <bug-66790-4-DyFxWyc6Ex@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-66790-4@http.gcc.gnu.org/bugzilla/>
References: <bug-66790-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2015-09/txt/msg01385.txt.bz2
Content-length: 2638

https://gcc.gnu.org/bugzilla/show_bug.cgi?idf790

--- Comment #22 from Eric Botcazou <ebotcazou at gcc dot gnu.org> ---
> A native English speaker would say that this could be written better.   I
> would suggest the term "on at least one path" and this is what a MAY problem
> is generally defined as.

Thanks.  Here's a proposed amended wording:

Index: df-problems.c
==================================================================--- df-problems.c       (revision 227819)
+++ df-problems.c       (working copy)
@@ -1309,22 +1309,23 @@ df_lr_verify_transfer_functions (void)


 /*----------------------------------------------------------------------------
-   LIVE AND MUST-INITIALIZED REGISTERS.
+   LIVE AND MAY-INITIALIZED REGISTERS.

    This problem first computes the IN and OUT bitvectors for the
-   must-initialized registers problems, which is a forward problem.
-   It gives the set of registers for which we MUST have an available
-   definition on any path from the entry block to the entry/exit of
-   a basic block.  Sets generate a definition, while clobbers kill
+   may-initialized registers problems, which is a forward problem.
+   It gives the set of registers for which we MAY have an available
+   definition, i.e. for which there is an available definition on
+   at least one path from the entry block to the entry/exit of a
+   basic block.  Sets generate a definition, while clobbers kill
    a definition.

    In and out bitvectors are built for each basic block and are indexed by
    regnum (see df.h for details).  In and out bitvectors in struct
-   df_live_bb_info actually refers to the must-initialized problem;
+   df_live_bb_info actually refers to the may-initialized problem;

    Then, the in and out sets for the LIVE problem itself are computed.
    These are the logical AND of the IN and OUT sets from the LR problem
-   and the must-initialized problem.
+   and the may-initialized problem.
 ----------------------------------------------------------------------------*/

 /* Private data used to verify the solution for this problem.  */
@@ -1531,7 +1532,7 @@ df_live_confluence_n (edge e)
 }


-/* Transfer function for the forwards must-initialized problem.  */
+/* Transfer function for the forwards may-initialized problem.  */

 static bool
 df_live_transfer_function (int bb_index)
@@ -1555,7 +1556,7 @@ df_live_transfer_function (int bb_index)
 }


-/* And the LR info with the must-initialized registers, to produce the LIVE
info.  */
+/* And the LR info with the may-initialized registers, to produce the LIVE
info.  */

 static void
 df_live_finalize (bitmap all_blocks)


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Bug target/67609] [5/6 Regression] Generates wrong code for SSE2 _mm_load_pd
  2015-09-17 11:37 [Bug regression/67609] New: [Regression] Generates wrong code for SSE2 _mm_load_pd bisqwit at iki dot fi
  2015-09-17 12:51 ` [Bug target/67609] [5/6 Regression] " rguenth at gcc dot gnu.org
  2015-09-17 13:13 ` ubizjak at gmail dot com
@ 2015-09-17 13:48 ` rguenth at gcc dot gnu.org
  2015-09-17 18:09 ` [Bug regression/67609] " bisqwit at iki dot fi
                   ` (28 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-09-17 13:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67609

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Uroš Bizjak from comment #3)
> The doc says:
> 
>           When used as an lvalue, 'subreg' is a word-based accessor.
>           Storing to a 'subreg' modifies all the words of REG that
>           overlap the 'subreg', but it leaves the other words of REG
>           alone.

But UNITS_PER_WORD is 8 so (subreg:DF (TI)) should leave the upper half
of the TImode register unchanged.

>           When storing to a normal 'subreg' that is smaller than a word,
>           the other bits of the referenced word are usually left in an
>           undefined state.  This laxity makes it easier to generate
>           efficient code for such instructions.  To represent an
>           instruction that preserves all the bits outside of those in
>           the 'subreg', use 'strict_low_part' or 'zero_extract' around
>           the 'subreg'.
> 
> However, we expand assignment to v[0] with:
> 
> ;; v[0] = b_4(D);
> 
> (insn 7 6 0 (set (subreg:DF (reg/v:TI 90 [ v ]) 0)
>         (reg/v:DF 88 [ b ])) pr67609.c:8 -1
>      (nil))
> 
> According to the above explanation, a strict_low_part should be used here.
> 
> I think this is middle-end, not a target problem.
>From gcc-bugs-return-497409-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org Thu Sep 17 13:50:27 2015
Return-Path: <gcc-bugs-return-497409-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Delivered-To: listarch-gcc-bugs@gcc.gnu.org
Received: (qmail 55680 invoked by alias); 17 Sep 2015 13:50:27 -0000
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
Delivered-To: mailing list gcc-bugs@gcc.gnu.org
Received: (qmail 55604 invoked by uid 48); 17 Sep 2015 13:50:23 -0000
From: "rguenth at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/67609] [5/6 Regression] Generates wrong code for SSE2 _mm_load_pd
Date: Thu, 17 Sep 2015 13:50:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: target
X-Bugzilla-Version: 5.2.1
X-Bugzilla-Keywords: ra, wrong-code
X-Bugzilla-Severity: major
X-Bugzilla-Who: rguenth at gcc dot gnu.org
X-Bugzilla-Status: NEW
X-Bugzilla-Resolution:
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: 5.3
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields:
Message-ID: <bug-67609-4-9oAPRuHk5L@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-67609-4@http.gcc.gnu.org/bugzilla/>
References: <bug-67609-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2015-09/txt/msg01387.txt.bz2
Content-length: 312

https://gcc.gnu.org/bugzilla/show_bug.cgi?idg609

--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
It does the "correct" thing for

#include <emmintrin.h>
__m128d reg;
void set_lower(float b)
{
  float v[4];
  _mm_store_pd((double *)v, reg);
  v[0] = b;
  reg = _mm_load_pd((double *)v);
}


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Bug regression/67609] [5/6 Regression] Generates wrong code for SSE2 _mm_load_pd
  2015-09-17 11:37 [Bug regression/67609] New: [Regression] Generates wrong code for SSE2 _mm_load_pd bisqwit at iki dot fi
                   ` (2 preceding siblings ...)
  2015-09-17 13:48 ` rguenth at gcc dot gnu.org
@ 2015-09-17 18:09 ` bisqwit at iki dot fi
  2015-09-17 19:03 ` [Bug rtl-optimization/67609] " ubizjak at gmail dot com
                   ` (27 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: bisqwit at iki dot fi @ 2015-09-17 18:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67609

Joel Yliluoma <bisqwit at iki dot fi> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|target                      |regression

--- Comment #6 from Joel Yliluoma <bisqwit at iki dot fi> ---
And also for _mm_load_ps in a similar situation. I did manage to get some error
to occur with floats too, but I'm yet to isolate the problem.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Bug rtl-optimization/67609] [5/6 Regression] Generates wrong code for SSE2 _mm_load_pd
  2015-09-17 11:37 [Bug regression/67609] New: [Regression] Generates wrong code for SSE2 _mm_load_pd bisqwit at iki dot fi
                   ` (3 preceding siblings ...)
  2015-09-17 18:09 ` [Bug regression/67609] " bisqwit at iki dot fi
@ 2015-09-17 19:03 ` ubizjak at gmail dot com
  2015-10-16 12:44 ` vmakarov at gcc dot gnu.org
                   ` (26 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: ubizjak at gmail dot com @ 2015-09-17 19:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67609

Uroš Bizjak <ubizjak at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|target                      |rtl-optimization

--- Comment #7 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to Richard Biener from comment #4)
> (In reply to Uroš Bizjak from comment #3)
> > The doc says:
> > 
> >           When used as an lvalue, 'subreg' is a word-based accessor.
> >           Storing to a 'subreg' modifies all the words of REG that
> >           overlap the 'subreg', but it leaves the other words of REG
> >           alone.
> 
> But UNITS_PER_WORD is 8 so (subreg:DF (TI)) should leave the upper half
> of the TImode register unchanged.

Indeed, and -m32 creates correct code. So, it is register allocator that fails.

Reconfirmed as rtl-optimization problem.
>From gcc-bugs-return-497440-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org Thu Sep 17 19:08:19 2015
Return-Path: <gcc-bugs-return-497440-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Delivered-To: listarch-gcc-bugs@gcc.gnu.org
Received: (qmail 106810 invoked by alias); 17 Sep 2015 19:08:19 -0000
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
Delivered-To: mailing list gcc-bugs@gcc.gnu.org
Received: (qmail 106512 invoked by uid 48); 17 Sep 2015 19:08:15 -0000
From: "jakub at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug tree-optimization/59124] [4.9/5/6 Regression] Wrong warnings "array subscript is above array bounds"
Date: Thu, 17 Sep 2015 19:08:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: tree-optimization
X-Bugzilla-Version: 4.8.3
X-Bugzilla-Keywords: diagnostic
X-Bugzilla-Severity: normal
X-Bugzilla-Who: jakub at gcc dot gnu.org
X-Bugzilla-Status: NEW
X-Bugzilla-Resolution:
X-Bugzilla-Priority: P2
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: 4.9.4
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields:
Message-ID: <bug-59124-4-ZYSmx8o735@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-59124-4@http.gcc.gnu.org/bugzilla/>
References: <bug-59124-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2015-09/txt/msg01418.txt.bz2
Content-length: 703

https://gcc.gnu.org/bugzilla/show_bug.cgi?idY124

--- Comment #26 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to baoshan from comment #25)
> > Why wrapping is well defined for unsigned types so adding 4294967295 is the
> > same as subtracting by 1.
>
> What is wrapping? and where it is defined? I don't know this part and I like
> to learn it.
> Thanks.

Just read the C or C++ standards?
E.g. C99, 6.2.5/9:
... "A computation involving unsigned operands can never overflow,
because a result that cannot be represented by the resulting unsigned integer
type is reduced modulo the number that is one greater than the largest value
that can be represented by the resulting type." ...


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Bug rtl-optimization/67609] [5/6 Regression] Generates wrong code for SSE2 _mm_load_pd
  2015-09-17 11:37 [Bug regression/67609] New: [Regression] Generates wrong code for SSE2 _mm_load_pd bisqwit at iki dot fi
                   ` (4 preceding siblings ...)
  2015-09-17 19:03 ` [Bug rtl-optimization/67609] " ubizjak at gmail dot com
@ 2015-10-16 12:44 ` vmakarov at gcc dot gnu.org
  2015-10-16 12:50 ` vmakarov at gcc dot gnu.org
                   ` (25 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: vmakarov at gcc dot gnu.org @ 2015-10-16 12:44 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67609

--- Comment #8 from Vladimir Makarov <vmakarov at gcc dot gnu.org> ---
(In reply to Uroš Bizjak from comment #7)
> (In reply to Richard Biener from comment #4)
> > (In reply to Uroš Bizjak from comment #3)
> > > The doc says:
> > > 
> > >           When used as an lvalue, 'subreg' is a word-based accessor.
> > >           Storing to a 'subreg' modifies all the words of REG that
> > >           overlap the 'subreg', but it leaves the other words of REG
> > >           alone.
> > 
> > But UNITS_PER_WORD is 8 so (subreg:DF (TI)) should leave the upper half
> > of the TImode register unchanged.
> 
> Indeed, and -m32 creates correct code. So, it is register allocator that
> fails.
> 
> Reconfirmed as rtl-optimization problem.

It is a quite interesting PR which reveals a long lasting latent bug in GCC.

Basically we have before LRA

    2: r90:DF=xmm0:DF
      REG_DEAD xmm0:DF
    3: NOTE_INSN_FUNCTION_BEG
    6: r89:TI=[`reg']
    7: r89:TI#0=r90:DF
      REG_DEAD r90:DF
    8: [`reg']=r89:TI#0

LRA and reload pass produces

    6: xmm1:TI=[`reg']
    7: xmm1:DF=xmm0:DF
    8: [`reg']=xmm1:V2DF

They does not do any transformations except transforming subreg of hard
register in insn #7.  And after that insn #6 is removed as a dead one by
subsequent optimizations.  In order to avoid removing insn #6 we need to keep
the subreg until the final pass:

    7: xmm1:TI#0=xmm0:DF

Why do LRA and reload remove subregs of hard registers? That is because some
subsequent optimizations can handle them.

Last two days I've been struggling to find solution which involves only LRA
(partial removing subreg of hard regs) but still failing.

In any case, even if I find such solution in LRA, it needs extensive testing on
other targets and probably it will be ready next week at the best.
>From gcc-bugs-return-499753-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org Fri Oct 16 12:44:59 2015
Return-Path: <gcc-bugs-return-499753-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Delivered-To: listarch-gcc-bugs@gcc.gnu.org
Received: (qmail 36422 invoked by alias); 16 Oct 2015 12:44:58 -0000
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
Delivered-To: mailing list gcc-bugs@gcc.gnu.org
Received: (qmail 36348 invoked by uid 48); 16 Oct 2015 12:44:54 -0000
From: "rguenth at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug middle-end/66311] [5 Regression] Problems with some integer(16) values
Date: Fri, 16 Oct 2015 12:44:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: middle-end
X-Bugzilla-Version: 5.1.1
X-Bugzilla-Keywords: wrong-code
X-Bugzilla-Severity: normal
X-Bugzilla-Who: rguenth at gcc dot gnu.org
X-Bugzilla-Status: ASSIGNED
X-Bugzilla-Resolution:
X-Bugzilla-Priority: P2
X-Bugzilla-Assigned-To: rsandifo at gcc dot gnu.org
X-Bugzilla-Target-Milestone: 5.3
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields: priority component short_desc
Message-ID: <bug-66311-4-Qn91PWMLlr@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-66311-4@http.gcc.gnu.org/bugzilla/>
References: <bug-66311-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2015-10/txt/msg01308.txt.bz2
Content-length: 649

https://gcc.gnu.org/bugzilla/show_bug.cgi?idf311

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P4                          |P2
          Component|fortran                     |middle-end
            Summary|[5/6 Regression] Problems   |[5 Regression] Problems
                   |with some integer(16)       |with some integer(16)
                   |values                      |values

--- Comment #19 from Richard Biener <rguenth at gcc dot gnu.org> ---
Still waiting for backport.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Bug rtl-optimization/67609] [5/6 Regression] Generates wrong code for SSE2 _mm_load_pd
  2015-09-17 11:37 [Bug regression/67609] New: [Regression] Generates wrong code for SSE2 _mm_load_pd bisqwit at iki dot fi
                   ` (5 preceding siblings ...)
  2015-10-16 12:44 ` vmakarov at gcc dot gnu.org
@ 2015-10-16 12:50 ` vmakarov at gcc dot gnu.org
  2015-10-20 16:26 ` vmakarov at gcc dot gnu.org
                   ` (24 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: vmakarov at gcc dot gnu.org @ 2015-10-16 12:50 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67609

--- Comment #9 from Vladimir Makarov <vmakarov at gcc dot gnu.org> ---
(In reply to Vladimir Makarov from comment #8)
> 
> Why do LRA and reload remove subregs of hard registers? That is because some
> subsequent optimizations can handle them.
> 

Sorry, it should be *can not handle*.

Basically keeping subregisters crashes other passes on GCC bootstrap.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Bug rtl-optimization/67609] [5/6 Regression] Generates wrong code for SSE2 _mm_load_pd
  2015-09-17 11:37 [Bug regression/67609] New: [Regression] Generates wrong code for SSE2 _mm_load_pd bisqwit at iki dot fi
                   ` (6 preceding siblings ...)
  2015-10-16 12:50 ` vmakarov at gcc dot gnu.org
@ 2015-10-20 16:26 ` vmakarov at gcc dot gnu.org
  2015-10-20 16:32 ` vmakarov at gcc dot gnu.org
                   ` (23 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: vmakarov at gcc dot gnu.org @ 2015-10-20 16:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67609

--- Comment #10 from Vladimir Makarov <vmakarov at gcc dot gnu.org> ---
Author: vmakarov
Date: Tue Oct 20 16:26:05 2015
New Revision: 229087

URL: https://gcc.gnu.org/viewcvs?rev=229087&root=gcc&view=rev
Log:
2015-10-20  Vladimir Makarov  <vmakarov@redhat.com>

        PR rtl-optimization/67609
        * lra-splill.c (lra_final_code_change): Don't remove all
        sub-registers.

2015-10-20  Vladimir Makarov  <vmakarov@redhat.com>

        PR rtl-optimization/67609
        * gcc.target/i386/pr67609.c: New.


Added:
    trunk/gcc/testsuite/gcc.target/i386/pr67609.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/lra-spills.c
    trunk/gcc/testsuite/ChangeLog


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Bug rtl-optimization/67609] [5/6 Regression] Generates wrong code for SSE2 _mm_load_pd
  2015-09-17 11:37 [Bug regression/67609] New: [Regression] Generates wrong code for SSE2 _mm_load_pd bisqwit at iki dot fi
                   ` (7 preceding siblings ...)
  2015-10-20 16:26 ` vmakarov at gcc dot gnu.org
@ 2015-10-20 16:32 ` vmakarov at gcc dot gnu.org
  2015-10-20 19:12 ` ubizjak at gmail dot com
                   ` (22 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: vmakarov at gcc dot gnu.org @ 2015-10-20 16:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67609

--- Comment #11 from Vladimir Makarov <vmakarov at gcc dot gnu.org> ---
I've committed the patch into the trunk.  As the patch is not trivial, I'd wait
for a week before committing it into gcc-5-branch to see how it is doing on the
trunk first.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Bug rtl-optimization/67609] [5/6 Regression] Generates wrong code for SSE2 _mm_load_pd
  2015-09-17 11:37 [Bug regression/67609] New: [Regression] Generates wrong code for SSE2 _mm_load_pd bisqwit at iki dot fi
                   ` (8 preceding siblings ...)
  2015-10-20 16:32 ` vmakarov at gcc dot gnu.org
@ 2015-10-20 19:12 ` ubizjak at gmail dot com
  2015-10-21  7:52 ` ubizjak at gmail dot com
                   ` (21 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: ubizjak at gmail dot com @ 2015-10-20 19:12 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67609

--- Comment #12 from Uroš Bizjak <ubizjak at gmail dot com> ---
Unfortunately, the patch doesn't fix similar PR67124 and (dup) PR68011.
>From gcc-bugs-return-500119-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org Tue Oct 20 19:37:11 2015
Return-Path: <gcc-bugs-return-500119-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Delivered-To: listarch-gcc-bugs@gcc.gnu.org
Received: (qmail 90429 invoked by alias); 20 Oct 2015 19:37:11 -0000
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
Delivered-To: mailing list gcc-bugs@gcc.gnu.org
Received: (qmail 90327 invoked by uid 48); 20 Oct 2015 19:37:06 -0000
From: "dominiq at lps dot ens.fr" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug fortran/52970] OpenMP Scoping Incorrect for Arrays of Parameters
Date: Tue, 20 Oct 2015 19:37:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: fortran
X-Bugzilla-Version: 4.8.0
X-Bugzilla-Keywords:
X-Bugzilla-Severity: normal
X-Bugzilla-Who: dominiq at lps dot ens.fr
X-Bugzilla-Status: RESOLVED
X-Bugzilla-Resolution: DUPLICATE
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields: bug_status resolution
Message-ID: <bug-52970-4-pq3mIjSnfA@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-52970-4@http.gcc.gnu.org/bugzilla/>
References: <bug-52970-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2015-10/txt/msg01674.txt.bz2
Content-length: 561

https://gcc.gnu.org/bugzilla/show_bug.cgi?idR970

Dominique d'Humieres <dominiq at lps dot ens.fr> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|WAITING                     |RESOLVED
         Resolution|---                         |DUPLICATE

--- Comment #3 from Dominique d'Humieres <dominiq at lps dot ens.fr> ---
> It looks like a dup of pr59488.

Agreed. Marked as DUPLICATE.

*** This bug has been marked as a duplicate of bug 59488 ***


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Bug rtl-optimization/67609] [5/6 Regression] Generates wrong code for SSE2 _mm_load_pd
  2015-09-17 11:37 [Bug regression/67609] New: [Regression] Generates wrong code for SSE2 _mm_load_pd bisqwit at iki dot fi
                   ` (9 preceding siblings ...)
  2015-10-20 19:12 ` ubizjak at gmail dot com
@ 2015-10-21  7:52 ` ubizjak at gmail dot com
  2015-10-21 16:30 ` vmakarov at gcc dot gnu.org
                   ` (20 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: ubizjak at gmail dot com @ 2015-10-21  7:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67609

--- Comment #13 from Uroš Bizjak <ubizjak at gmail dot com> ---
The runtime version of the test still fails:

--cut here--
#include <emmintrin.h>

__m128d reg = { 2.0, 4.0 };

void
__attribute__((noinline))
set_lower (double b)
{
  double v[2];
  _mm_store_pd(v, reg);
  v[0] = b;
  reg = _mm_load_pd(v);
}

int
main ()
{
  set_lower (6.0);

  if (reg[1] != 4.0)
    abort ();

  return 0;
}
--cut here--

gcc -O2 -pr67609.c

$ ./a.out
Aborted

set_lower:
.LFB518:
        movdqa  reg(%rip), %xmm1        # 6     *movti_internal/4
>>      movapd  %xmm0, %xmm1    # 7     *movdf_internal/14
        movaps  %xmm1, reg(%rip)        # 8     *movv2df_internal/3
        ret     # 14    simple_return_internal

Marked insn moves the whole register and clobbers the high word of the xmm1.
>From gcc-bugs-return-500146-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org Wed Oct 21 07:55:26 2015
Return-Path: <gcc-bugs-return-500146-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Delivered-To: listarch-gcc-bugs@gcc.gnu.org
Received: (qmail 112953 invoked by alias); 21 Oct 2015 07:55:26 -0000
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
Delivered-To: mailing list gcc-bugs@gcc.gnu.org
Received: (qmail 112895 invoked by uid 48); 21 Oct 2015 07:55:22 -0000
From: "paolo.carlini at oracle dot com" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug c++/67904] g++ crashes and asks for bugreport
Date: Wed, 21 Oct 2015 07:55:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: c++
X-Bugzilla-Version: 5.2.1
X-Bugzilla-Keywords:
X-Bugzilla-Severity: normal
X-Bugzilla-Who: paolo.carlini at oracle dot com
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Resolution:
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields: cc bug_severity
Message-ID: <bug-67904-4-YuXwSG65z6@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-67904-4@http.gcc.gnu.org/bugzilla/>
References: <bug-67904-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2015-10/txt/msg01701.txt.bz2
Content-length: 674

https://gcc.gnu.org/bugzilla/show_bug.cgi?idg904

Paolo Carlini <paolo.carlini at oracle dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |nickc at redhat dot com,
                   |                            |ramana.radhakrishnan at arm dot co
                   |                            |m, richard.earnshaw at arm dot com
           Severity|critical                    |normal

--- Comment #2 from Paolo Carlini <paolo.carlini at oracle dot com> ---
I can't either, with -m32. Let's add arm maintainers in CC.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Bug rtl-optimization/67609] [5/6 Regression] Generates wrong code for SSE2 _mm_load_pd
  2015-09-17 11:37 [Bug regression/67609] New: [Regression] Generates wrong code for SSE2 _mm_load_pd bisqwit at iki dot fi
                   ` (10 preceding siblings ...)
  2015-10-21  7:52 ` ubizjak at gmail dot com
@ 2015-10-21 16:30 ` vmakarov at gcc dot gnu.org
  2015-10-21 17:34 ` ubizjak at gmail dot com
                   ` (19 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: vmakarov at gcc dot gnu.org @ 2015-10-21 16:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67609

--- Comment #14 from Vladimir Makarov <vmakarov at gcc dot gnu.org> ---
(In reply to Uroš Bizjak from comment #13)
> The runtime version of the test still fails:
> 
> 
> gcc -O2 -pr67609.c
> 
> $ ./a.out
> Aborted
> 
> set_lower:
> .LFB518:
>         movdqa  reg(%rip), %xmm1        # 6     *movti_internal/4
> >>      movapd  %xmm0, %xmm1    # 7     *movdf_internal/14
>         movaps  %xmm1, reg(%rip)        # 8     *movv2df_internal/3
>         ret     # 14    simple_return_internal
> 
> Marked insn moves the whole register and clobbers the high word of the xmm1.

Yes, right.  The bug is not fixed yet.  Although it is not RA problem anymore,
I believe.

Before final we have

(insn:TI 7 6 8 2 (set (subreg:DF (reg/v:TI 22 xmm1 [orig:89 v ] [89]) 0)
        (reg/v:DF 21 xmm0 [orig:90 b ] [90])) b3.c:7 126 {*movdf_internal}
     (expr_list:REG_DEAD (reg/v:DF 21 xmm0 [orig:90 b ] [90])
        (nil)))

How can the final pass use movapd for this?  RTL semantics say high part of
xmm1 should be not changed.

I only can guess.  That final pass removes the subreg first as it was done in
LRA earlier.

I see two solutions.  Prevent the final remove the subreg first and generate
corresponding insn (most probably needs some additions to i386.md).  Or prevent
use movapd in *movdf_internal using movlpd instead (by the way it will also
solve other bugs like 67124).

The first solution is less safe as it may affect all targets.  Although it
could be implemented in a safe way: remove the subreg only if there is no insn
definition with subreg.  But I am not a specialist in writing md files to be
sure (e.g. how to treat insn as *movdf_internal on all passes and only as insn
with subreg on the final pass).
>From gcc-bugs-return-500200-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org Wed Oct 21 17:30:58 2015
Return-Path: <gcc-bugs-return-500200-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Delivered-To: listarch-gcc-bugs@gcc.gnu.org
Received: (qmail 114266 invoked by alias); 21 Oct 2015 17:30:57 -0000
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
Delivered-To: mailing list gcc-bugs@gcc.gnu.org
Received: (qmail 113785 invoked by uid 55); 21 Oct 2015 17:30:52 -0000
From: "mpolacek at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug c/68024] Diagnose variadic functions defined without prototypes
Date: Wed, 21 Oct 2015 17:30:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: c
X-Bugzilla-Version: 6.0
X-Bugzilla-Keywords:
X-Bugzilla-Severity: normal
X-Bugzilla-Who: mpolacek at gcc dot gnu.org
X-Bugzilla-Status: ASSIGNED
X-Bugzilla-Resolution:
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: mpolacek at gcc dot gnu.org
X-Bugzilla-Target-Milestone: 6.0
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields:
Message-ID: <bug-68024-4-RJsxxWmO3e@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-68024-4@http.gcc.gnu.org/bugzilla/>
References: <bug-68024-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2015-10/txt/msg01755.txt.bz2
Content-length: 554

https://gcc.gnu.org/bugzilla/show_bug.cgi?idh024

--- Comment #2 from Marek Polacek <mpolacek at gcc dot gnu.org> ---
Author: mpolacek
Date: Wed Oct 21 17:30:20 2015
New Revision: 229131

URL: https://gcc.gnu.org/viewcvs?rev"9131&root=gcc&view=rev
Log:
        PR c/68024
        * c-decl.c (start_function): Warn about vararg functions without
        a prototype.

        * gcc.dg/pr68024.c: New test.

Added:
    trunk/gcc/testsuite/gcc.dg/pr68024.c
Modified:
    trunk/gcc/c/ChangeLog
    trunk/gcc/c/c-decl.c
    trunk/gcc/testsuite/ChangeLog


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Bug rtl-optimization/67609] [5/6 Regression] Generates wrong code for SSE2 _mm_load_pd
  2015-09-17 11:37 [Bug regression/67609] New: [Regression] Generates wrong code for SSE2 _mm_load_pd bisqwit at iki dot fi
                   ` (11 preceding siblings ...)
  2015-10-21 16:30 ` vmakarov at gcc dot gnu.org
@ 2015-10-21 17:34 ` ubizjak at gmail dot com
  2015-10-21 18:16 ` law at redhat dot com
                   ` (18 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: ubizjak at gmail dot com @ 2015-10-21 17:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67609

Uroš Bizjak <ubizjak at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |law at gcc dot gnu.org

--- Comment #15 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to Vladimir Makarov from comment #14)

> The first solution is less safe as it may affect all targets.  Although it
> could be implemented in a safe way: remove the subreg only if there is no
> insn definition with subreg.  But I am not a specialist in writing md files
> to be sure (e.g. how to treat insn as *movdf_internal on all passes and only
> as insn with subreg on the final pass).

I think that RTL infrastructure should be fixed/enhanced first to allow proper
handling of subregs through all passes in a consistent way. There is no point
in a special workaround, applicable to only one target, as the same problem
will trigger also for other targets.

Adding Jeff to CC.
>From gcc-bugs-return-500203-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org Wed Oct 21 17:39:26 2015
Return-Path: <gcc-bugs-return-500203-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Delivered-To: listarch-gcc-bugs@gcc.gnu.org
Received: (qmail 23433 invoked by alias); 21 Oct 2015 17:39:26 -0000
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
Delivered-To: mailing list gcc-bugs@gcc.gnu.org
Received: (qmail 23373 invoked by uid 48); 21 Oct 2015 17:39:22 -0000
From: "gong_su at hotmail dot com" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug go/67968] go1: internal compiler error: in write_specific_type_functions, at go/gofrontend/types.cc:1812
Date: Wed, 21 Oct 2015 17:39:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: go
X-Bugzilla-Version: 5.2.1
X-Bugzilla-Keywords: ice-on-valid-code
X-Bugzilla-Severity: normal
X-Bugzilla-Who: gong_su at hotmail dot com
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Resolution:
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: ian at airs dot com
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields:
Message-ID: <bug-67968-4-2BhkonEwEx@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-67968-4@http.gcc.gnu.org/bugzilla/>
References: <bug-67968-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2015-10/txt/msg01758.txt.bz2
Content-length: 585

https://gcc.gnu.org/bugzilla/show_bug.cgi?idg968

--- Comment #2 from gong_su at hotmail dot com ---
Hi Dominik, the command that failed with go1 internal error has a lot of .go
files specified so I don't know which one actually caused the problem. So
instead I can send you the entire Ethereum build tree so that you should be
able to reproduce the problem by simply unpacking the tar.bz2 file somewhere
and doing a "make geth". I can't attach the build tree here since it's 64MB and
the limit for attachment is 1MB. Let me know if you want me to email you the
build tree. Thanks.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Bug rtl-optimization/67609] [5/6 Regression] Generates wrong code for SSE2 _mm_load_pd
  2015-09-17 11:37 [Bug regression/67609] New: [Regression] Generates wrong code for SSE2 _mm_load_pd bisqwit at iki dot fi
                   ` (12 preceding siblings ...)
  2015-10-21 17:34 ` ubizjak at gmail dot com
@ 2015-10-21 18:16 ` law at redhat dot com
  2015-10-21 18:42 ` vmakarov at gcc dot gnu.org
                   ` (17 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: law at redhat dot com @ 2015-10-21 18:16 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67609

Jeffrey A. Law <law at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |law at redhat dot com

--- Comment #16 from Jeffrey A. Law <law at redhat dot com> ---
reload has traditionally removed subregs of hardregs and passes after reload
have depended on that behaviour.  Doing something similar in lra is obviously
necessary.  In fact, subregs of multi-word hard regs isn't ever supposed to
appear in the except during allocation & reloading.

I'm not sure why final has another call to cleanup_subreg_operands.  While git
blame blames me, I was just refactoring existing code code back in '98.

Does that shed any light on what the right behaviour for LRA ought to be?


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Bug rtl-optimization/67609] [5/6 Regression] Generates wrong code for SSE2 _mm_load_pd
  2015-09-17 11:37 [Bug regression/67609] New: [Regression] Generates wrong code for SSE2 _mm_load_pd bisqwit at iki dot fi
                   ` (13 preceding siblings ...)
  2015-10-21 18:16 ` law at redhat dot com
@ 2015-10-21 18:42 ` vmakarov at gcc dot gnu.org
  2015-10-22  8:23 ` rguenth at gcc dot gnu.org
                   ` (16 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: vmakarov at gcc dot gnu.org @ 2015-10-21 18:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67609

--- Comment #17 from Vladimir Makarov <vmakarov at gcc dot gnu.org> ---
(In reply to Jeffrey A. Law from comment #16)
> reload has traditionally removed subregs of hardregs and passes after reload
> have depended on that behaviour.  Doing something similar in lra is
> obviously necessary.  In fact, subregs of multi-word hard regs isn't ever
> supposed to appear in the except during allocation & reloading.
> 
> I'm not sure why final has another call to cleanup_subreg_operands.  While
> git blame blames me, I was just refactoring existing code code back in '98.
> 
> Does that shed any light on what the right behaviour for LRA ought to be?

LRA before the patch did the same as reload.  This case shows that still some
subregs should stay after reload otherwise we have wrong transformations (as
making insn setting high part of the register a dead insn).

Of course, some optimizations can not deal with subregs of multi-regs.  The
patch avoids keeping such subregs.  I also found x86 reg-stack pass can not
deal with any subregisters of stack fp regs.

In any case I spent a lot of time for this small patch which works for at least
5 tested targets.

I guess what we need more is to make final pass (at least for x86-64) to deal
with the rest of subregs in a right way.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Bug rtl-optimization/67609] [5/6 Regression] Generates wrong code for SSE2 _mm_load_pd
  2015-09-17 11:37 [Bug regression/67609] New: [Regression] Generates wrong code for SSE2 _mm_load_pd bisqwit at iki dot fi
                   ` (14 preceding siblings ...)
  2015-10-21 18:42 ` vmakarov at gcc dot gnu.org
@ 2015-10-22  8:23 ` rguenth at gcc dot gnu.org
  2015-10-22 12:49 ` law at redhat dot com
                   ` (15 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-10-22  8:23 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67609

--- Comment #18 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Jeffrey A. Law from comment #16)
> reload has traditionally removed subregs of hardregs and passes after reload
> have depended on that behaviour.  Doing something similar in lra is
> obviously necessary.  In fact, subregs of multi-word hard regs isn't ever
> supposed to appear in the except during allocation & reloading.

But how do you preserve semantics of preserving the upper part then?
Do recognized insns never change after reload and thus the target needs
to make sure to have insns "separated" enough that the property is preserved
solely by means of the selected insn (we don't record the chosen alternatives
in the insns, do we?)?

> I'm not sure why final has another call to cleanup_subreg_operands.  While
> git blame blames me, I was just refactoring existing code code back in '98.
> 
> Does that shed any light on what the right behaviour for LRA ought to be?


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Bug rtl-optimization/67609] [5/6 Regression] Generates wrong code for SSE2 _mm_load_pd
  2015-09-17 11:37 [Bug regression/67609] New: [Regression] Generates wrong code for SSE2 _mm_load_pd bisqwit at iki dot fi
                   ` (15 preceding siblings ...)
  2015-10-22  8:23 ` rguenth at gcc dot gnu.org
@ 2015-10-22 12:49 ` law at redhat dot com
  2015-10-22 13:40 ` rguenth at gcc dot gnu.org
                   ` (14 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: law at redhat dot com @ 2015-10-22 12:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67609

--- Comment #19 from Jeffrey A. Law <law at redhat dot com> ---
Preserving the upper part when setting the low part would be expressed via
STRICT_LOW_PART.  At least that's what I'd expect to see.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Bug rtl-optimization/67609] [5/6 Regression] Generates wrong code for SSE2 _mm_load_pd
  2015-09-17 11:37 [Bug regression/67609] New: [Regression] Generates wrong code for SSE2 _mm_load_pd bisqwit at iki dot fi
                   ` (16 preceding siblings ...)
  2015-10-22 12:49 ` law at redhat dot com
@ 2015-10-22 13:40 ` rguenth at gcc dot gnu.org
  2015-10-22 19:34 ` law at redhat dot com
                   ` (13 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-10-22 13:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67609

--- Comment #20 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Jeffrey A. Law from comment #19)
> Preserving the upper part when setting the low part would be expressed via
> STRICT_LOW_PART.  At least that's what I'd expect to see.

According to documentation this is only needed for smaller-than-word subregs:

"When used as an lvalue, @code{subreg} is a word-based accessor.
Storing to a @code{subreg} modifies all the words of @var{reg} that
overlap the @code{subreg}, but it leaves the other words of @var{reg}
alone.

for the case in question, while strict-low-part:

"When storing to a normal @code{subreg} that is smaller than a word,
the other bits of the referenced word are usually left in an undefined
state.  This laxity makes it easier to generate efficient code for
such instructions.  To represent an instruction that preserves all the
bits outside of those in the @code{subreg}, use @code{strict_low_part}
or @code{zero_extract} around the @code{subreg}.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Bug rtl-optimization/67609] [5/6 Regression] Generates wrong code for SSE2 _mm_load_pd
  2015-09-17 11:37 [Bug regression/67609] New: [Regression] Generates wrong code for SSE2 _mm_load_pd bisqwit at iki dot fi
                   ` (17 preceding siblings ...)
  2015-10-22 13:40 ` rguenth at gcc dot gnu.org
@ 2015-10-22 19:34 ` law at redhat dot com
  2015-10-22 21:55 ` rth at gcc dot gnu.org
                   ` (12 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: law at redhat dot com @ 2015-10-22 19:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67609

--- Comment #21 from Jeffrey A. Law <law at redhat dot com> ---
On 10/22/2015 07:40 AM, rguenth at gcc dot gnu.org wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67609
>
> --- Comment #20 from Richard Biener <rguenth at gcc dot gnu.org> ---
> (In reply to Jeffrey A. Law from comment #19)
>> Preserving the upper part when setting the low part would be expressed via
>> STRICT_LOW_PART.  At least that's what I'd expect to see.
>
> According to documentation this is only needed for smaller-than-word subregs:
Yea, you're probably right.  Sigh.

So going back to the original problem, for a subreg of a multi-word reg, 
why can't we simplify that down to a suitably sized reg?

Jeff


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Bug rtl-optimization/67609] [5/6 Regression] Generates wrong code for SSE2 _mm_load_pd
  2015-09-17 11:37 [Bug regression/67609] New: [Regression] Generates wrong code for SSE2 _mm_load_pd bisqwit at iki dot fi
                   ` (18 preceding siblings ...)
  2015-10-22 19:34 ` law at redhat dot com
@ 2015-10-22 21:55 ` rth at gcc dot gnu.org
  2015-10-22 22:54 ` rth at gcc dot gnu.org
                   ` (11 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: rth at gcc dot gnu.org @ 2015-10-22 21:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67609

--- Comment #22 from Richard Henderson <rth at gcc dot gnu.org> ---
(In reply to Jeffrey A. Law from comment #21)
> So going back to the original problem, for a subreg of a multi-word reg, 
> why can't we simplify that down to a suitably sized reg?

Because we're dealing with registers of different sizes.

Assigning to a subreg as the low-part of a multi-word pseudo only
makes sense when talking about general registers, which is the only
place that "word_mode" applies.

When talking about vector registers, which are universally larger
than word-mode, we cannot simply assign to a subreg.

There is a vec_set named pattern that can perform an insertion into
a vector element, like what's being demonstrated in the test source
here.  Ideally that's how we'd have expanded this originally.

We already have ix86_cannot_change_mode_class to avoid the cases
that we knew we couldn't support, e.g. QI and HImode loads/stores.
But perhaps we should prevent all size-changing mode changes for
the vector registers.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Bug rtl-optimization/67609] [5/6 Regression] Generates wrong code for SSE2 _mm_load_pd
  2015-09-17 11:37 [Bug regression/67609] New: [Regression] Generates wrong code for SSE2 _mm_load_pd bisqwit at iki dot fi
                   ` (19 preceding siblings ...)
  2015-10-22 21:55 ` rth at gcc dot gnu.org
@ 2015-10-22 22:54 ` rth at gcc dot gnu.org
  2015-10-23  1:58 ` vmakarov at gcc dot gnu.org
                   ` (10 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: rth at gcc dot gnu.org @ 2015-10-22 22:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67609

--- Comment #23 from Richard Henderson <rth at gcc dot gnu.org> ---
Created attachment 36563
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=36563&action=edit
possible patch

Certainly this fixes the executable test case from #c13.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Bug rtl-optimization/67609] [5/6 Regression] Generates wrong code for SSE2 _mm_load_pd
  2015-09-17 11:37 [Bug regression/67609] New: [Regression] Generates wrong code for SSE2 _mm_load_pd bisqwit at iki dot fi
                   ` (20 preceding siblings ...)
  2015-10-22 22:54 ` rth at gcc dot gnu.org
@ 2015-10-23  1:58 ` vmakarov at gcc dot gnu.org
  2015-10-23  5:02 ` vmakarov at gcc dot gnu.org
                   ` (9 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: vmakarov at gcc dot gnu.org @ 2015-10-23  1:58 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67609

--- Comment #24 from Vladimir Makarov <vmakarov at gcc dot gnu.org> ---
(In reply to Richard Henderson from comment #23)
> Created attachment 36563 [details]
> possible patch
> 
> Certainly this fixes the executable test case from #c13.

This patch could be a solution to generate a correct code.  Unfortunately, it
generates ineffective code:

        movdqa  reg(%rip), %xmm1
        movaps  %xmm1, -24(%rsp)
        movsd   %xmm0, -24(%rsp)
        movapd  -24(%rsp), %xmm2
        movaps  %xmm2, reg(%rip)

instead of better code which would be expected for this case

        movdqa  reg(%rip), %xmm1
        movlpd  %xmm0, %xmm1
        movaps  %xmm1, reg(%rip)

The patch will force any pseudo which is accessed as a subregister of DImode to
be spilled.  Although I don't know how it will affect performance of SPEC
benchmarks.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Bug rtl-optimization/67609] [5/6 Regression] Generates wrong code for SSE2 _mm_load_pd
  2015-09-17 11:37 [Bug regression/67609] New: [Regression] Generates wrong code for SSE2 _mm_load_pd bisqwit at iki dot fi
                   ` (21 preceding siblings ...)
  2015-10-23  1:58 ` vmakarov at gcc dot gnu.org
@ 2015-10-23  5:02 ` vmakarov at gcc dot gnu.org
  2015-10-23  8:30 ` rguenth at gcc dot gnu.org
                   ` (8 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: vmakarov at gcc dot gnu.org @ 2015-10-23  5:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67609

--- Comment #25 from Vladimir Makarov <vmakarov at gcc dot gnu.org> ---
(In reply to Vladimir Makarov from comment #24)
> (In reply to Richard Henderson from comment #23)
> > Created attachment 36563 [details]
> > possible patch
> > 
> > Certainly this fixes the executable test case from #c13.
> 

> The patch will force any pseudo which is accessed as a subregister of DImode
> to be spilled.

Probably, I exaggerated with "any", but still the patch will result in more
loads/stores.  So it would be nice to benchmark it.  I'll try to do this on
Friday.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Bug rtl-optimization/67609] [5/6 Regression] Generates wrong code for SSE2 _mm_load_pd
  2015-09-17 11:37 [Bug regression/67609] New: [Regression] Generates wrong code for SSE2 _mm_load_pd bisqwit at iki dot fi
                   ` (22 preceding siblings ...)
  2015-10-23  5:02 ` vmakarov at gcc dot gnu.org
@ 2015-10-23  8:30 ` rguenth at gcc dot gnu.org
  2015-10-23 13:14 ` vmakarov at gcc dot gnu.org
                   ` (7 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-10-23  8:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67609

--- Comment #26 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Richard Henderson from comment #22)
> (In reply to Jeffrey A. Law from comment #21)
> > So going back to the original problem, for a subreg of a multi-word reg, 
> > why can't we simplify that down to a suitably sized reg?
> 
> Because we're dealing with registers of different sizes.
> 
> Assigning to a subreg as the low-part of a multi-word pseudo only
> makes sense when talking about general registers, which is the only
> place that "word_mode" applies.

Hmm, I don't see this documented anywhere.  In fact there is no such
thing as a "vector register", there are only vector modes.  And we
are using %xmm for plain SF/DFmode all over the place.

Note that in the particular case the mode we subreg is TImode,
not a vector mode.

> When talking about vector registers, which are universally larger
> than word-mode, we cannot simply assign to a subreg.
> 
> There is a vec_set named pattern that can perform an insertion into
> a vector element, like what's being demonstrated in the test source
> here.  Ideally that's how we'd have expanded this originally.

Indeed, if expand can see we are setting the low part of a vector
then it should try using vec_set.  Auditing of other targets might
be necessary here though.  And of course the i386 backend might
end up choosing movdf for this operation anyway...

> We already have ix86_cannot_change_mode_class to avoid the cases
> that we knew we couldn't support, e.g. QI and HImode loads/stores.
> But perhaps we should prevent all size-changing mode changes for
> the vector registers.

That may be a workaround for x86 but I don't see how this fixes the
issue in general given that targets may have general registers larger
than word_mode (is x32 TARGET_64BIT?).


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Bug rtl-optimization/67609] [5/6 Regression] Generates wrong code for SSE2 _mm_load_pd
  2015-09-17 11:37 [Bug regression/67609] New: [Regression] Generates wrong code for SSE2 _mm_load_pd bisqwit at iki dot fi
                   ` (23 preceding siblings ...)
  2015-10-23  8:30 ` rguenth at gcc dot gnu.org
@ 2015-10-23 13:14 ` vmakarov at gcc dot gnu.org
  2015-10-23 19:06 ` rth at gcc dot gnu.org
                   ` (6 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: vmakarov at gcc dot gnu.org @ 2015-10-23 13:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67609

--- Comment #27 from Vladimir Makarov <vmakarov at gcc dot gnu.org> ---
(In reply to Vladimir Makarov from comment #25)
> So it would be nice to benchmark it.  I'll try to do this on
> Friday.

Practically every SPEC2000 benchmark failed to compile with this patch.  GCC
crashes in split2 pass with messages like this

Error: unrecognizable insn:
(insn 381 345 67 3 (set (subreg:V2DF (reg:DF 22 xmm1 [287]) 0)
        (xor:V2DF (subreg:V2DF (reg:DF 22 xmm1 [287]) 0)
            (mem/u/c:V2DF (symbol_ref/u:DI ("*.LC6") [flags 0x2]) [2  S16
A128]))) apsi.f:3324 -1
     (nil))
apsi.f:3336:0: internal compiler error: in extract_insn, at recog.c:2297
0x9b9848 _fatal_insn(char const*, rtx_def const*, char const*, int, char
const*)
        /home/cygnus/vmakarov/build1/trunk3/gcc/gcc/rtl-error.c:109
0x9b9879 _fatal_insn_not_found(rtx_def const*, char const*, int, char const*)
        /home/cygnus/vmakarov/build1/trunk3/gcc/gcc/rtl-error.c:117
0x988907 extract_insn(rtx_insn*)
        /home/cygnus/vmakarov/build1/trunk3/gcc/gcc/recog.c:2297
0x988991 extract_insn_cached(rtx_insn*)
        /home/cygnus/vmakarov/build1/trunk3/gcc/gcc/recog.c:2188
0x7e3f5d cleanup_subreg_operands(rtx_insn*)
        /home/cygnus/vmakarov/build1/trunk3/gcc/gcc/final.c:3112
0x98638c split_insn
        /home/cygnus/vmakarov/build1/trunk3/gcc/gcc/recog.c:2910
0x98ab77 split_all_insns()
        /home/cygnus/vmakarov/build1/trunk3/gcc/gcc/recog.c:2964
0x98abd8 rest_of_handle_split_after_reload
        /home/cygnus/vmakarov/build1/trunk3/gcc/gcc/recog.c:3902
0x98abd8 execute
        /home/cygnus/vmakarov/build1/trunk3/gcc/gcc/recog.c:3931
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <http://gcc.gnu.org/bugs.html> for instructions.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Bug rtl-optimization/67609] [5/6 Regression] Generates wrong code for SSE2 _mm_load_pd
  2015-09-17 11:37 [Bug regression/67609] New: [Regression] Generates wrong code for SSE2 _mm_load_pd bisqwit at iki dot fi
                   ` (24 preceding siblings ...)
  2015-10-23 13:14 ` vmakarov at gcc dot gnu.org
@ 2015-10-23 19:06 ` rth at gcc dot gnu.org
  2015-10-26 19:18 ` rth at gcc dot gnu.org
                   ` (5 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: rth at gcc dot gnu.org @ 2015-10-23 19:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67609

--- Comment #28 from Richard Henderson <rth at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #26)
> Hmm, I don't see this documented anywhere.  In fact there is no such
> thing as a "vector register", there are only vector modes.  And we
> are using %xmm for plain SF/DFmode all over the place.
> 
> Note that in the particular case the mode we subreg is TImode,
> not a vector mode.

You're right, my language was sloppy.  The problem I describe is going
to be true for any register whose reg_raw_mode is larger than word_mode.

The assumption is that assignment to a word_mode subreg both (1) cannot
affect bits outside the word_mode and (2) can be simplified to a plain
hard register post-reload.

Part deux is impossible when reg_raw_mode is larger than word_mode,
because the subreg-y assignment result is indistinguishable from a
normal word_mode assignment.

> That may be a workaround for x86 but I don't see how this fixes the
> issue in general given that targets may have general registers larger
> than word_mode

It doesn't fix other targets, true.  But I don't see how to do that
without changes to each target.

> (is x32 TARGET_64BIT?).

Yes.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Bug rtl-optimization/67609] [5/6 Regression] Generates wrong code for SSE2 _mm_load_pd
  2015-09-17 11:37 [Bug regression/67609] New: [Regression] Generates wrong code for SSE2 _mm_load_pd bisqwit at iki dot fi
                   ` (25 preceding siblings ...)
  2015-10-23 19:06 ` rth at gcc dot gnu.org
@ 2015-10-26 19:18 ` rth at gcc dot gnu.org
  2015-10-26 21:48 ` vmakarov at gcc dot gnu.org
                   ` (4 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: rth at gcc dot gnu.org @ 2015-10-26 19:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67609

--- Comment #29 from Richard Henderson <rth at gcc dot gnu.org> ---
(In reply to Vladimir Makarov from comment #27)
> (In reply to Vladimir Makarov from comment #25)
> > So it would be nice to benchmark it.  I'll try to do this on
> > Friday.
> 
> Practically every SPEC2000 benchmark failed to compile with this patch.  GCC
> crashes in split2 pass with messages like this
> 
> Error: unrecognizable insn:
> (insn 381 345 67 3 (set (subreg:V2DF (reg:DF 22 xmm1 [287]) 0)
>         (xor:V2DF (subreg:V2DF (reg:DF 22 xmm1 [287]) 0)
>             (mem/u/c:V2DF (symbol_ref/u:DI ("*.LC6") [flags 0x2]) [2  S16
> A128])))

Ho hum.  Sorry, Vlad, if I'd bothered bootstrapping I'd have seen this myself.
Please change != to < in the patch to re-try.  (That is, allow the TO mode to
be wider than the FROM mode in order to support the paradoxical subregs seen
above.)


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Bug rtl-optimization/67609] [5/6 Regression] Generates wrong code for SSE2 _mm_load_pd
  2015-09-17 11:37 [Bug regression/67609] New: [Regression] Generates wrong code for SSE2 _mm_load_pd bisqwit at iki dot fi
                   ` (26 preceding siblings ...)
  2015-10-26 19:18 ` rth at gcc dot gnu.org
@ 2015-10-26 21:48 ` vmakarov at gcc dot gnu.org
  2015-10-27 10:18 ` rguenth at gcc dot gnu.org
                   ` (3 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: vmakarov at gcc dot gnu.org @ 2015-10-26 21:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67609

--- Comment #30 from Vladimir Makarov <vmakarov at gcc dot gnu.org> ---
(In reply to Richard Henderson from comment #29)
>
> 
> Ho hum.  Sorry, Vlad, if I'd bothered bootstrapping I'd have seen this
> myself.
> Please change != to < in the patch to re-try.  (That is, allow the TO mode to
> be wider than the FROM mode in order to support the paradoxical subregs seen
> above.)

Thanks.  I've checked the patch on Haswell x86-64 SPEC2000 using -O3
-mtune=corei7.  The patch changes code of 3 out of 12 SPECInt tests and 9 of 14
SPECFP tests.  The code size always increases.  No surprise.  But the change is
insignificant (about 0.001%).  The biggest SPEC rate decrease is achieved on
SPECFP and is only about 0.15%.

So I guess the patch is ok to fix the problem.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Bug rtl-optimization/67609] [5/6 Regression] Generates wrong code for SSE2 _mm_load_pd
  2015-09-17 11:37 [Bug regression/67609] New: [Regression] Generates wrong code for SSE2 _mm_load_pd bisqwit at iki dot fi
                   ` (27 preceding siblings ...)
  2015-10-26 21:48 ` vmakarov at gcc dot gnu.org
@ 2015-10-27 10:18 ` rguenth at gcc dot gnu.org
  2015-10-27 20:00 ` rth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-10-27 10:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67609

--- Comment #31 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Richard Henderson from comment #28)
> (In reply to Richard Biener from comment #26)
> > Hmm, I don't see this documented anywhere.  In fact there is no such
> > thing as a "vector register", there are only vector modes.  And we
> > are using %xmm for plain SF/DFmode all over the place.
> > 
> > Note that in the particular case the mode we subreg is TImode,
> > not a vector mode.
> 
> You're right, my language was sloppy.  The problem I describe is going
> to be true for any register whose reg_raw_mode is larger than word_mode.
> 
> The assumption is that assignment to a word_mode subreg both (1) cannot
> affect bits outside the word_mode and (2) can be simplified to a plain
> hard register post-reload.
> 
> Part deux is impossible when reg_raw_mode is larger than word_mode,
> because the subreg-y assignment result is indistinguishable from a
> normal word_mode assignment.
> 
> > That may be a workaround for x86 but I don't see how this fixes the
> > issue in general given that targets may have general registers larger
> > than word_mode
> 
> It doesn't fix other targets, true.  But I don't see how to do that
> without changes to each target.

Drop (2)?  But that requires to touch every target as well I guess
(and can't be done incrementally).

If we go with your fix can you please amend the documentation to mention
this problem and suggest what targets need (not) to do to not run into
this problem?

> > (is x32 TARGET_64BIT?).
> 
> Yes.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Bug rtl-optimization/67609] [5/6 Regression] Generates wrong code for SSE2 _mm_load_pd
  2015-09-17 11:37 [Bug regression/67609] New: [Regression] Generates wrong code for SSE2 _mm_load_pd bisqwit at iki dot fi
                   ` (28 preceding siblings ...)
  2015-10-27 10:18 ` rguenth at gcc dot gnu.org
@ 2015-10-27 20:00 ` rth at gcc dot gnu.org
  2015-10-29 18:37 ` rth at gcc dot gnu.org
  2015-10-29 18:45 ` [Bug rtl-optimization/67609] [5 " rth at gcc dot gnu.org
  31 siblings, 0 replies; 33+ messages in thread
From: rth at gcc dot gnu.org @ 2015-10-27 20:00 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67609

--- Comment #32 from Richard Henderson <rth at gcc dot gnu.org> ---
Author: rth
Date: Tue Oct 27 19:59:41 2015
New Revision: 229458

URL: https://gcc.gnu.org/viewcvs?rev=229458&root=gcc&view=rev
Log:
PR rtl-opt/67609

        * config/i386/i386.c (ix86_cannot_change_mode_class): Disallow
        narrowing subregs on SSE and MMX registers.
        * doc/tm.texi.in (CANNOT_CHANGE_MODE_CLASS): Clarify when subregs that
        appear to be sub-words of multi-register pseudos must be rejected.
        * doc/tm.texi: Regenerate.
testsuite/
        * gcc.target/i386/pr67609-2.c: New test.

Added:
    trunk/gcc/testsuite/gcc.target/i386/pr67609-2.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/i386/i386.c
    trunk/gcc/doc/tm.texi
    trunk/gcc/doc/tm.texi.in
    trunk/gcc/testsuite/ChangeLog


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Bug rtl-optimization/67609] [5/6 Regression] Generates wrong code for SSE2 _mm_load_pd
  2015-09-17 11:37 [Bug regression/67609] New: [Regression] Generates wrong code for SSE2 _mm_load_pd bisqwit at iki dot fi
                   ` (29 preceding siblings ...)
  2015-10-27 20:00 ` rth at gcc dot gnu.org
@ 2015-10-29 18:37 ` rth at gcc dot gnu.org
  2015-10-29 18:45 ` [Bug rtl-optimization/67609] [5 " rth at gcc dot gnu.org
  31 siblings, 0 replies; 33+ messages in thread
From: rth at gcc dot gnu.org @ 2015-10-29 18:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67609

--- Comment #33 from Richard Henderson <rth at gcc dot gnu.org> ---
Author: rth
Date: Thu Oct 29 18:36:39 2015
New Revision: 229550

URL: https://gcc.gnu.org/viewcvs?rev=229550&root=gcc&view=rev
Log:
Fix target/68124

        PR target/68124
        PR rtl-opt/67609
        * config/i386/i386.c (ix86_cannot_change_mode_class): Tighten
        sse check to the exact conditions of PR 67609.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/i386/i386.c


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Bug rtl-optimization/67609] [5 Regression] Generates wrong code for SSE2 _mm_load_pd
  2015-09-17 11:37 [Bug regression/67609] New: [Regression] Generates wrong code for SSE2 _mm_load_pd bisqwit at iki dot fi
                   ` (30 preceding siblings ...)
  2015-10-29 18:37 ` rth at gcc dot gnu.org
@ 2015-10-29 18:45 ` rth at gcc dot gnu.org
  31 siblings, 0 replies; 33+ messages in thread
From: rth at gcc dot gnu.org @ 2015-10-29 18:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67609

Richard Henderson <rth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|[5/6 Regression] Generates  |[5 Regression] Generates
                   |wrong code for SSE2         |wrong code for SSE2
                   |_mm_load_pd                 |_mm_load_pd

--- Comment #34 from Richard Henderson <rth at gcc dot gnu.org> ---
Fixed for 6; let's wait a bit and see if there's any more fallout
before backporting to 5.


^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2015-10-29 18:45 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-17 11:37 [Bug regression/67609] New: [Regression] Generates wrong code for SSE2 _mm_load_pd bisqwit at iki dot fi
2015-09-17 12:51 ` [Bug target/67609] [5/6 Regression] " rguenth at gcc dot gnu.org
2015-09-17 13:13 ` ubizjak at gmail dot com
2015-09-17 13:48 ` rguenth at gcc dot gnu.org
2015-09-17 18:09 ` [Bug regression/67609] " bisqwit at iki dot fi
2015-09-17 19:03 ` [Bug rtl-optimization/67609] " ubizjak at gmail dot com
2015-10-16 12:44 ` vmakarov at gcc dot gnu.org
2015-10-16 12:50 ` vmakarov at gcc dot gnu.org
2015-10-20 16:26 ` vmakarov at gcc dot gnu.org
2015-10-20 16:32 ` vmakarov at gcc dot gnu.org
2015-10-20 19:12 ` ubizjak at gmail dot com
2015-10-21  7:52 ` ubizjak at gmail dot com
2015-10-21 16:30 ` vmakarov at gcc dot gnu.org
2015-10-21 17:34 ` ubizjak at gmail dot com
2015-10-21 18:16 ` law at redhat dot com
2015-10-21 18:42 ` vmakarov at gcc dot gnu.org
2015-10-22  8:23 ` rguenth at gcc dot gnu.org
2015-10-22 12:49 ` law at redhat dot com
2015-10-22 13:40 ` rguenth at gcc dot gnu.org
2015-10-22 19:34 ` law at redhat dot com
2015-10-22 21:55 ` rth at gcc dot gnu.org
2015-10-22 22:54 ` rth at gcc dot gnu.org
2015-10-23  1:58 ` vmakarov at gcc dot gnu.org
2015-10-23  5:02 ` vmakarov at gcc dot gnu.org
2015-10-23  8:30 ` rguenth at gcc dot gnu.org
2015-10-23 13:14 ` vmakarov at gcc dot gnu.org
2015-10-23 19:06 ` rth at gcc dot gnu.org
2015-10-26 19:18 ` rth at gcc dot gnu.org
2015-10-26 21:48 ` vmakarov at gcc dot gnu.org
2015-10-27 10:18 ` rguenth at gcc dot gnu.org
2015-10-27 20:00 ` rth at gcc dot gnu.org
2015-10-29 18:37 ` rth at gcc dot gnu.org
2015-10-29 18:45 ` [Bug rtl-optimization/67609] [5 " rth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).