[Bug tree-optimization/66675] New: Could improve vector bit_field

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug tree-optimization/66675] New: Could improve vector bit_field_ref style optimizations.
@ 2015-06-25 21:03 ramana at gcc dot gnu.org
  2015-06-25 21:04 ` [Bug tree-optimization/66675] " ramana at gcc dot gnu.org
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: ramana at gcc dot gnu.org @ 2015-06-25 21:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66675

            Bug ID: 66675
           Summary: Could improve vector bit_field_ref style
                    optimizations.
           Product: gcc
           Version: 5.1.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: ramana at gcc dot gnu.org
  Target Milestone: ---

This example 


#include <arm_neon.h>

int main(int argc, char *argv[])
{
    int8x8_t a = {argc, 1, 2, 3, 4, 5, 6, 7};
    int8x8_t b = {0, 1, 2, 3, 4, 5, 6, 7};
    int8x8_t c = vadd_s8(a, b);
    return c[0];
}


or it's variant written in gcc vector speak generate pretty terrible code for
AArch64 

main:
        adr     x1, .LC0
        ld1     {v0.8b}, [x1]
        ins     v0.b[0], w0
        adr     x0, .LC2
        ld1     {v1.8b}, [x0]
        add     v0.8b, v0.8b, v1.8b
        umov    w0, v0.b[0]
        sxtb    w0, w0
        ret
        .size   main, .-main


This could well be folded down to a simple function that returns just argc.
While this is a bit silly to expect in real life, it does show an interesting
example....


regards
Ramana


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/66675] Could improve vector bit_field_ref style optimizations.
  2015-06-25 21:03 [Bug tree-optimization/66675] New: Could improve vector bit_field_ref style optimizations ramana at gcc dot gnu.org
@ 2015-06-25 21:04 ` ramana at gcc dot gnu.org
  2015-06-25 21:06 ` [Bug tree-optimization/66675] Could improve vector lane folding style operations ramana at gcc dot gnu.org
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: ramana at gcc dot gnu.org @ 2015-06-25 21:04 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66675

Ramana Radhakrishnan <ramana at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
             Target|                            |aarch64*-*-*, arm*-*-*
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2015-06-25
             Blocks|                            |47562
     Ever confirmed|0                           |1
           Severity|normal                      |enhancement


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=47562
[Bug 47562] [meta-bug] keep track of Neon Intrinsics enhancements


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/66675] Could improve vector lane folding style operations.
  2015-06-25 21:03 [Bug tree-optimization/66675] New: Could improve vector bit_field_ref style optimizations ramana at gcc dot gnu.org
  2015-06-25 21:04 ` [Bug tree-optimization/66675] " ramana at gcc dot gnu.org
@ 2015-06-25 21:06 ` ramana at gcc dot gnu.org
  2015-06-25 22:41 ` pinskia at gcc dot gnu.org
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: ramana at gcc dot gnu.org @ 2015-06-25 21:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66675

--- Comment #1 from Ramana Radhakrishnan <ramana at gcc dot gnu.org> ---
The GCC vector speak variant is as below.


typedef char v8qi __attribute__ ((vector_size (8)));


int main(int argc, char *argv[])
{
  v8qi a = {argc, 1, 2, 3, 4, 5, 6, 7};
  v8qi b = {0, 1, 2, 3, 4, 5, 6, 7};
  v8qi c = a + b;    
  return c[0];
}

True on both arm and aarch64 - I haven't checked other targets.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/66675] Could improve vector lane folding style operations.
  2015-06-25 21:03 [Bug tree-optimization/66675] New: Could improve vector bit_field_ref style optimizations ramana at gcc dot gnu.org
  2015-06-25 21:04 ` [Bug tree-optimization/66675] " ramana at gcc dot gnu.org
  2015-06-25 21:06 ` [Bug tree-optimization/66675] Could improve vector lane folding style operations ramana at gcc dot gnu.org
@ 2015-06-25 22:41 ` pinskia at gcc dot gnu.org
  2015-06-25 22:42 ` pinskia at gcc dot gnu.org
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: pinskia at gcc dot gnu.org @ 2015-06-25 22:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66675

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Basically VECTOR_CST + VECTOR_CST is not optimized at all.  I bet almost all
operations that act on VECTOR_CST are not optimized including and not limited
to PLUS, SUB, MULTIPLY, DIVIDE, SHIFT, IOR, XOR, and AND.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/66675] Could improve vector lane folding style operations.
  2015-06-25 21:03 [Bug tree-optimization/66675] New: Could improve vector bit_field_ref style optimizations ramana at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2015-06-25 22:41 ` pinskia at gcc dot gnu.org
@ 2015-06-25 22:42 ` pinskia at gcc dot gnu.org
  2015-06-25 22:50 ` pinskia at gcc dot gnu.org
  2021-08-20  6:01 ` pinskia at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: pinskia at gcc dot gnu.org @ 2015-06-25 22:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66675

--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #2)
> Basically VECTOR_CST + VECTOR_CST is not optimized at all.  I bet almost all
> operations that act on VECTOR_CST are not optimized including and not
> limited to PLUS, SUB, MULTIPLY, DIVIDE, SHIFT, IOR, XOR, and AND.

Or rather CONSTRUCTOR + VECTOR_CST.  We I suspect having a VECTOR_EXPR instead
of a CONSTRUCTOR can help in those cases.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/66675] Could improve vector lane folding style operations.
  2015-06-25 21:03 [Bug tree-optimization/66675] New: Could improve vector bit_field_ref style optimizations ramana at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2015-06-25 22:42 ` pinskia at gcc dot gnu.org
@ 2015-06-25 22:50 ` pinskia at gcc dot gnu.org
  2021-08-20  6:01 ` pinskia at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: pinskia at gcc dot gnu.org @ 2015-06-25 22:50 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66675

--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Note for this optimization to be useful there needs to be a heurstic to find
out if folding CONSTRUCTOR + VECTOR_CST is going to be only one or no other
add.  Or using one element of the whole vector.

AKA it might not be profit able to fold CONSTRUCTOR + VECTOR_CST to CONSTRUCTOR
with all scalar additions.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/66675] Could improve vector lane folding style operations.
  2015-06-25 21:03 [Bug tree-optimization/66675] New: Could improve vector bit_field_ref style optimizations ramana at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2015-06-25 22:50 ` pinskia at gcc dot gnu.org
@ 2021-08-20  6:01 ` pinskia at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-08-20  6:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66675

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|2015-06-25 00:00:00         |2021-8-19

--- Comment #6 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Maybe if some match patterns dealing with BFRs and vector_csts is needed

Something like:
(for binary_op (...)
 (simplify
  (BFR (binary_op:s VECTOR_CST@0 @1) ...)
  (binary_op (BFR @0 ...) (BFR @1 ...)))
 (simplify
  (BFR (binary_op:s @0 VECTOR_CST@1 ) ...)
  (binary_op (BFR @0 ...) (BFR @1 ...)))
)

(for unary_op (...)
 (simplify
  (BFR (unary_op:s @1) ...)
  (unary_op (BFR @0 ...)))

This pushes the BFR as far back as possible and will solve this testcase but I
am not 100% sure it will solve all.

Note BFR might be subvectors and not just a scalar and such.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-08-20  6:01 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-06-25 21:03 [Bug tree-optimization/66675] New: Could improve vector bit_field_ref style optimizations ramana at gcc dot gnu.org
2015-06-25 21:04 ` [Bug tree-optimization/66675] " ramana at gcc dot gnu.org
2015-06-25 21:06 ` [Bug tree-optimization/66675] Could improve vector lane folding style operations ramana at gcc dot gnu.org
2015-06-25 22:41 ` pinskia at gcc dot gnu.org
2015-06-25 22:42 ` pinskia at gcc dot gnu.org
2015-06-25 22:50 ` pinskia at gcc dot gnu.org
2021-08-20  6:01 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).