public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/63467] New: should have asm statement that does not prevent vectorization
@ 2014-10-06 17:04 andi-gcc at firstfloor dot org
  2014-10-06 17:08 ` [Bug tree-optimization/63467] " pinskia at gcc dot gnu.org
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: andi-gcc at firstfloor dot org @ 2014-10-06 17:04 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63467

            Bug ID: 63467
           Summary: should have asm statement that does not prevent
                    vectorization
           Product: gcc
           Version: 5.0
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: andi-gcc at firstfloor dot org

Currently any inline asm statement in a loop prevents vectorization, like

#define N 100
int a[N], b[N], c[N];

main()
{
        int i;
        for (i = 0; i < N; i++) {
                asm("");
                a[i] = b[i] + c[i];
        }
}

Without the asm the loop vectorizes fine.

This is a problem if you want to add markers into the loop body for static
assembler code analysis (for example with IACA,
https://software.intel.com/en-us/articles/intel-architecture-code-analyzer)

Should have some way to tell the compiler that a particular inline asm
statement does not have any side effects that prevent vectorization or other
loop transformations.

Perhaps an asm const ?


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug tree-optimization/63467] should have asm statement that does not prevent vectorization
  2014-10-06 17:04 [Bug tree-optimization/63467] New: should have asm statement that does not prevent vectorization andi-gcc at firstfloor dot org
@ 2014-10-06 17:08 ` pinskia at gcc dot gnu.org
  2014-10-06 17:10 ` andi-gcc at firstfloor dot org
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2014-10-06 17:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63467

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Try asm volatile ("":::); instead.  Asms without any ::: are considered
clobbering memory.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug tree-optimization/63467] should have asm statement that does not prevent vectorization
  2014-10-06 17:04 [Bug tree-optimization/63467] New: should have asm statement that does not prevent vectorization andi-gcc at firstfloor dot org
  2014-10-06 17:08 ` [Bug tree-optimization/63467] " pinskia at gcc dot gnu.org
@ 2014-10-06 17:10 ` andi-gcc at firstfloor dot org
  2014-10-06 17:26 ` pinskia at gcc dot gnu.org
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: andi-gcc at firstfloor dot org @ 2014-10-06 17:10 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63467

--- Comment #2 from Andi Kleen <andi-gcc at firstfloor dot org> ---
It's the same with asm("" :::);

At least the vectorizer bombs out on any asm.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug tree-optimization/63467] should have asm statement that does not prevent vectorization
  2014-10-06 17:04 [Bug tree-optimization/63467] New: should have asm statement that does not prevent vectorization andi-gcc at firstfloor dot org
  2014-10-06 17:08 ` [Bug tree-optimization/63467] " pinskia at gcc dot gnu.org
  2014-10-06 17:10 ` andi-gcc at firstfloor dot org
@ 2014-10-06 17:26 ` pinskia at gcc dot gnu.org
  2014-10-06 17:31 ` jakub at gcc dot gnu.org
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2014-10-06 17:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63467

--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Ok doing this works:
                asm("":"+r"(t)::);


But it looks like it should not vectorize due to the number of iterations
happening for that asm has changed.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug tree-optimization/63467] should have asm statement that does not prevent vectorization
  2014-10-06 17:04 [Bug tree-optimization/63467] New: should have asm statement that does not prevent vectorization andi-gcc at firstfloor dot org
                   ` (2 preceding siblings ...)
  2014-10-06 17:26 ` pinskia at gcc dot gnu.org
@ 2014-10-06 17:31 ` jakub at gcc dot gnu.org
  2014-10-06 17:33 ` pinskia at gcc dot gnu.org
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: jakub at gcc dot gnu.org @ 2014-10-06 17:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63467

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
How would that work though?  How exactly would you vectorize inline-asm?
Duplicate it VF times, something else?


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug tree-optimization/63467] should have asm statement that does not prevent vectorization
  2014-10-06 17:04 [Bug tree-optimization/63467] New: should have asm statement that does not prevent vectorization andi-gcc at firstfloor dot org
                   ` (3 preceding siblings ...)
  2014-10-06 17:31 ` jakub at gcc dot gnu.org
@ 2014-10-06 17:33 ` pinskia at gcc dot gnu.org
  2014-10-06 17:46 ` andi-gcc at firstfloor dot org
  2014-10-07  8:11 ` rguenth at gcc dot gnu.org
  6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2014-10-06 17:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63467

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|---                         |INVALID

--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #3)
> Ok doing this works:
>                 asm("":"+r"(t)::);
> 
> 
> But it looks like it should not vectorize due to the number of iterations
> happening for that asm has changed.

Ok, actually if the asm result is used outside the loop, the vectorizer does
not happen but if is not used, then it happens so no wrong code.

You need some output to the inline-asm to cause it to vectorizer:
This:
{int t; asm ("":"+r"(t)::); }

Otherwise if it is volatile, it does not work as that requires that many
iterations of asm to be called.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug tree-optimization/63467] should have asm statement that does not prevent vectorization
  2014-10-06 17:04 [Bug tree-optimization/63467] New: should have asm statement that does not prevent vectorization andi-gcc at firstfloor dot org
                   ` (4 preceding siblings ...)
  2014-10-06 17:33 ` pinskia at gcc dot gnu.org
@ 2014-10-06 17:46 ` andi-gcc at firstfloor dot org
  2014-10-07  8:11 ` rguenth at gcc dot gnu.org
  6 siblings, 0 replies; 8+ messages in thread
From: andi-gcc at firstfloor dot org @ 2014-10-06 17:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63467

--- Comment #6 from Andi Kleen <andi-gcc at firstfloor dot org> ---
For the marker case it's enough if it just stays in the same position in the
basic block and does get duplicated if the BB gets too.

That's somewhat special semantics, that is why I think it would need some way
to annotate (asm const?)

Ok maybe Andrew's trick works, but it seems fragile. Would that work for other
loop transformations (like graphite) too?


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug tree-optimization/63467] should have asm statement that does not prevent vectorization
  2014-10-06 17:04 [Bug tree-optimization/63467] New: should have asm statement that does not prevent vectorization andi-gcc at firstfloor dot org
                   ` (5 preceding siblings ...)
  2014-10-06 17:46 ` andi-gcc at firstfloor dot org
@ 2014-10-07  8:11 ` rguenth at gcc dot gnu.org
  6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2014-10-07  8:11 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63467

--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> ---
Why not use a label?

#define N 100
int a[N], b[N], c[N];

main()
{
  static void *x __attribute__((used)) = &&bar;
  int i;
  for (i = 0; i < N; i++) {
bar:
      a[i] = b[i] + c[i];
  }
}

will get you

.L2:
        movdqa  b(%rax), %xmm0
        addq    $16, %rax
        paddd   c-16(%rax), %xmm0
        movaps  %xmm0, a-16(%rax)
        cmpq    $400, %rax
        jne     .L2

...

        .type   x.1751, @object
        .size   x.1751, 8
x.1751:
        .quad   .L2

(ok, the label isn't called 'bar' anymore for some dubious reason).  Maybe
there is a more fancy way to mark the label used than taking its address
(a "used" attribute on the label itself is ignored).

The code_label ("bar") survives until the very end but it seems that asmout
transforms local labels to the .L<d> form unconditionally.


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-10-07  8:11 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-10-06 17:04 [Bug tree-optimization/63467] New: should have asm statement that does not prevent vectorization andi-gcc at firstfloor dot org
2014-10-06 17:08 ` [Bug tree-optimization/63467] " pinskia at gcc dot gnu.org
2014-10-06 17:10 ` andi-gcc at firstfloor dot org
2014-10-06 17:26 ` pinskia at gcc dot gnu.org
2014-10-06 17:31 ` jakub at gcc dot gnu.org
2014-10-06 17:33 ` pinskia at gcc dot gnu.org
2014-10-06 17:46 ` andi-gcc at firstfloor dot org
2014-10-07  8:11 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).