public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/110438] New: generating all-ones zmm needs dep-breaking pxor before ternlog
@ 2023-06-27 17:54 amonakov at gcc dot gnu.org
  2023-06-27 17:59 ` [Bug target/110438] " amonakov at gcc dot gnu.org
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: amonakov at gcc dot gnu.org @ 2023-06-27 17:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110438

            Bug ID: 110438
           Summary: generating all-ones zmm needs dep-breaking pxor before
                    ternlog
           Product: gcc
           Version: 13.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: amonakov at gcc dot gnu.org
  Target Milestone: ---
            Target: x86_64-*-*

VPTERNLOG is never a dependency-breaking instruction on existing x86
implementations, so generating a vector of all-ones via bare ternlog can stall
waiting on destination register. GCC should emit a dependency-breaking PXOR,
otherwise it will be a false-dependency-on-popcnt-lzcnt debacle all over again.

#include <immintrin.h>

__m512i g(void)
{
    return (__m512i){ 0 } - 1;
}

g:
        # waits until previous computation
        # of zmm0 has completed
        vpternlogd      zmm0, zmm0, zmm0, 0xFF
        ret

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/110438] generating all-ones zmm needs dep-breaking pxor before ternlog
  2023-06-27 17:54 [Bug target/110438] New: generating all-ones zmm needs dep-breaking pxor before ternlog amonakov at gcc dot gnu.org
@ 2023-06-27 17:59 ` amonakov at gcc dot gnu.org
  2023-06-28  0:35 ` crazylht at gmail dot com
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: amonakov at gcc dot gnu.org @ 2023-06-27 17:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110438

--- Comment #1 from Alexander Monakov <amonakov at gcc dot gnu.org> ---
We might want to omit PXOR when optimizing for size.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/110438] generating all-ones zmm needs dep-breaking pxor before ternlog
  2023-06-27 17:54 [Bug target/110438] New: generating all-ones zmm needs dep-breaking pxor before ternlog amonakov at gcc dot gnu.org
  2023-06-27 17:59 ` [Bug target/110438] " amonakov at gcc dot gnu.org
@ 2023-06-28  0:35 ` crazylht at gmail dot com
  2023-07-04 19:54 ` amonakov at gcc dot gnu.org
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: crazylht at gmail dot com @ 2023-06-28  0:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110438

Hongtao.liu <crazylht at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |crazylht at gmail dot com

--- Comment #2 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Alexander Monakov from comment #1)
> We might want to omit PXOR when optimizing for size.

indeed.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/110438] generating all-ones zmm needs dep-breaking pxor before ternlog
  2023-06-27 17:54 [Bug target/110438] New: generating all-ones zmm needs dep-breaking pxor before ternlog amonakov at gcc dot gnu.org
  2023-06-27 17:59 ` [Bug target/110438] " amonakov at gcc dot gnu.org
  2023-06-28  0:35 ` crazylht at gmail dot com
@ 2023-07-04 19:54 ` amonakov at gcc dot gnu.org
  2023-07-12  7:51 ` cvs-commit at gcc dot gnu.org
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: amonakov at gcc dot gnu.org @ 2023-07-04 19:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110438

--- Comment #3 from Alexander Monakov <amonakov at gcc dot gnu.org> ---
Patch available:
https://inbox.sourceware.org/gcc-patches/8f73371d732237ed54ede44b7bd88624@ispras.ru/T/#u

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/110438] generating all-ones zmm needs dep-breaking pxor before ternlog
  2023-06-27 17:54 [Bug target/110438] New: generating all-ones zmm needs dep-breaking pxor before ternlog amonakov at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2023-07-04 19:54 ` amonakov at gcc dot gnu.org
@ 2023-07-12  7:51 ` cvs-commit at gcc dot gnu.org
  2023-07-12  7:52 ` crazylht at gmail dot com
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-07-12  7:51 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110438

--- Comment #4 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by hongtao Liu <liuhongt@gcc.gnu.org>:

https://gcc.gnu.org/g:13c556d6ae84be3ee2bc245a56eafa58221de86a

commit r14-2447-g13c556d6ae84be3ee2bc245a56eafa58221de86a
Author: liuhongt <hongtao.liu@intel.com>
Date:   Thu Jun 29 14:25:28 2023 +0800

    Break false dependence for vpternlog by inserting vpxor or setting
constraint of input operand to '0'

    False dependency happens when destination is only updated by
    pternlog. There is no false dependency when destination is also used
    in source. So either a pxor should be inserted, or input operand
    should be set with constraint '0'.

    gcc/ChangeLog:

            PR target/110438
            PR target/110202
            * config/i386/predicates.md
            (int_float_vector_all_ones_operand): New predicate.
            * config/i386/sse.md (*vmov<mode>_constm1_pternlog_false_dep): New
            define_insn.
            (*<avx512>_cvtmask2<ssemodesuffix><mode>_pternlog_false_dep):
            Ditto.
            (*<avx512>_cvtmask2<ssemodesuffix><mode>_pternlog_false_dep):
            Ditto.
            (*<avx512>_cvtmask2<ssemodesuffix><mode>): Adjust to
            define_insn_and_split to avoid false dependence.
            (*<avx512>_cvtmask2<ssemodesuffix><mode>): Ditto.
            (<mask_codefor>one_cmpl<mode>2<mask_name>): Adjust constraint
            of operands 1 to '0' to avoid false dependence.
            (*andnot<mode>3): Ditto.
            (iornot<mode>3): Ditto.
            (*<nlogic><mode>3): Ditto.

    gcc/testsuite/ChangeLog:

            * gcc.target/i386/pr110438.c: New test.
            * gcc.target/i386/pr100711-6.c: Adjust testcase.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/110438] generating all-ones zmm needs dep-breaking pxor before ternlog
  2023-06-27 17:54 [Bug target/110438] New: generating all-ones zmm needs dep-breaking pxor before ternlog amonakov at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2023-07-12  7:51 ` cvs-commit at gcc dot gnu.org
@ 2023-07-12  7:52 ` crazylht at gmail dot com
  2023-07-18  3:33 ` cvs-commit at gcc dot gnu.org
  2023-11-30 10:55 ` liuhongt at gcc dot gnu.org
  6 siblings, 0 replies; 8+ messages in thread
From: crazylht at gmail dot com @ 2023-07-12  7:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110438

--- Comment #5 from Hongtao.liu <crazylht at gmail dot com> ---
Should be fixed in GCC14.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/110438] generating all-ones zmm needs dep-breaking pxor before ternlog
  2023-06-27 17:54 [Bug target/110438] New: generating all-ones zmm needs dep-breaking pxor before ternlog amonakov at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2023-07-12  7:52 ` crazylht at gmail dot com
@ 2023-07-18  3:33 ` cvs-commit at gcc dot gnu.org
  2023-11-30 10:55 ` liuhongt at gcc dot gnu.org
  6 siblings, 0 replies; 8+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-07-18  3:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110438

--- Comment #6 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by hongtao Liu <liuhongt@gcc.gnu.org>:

https://gcc.gnu.org/g:c3f1768b21e9d994c4f090405e863feb06a54002

commit r14-2596-gc3f1768b21e9d994c4f090405e863feb06a54002
Author: liuhongt <hongtao.liu@intel.com>
Date:   Mon Jul 17 12:50:17 2023 +0800

    Remove # from <mask_codefor>one_cmpl<mode>2<mask_name> assemble output.

    optimize_insn_for_speed () in assemble output is not aligned with
    splitter condition, and it cause an ICE when building SPEC2017
    blender_r.

    libpng/pngread.c: In function âpng_read_imageâ:
    libpng/pngread.c:786:1: internal compiler error: in final_scan_insn_1, at
final.cc:2813
      786 | }
          | ^
    0x73ac3d final_scan_insn_1
            ../../gcc/final.cc:2813
    0xb3420b final_scan_insn(rtx_insn*, _IO_FILE*, int, int, int*)
            ../../gcc/final.cc:2887
    0xb344c4 final_1
            ../../gcc/final.cc:1979
    0xb34f64 rest_of_handle_final
            ../../gcc/final.cc:4240
    0xb34f64 execute
            ../../gcc/final.cc:4318

    gcc/ChangeLog:

            PR target/110438
            * config/i386/sse.md (<mask_codefor>one_cmpl<mode>2<mask_name>):
            Remove # from assemble output.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/110438] generating all-ones zmm needs dep-breaking pxor before ternlog
  2023-06-27 17:54 [Bug target/110438] New: generating all-ones zmm needs dep-breaking pxor before ternlog amonakov at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2023-07-18  3:33 ` cvs-commit at gcc dot gnu.org
@ 2023-11-30 10:55 ` liuhongt at gcc dot gnu.org
  6 siblings, 0 replies; 8+ messages in thread
From: liuhongt at gcc dot gnu.org @ 2023-11-30 10:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110438

liuhongt at gcc dot gnu.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |liuhongt at gcc dot gnu.org
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|---                         |FIXED

--- Comment #7 from liuhongt at gcc dot gnu.org ---
.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-11-30 10:55 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-27 17:54 [Bug target/110438] New: generating all-ones zmm needs dep-breaking pxor before ternlog amonakov at gcc dot gnu.org
2023-06-27 17:59 ` [Bug target/110438] " amonakov at gcc dot gnu.org
2023-06-28  0:35 ` crazylht at gmail dot com
2023-07-04 19:54 ` amonakov at gcc dot gnu.org
2023-07-12  7:51 ` cvs-commit at gcc dot gnu.org
2023-07-12  7:52 ` crazylht at gmail dot com
2023-07-18  3:33 ` cvs-commit at gcc dot gnu.org
2023-11-30 10:55 ` liuhongt at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).