public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/103774] New: [i386] GCC should swap the arguments to certain functions to generate a single instruction
@ 2021-12-20 15:59 thiago at kde dot org
  2021-12-20 16:00 ` [Bug target/103774] " thiago at kde dot org
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: thiago at kde dot org @ 2021-12-20 15:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103774

            Bug ID: 103774
           Summary: [i386] GCC should swap the arguments to certain
                    functions to generate a single instruction
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: thiago at kde dot org
  Target Milestone: ---

I don't know how widespread this is. Seen in the code generated at
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103750.

This code:
        __m256i data1 = _mm256_loadu_si256(reinterpret_cast<const __m256i
*>(n));
        __m256i data2 = _mm256_loadu_si256(reinterpret_cast<const __m256i *>(n)
+ 1);
        __mmask16 mask1 = _mm256_cmpeq_epu16_mask(data1, mch256);
        __mmask16 mask2 = _mm256_cmpeq_epu16_mask(data2, mch256);
Generates:
        vmovdqu16       (%rdi), %ymm1
        vmovdqu16       32(%rdi), %ymm2
        vpcmpuw $0, %ymm0, %ymm1, %k0
        vpcmpuw $0, %ymm0, %ymm2, %k1

While if you invert the two operands in the cmpeq intrinsics, as in:
        __m256i data1 = _mm256_loadu_si256(reinterpret_cast<const __m256i
*>(n));
        __m256i data2 = _mm256_loadu_si256(reinterpret_cast<const __m256i *>(n)
+ 1);
        __mmask16 mask1 = _mm256_cmpeq_epu16_mask(mch256, data1);
        __mmask16 mask2 = _mm256_cmpeq_epu16_mask(mch256, data2);
You get:
        vpcmpuw $0, (%rdi), %ymm0, %k0
        vpcmpuw $0, 32(%rdi), %ymm0, %k1


Godbolt link with full copileable source code:
https://gcc.godbolt.org/z/rKo666MM7

Clang, ICC (Clang-based) do this. MSVC behaves like GCC.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/103774] [i386] GCC should swap the arguments to certain functions to generate a single instruction
  2021-12-20 15:59 [Bug target/103774] New: [i386] GCC should swap the arguments to certain functions to generate a single instruction thiago at kde dot org
@ 2021-12-20 16:00 ` thiago at kde dot org
  2021-12-20 16:01 ` pinskia at gcc dot gnu.org
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: thiago at kde dot org @ 2021-12-20 16:00 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103774

Thiago Macieira <thiago at kde dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |hjl.tools at gmail dot com

--- Comment #1 from Thiago Macieira <thiago at kde dot org> ---
This is a very minor thing because I expect that, at the uop level, the two
code sequences are identical. There are two more macro-instructions to retire
on the front-end, though.

You can lower the priority.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/103774] [i386] GCC should swap the arguments to certain functions to generate a single instruction
  2021-12-20 15:59 [Bug target/103774] New: [i386] GCC should swap the arguments to certain functions to generate a single instruction thiago at kde dot org
  2021-12-20 16:00 ` [Bug target/103774] " thiago at kde dot org
@ 2021-12-20 16:01 ` pinskia at gcc dot gnu.org
  2021-12-20 16:04 ` pinskia at gcc dot gnu.org
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-12-20 16:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103774

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
#include <immintrin.h>

#if defined(__INTEL_COMPILER) || defined(_MSC_VER)
auto __tzcnt_u16(unsigned /*short*/ value)
{
#    ifdef Q_CC_INTEL
    unsigned short res;
    asm("tzcntw %w1, %0" : "=r" (res) : "r" (value));
    return res;
#    else
    return _tzcnt_u32(value | 0xffff0000U);
#    endif
}
#endif

const char16_t *qustrchr1(char16_t *n, char16_t *e, char16_t c) noexcept
{
    __m256i mch256 = _mm256_set1_epi16(c);
    for ( ; n < e; n += 32) {
        __m256i data1 = _mm256_loadu_si256(reinterpret_cast<const __m256i
*>(n));
        __m256i data2 = _mm256_loadu_si256(reinterpret_cast<const __m256i *>(n)
+ 1);
        __mmask16 mask1 = _mm256_cmpeq_epu16_mask(data1, mch256);
        __mmask16 mask2 = _mm256_cmpeq_epu16_mask(data2, mch256);
        if (_kortestz_mask16_u8(mask1, mask2))
            continue;

        unsigned idx = _tzcnt_u32(mask1);
        if (mask1 == 0) {
            idx = __tzcnt_u16(mask2);
            n += 16;
        }
        return n + idx;
    }
    return e;
}

const char16_t *qustrchr2(char16_t *n, char16_t *e, char16_t c) noexcept
{
    __m256i mch256 = _mm256_set1_epi16(c);
    for ( ; n < e; n += 32) {
        __m256i data1 = _mm256_loadu_si256(reinterpret_cast<const __m256i
*>(n));
        __m256i data2 = _mm256_loadu_si256(reinterpret_cast<const __m256i *>(n)
+ 1);
        __mmask16 mask1 = _mm256_cmpeq_epu16_mask(mch256, data1);
        __mmask16 mask2 = _mm256_cmpeq_epu16_mask(mch256, data2);
        if (_kortestz_mask16_u8(mask1, mask2))
            continue;

        unsigned idx = _tzcnt_u32(mask1);
        if (mask1 == 0) {
            idx = __tzcnt_u16(mask2);
            n += 16;
        }
        return n + idx;
    }
    return e;
}

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/103774] [i386] GCC should swap the arguments to certain functions to generate a single instruction
  2021-12-20 15:59 [Bug target/103774] New: [i386] GCC should swap the arguments to certain functions to generate a single instruction thiago at kde dot org
  2021-12-20 16:00 ` [Bug target/103774] " thiago at kde dot org
  2021-12-20 16:01 ` pinskia at gcc dot gnu.org
@ 2021-12-20 16:04 ` pinskia at gcc dot gnu.org
  2021-12-20 16:08 ` pinskia at gcc dot gnu.org
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-12-20 16:04 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103774

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |enhancement
           Keywords|                            |missed-optimization

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/103774] [i386] GCC should swap the arguments to certain functions to generate a single instruction
  2021-12-20 15:59 [Bug target/103774] New: [i386] GCC should swap the arguments to certain functions to generate a single instruction thiago at kde dot org
                   ` (2 preceding siblings ...)
  2021-12-20 16:04 ` pinskia at gcc dot gnu.org
@ 2021-12-20 16:08 ` pinskia at gcc dot gnu.org
  2022-01-07  3:14 ` crazylht at gmail dot com
  2022-01-07  3:20 ` pinskia at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-12-20 16:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103774

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2021-12-20
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |NEW

--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(define_insn "<avx512>_ucmp<mode>3<mask_scalar_merge_name>"
  [(set (match_operand:<avx512fmaskmode> 0 "register_operand" "=k")
        (unspec:<avx512fmaskmode>
          [(match_operand:VI12_AVX512VL 1 "register_operand" "v")
           (match_operand:VI12_AVX512VL 2 "nonimmediate_operand" "vm")
           (match_operand:SI 3 "const_0_to_7_operand" "n")]
          UNSPEC_UNSIGNED_PCMP))]
  "TARGET_AVX512BW"
  "vpcmpu<ssemodesuffix>\t{%3, %2, %1,
%0<mask_scalar_merge_operand4>|%0<mask_scalar_merge_operand4>, %1, %2, %3}"
  [(set_attr "type" "ssecmp")
   (set_attr "length_immediate" "1")
   (set_attr "prefix" "evex")
   (set_attr "mode" "<sseinsnmode>")])

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/103774] [i386] GCC should swap the arguments to certain functions to generate a single instruction
  2021-12-20 15:59 [Bug target/103774] New: [i386] GCC should swap the arguments to certain functions to generate a single instruction thiago at kde dot org
                   ` (3 preceding siblings ...)
  2021-12-20 16:08 ` pinskia at gcc dot gnu.org
@ 2022-01-07  3:14 ` crazylht at gmail dot com
  2022-01-07  3:20 ` pinskia at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: crazylht at gmail dot com @ 2022-01-07  3:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103774

--- Comment #4 from Hongtao.liu <crazylht at gmail dot com> ---
Fixed in GCC12 by r12-6338

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/103774] [i386] GCC should swap the arguments to certain functions to generate a single instruction
  2021-12-20 15:59 [Bug target/103774] New: [i386] GCC should swap the arguments to certain functions to generate a single instruction thiago at kde dot org
                   ` (4 preceding siblings ...)
  2022-01-07  3:14 ` crazylht at gmail dot com
@ 2022-01-07  3:20 ` pinskia at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-01-07  3:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103774

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |12.0
         Resolution|---                         |FIXED
             Status|NEW                         |RESOLVED

--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Fixed so closing.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-01-07  3:20 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-20 15:59 [Bug target/103774] New: [i386] GCC should swap the arguments to certain functions to generate a single instruction thiago at kde dot org
2021-12-20 16:00 ` [Bug target/103774] " thiago at kde dot org
2021-12-20 16:01 ` pinskia at gcc dot gnu.org
2021-12-20 16:04 ` pinskia at gcc dot gnu.org
2021-12-20 16:08 ` pinskia at gcc dot gnu.org
2022-01-07  3:14 ` crazylht at gmail dot com
2022-01-07  3:20 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).