public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/111449] New: memcmp (p,q,16) == 0 can be optimized better on ppc64 with vector comparison instructions
@ 2023-09-18  2:37 guihaoc at gcc dot gnu.org
  2023-10-23  1:17 ` [Bug target/111449] " cvs-commit at gcc dot gnu.org
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: guihaoc at gcc dot gnu.org @ 2023-09-18  2:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111449

            Bug ID: 111449
           Summary: memcmp (p,q,16) == 0 can be optimized better on ppc64
                    with vector comparison instructions
           Product: gcc
           Version: 13.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: guihaoc at gcc dot gnu.org
  Target Milestone: ---

int compare (const char* s1, const char* s2)
{
  return __builtin_memcmp (s1, s2, 16) == 0;
}


trunk outputs
        ld 10,0(3)
        ld 9,0(4)
        cmpd 0,10,9
        beq 0,.L6
.L2:
        li 3,1
        cntlzw 3,3
        srwi 3,3,5
        blr
        .p2align 4,,15
.L6:
        ld 10,8(3)
        ld 9,8(4)
        li 3,0
        cmpd 0,10,9
        bne 0,.L2
        cntlzw 3,3
        srwi 3,3,5
        blr

Expect to use vector comparison to eliminate branches.
        lxv 32,0(3)
        lxv 33,0(4)
        vcmpequb. 0,0,1
        mfcr 3,2
        rlwinm 3,3,25,1
        blr

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug target/111449] memcmp (p,q,16) == 0 can be optimized better on ppc64 with vector comparison instructions
  2023-09-18  2:37 [Bug target/111449] New: memcmp (p,q,16) == 0 can be optimized better on ppc64 with vector comparison instructions guihaoc at gcc dot gnu.org
@ 2023-10-23  1:17 ` cvs-commit at gcc dot gnu.org
  2023-10-30  3:03 ` cvs-commit at gcc dot gnu.org
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-10-23  1:17 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111449

--- Comment #1 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by HaoChen Gui <guihaoc@gcc.gnu.org>:

https://gcc.gnu.org/g:f08ca5903c7a02b450b93143467f70b9fd8e0085

commit r14-4835-gf08ca5903c7a02b450b93143467f70b9fd8e0085
Author: Haochen Gui <guihaoc@gcc.gnu.org>
Date:   Mon Oct 23 09:14:13 2023 +0800

    Expand: Enable vector mode for by pieces compares

    Vector mode compare instructions are efficient for equality compare on
    rs6000. This patch refactors the codes of by pieces operation to enable
    vector mode for compare.

    gcc/
            PR target/111449
            * expr.cc (can_use_qi_vectors): New function to return true if
            we know how to implement OP using vectors of bytes.
            (qi_vector_mode_supported_p): New function to check if optabs
            exists for the mode and certain by pieces operations.
            (widest_fixed_size_mode_for_size): Replace the second argument
            with the type of by pieces operations.  Call can_use_qi_vectors
            and qi_vector_mode_supported_p to do the check.  Call
            scalar_mode_supported_p to check if the scalar mode is supported.
            (by_pieces_ninsns): Pass the type of by pieces operation to
            widest_fixed_size_mode_for_size.
            (class op_by_pieces_d): Remove m_qi_vector_mode.  Add m_op to
            record the type of by pieces operations.
            (op_by_pieces_d::op_by_pieces_d): Change last argument to the
            type of by pieces operations, initialize m_op with it.  Pass
            m_op to function widest_fixed_size_mode_for_size.
            (op_by_pieces_d::get_usable_mode): Pass m_op to function
            widest_fixed_size_mode_for_size.
            (op_by_pieces_d::smallest_fixed_size_mode_for_size): Call
            can_use_qi_vectors and qi_vector_mode_supported_p to do the
            check.
            (op_by_pieces_d::run): Pass m_op to function
            widest_fixed_size_mode_for_size.
            (move_by_pieces_d::move_by_pieces_d): Set m_op to MOVE_BY_PIECES.
            (store_by_pieces_d::store_by_pieces_d): Set m_op with the op.
            (can_store_by_pieces): Pass the type of by pieces operations to
            widest_fixed_size_mode_for_size.
            (clear_by_pieces): Initialize class store_by_pieces_d with
            CLEAR_BY_PIECES.
            (compare_by_pieces_d::compare_by_pieces_d): Set m_op to
            COMPARE_BY_PIECES.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug target/111449] memcmp (p,q,16) == 0 can be optimized better on ppc64 with vector comparison instructions
  2023-09-18  2:37 [Bug target/111449] New: memcmp (p,q,16) == 0 can be optimized better on ppc64 with vector comparison instructions guihaoc at gcc dot gnu.org
  2023-10-23  1:17 ` [Bug target/111449] " cvs-commit at gcc dot gnu.org
@ 2023-10-30  3:03 ` cvs-commit at gcc dot gnu.org
  2023-11-17  9:20 ` cvs-commit at gcc dot gnu.org
  2023-11-17  9:25 ` guihaoc at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-10-30  3:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111449

--- Comment #2 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by HaoChen Gui <guihaoc@gcc.gnu.org>:

https://gcc.gnu.org/g:8111b5c23bd14f80607bd35af58ec31e38a0378e

commit r14-5001-g8111b5c23bd14f80607bd35af58ec31e38a0378e
Author: Haochen Gui <guihaoc@gcc.gnu.org>
Date:   Mon Oct 30 10:59:51 2023 +0800

    Expand: Checking available optabs for scalar modes in by pieces operations

    The former patch (f08ca5903c7) examines the scalar modes by target
    hook scalar_mode_supported_p.  It causes some i386 regression cases
    as XImode and OImode are not enabled in i386 target function.  This
    patch examines the scalar mode by checking if the corresponding optabs
    are available for the mode.

    gcc/
            PR target/111449
            * expr.cc (qi_vector_mode_supported_p): Rename to...
            (by_pieces_mode_supported_p): ...this, and extends it to do
            the checking for both scalar and vector mode.
            (widest_fixed_size_mode_for_size): Call
            by_pieces_mode_supported_p to examine the mode.
            (op_by_pieces_d::smallest_fixed_size_mode_for_size): Likewise.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug target/111449] memcmp (p,q,16) == 0 can be optimized better on ppc64 with vector comparison instructions
  2023-09-18  2:37 [Bug target/111449] New: memcmp (p,q,16) == 0 can be optimized better on ppc64 with vector comparison instructions guihaoc at gcc dot gnu.org
  2023-10-23  1:17 ` [Bug target/111449] " cvs-commit at gcc dot gnu.org
  2023-10-30  3:03 ` cvs-commit at gcc dot gnu.org
@ 2023-11-17  9:20 ` cvs-commit at gcc dot gnu.org
  2023-11-17  9:25 ` guihaoc at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-11-17  9:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111449

--- Comment #3 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by HaoChen Gui <guihaoc@gcc.gnu.org>:

https://gcc.gnu.org/g:cd295a80c91040fd4d826528c8e8e07fe909ae62

commit r14-5548-gcd295a80c91040fd4d826528c8e8e07fe909ae62
Author: Haochen Gui <guihaoc@gcc.gnu.org>
Date:   Fri Nov 17 17:12:32 2023 +0800

    rs6000: Enable vector mode for by pieces equality compare

    This patch adds a new expand pattern - cbranchv16qi4 to enable vector
    mode by pieces equality compare on rs6000.  The macro MOVE_MAX_PIECES
    (COMPARE_MAX_PIECES) is set to 16 bytes when EFFICIENT_UNALIGNED_VSX
    is enabled, otherwise keeps unchanged.  The macro STORE_MAX_PIECES is
    set to the same value as MOVE_MAX_PIECES by default, so now it's
    explicitly defined and keeps unchanged.

    gcc/
            PR target/111449
            * config/rs6000/altivec.md (cbranchv16qi4): New expand pattern.
            * config/rs6000/rs6000.cc (rs6000_generate_compare): Generate
            insn sequence for V16QImode equality compare.
            * config/rs6000/rs6000.h (MOVE_MAX_PIECES): Define.
            (STORE_MAX_PIECES): Define.

    gcc/testsuite/
            PR target/111449
            * gcc.target/powerpc/pr111449-1.c: New.
            * gcc.dg/tree-ssa/sra-17.c: Add additional options for 32-bit
powerpc.
            * gcc.dg/tree-ssa/sra-18.c: Likewise.

--- Comment #4 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by HaoChen Gui <guihaoc@gcc.gnu.org>:

https://gcc.gnu.org/g:10615c8a10d6b61e813254924d76be728dbd4688

commit r14-5549-g10615c8a10d6b61e813254924d76be728dbd4688
Author: Haochen Gui <guihaoc@gcc.gnu.org>
Date:   Fri Nov 17 17:17:59 2023 +0800

    rs6000: Fix regression cases caused 16-byte by pieces move

    The previous patch enables 16-byte by pieces move. Originally 16-byte
    move is implemented via pattern.  expand_block_move does an optimization
    on P8 LE to leverage V2DI reversed load/store for memory to memory move.
    Now 16-byte move is implemented via by pieces move and finally split to
    two DI load/store.  This patch creates an insn_and_split pattern to
    retake the optimization.

    gcc/
            PR target/111449
            * config/rs6000/vsx.md (*vsx_le_mem_to_mem_mov_ti): New.

    gcc/testsuite/
            PR target/111449
            * gcc.target/powerpc/pr111449-2.c: New.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug target/111449] memcmp (p,q,16) == 0 can be optimized better on ppc64 with vector comparison instructions
  2023-09-18  2:37 [Bug target/111449] New: memcmp (p,q,16) == 0 can be optimized better on ppc64 with vector comparison instructions guihaoc at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2023-11-17  9:20 ` cvs-commit at gcc dot gnu.org
@ 2023-11-17  9:25 ` guihaoc at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: guihaoc at gcc dot gnu.org @ 2023-11-17  9:25 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111449

HaoChen Gui <guihaoc at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|---                         |FIXED

--- Comment #5 from HaoChen Gui <guihaoc at gcc dot gnu.org> ---
Fixed

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-11-17  9:25 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-09-18  2:37 [Bug target/111449] New: memcmp (p,q,16) == 0 can be optimized better on ppc64 with vector comparison instructions guihaoc at gcc dot gnu.org
2023-10-23  1:17 ` [Bug target/111449] " cvs-commit at gcc dot gnu.org
2023-10-30  3:03 ` cvs-commit at gcc dot gnu.org
2023-11-17  9:20 ` cvs-commit at gcc dot gnu.org
2023-11-17  9:25 ` guihaoc at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).