From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lf1-x136.google.com (mail-lf1-x136.google.com [IPv6:2a00:1450:4864:20::136]) by sourceware.org (Postfix) with ESMTPS id B5879384AB4D for ; Wed, 15 May 2024 06:49:56 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B5879384AB4D Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu ARC-Filter: OpenARC Filter v1.0.0 sourceware.org B5879384AB4D Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::136 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1715755800; cv=none; b=DalKsHlGnacFt4yw1hM0xZ09haOYwwzZczrt24QsITZNrTsdbQGVOCvCYRvKbB2Q3dFGMWRBKgFmVgNOZC0oa8vTDpatqDsgPf5XvYrim/YvrWUUXAwsGyQ2FrSeVy8zI6azzVwh7HleRS1UZsu6i6RuQzCNYBVzVTKoEwklEoY= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1715755800; c=relaxed/simple; bh=ox90k4BocPMmATqRxcXJyNPs8I0RHA0B+h9Z2xpEdJ8=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=h9nHOj6OeGY4lGMu7yjs3KPcaRimn7VKpaC69/p16nJjRqYaDMZ4kFuAFrGrVGsjWH2dxzqKgJyMeAV/RsNjhSTMyBgDDtb/PpemSF+xoQC0FgYSTOqkrYBzmHUAyIGuwfqLV72Un4lS6ElEUXpqa3ZhXMjqZe0l6+cVgMhcUYU= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-lf1-x136.google.com with SMTP id 2adb3069b0e04-51fdc9af005so10424536e87.3 for ; Tue, 14 May 2024 23:49:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; t=1715755794; x=1716360594; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=Nqr4oYhh2Ky6XY7PS2AeiSAu/yWNviLGtEh8YTV1IaE=; b=GcFyUAwDsjdE1/TicKchQNHSSi4MoooTEC0YvGUe3wkYTeOD2xh8+uOTjukneckSNm VfoRIeBXu4WLh6Qy6rYM63TxM4SuSc1+28X7eXs2pgiWs7M2FEywBSCSt/YssLOv/LlP GCn6YiNqryQqt/FCYnKzoJsRfgVIkMOvaKdJJv2jCpMBSeV/17eAZjA+ZbOf/A8uKyEK e4LhOIY0YjAXMykukA1M/Hw7fD9Pj9G0Z/QqmAsTsCI9CO02j1lR07nHeiw9Wyq0RR/9 1H2VAjCdQwqQTnO6t4QwiKt1t6xWRuAOdQnw2Q/AtksNqiv+5uFJcyCX+rhhMd5zHO1X 9udA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1715755794; x=1716360594; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Nqr4oYhh2Ky6XY7PS2AeiSAu/yWNviLGtEh8YTV1IaE=; b=pVxuQGjkFnrJ/qMt0hG0rswCtm3jMo7ph+IOrFQk/hezf02PyteJsgDTOghFY9tnRp nYbki7XDYm/WpmCF80oPmMUO6IwO4vIfqja7mZYJbAGBJ3KMWs0b2TCjfQwH9O6i7/ef ygHKBY52+BGNvHpk+SY30wZ27Hu915+7ZtJXuNhaaJ1lWyCRSHcvwJlR16XMElSm9QVD gJR1FUPF8O0TqHYjXzhYDMki7wLQjUHWlVmB2IDWYjeUHEmMYMaCzdlh9AOXi4CKqMRI BS7EoT93yEtl3RM5gDf/JQG632YRemBT9htgjFYo+Td+8wBhnVr0MzLdhmfxGk+vmOnS Z6Tg== X-Gm-Message-State: AOJu0YzxCB+2za25UYrb1UPnF3sQg95/MewHj4zCRrk+Nls+pIwQMtpu UdEunuKigoaMZeS6BONnbp01tw0hSZPtjwZ24pm1e335tadPW4P8LByWVc/FU5YjQQl+FeS3pOx 9U4c= X-Google-Smtp-Source: AGHT+IEZ10qgvfrwJ0TVioKi3eAJXgSCOvIpDlMmcLm87BKydEwCWhhRLRjEVeia034lJi1IRhyA2g== X-Received: by 2002:ac2:5398:0:b0:523:54a2:3836 with SMTP id 2adb3069b0e04-52354a239d3mr4413049e87.33.1715755793959; Tue, 14 May 2024 23:49:53 -0700 (PDT) Received: from antares.fritz.box (62-178-148-172.cable.dynamic.surfer.at. [62.178.148.172]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-a5a17b178b1sm815471666b.191.2024.05.14.23.49.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 14 May 2024 23:49:52 -0700 (PDT) From: =?UTF-8?q?Christoph=20M=C3=BCllner?= To: gcc-patches@gcc.gnu.org, Kito Cheng , Jim Wilson , Palmer Dabbelt , Andrew Waterman , Philipp Tomsich , Jeff Law , Vineet Gupta Cc: =?UTF-8?q?Christoph=20M=C3=BCllner?= Subject: [PATCH v2 1/2] RISC-V: Add cmpmemsi expansion Date: Wed, 15 May 2024 08:49:25 +0200 Message-ID: <20240515064926.2441787-1-christoph.muellner@vrull.eu> X-Mailer: git-send-email 2.44.0 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,KAM_MANYTO,KAM_SHORT,LIKELY_SPAM_BODY,RCVD_IN_DNSWL_NONE,SCC_10_SHORT_WORD_LINES,SCC_5_SHORT_WORD_LINES,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: GCC has a generic cmpmemsi expansion via the by-pieces framework, which shows some room for target-specific optimizations. E.g. for comparing two aligned memory blocks of 15 bytes we get the following sequence: my_mem_cmp_aligned_15: li a4,0 j .L2 .L8: bgeu a4,a7,.L7 .L2: add a2,a0,a4 add a3,a1,a4 lbu a5,0(a2) lbu a6,0(a3) addi a4,a4,1 li a7,15 // missed hoisting subw a5,a5,a6 andi a5,a5,0xff // useless beq a5,zero,.L8 lbu a0,0(a2) // loading again! lbu a5,0(a3) // loading again! subw a0,a0,a5 ret .L7: li a0,0 ret Diff first byte: 15 insns Diff second byte: 25 insns No diff: 25 insns Possible improvements: * unroll the loop and use load-with-displacement to avoid offset increments * load and compare multiple (aligned) bytes at once * Use the bitmanip/strcmp result calculation (reverse words and synthesize (a2 >= a3) ? 1 : -1 in a branchless sequence) When applying these improvements we get the following sequence: my_mem_cmp_aligned_15: ld a5,0(a0) ld a4,0(a1) bne a5,a4,.L2 ld a5,8(a0) ld a4,8(a1) slli a5,a5,8 slli a4,a4,8 bne a5,a4,.L2 li a0,0 .L3: sext.w a0,a0 ret .L2: rev8 a5,a5 rev8 a4,a4 sltu a5,a5,a4 neg a5,a5 ori a0,a5,1 j .L3 Diff first byte: 11 insns Diff second byte: 16 insns No diff: 11 insns This patch implements this improvements. The tests consist of a execution test (similar to gcc/testsuite/gcc.dg/torture/inline-mem-cmp-1.c) and a few tests that test the expansion conditions (known length and alignment). Similar to the cpymemsi expansion this patch does not introduce any gating for the cmpmemsi expansion (on top of requiring the known length, alignment and Zbb). Bootstrapped and SPEC CPU 2017 tested. gcc/ChangeLog: * config/riscv/riscv-protos.h (riscv_expand_block_compare): New prototype. * config/riscv/riscv-string.cc (GEN_EMIT_HELPER2): New helper for zero_extendhi. (do_load_from_addr): Add support for HI and SI/64 modes. (do_load): Add helper for zero-extended loads. (emit_memcmp_scalar_load_and_compare): New helper to emit memcmp. (emit_memcmp_scalar_result_calculation): Likewise. (riscv_expand_block_compare_scalar): Likewise. (riscv_expand_block_compare): New RISC-V expander for memory compare. * config/riscv/riscv.md (cmpmemsi): New cmpmem expansion. gcc/testsuite/ChangeLog: * gcc.target/riscv/cmpmemsi-1.c: New test. * gcc.target/riscv/cmpmemsi-2.c: New test. * gcc.target/riscv/cmpmemsi-3.c: New test. * gcc.target/riscv/cmpmemsi.c: New test. Signed-off-by: Christoph Müllner --- gcc/config/riscv/riscv-protos.h | 1 + gcc/config/riscv/riscv-string.cc | 218 ++++++++++++++++++-- gcc/config/riscv/riscv.md | 15 ++ gcc/testsuite/gcc.target/riscv/cmpmemsi-1.c | 6 + gcc/testsuite/gcc.target/riscv/cmpmemsi-2.c | 42 ++++ gcc/testsuite/gcc.target/riscv/cmpmemsi-3.c | 43 ++++ gcc/testsuite/gcc.target/riscv/cmpmemsi.c | 22 ++ 7 files changed, 333 insertions(+), 14 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/cmpmemsi-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/cmpmemsi-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/cmpmemsi-3.c create mode 100644 gcc/testsuite/gcc.target/riscv/cmpmemsi.c diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 5c8a52b78a2..565ead1382a 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -189,6 +189,7 @@ rtl_opt_pass * make_pass_avlprop (gcc::context *ctxt); rtl_opt_pass * make_pass_vsetvl (gcc::context *ctxt); /* Routines implemented in riscv-string.c. */ +extern bool riscv_expand_block_compare (rtx, rtx, rtx, rtx); extern bool riscv_expand_block_move (rtx, rtx, rtx); extern bool riscv_expand_block_clear (rtx, rtx); diff --git a/gcc/config/riscv/riscv-string.cc b/gcc/config/riscv/riscv-string.cc index 87f5fdee3c1..a4c7f27d904 100644 --- a/gcc/config/riscv/riscv-string.cc +++ b/gcc/config/riscv/riscv-string.cc @@ -86,35 +86,47 @@ GEN_EMIT_HELPER2(th_rev) /* do_th_rev2 */ GEN_EMIT_HELPER2(th_tstnbz) /* do_th_tstnbz2 */ GEN_EMIT_HELPER3(xor) /* do_xor3 */ GEN_EMIT_HELPER2(zero_extendqi) /* do_zero_extendqi2 */ +GEN_EMIT_HELPER2(zero_extendhi) /* do_zero_extendhi2 */ #undef GEN_EMIT_HELPER2 #undef GEN_EMIT_HELPER3 -/* Helper function to load a byte or a Pmode register. +/* Helper function to emit zero-extended loads. - MODE is the mode to use for the load (QImode or Pmode). + MODE is the mode to use for the load. DEST is the destination register for the data. - ADDR_REG is the register that holds the address. - ADDR is the address expression to load from. - - This function returns an rtx containing the register, - where the ADDR is stored. */ + MEM is the source to load from. */ -static rtx -do_load_from_addr (machine_mode mode, rtx dest, rtx addr_reg, rtx addr) +static void +do_load (machine_mode mode, rtx dest, rtx mem) { - rtx mem = gen_rtx_MEM (mode, addr_reg); - MEM_COPY_ATTRIBUTES (mem, addr); - set_mem_size (mem, GET_MODE_SIZE (mode)); - if (mode == QImode) do_zero_extendqi2 (dest, mem); + else if (mode == HImode) + do_zero_extendhi2 (dest, mem); + else if (mode == SImode && TARGET_64BIT) + emit_insn (gen_zero_extendsidi2 (dest, mem)); else if (mode == Xmode) emit_move_insn (dest, mem); else gcc_unreachable (); +} + +/* Helper function to emit zero-extended loads. + + MODE is the mode to use for the load (QImode or Pmode). + DEST is the destination register for the data. + ADDR_REG is the register that holds the address. + ADDR is the address expression to load from. */ - return addr_reg; +static void +do_load_from_addr (machine_mode mode, rtx dest, rtx addr_reg, rtx addr) +{ + rtx mem = gen_rtx_MEM (mode, addr_reg); + MEM_COPY_ATTRIBUTES (mem, addr); + set_mem_size (mem, GET_MODE_SIZE (mode)); + + do_load (mode, dest, mem); } /* Generate a sequence to compare single characters in data1 and data2. @@ -617,6 +629,184 @@ riscv_expand_strlen (rtx result, rtx src, rtx search_char, rtx align) return false; } +/* Generate the sequence of load and compares for memcmp using Zbb. + + RESULT is the register where the return value of memcmp will be stored. + The source pointers are SRC1 and SRC2 (NBYTES bytes to compare). + DATA1 and DATA2 are registers where the data chunks will be stored. + DIFF_LABEL is the location of the code that calculates the return value. + FINAL_LABEL is the location of the code that comes after the calculation + of the return value. */ + +static void +emit_memcmp_scalar_load_and_compare (rtx result, rtx src1, rtx src2, + unsigned HOST_WIDE_INT nbytes, + rtx data1, rtx data2, + rtx diff_label, rtx final_label) +{ + const unsigned HOST_WIDE_INT xlen = GET_MODE_SIZE (Xmode); + unsigned HOST_WIDE_INT offset = 0; + + while (nbytes > 0) + { + unsigned HOST_WIDE_INT cmp_bytes = xlen < nbytes ? xlen : nbytes; + machine_mode load_mode; + + /* Special cases to avoid masking of trailing bytes. */ + if (cmp_bytes == 1) + load_mode = QImode; + else if (cmp_bytes == 2) + load_mode = HImode; + else if (cmp_bytes == 4) + load_mode = SImode; + else + load_mode = Xmode; + + rtx addr1 = adjust_address (src1, load_mode, offset); + do_load (load_mode, data1, addr1); + rtx addr2 = adjust_address (src2, load_mode, offset); + do_load (load_mode, data2, addr2); + + /* Fast-path for a single byte. */ + if (cmp_bytes == 1) + { + rtx tmp = gen_reg_rtx (Xmode); + do_sub3 (tmp, data1, data2); + emit_insn (gen_movsi (result, gen_lowpart (SImode, tmp))); + emit_jump_insn (gen_jump (final_label)); + emit_barrier (); /* No fall-through. */ + return; + } + + /* Shift off trailing bytes in words if needed. */ + unsigned int load_bytes = GET_MODE_SIZE (load_mode).to_constant (); + if (cmp_bytes < load_bytes) + { + int shamt = (load_bytes - cmp_bytes) * BITS_PER_UNIT; + do_ashl3 (data1, data1, GEN_INT (shamt)); + do_ashl3 (data2, data2, GEN_INT (shamt)); + } + + /* Break out if data1 != data2 */ + rtx cond = gen_rtx_NE (VOIDmode, data1, data2); + emit_unlikely_jump_insn (gen_cbranch4 (Pmode, cond, data1, + data2, diff_label)); + /* Fall-through on equality. */ + + offset += cmp_bytes; + nbytes -= cmp_bytes; + } +} + +/* memcmp result calculation. + + RESULT is the register where the return value will be stored. + The two data chunks are in DATA1 and DATA2. */ + +static void +emit_memcmp_scalar_result_calculation (rtx result, rtx data1, rtx data2) +{ + /* Get bytes in big-endian order and compare as words. */ + do_bswap2 (data1, data1); + do_bswap2 (data2, data2); + /* Synthesize (data1 >= data2) ? 1 : -1 in a branchless sequence. */ + rtx tmp = gen_reg_rtx (Xmode); + emit_insn (gen_slt_3 (LTU, Xmode, Xmode, tmp, data1, data2)); + do_neg2 (tmp, tmp); + do_ior3 (tmp, tmp, const1_rtx); + emit_insn (gen_movsi (result, gen_lowpart (SImode, tmp))); +} + +/* Expand memcmp using scalar instructions (incl. Zbb). + + RESULT is the register where the return value will be stored. + The source pointers are SRC1 and SRC2 (NBYTES bytes to compare). */ + +static bool +riscv_expand_block_compare_scalar (rtx result, rtx src1, rtx src2, rtx nbytes) +{ + const unsigned HOST_WIDE_INT xlen = GET_MODE_SIZE (Xmode); + + if (optimize_function_for_size_p (cfun)) + return false; + + /* We don't support big endian. */ + if (BYTES_BIG_ENDIAN) + return false; + + if (!CONST_INT_P (nbytes)) + return false; + + /* We need the rev (bswap) instruction. */ + if (!TARGET_ZBB) + return false; + + unsigned HOST_WIDE_INT length = UINTVAL (nbytes); + + /* Limit to 12-bits (maximum load-offset). */ + if (length > IMM_REACH) + length = IMM_REACH; + + /* We need xlen-aligned memory. */ + unsigned HOST_WIDE_INT align = MIN (MEM_ALIGN (src1), MEM_ALIGN (src2)); + if (align < (xlen * BITS_PER_UNIT)) + return false; + + if (length > RISCV_MAX_MOVE_BYTES_STRAIGHT) + return false; + + /* Overall structure of emitted code: + Load-and-compare: + - Load data1 and data2 + - Compare strings and either: + - Fall-through on equality + - Jump to end_label if data1 != data2 + // Fall-through + Set result to 0 and jump to final_label + diff_label: + Calculate result value with the use of data1 and data2 + Jump to final_label + final_label: + // Nothing. */ + + rtx data1 = gen_reg_rtx (Xmode); + rtx data2 = gen_reg_rtx (Xmode); + rtx diff_label = gen_label_rtx (); + rtx final_label = gen_label_rtx (); + + /* Generate a sequence of zbb instructions to compare out + to the length specified. */ + emit_memcmp_scalar_load_and_compare (result, src1, src2, length, + data1, data2, + diff_label, final_label); + + emit_insn (gen_rtx_SET (result, gen_rtx_CONST_INT (SImode, 0))); + emit_jump_insn (gen_jump (final_label)); + emit_barrier (); /* No fall-through. */ + + emit_label (diff_label); + emit_memcmp_scalar_result_calculation (result, data1, data2); + emit_jump_insn (gen_jump (final_label)); + emit_barrier (); /* No fall-through. */ + + emit_label (final_label); + return true; +} + +/* Expand memcmp operation. + + RESULT is the register where the return value will be stored. + The source pointers are SRC1 and SRC2 (NBYTES bytes to compare). */ + +bool +riscv_expand_block_compare (rtx result, rtx src1, rtx src2, rtx nbytes) +{ + if (stringop_strategy & STRATEGY_SCALAR) + return riscv_expand_block_compare_scalar (result, src1, src2, nbytes); + + return false; +} + /* Emit straight-line code to move LENGTH bytes from SRC to DEST. Assume that the areas do not overlap. */ diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md index 893040f2854..4d3daaf00a9 100644 --- a/gcc/config/riscv/riscv.md +++ b/gcc/config/riscv/riscv.md @@ -2635,6 +2635,21 @@ (define_split DONE; }) +(define_expand "cmpmemsi" + [(parallel [(set (match_operand:SI 0) + (compare:SI (match_operand:BLK 1) + (match_operand:BLK 2))) + (use (match_operand:SI 3)) + (use (match_operand:SI 4))])] + "!optimize_size" +{ + if (riscv_expand_block_compare (operands[0], operands[1], operands[2], + operands[3])) + DONE; + else + FAIL; +}) + (define_expand "cpymem" [(parallel [(set (match_operand:BLK 0 "general_operand") (match_operand:BLK 1 "general_operand")) diff --git a/gcc/testsuite/gcc.target/riscv/cmpmemsi-1.c b/gcc/testsuite/gcc.target/riscv/cmpmemsi-1.c new file mode 100644 index 00000000000..d7e0bc47407 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/cmpmemsi-1.c @@ -0,0 +1,6 @@ +/* { dg-do run } */ +/* { dg-options "-march=rv32gc_zbb -save-temps -g0 -fno-lto" { target { rv32 } } } */ +/* { dg-options "-march=rv64gc_zbb -save-temps -g0 -fno-lto" { target { rv64 } } } */ +/* { dg-timeout-factor 2 } */ + +#include "../../gcc.dg/memcmp-1.c" diff --git a/gcc/testsuite/gcc.target/riscv/cmpmemsi-2.c b/gcc/testsuite/gcc.target/riscv/cmpmemsi-2.c new file mode 100644 index 00000000000..77aa88b5b9c --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/cmpmemsi-2.c @@ -0,0 +1,42 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gc_zbb" { target { rv32 } } } */ +/* { dg-options "-march=rv64gc_zbb" { target { rv64 } } } */ +/* { dg-skip-if "" { *-*-* } { "-O0" "-Os" "-Og" "-Oz" } } */ + +#include +#define aligned32 __attribute__ ((aligned (32))) + +const char myconst15[] aligned32 = { 1, 2, 3, 4, 5, 6, 7, + 0, 1, 2, 3, 4, 5, 6, 7 }; +const char myconst23[] aligned32 = { 1, 2, 3, 4, 5, 6, 7, + 0, 1, 2, 3, 4, 5, 6, 7, + 0, 1, 2, 3, 4, 5, 6, 7 }; +const char myconst31[] aligned32 = { 1, 2, 3, 4, 5, 6, 7, + 0, 1, 2, 3, 4, 5, 6, 7, + 0, 1, 2, 3, 4, 5, 6, 7, + 0, 1, 2, 3, 4, 5, 6, 7 }; + +/* No expansion (unknown alignment) */ +#define MY_MEM_CMP_N(N) \ +int my_mem_cmp_##N(const char *b1, const char *b2) \ +{ \ + return __builtin_memcmp (b1, b2, N); \ +} + +/* No expansion (unknown alignment) */ +#define MY_MEM_CMP_CONST_N(N) \ +int my_mem_cmp_const_##N(const char *b1) \ +{ \ + return __builtin_memcmp (b1, myconst##N, sizeof(myconst##N)); \ +} + +MY_MEM_CMP_N(15) +MY_MEM_CMP_CONST_N(15) + +MY_MEM_CMP_N(23) +MY_MEM_CMP_CONST_N(23) + +MY_MEM_CMP_N(31) +MY_MEM_CMP_CONST_N(31) + +/* { dg-final { scan-assembler-times "\t(call|tail)\tmemcmp" 6 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/cmpmemsi-3.c b/gcc/testsuite/gcc.target/riscv/cmpmemsi-3.c new file mode 100644 index 00000000000..193cd4a343e --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/cmpmemsi-3.c @@ -0,0 +1,43 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gc_zbb" { target { rv32 } } } */ +/* { dg-options "-march=rv64gc_zbb" { target { rv64 } } } */ +/* { dg-skip-if "" { *-*-* } { "-O0" "-Os" "-Og" "-Oz" } } */ + +#include +#define aligned32 __attribute__ ((aligned (32))) + +const char myconst15[] aligned32 = { 1, 2, 3, 4, 5, 6, 7, + 0, 1, 2, 3, 4, 5, 6, 7 }; +const char myconst23[] aligned32 = { 1, 2, 3, 4, 5, 6, 7, + 0, 1, 2, 3, 4, 5, 6, 7, + 0, 1, 2, 3, 4, 5, 6, 7 }; +const char myconst31[] aligned32 = { 1, 2, 3, 4, 5, 6, 7, + 0, 1, 2, 3, 4, 5, 6, 7, + 0, 1, 2, 3, 4, 5, 6, 7, + 0, 1, 2, 3, 4, 5, 6, 7 }; + +#define MY_MEM_CMP_ALIGNED_N(N) \ +int my_mem_cmp_aligned_##N(const char *b1, const char *b2) \ +{ \ + b1 = __builtin_assume_aligned (b1, 4096); \ + b2 = __builtin_assume_aligned (b2, 4096); \ + return __builtin_memcmp (b1, b2, N); \ +} + +#define MY_MEM_CMP_ALIGNED_CONST_N(N) \ +int my_mem_cmp_aligned_const_##N(const char *b1) \ +{ \ + b1 = __builtin_assume_aligned (b1, 4096); \ + return __builtin_memcmp (b1, myconst##N, sizeof(myconst##N)); \ +} + +MY_MEM_CMP_ALIGNED_N(15) +MY_MEM_CMP_ALIGNED_CONST_N(15) + +MY_MEM_CMP_ALIGNED_N(23) +MY_MEM_CMP_ALIGNED_CONST_N(23) + +MY_MEM_CMP_ALIGNED_N(31) +MY_MEM_CMP_ALIGNED_CONST_N(31) + +/* { dg-final { scan-assembler-not "\t(call|tail)\tmemcmp" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/cmpmemsi.c b/gcc/testsuite/gcc.target/riscv/cmpmemsi.c new file mode 100644 index 00000000000..f4ccf269924 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/cmpmemsi.c @@ -0,0 +1,22 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gc_zbb" { target { rv32 } } } */ +/* { dg-options "-march=rv64gc_zbb" { target { rv64 } } } */ +/* { dg-skip-if "" { *-*-* } { "-O0" "-Os" "-Og" "-Oz" } } */ + +#include + +/* No expansion (unknown size) */ +int my_mem_cmp_n(const char *b1, const char *b2, size_t n) +{ + return __builtin_memcmp (b1, b2, n); +} + +/* No expansion (unknown size) */ +int my_mem_cmp_aligned(const char *b1, const char *b2, size_t n) +{ + b1 = __builtin_assume_aligned (b1, 4096); + b2 = __builtin_assume_aligned (b2, 4096); + return __builtin_memcmp (b1, b2, n); +} + +/* { dg-final { scan-assembler-times "\t(call|tail)\tmemcmp" 2 } } */ -- 2.44.0