From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pg1-x529.google.com (mail-pg1-x529.google.com [IPv6:2607:f8b0:4864:20::529]) by sourceware.org (Postfix) with ESMTPS id 50970383B7AB for ; Sat, 11 Jun 2022 23:15:41 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 50970383B7AB Received: by mail-pg1-x529.google.com with SMTP id q140so2428037pgq.6 for ; Sat, 11 Jun 2022 16:15:41 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=g9FfN0k7uq4wSnTkLucITzpWGXadtjlHLj2SySUqspc=; b=FM3gW+LNRD4Y2H9Xdd4/skA5XbRtb1CwHyEjPqKEqGVtqNLtNtYe5Fy8WAxt3ofZ/u SSD2+SvLa+upNeNBrgGG8J0t2ETw5R4pSVm1c0VE/KXesk7hMVU1iNHxN/VmZZi/O/+/ yKDY5QulhZxmPVSAB4NSGpmOAjWkTnnUm+m0blco/HUDRxhLv+zFdIYIUXi9CMCiFPNy Xkgr5xsCEkR5i8P+DS00sIvn7Ri+7t1rj2W5TRGDVhvA7JrunQTHL1WOeYGNItzPf/ro DKFu6XCCdf8XxjMvSuWgRIlHf1ojfT/Gqg+gPF/vOmk9Ad8/yLzMgOXOlknW7h83ll7u 0qYg== X-Gm-Message-State: AOAM5320mnfTVfgtLBBRDesEJJLQtcjxklbhiKH0gP4mHKXVGfUzWgua skv52XuIQNr/2ewm02EC37PKrfcgsdcJVA== X-Google-Smtp-Source: ABdhPJyFswTMItUPNdNPTae1A0Wkweiw0WhumyF4K5nFEhB3SOvmpZvPZ9fI7vvNo/StuWbW/xxozQ== X-Received: by 2002:a63:4447:0:b0:3fc:d3d1:cea9 with SMTP id t7-20020a634447000000b003fcd3d1cea9mr44115768pgk.269.1654989340060; Sat, 11 Jun 2022 16:15:40 -0700 (PDT) Received: from octofox.hsd1.ca.comcast.net ([2601:641:401:1d20:7de4:71bd:2837:5355]) by smtp.gmail.com with ESMTPSA id j8-20020aa79288000000b005183f333721sm2111979pfa.87.2022.06.11.16.15.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 11 Jun 2022 16:15:39 -0700 (PDT) From: Max Filippov To: gcc-patches@gcc.gnu.org Subject: [COMMITTED] xtensa: Consider the Loop Option when setmemsi is expanded to small loop Date: Sat, 11 Jun 2022 16:15:26 -0700 Message-Id: <20220611231526.4036217-1-jcmvbkbc@gmail.com> X-Mailer: git-send-email 2.30.2 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-9.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, FROM_LOCAL_NOVOWEL, GIT_PATCH_0, HK_RANDOM_ENVFROM, HK_RANDOM_FROM, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 11 Jun 2022 23:15:43 -0000 From: Takayuki 'January June' Suwa Now apply to almost any size of aligned block under such circumstances. gcc/ChangeLog: * config/xtensa/xtensa.cc (xtensa_expand_block_set_small_loop): Pass through the block length / loop count conditions if zero-overhead looping is configured and active, --- gcc/config/xtensa/xtensa.cc | 71 ++++++++++++++++++++++++++----------- 1 file changed, 50 insertions(+), 21 deletions(-) diff --git a/gcc/config/xtensa/xtensa.cc b/gcc/config/xtensa/xtensa.cc index c7b54babc370..bc3330f836f3 100644 --- a/gcc/config/xtensa/xtensa.cc +++ b/gcc/config/xtensa/xtensa.cc @@ -1483,7 +1483,7 @@ xtensa_expand_block_set_unrolled_loop (rtx *operands) int xtensa_expand_block_set_small_loop (rtx *operands) { - HOST_WIDE_INT bytes, value, align; + HOST_WIDE_INT bytes, value, align, count; int expand_len, funccall_len; rtx x, dst, end, reg; machine_mode unit_mode; @@ -1503,17 +1503,25 @@ xtensa_expand_block_set_small_loop (rtx *operands) /* Totally-aligned block only. */ if (bytes % align != 0) return 0; + count = bytes / align; - /* If 4-byte aligned, small loop substitution is almost optimal, thus - limited to only offset to the end address for ADDI/ADDMI instruction. */ - if (align == 4 - && ! (bytes <= 127 || (bytes <= 32512 && bytes % 256 == 0))) - return 0; + /* If the Loop Option (zero-overhead looping) is configured and active, + almost no restrictions about the length of the block. */ + if (! (TARGET_LOOPS && optimize)) + { + /* If 4-byte aligned, small loop substitution is almost optimal, + thus limited to only offset to the end address for ADDI/ADDMI + instruction. */ + if (align == 4 + && ! (bytes <= 127 || (bytes <= 32512 && bytes % 256 == 0))) + return 0; - /* If no 4-byte aligned, loop count should be treated as the constraint. */ - if (align != 4 - && bytes / align > ((optimize > 1 && !optimize_size) ? 8 : 15)) - return 0; + /* If no 4-byte aligned, loop count should be treated as the + constraint. */ + if (align != 4 + && count > ((optimize > 1 && !optimize_size) ? 8 : 15)) + return 0; + } /* Insn expansion: holding the init value. Either MOV(.N) or L32R w/litpool. */ @@ -1523,16 +1531,33 @@ xtensa_expand_block_set_small_loop (rtx *operands) expand_len = TARGET_DENSITY ? 2 : 3; else expand_len = 3 + 4; - /* Insn expansion: Either ADDI(.N) or ADDMI for the end address. */ - expand_len += bytes > 127 ? 3 - : (TARGET_DENSITY && bytes <= 15) ? 2 : 3; - - /* Insn expansion: the loop body and branch instruction. - For store, one of S8I, S16I or S32I(.N). - For advance, ADDI(.N). - For branch, BNE. */ - expand_len += (TARGET_DENSITY && align == 4 ? 2 : 3) - + (TARGET_DENSITY ? 2 : 3) + 3; + if (TARGET_LOOPS && optimize) /* zero-overhead looping */ + { + /* Insn translation: Either MOV(.N) or L32R w/litpool for the + loop count. */ + expand_len += xtensa_simm12b (count) ? xtensa_sizeof_MOVI (count) + : 3 + 4; + /* Insn translation: LOOP, the zero-overhead looping setup + instruction. */ + expand_len += 3; + /* Insn expansion: the loop body instructions. + For store, one of S8I, S16I or S32I(.N). + For advance, ADDI(.N). */ + expand_len += (TARGET_DENSITY && align == 4 ? 2 : 3) + + (TARGET_DENSITY ? 2 : 3); + } + else /* NO zero-overhead looping */ + { + /* Insn expansion: Either ADDI(.N) or ADDMI for the end address. */ + expand_len += bytes > 127 ? 3 + : (TARGET_DENSITY && bytes <= 15) ? 2 : 3; + /* Insn expansion: the loop body and branch instruction. + For store, one of S8I, S16I or S32I(.N). + For advance, ADDI(.N). + For branch, BNE. */ + expand_len += (TARGET_DENSITY && align == 4 ? 2 : 3) + + (TARGET_DENSITY ? 2 : 3) + 3; + } /* Function call: preparing two arguments. */ funccall_len = xtensa_sizeof_MOVI (value); @@ -1555,7 +1580,11 @@ xtensa_expand_block_set_small_loop (rtx *operands) dst = gen_reg_rtx (SImode); emit_move_insn (dst, x); end = gen_reg_rtx (SImode); - emit_insn (gen_addsi3 (end, dst, operands[1] /* the length */)); + if (TARGET_LOOPS && optimize) + x = force_reg (SImode, operands[1] /* the length */); + else + x = operands[1]; + emit_insn (gen_addsi3 (end, dst, x)); switch (align) { case 1: -- 2.30.2