From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lj1-x232.google.com (mail-lj1-x232.google.com [IPv6:2a00:1450:4864:20::232]) by sourceware.org (Postfix) with ESMTPS id 60B353858D33 for ; Thu, 13 Apr 2023 23:22:08 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 60B353858D33 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu Received: by mail-lj1-x232.google.com with SMTP id b33so12809065ljf.2 for ; Thu, 13 Apr 2023 16:22:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; t=1681428126; x=1684020126; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=jRWooSPjoKGbodZzRU5z0L5f/oyGoAHP8oiDoARbCEE=; b=dDVn/PquRlg1SnmUSfa8qfbF0B+7ShHrSGMAcamEP8OY64KWgYnnfboatZlNBRK/5V eZCHVej7PHJ62e09ejhXLENyL4grRtnV6DH+30sa5ezMM/cZdnsZEVpYIh6J6v24rtXg ncuQj5oTHYykNme+Sme+5JqafFmxYvkYVpo2D0tcvHksuzZlR3dO8I+L4j0HIPKDUeZE fR9F69fHokDoe6n4Q4SRb7b6dyI9uONmkdAs75Moux2WqCvrGXClQKdwYixyXY1GkXVv M0qtoMuuSsS5tgh35qo/vkQ7uiIGYkYt4PvkOUfIrJS+03T0TapxXRYm1h6VQweOB28N FGxg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681428126; x=1684020126; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=jRWooSPjoKGbodZzRU5z0L5f/oyGoAHP8oiDoARbCEE=; b=Ti4QKLlocw+/3qyY1b4qBbo/zQc2bAv5wo88laF1zfxg5zI1uKb5o7T0P2qLuiny3q zty4Y0F39VSZA6jptAyqlHyI4r4HT49vaJtg0c38FjfnRCtI8ZL0pGjmtNAfVOKTeu90 Gq5mlLHa/36F5EdTaVcfcnS2lPOcCxlFUStZVHW8waBqI/bIsvbP1B0J0klun1Y2C2+M 3X/gFlUqXtIXYByFu/uUBUawswEXKWG8PpMxFgDDI4ygC9SdoIbRrDXKIpSjIUI4B/SJ tLti0yiybGIKsCEHQT9bTc1vKviFVbWXBhyIxHMZHtPRKImXfiBRwwzTje/nlcbeASu8 7PsA== X-Gm-Message-State: AAQBX9fRVlkGWN5o4JIdnkGhlpbnmvouO1NW3SA7jyZZwdUJJV85+ky8 8OirwRKANGZhIOinJh/QOtoc/SQ3fCRLBYePQRjvEw== X-Google-Smtp-Source: AKy350b+J1D12Db8StKG0Sa1MLl/XFrqBNqG8w4P02a8pbo9/es6nAojgrUz3+iO941Mbqi8Cqzpfw== X-Received: by 2002:a2e:8ec9:0:b0:295:a958:2bca with SMTP id e9-20020a2e8ec9000000b00295a9582bcamr1262390ljl.6.1681428126333; Thu, 13 Apr 2023 16:22:06 -0700 (PDT) Received: from ubuntu-focal.. ([2a01:4f9:3a:1e26::2]) by smtp.gmail.com with ESMTPSA id h23-20020a2e3a17000000b002a785484afasm473600lja.68.2023.04.13.16.22.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 13 Apr 2023 16:22:05 -0700 (PDT) From: Philipp Tomsich To: gcc-patches@gcc.gnu.org Cc: Kyrylo Tkachov , Philipp Tomsich , Di Zhao Subject: [PATCH] aarch64: disable LDP via tuning structure for -mcpu=ampere1 Date: Fri, 14 Apr 2023 01:21:57 +0200 Message-Id: <20230413232157.1487389-1-philipp.tomsich@vrull.eu> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-11.8 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,JMQ_SPF_NEUTRAL,KAM_NUMSUBJECT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: AmpereOne (-mcpu=ampere1) breaks LDP instructions into two uops. Given the chance that this causes instructions to slip into the next decoding cycle and the additional overheads when handling cacheline-crossing LDP instructions, we disable the generation of LDP isntructions through the tuning structure from instruction combining (such as in peephole2). Given the code-density benefits in builtins and prologue/epilogue expansion, we allow LDPs there. This commit: * adds a new tuning option AARCH64_EXTRA_TUNE_NO_LDP_COMBINE * allows -moverride=tune=... to override this Signed-off-by: Philipp Tomsich Co-Authored-By: Di Zhao gcc/ChangeLog: * config/aarch64/aarch64-tuning-flags.def (AARCH64_EXTRA_TUNING_OPTION): Add AARCH64_EXTRA_TUNE_NO_LDP_COMBINE. * config/aarch64/aarch64.cc (aarch64_operands_ok_for_ldpstp): Check for the above tuning option when processing loads. --- gcc/config/aarch64/aarch64-tuning-flags.def | 3 +++ gcc/config/aarch64/aarch64.cc | 8 +++++++- 2 files changed, 10 insertions(+), 1 deletion(-) diff --git a/gcc/config/aarch64/aarch64-tuning-flags.def b/gcc/config/aarch64/aarch64-tuning-flags.def index 712895a5263..52112ba7c48 100644 --- a/gcc/config/aarch64/aarch64-tuning-flags.def +++ b/gcc/config/aarch64/aarch64-tuning-flags.def @@ -44,6 +44,9 @@ AARCH64_EXTRA_TUNING_OPTION ("cheap_shift_extend", CHEAP_SHIFT_EXTEND) /* Disallow load/store pair instructions on Q-registers. */ AARCH64_EXTRA_TUNING_OPTION ("no_ldp_stp_qregs", NO_LDP_STP_QREGS) +/* Disallow load-pair instructions to be formed in combine/peephole. */ +AARCH64_EXTRA_TUNING_OPTION ("no_ldp_combine", NO_LDP_COMBINE) + AARCH64_EXTRA_TUNING_OPTION ("rename_load_regs", RENAME_LOAD_REGS) AARCH64_EXTRA_TUNING_OPTION ("cse_sve_vl_constants", CSE_SVE_VL_CONSTANTS) diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index f4ef22ce02f..8dc1a9ceb17 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -1971,7 +1971,7 @@ static const struct tune_params ampere1a_tunings = 2, /* min_div_recip_mul_df. */ 0, /* max_case_values. */ tune_params::AUTOPREFETCHER_WEAK, /* autoprefetcher_model. */ - (AARCH64_EXTRA_TUNE_NONE), /* tune_flags. */ + (AARCH64_EXTRA_TUNE_NO_LDP_COMBINE), /* tune_flags. */ &ere1_prefetch_tune }; @@ -26053,6 +26053,12 @@ aarch64_operands_ok_for_ldpstp (rtx *operands, bool load, enum reg_class rclass_1, rclass_2; rtx mem_1, mem_2, reg_1, reg_2; + /* Allow the tuning structure to disable LDP instruction formation + from combining instructions (e.g., in peephole2). */ + if (load && (aarch64_tune_params.extra_tuning_flags + & AARCH64_EXTRA_TUNE_NO_LDP_COMBINE)) + return false; + if (load) { mem_1 = operands[1]; -- 2.34.1