From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yb1-xb2a.google.com (mail-yb1-xb2a.google.com [IPv6:2607:f8b0:4864:20::b2a]) by sourceware.org (Postfix) with ESMTPS id CD68B3959E42 for ; Wed, 16 Nov 2022 12:21:30 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org CD68B3959E42 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-yb1-xb2a.google.com with SMTP id b131so20066925yba.11 for ; Wed, 16 Nov 2022 04:21:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=WDbK9EgB/GJhEJ3EmtQi9uY7MLd8bW4vuYDtrMNNg6U=; b=TUVPyaLd3qn8RhriBkSQLvxtm1//nBHtaHjO+su5n6lEwQVN9ZEizlGDYz/X82GWLa d/AoKD/vltNvkGIILfhg7xWhg5RKyQRFCnakzTE3T6t3yh5TIWewXuI1lADqP+SfrxVJ oMN2lfezgEAcFnCfuYtSHyFJx/VWrDDXoIBnqfEh7sEeKEn4znS3X8nwMlrC+J19slcA QZSHXzxH3wZBVgymb6il+Q0J39jn0DIP6OWk+4rLHW5pePDUTk9VKfdkUJnOQTwmq/PK wEf4rug2Odblxvl0bn7T8dn4dpca4yqI2ehk45yJGyUGQi8M6YDQjf81ZBtIuPBHZdGg XMOQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=WDbK9EgB/GJhEJ3EmtQi9uY7MLd8bW4vuYDtrMNNg6U=; b=EzMh6GGqwcTAwIWtNQS9XSOtjgApXIu/3KEeHT6cw3ELajvvCcykjoHWQSGMPieUI+ c4FjzkPlBsohwfLFqIt2k8HK4PpJFZ8uBT4kK5jvZCJL/Vo06wFKvnO8zoJBoAadTdLB AdTV+d/HEPceljbBPgpxQcyOTRCmdOA6lXMXy7O7+xJX60tHWD7tqdCk0vfU42cmgwxI Trksq3tKUE4Xazhr22X5oW1IePLjVVXlV7sjpXn6qZVvHFqIOtq9hyXBEMsg4qZlzqQh pfU6BX5Te0Tg+Q+gYRn8FSCOcB+9d0GNlsjSM9QaeJDmDjCkNSdMMo2Tm59klnESQKmI mgHQ== X-Gm-Message-State: ANoB5plUq1RfmJ6vQoT22GcV4XDrIcSqKlU2ejbOBhkudNHCGxNK2AtT B74cwwuSevkc/hCb56Oog39QZGsBVkjx/MiSMJw= X-Google-Smtp-Source: AA0mqf4Px+Zwuy41OERffdGLRivCS1O1DRNYCmIVjPel5ZOq5G8wKmVjULWZyJHmYabOcHDvv7DX3+qqe0pqxl8XoAQ= X-Received: by 2002:a25:260b:0:b0:6dd:1ffe:2028 with SMTP id m11-20020a25260b000000b006dd1ffe2028mr21768254ybm.550.1668601290037; Wed, 16 Nov 2022 04:21:30 -0800 (PST) MIME-Version: 1.0 References: <20221101162637.14238-1-amonakov@ispras.ru> <20221101162637.14238-3-amonakov@ispras.ru> In-Reply-To: From: =?UTF-8?B?SmFuIEh1YmnEjWth?= Date: Wed, 16 Nov 2022 13:21:18 +0100 Message-ID: Subject: Re: [PATCH 2/2] i386: correct x87&SSE multiplication modeling in znver.md To: "Kumar, Venkataramanan" Cc: Alexander Monakov , "gcc-patches@gcc.gnu.org" , "Joshi, Tejas Sanjay" Content-Type: multipart/alternative; boundary="0000000000002a1dc805ed9583ea" X-Spam-Status: No, score=-3.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,HTML_MESSAGE,MEDICAL_SUBJECT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: --0000000000002a1dc805ed9583ea Content-Type: text/plain; charset="UTF-8" Hello, On Wed, Nov 16, 2022 at 12:53 PM Kumar, Venkataramanan < Venkataramanan.Kumar@amd.com> wrote: > [AMD Official Use Only - General] > > Hi, > > > > Top znver table sizes in insn-automata.o: > > > > Before: > > > > 30056 r znver1_fp_min_issue_delay > > 120224 r znver1_fp_transitions > > > > After: > > > > 6720 r znver1_fp_min_issue_delay > > 53760 r znver1_fp_transitions > This looks really promising. I will experiment with the patch for separate znver3 model, but I think we should be able to keep them unified and hopefully get both less code duplicatoin and table sizes. > > > > gcc/ChangeLog: > > > > PR target/87832 > > * config/i386/znver.md: (znver1_fp_op_mul): Correct cycles in > > the reservation. > > (znver1_fp_op_mul_load): Ditto. > > (znver1_mmx_mul): Ditto. > > (znver1_mmx_load): Ditto. > > (znver1_ssemul_ss_ps): Ditto. > > (znver1_ssemul_ss_ps_load): Ditto. > > (znver1_ssemul_avx256_ps): Ditto. > > (znver1_ssemul_avx256_ps_load): Ditto. > > (znver1_ssemul_sd_pd): Ditto. > > (znver1_ssemul_sd_pd_load): Ditto. > > (znver2_ssemul_sd_pd): Ditto. > > (znver2_ssemul_sd_pd_load): Ditto. > > (znver1_ssemul_avx256_pd): Ditto. > > (znver1_ssemul_avx256_pd_load): Ditto. > > (znver1_sseimul): Ditto. > > (znver1_sseimul_avx256): Ditto. > > (znver1_sseimul_load): Ditto. > > (znver1_sseimul_avx256_load): Ditto. > > (znver1_sseimul_di): Ditto. > > (znver1_sseimul_load_di): Ditto. > > --- > > gcc/config/i386/znver.md | 40 ++++++++++++++++++++-------------------- > > 1 file changed, 20 insertions(+), 20 deletions(-) > > > > diff --git a/gcc/config/i386/znver.md b/gcc/config/i386/znver.md index > > c52f8b532..882f250f1 100644 > > --- a/gcc/config/i386/znver.md > > +++ b/gcc/config/i386/znver.md > > @@ -573,13 +573,13 @@ (define_insn_reservation "znver1_fp_op_mul" 5 > > (and (eq_attr "cpu" "znver1,znver2,znver3") > > (and (eq_attr "type" "fop,fmul") > > (eq_attr "memory" "none"))) > > - "znver1-direct,znver1-fp0*5") > > + "znver1-direct,znver1-fp0") > > > > (define_insn_reservation "znver1_fp_op_mul_load" 12 > > (and (eq_attr "cpu" "znver1,znver2,znver3") > > (and (eq_attr "type" "fop,fmul") > > (eq_attr "memory" "load"))) > > - "znver1-direct,znver1-load,znver1-fp0*5") > > + "znver1-direct,znver1-load,znver1-fp0") > > > > (define_insn_reservation "znver1_fp_op_imul_load" 16 > > (and (eq_attr "cpu" "znver1,znver2,znver3") @@ > -684,13 > > +684,13 @@ (define_insn_reservation "znver1_mmx_mul" 3 > > (and (eq_attr "cpu" "znver1,znver2,znver3") > > (and (eq_attr "type" "mmxmul") > > (eq_attr "memory" "none"))) > > - "znver1-direct,znver1-fp0*3") > > + "znver1-direct,znver1-fp0") > > > > (define_insn_reservation "znver1_mmx_load" 10 > > (and (eq_attr "cpu" "znver1,znver2,znver3") > > (and (eq_attr "type" "mmxmul") > > (eq_attr "memory" "load"))) > > - "znver1-direct,znver1-load,znver1-fp0*3") > > + "znver1-direct,znver1-load,znver1-fp0") > > > > ;; TODO > > (define_insn_reservation "znver1_avx256_log" 1 @@ -1161,7 +1161,7 > > @@ (define_insn_reservation "znver1_ssemul_ss_ps" 3 > > (eq_attr "mode" > > "V8SF,V4SF,SF,V4DF,V2DF,DF"))) > > (and (eq_attr "type" "ssemul") > > (eq_attr "memory" "none"))) > > - "znver1-direct,(znver1-fp0|znver1-fp1)*3") > > + "znver1-direct,znver1-fp0|znver1-fp1") > > > > (define_insn_reservation "znver1_ssemul_ss_ps_load" 10 > > (and (ior (and (eq_attr "cpu" "znver1") @@ > -1172,47 > > +1172,47 @@ (define_insn_reservation "znver1_ssemul_ss_ps_load" 10 > > (eq_attr "mode" > "V8SF,V4SF,SF"))) > > (and (eq_attr "type" "ssemul") > > (eq_attr "memory" "load"))) > > - > "znver1-direct,znver1-load,(znver1-fp0|znver1-fp1)*3") > > + > > + "znver1-direct,znver1-load,znver1-fp0|znver1-fp1") > > > > (define_insn_reservation "znver1_ssemul_avx256_ps" 3 > > (and (eq_attr "cpu" "znver1") > > (and (eq_attr "mode" "V8SF") > > (and (eq_attr "type" "ssemul") > > (eq_attr "memory" "none")))) > > - "znver1-double,(znver1-fp0|znver1-fp1)*3") > > + "znver1-double,znver1-fp0*2|znver1-fp1*2") > > > > (define_insn_reservation "znver1_ssemul_avx256_ps_load" 10 > > (and (eq_attr "cpu" "znver1") > > (and (eq_attr "mode" "V8SF") > > (and (eq_attr "type" "ssemul") > > (eq_attr "memory" "load")))) > > - > "znver1-double,znver1-load,(znver1-fp0|znver1-fp1)*3") > > + > > + "znver1-double,znver1-load,znver1-fp0*2|znver1-fp1*2") > > > > (define_insn_reservation "znver1_ssemul_sd_pd" 4 > > (and (eq_attr "cpu" "znver1") > > (and (eq_attr "mode" "V2DF,DF") > > (and (eq_attr "type" "ssemul") > > (eq_attr "memory" "none")))) > > - "znver1-direct,(znver1-fp0|znver1-fp1)*4") > > + "znver1-direct,znver1-fp0|znver1-fp1") > > > > (define_insn_reservation "znver1_ssemul_sd_pd_load" 11 > > (and (eq_attr "cpu" "znver1") > > (and (eq_attr "mode" "V2DF,DF") > > (and (eq_attr "type" "ssemul") > > (eq_attr "memory" "load")))) > > - > "znver1-direct,znver1-load,(znver1-fp0|znver1-fp1)*4") > > + > > + "znver1-direct,znver1-load,znver1-fp0|znver1-fp1") > > > > (define_insn_reservation "znver2_ssemul_sd_pd" 3 > > (and (eq_attr "cpu" "znver2,znver3") > > (and (eq_attr "type" "ssemul") > > (eq_attr "memory" "none"))) > > - "znver1-direct,(znver1-fp0|znver1-fp1)*3") > > + "znver1-direct,znver1-fp0|znver1-fp1") > > > > (define_insn_reservation "znver2_ssemul_sd_pd_load" 10 > > (and (eq_attr "cpu" "znver2,znver3") > > (and (eq_attr "type" "ssemul") > > (eq_attr "memory" "load"))) > > - > "znver1-direct,znver1-load,(znver1-fp0|znver1-fp1)*3") > > + > > + "znver1-direct,znver1-load,znver1-fp0|znver1-fp1") > > > > > > (define_insn_reservation "znver1_ssemul_avx256_pd" 5 @@ -1220,14 > > +1220,14 @@ (define_insn_reservation "znver1_ssemul_avx256_pd" 5 > > (and (eq_attr "mode" "V4DF") > > (and (eq_attr "type" "ssemul") > > (eq_attr "memory" "none")))) > > - "znver1-double,(znver1-fp0|znver1-fp1)*4") > > + "znver1-double,znver1-fp0*2|znver1-fp1*2") > > Do we need to include "znver1" check here? > If people use nonsential combinations like -mtune=znver1 -march=znver2 this may help a bit. I do it from time to time to see differences between pipelilne models, but it is not too important. > > > > (define_insn_reservation "znver1_sseimul_avx256" 4 > > (and (eq_attr "cpu" "znver1,znver2,znver3") > > (and (eq_attr "mode" "OI") > > (and (eq_attr "type" "sseimul") > > (eq_attr "memory" "none")))) > > - "znver1-double,znver1-fp0*4") > > + "znver1-double,znver1-fp0*2") > > znver1 native path is 128 and znver2/3 has 256 bit paths. > We need to split this into two reservations. One for znver1 and the other > for znver2/3. > isn't it znver2 for 128 and znver3 for 256? The patch looks good. > Patch is OK then :) thanks a lot! Honza > > Regards, > Venkat. > > --0000000000002a1dc805ed9583ea--