From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from NAM12-DM6-obe.outbound.protection.outlook.com (mail-dm6nam12on2082.outbound.protection.outlook.com [40.107.243.82]) by sourceware.org (Postfix) with ESMTPS id 8D3D9382DE3F for ; Wed, 26 Oct 2022 18:07:17 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 8D3D9382DE3F Authentication-Results: sourceware.org; dmarc=fail (p=quarantine dis=none) header.from=amd.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=amd.com ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=IbT7laMxJDJBnpeBGf0zG7D7GtBs477BNmKuSlDymfaHQIfniKZcM+1pBt6UDIjVPubfEf8OC2jiBWbggc0el6McTUnZ0CoAtiPLH/ZdUJ5fFzHZy+/D59BfDvN8BQaLikipZGMNCQi147TwDthQf4VH0IwYVG0+YnNGiN2tDjiryRkGkfKhsrZTIh1+3xWLOLo2wqlaiDVTdO5q1NRZSTRKg0SBy5HHJNrmfUPJUJlCItdxPWSbK9oKy+UyJH9EmyTkW07z083n3X/cKRdGpPDNCpr1D3TTzVRhUTQc36wP5w1OVKhmA/YPJS+Qi6Rmjj7q1W1B7YG26K7K/qHSlg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=U3sSXW2rWxIAckYy+ATLNU64PaItgvgxQOYl5fupj34=; b=RT7GFLqmdF3q6LCUKda1Kx8DXT0tHTnME5scTDE9idCDDaWgMt0n9GBKo8Mx9AoDiHBX+oSd3Llj7r3jnhXpwNQPKUddgn9u7cfJeiwcAPnoS55c0jeOhRTaXkQg/8+S9kuVAl11MULnW4O8qMhcOt06+K41BgbiyVKvrHUCi1e+5sn3NUIpbkv4EELHoFByU9MFWh5RE8/Bxquf+6hBHA0HPG1dtDSMC/5E8OAAgJgs2cIQ9MAwV2CzvF4JPD2nxkGIk0lYxMynLlu9UvQnV1tg9WOjXHtKKJSGrFiK8M0RFZaqAFseXe0ITWq2R6JxQpIYlxy4pbDmyt6SbLoXDw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=amd.com; dmarc=pass action=none header.from=amd.com; dkim=pass header.d=amd.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=U3sSXW2rWxIAckYy+ATLNU64PaItgvgxQOYl5fupj34=; b=Uc33uVMB6af8XWdJStLMqdZrhfhi4XzhrrpPDxXBz632EnOgNmW1M8M0K78oTawpd6dkpGs11ahINIckmTqSw5t1pBO1UV+F9QoSDlttOZ6stcevtPSKO2cZFux+Sq4c0ouRbTEpPuNkoQKp0xKb2GrvzwqlOC1DVIDSO9jZ4E0= Received: from DM6PR12MB3081.namprd12.prod.outlook.com (2603:10b6:5:38::27) by DM6PR12MB4514.namprd12.prod.outlook.com (2603:10b6:5:2a7::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5746.28; Wed, 26 Oct 2022 18:07:12 +0000 Received: from DM6PR12MB3081.namprd12.prod.outlook.com ([fe80::df6f:17a9:87de:851f]) by DM6PR12MB3081.namprd12.prod.outlook.com ([fe80::df6f:17a9:87de:851f%7]) with mapi id 15.20.5746.028; Wed, 26 Oct 2022 18:07:12 +0000 From: "Kumar, Venkataramanan" To: Alexander Monakov , =?iso-8859-2?Q?Jan_Hubi=E8ka?= CC: Jakub Jelinek , Richard Biener , "Joshi, Tejas Sanjay" , "gcc-patches@gcc.gnu.org" Subject: RE: [PATCH] [X86_64]: Enable support for next generation AMD Zen4 CPU Thread-Topic: [PATCH] [X86_64]: Enable support for next generation AMD Zen4 CPU Thread-Index: AdjfE5yvD9vMHso0QVqRSCPoho7w6QCc+8iAACrnl1AAwBzIYAAD+baAAD1xXoAALFqsMAAyfZSAAAB26YAACKVFAABioLUg Date: Wed, 26 Oct 2022 18:07:12 +0000 Message-ID: References: <4549f27b-238a-7d77-f72b-cc77df8ae36e@ispras.ru> In-Reply-To: <4549f27b-238a-7d77-f72b-cc77df8ae36e@ispras.ru> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: msip_labels: MSIP_Label_4342314e-0df4-4b58-84bf-38bed6170a0f_ActionId=692d9ba9-96fd-4917-a684-d78ed090cb3d;MSIP_Label_4342314e-0df4-4b58-84bf-38bed6170a0f_ContentBits=0;MSIP_Label_4342314e-0df4-4b58-84bf-38bed6170a0f_Enabled=true;MSIP_Label_4342314e-0df4-4b58-84bf-38bed6170a0f_Method=Standard;MSIP_Label_4342314e-0df4-4b58-84bf-38bed6170a0f_Name=General;MSIP_Label_4342314e-0df4-4b58-84bf-38bed6170a0f_SetDate=2022-10-26T17:51:47Z;MSIP_Label_4342314e-0df4-4b58-84bf-38bed6170a0f_SiteId=3dd8961f-e488-4e60-8e11-a82d994e183d; authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=amd.com; x-ms-publictraffictype: Email x-ms-traffictypediagnostic: DM6PR12MB3081:EE_|DM6PR12MB4514:EE_ x-ms-office365-filtering-correlation-id: b344ace3-9a96-450b-a30e-08dab77ce848 x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: DxOpcaAfRIYZQyTMx2kIgfYzHol/YaiBWwArIGjlzEeAvDlb1Uj8jdZTkOzXA+ezrf2Z47x348RDOJ4X7WiUduNR1cIZ9FmFdXOMcaZOcha5+jJHLnXZOkEOf3j0HNPYXcWF7LlG/PEsK378b0qcDAA6MBTVkcVZAXAXOEfjP47OkrFbWUlSpByv+Qvps+5wr958KjzSxp2pdg98y5CZgYrAI1OeqIrpwM/WRAc7c1KCy7uNZ7R+RsG9K7o1G3H0eF8RHSL+DCBcyC/TPYsG0kdvmrkil+WR9Niu5f2Q72kW2ySD78D7dKPcKfBnHiEaVXbT6SJnPvoHgX/s/J64ZLXeHy7DT4Z7WAPAUFhpk9hW5hnWct4t0gbtw6hcI3Ug41TtXzcBvfBuML/fM/VNhlfvebWZ0kowJlIiSyuPtcYlITFDU2KEXZUaCIZFjtxduIkNhcpaeCHqf9aP1UFuJueXdCu+0Pcgr9XKHZWLVzO6fPCfXMkWUNoXixMUyMKicDdo3eHQXPZyojatM0xpkO6jwP7M70OjVPCdX2F9ADcDzOwpt9DVKpL/YaRPLVedsegK/LxpAQVX7yGwPaUDhwJPbaMLpL1vqVusmnrM+Co7j9svYQzJ6IzU/hBKRZpbpg+f0sOSddubkPTZGn/hd1v+hflrGI16OCMdvS4ZM/AOs3e0Eo3hmgJ5i5UpvabQKeIYVrwkpCUw2rNt/VFaK/us1QeQq9d2JQf+eEqJqHlA5tU5h6CAGWH/FfvqILyrfQAcnZnkA1vxDU1SjaHM4wO9miRDhlW/tosZ0IUBc0A= x-forefront-antispam-report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DM6PR12MB3081.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(4636009)(136003)(376002)(366004)(396003)(346002)(39860400002)(451199015)(66574015)(38070700005)(83380400001)(66946007)(86362001)(66446008)(33656002)(66476007)(8936002)(52536014)(38100700002)(76116006)(66556008)(122000001)(5660300002)(2906002)(8676002)(6506007)(4326008)(54906003)(7696005)(53546011)(41300700001)(71200400001)(26005)(110136005)(966005)(316002)(45080400002)(186003)(478600001)(55016003)(9686003)(64756008);DIR:OUT;SFP:1101; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?iso-8859-2?Q?x9uuWcU7quK2CLKvHmJWAsGLlvqRgURAp2vpBViWn0cgMHRkdu7Z2xDS/M?= =?iso-8859-2?Q?pdOfTl7ZrBg+ytyC4ehd5URwe4G4zmQrFQnvmdtQthUR7+Yd58oTZNHS6Y?= =?iso-8859-2?Q?p1HgpA+LqYn39gA9Cj8V3En0dJXePiE8gpCP9XCO8ux5Rl1fLo/z5Tu0k+?= =?iso-8859-2?Q?1h5M7SVImfZKTCzVUUTmquuEsYiuNZj6Y1glNgIclfM8n44QAr6IswutdI?= =?iso-8859-2?Q?vLyOUnyPGapupd8yALS5HLEW44BhK8Q4iR8sSFXKcM2dqPZwqDfA0J9457?= =?iso-8859-2?Q?ngmY3nxZtZtuWSeLcQAx5yePdq+3IePwXmN9kqRTAGZ1TBYZXBCtHdJOsh?= =?iso-8859-2?Q?MVY5SQzhyAVwGBstmPJ3bXzWMzKc5xC5EjHhyABGrOHGFsVm7ena0Ta6tJ?= =?iso-8859-2?Q?tUNiefeu1aDiDd6BDf2VpyXugbzcvanS8LCDJGyuIB25WCttaCPDlNLq5W?= =?iso-8859-2?Q?DPZyftv6ONYayO8+BN2Oy6DC9d1nTM9bwh60nTaL1d0uLk/+jytkwk6P4g?= =?iso-8859-2?Q?3NuLN8Us1qEogSjZHbsuoL9o834aB9T+CPwWRZ1dD7aWJddordipBUiIGu?= =?iso-8859-2?Q?UnZO/r17X4oj0t342APm+tT2DHaMeiTBpDRLL9ak9KS7uJJaAcdbxsnwVt?= =?iso-8859-2?Q?6OxgyXuxIXk/IDssAq11p0CmAdDeG/fMSU+16TSmA8Frgm4G4a38aulJxp?= =?iso-8859-2?Q?sCTBY2zJw1yARHwURYE3HR4uRjcKpy1r6KnuMKZeJPwjTWaCcLmhAZRfk4?= =?iso-8859-2?Q?+07jTjxqbdTbCAlG4mYU/2Nm65XeuERMWBv+LyIbs3a6LmSlDub8hr+5NP?= =?iso-8859-2?Q?pxTSPq0G9pCqEQt26B+35nAtL+PSNXAweg8ZJDHhrLLsYQNZHRaImTP+u2?= =?iso-8859-2?Q?ZNGTkdW+J+ibTzexxKXRuuEr3iaaCpwk4fMnMl46K+bsZV0ZbMPm8Iyhto?= =?iso-8859-2?Q?tnNcTsaHRwz1rydPaP6RePB5XLlYwZ9ZNlqdO6t1uTzVe1FRWdWD2g6lpD?= =?iso-8859-2?Q?2UrrrsyUrx5dT1vi8QSDLjaLQ7Xnuj26To8aXEjt4zEIVGiGGojnC3EHC8?= =?iso-8859-2?Q?B4turnUXqRMAYqsfyjt275Z/jXa3/Nrjktzx05wswxUy2+D4sEPtSP8k2I?= =?iso-8859-2?Q?5tzEoIqtKnjOJ3PEupT79280b7LWgm7w4MGukCWcxg0tAH8VJ8w7Ptgt2O?= =?iso-8859-2?Q?pDiD/QNaNX14IS2dNUFZKB0QEUuwi/dgqi7v11YhuO0QJW+sk6D3NnC+8e?= =?iso-8859-2?Q?HUwo4gVXulDuCCff/hOnzUJwUumrQMNY9tM0n4RX7MoL5J56LSmYqPrVCu?= =?iso-8859-2?Q?EftHQVg6eFruS2GaEpAgFKE48Mtw+bBaPJmfRSCDfKl2V/lmfeQ8IdgUB5?= =?iso-8859-2?Q?nt9S3xE5ViFKa5idqHG8yhMNzGn/GGO2o3/P58k/42DhriOVNsX688OTum?= =?iso-8859-2?Q?EnKSgXbWn+scx66G1gTMpSlAd5uE4AXgl+2F0SN+BmO7G0Cq7hCsERWleh?= =?iso-8859-2?Q?mfO+LcNX4WjDyfIvXbWc73QxJzK+a/7LA0SiGJd50vNEWT+oALENHArktn?= =?iso-8859-2?Q?cyUGXqwFxjYJsi1o8K1yNvnVx2xvOg+pzAi33BED9hJJpgqxyaol+3aCjK?= =?iso-8859-2?Q?DgbqBEpQhoAB4=3D?= Content-Type: text/plain; charset="iso-8859-2" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: DM6PR12MB3081.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: b344ace3-9a96-450b-a30e-08dab77ce848 X-MS-Exchange-CrossTenant-originalarrivaltime: 26 Oct 2022 18:07:12.6623 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: GY5hjOy79q4K2n755fGJV30Ci2zcSD4mFUcJ2ATJvbqUvIBdy7m0V5Qk3KcWd4T5 X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM6PR12MB4514 X-Spam-Status: No, score=-11.1 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: [AMD Official Use Only - General] Hi Alexander, Thank you for looking in to this issue. > -----Original Message----- > From: Alexander Monakov > Sent: Tuesday, October 25, 2022 12:18 AM > To: Jan Hubi=E8ka > Cc: Kumar, Venkataramanan ; Jakub > Jelinek ; Richard Biener > ; Joshi, Tejas Sanjay > ; gcc-patches@gcc.gnu.org > Subject: Re: [PATCH] [X86_64]: Enable support for next generation AMD > Zen4 CPU > > Caution: This message originated from an External Source. Use proper > caution when opening attachments, clicking links, or responding. > > > On Mon, 24 Oct 2022, Jan Hubi=E8ka wrote: > > > > By the way, it appears pre-existing znver[123] models are also > > > causing some kind of combinatorial blow-up, but before znver4 it was > > > not a blocking issue: > > > > > > > https://nam11.safelinks.protection.outlook.com/?url=3Dhttps%3A%2F%2Fgc > > > > c.gnu.org%2Fbugzilla%2Fshow_bug.cgi%3Fid%3D87832&data=3D05%7C > 01%7C > > > > Venkataramanan.Kumar%40amd.com%7C5d22bec311ac43b3f56a08dab5f > 03fc7%7C > > > > 3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638022340726474 > 812%7CUnkn > > > > own%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik > 1haW > > > > wiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=3Dkg2zKCBxDEeYYKijH > 204QpOC4 > > > 0SJBADOvqlk0LhzJhc%3D&reserved=3D0 > > > > > > It is really easy to make DFA size to grow if there are possibly many > > instructions in the pipeline (as every possible state of a modelled > > pipeline needs to be a new state of the automaton). This is > > essentially depth_of_pipeline * number_of_units with additional states > > to repesent special instructions and this naturally keeps growing. > > > > We could try to break the FP automata into multiple ones, but there > > are instructions that can go down any pipe which makes this hard or we > > can try toreduce number of different reservation types (possibly by > > breaking the automaton to znver1-3 and 4 or so). > > With znver2 model I experimented with broken up version and common > one > > and ended up with smaller binary for combined one. > > Looking at znver1.md again, I think the problem is caused by incorrect > modeling of division instructions: they have descriptions like > > (define_insn_reservation "znver1_idiv_DI" 41 > (and (eq_attr "cpu" "znver1,znver2") > (and (eq_attr "type" "idiv") > (and (eq_attr "mode" "DI") > (eq_attr "memory" "none")))) > "znver1-double,znver1-ieu2*41") > > which says that DImode idiv has latency 41 (which is correct) and that it > occupies 2nd integer execution unit for 41 consecutive cycles, but that i= s > not correct: Yes you are correct. It does not block the 2nd integer execution pipe conse= cutively for 41 cycles. > > 1) the division instruction is partially pipelined, and has throughput 1/= 14 "Div" unit takes one instruction and in the worst case the latency will be = 41 cycles in znver1/2. But I agree that we can put best case latency of 14 cycles for the schedule= r model in znver1/2 . > > 2) for the most part it occupies a separate division unit, not the genera= l > arithmetic unit. Agreed. > > (incidentally, I think the blowup is caused by interaction of such super-= long > 41-cycle paths with the rest of reservations) > > I think we should fix this by modeling the separate division unit properl= y, > and fixing reservations to use the measured reciprocal throughput of thos= e > instructions (available from uops.info). The following patch does that fo= r > integer divisions and completely eliminates the integer part of the probl= em; > the issue with floating-point divisions remains. > > Top 5 znver table sizes, before: > > 68692 r znver1_ieu_check > 68692 r znver1_ieu_transitions > 99792 r znver1_ieu_min_issue_delay > 428108 r znver1_fp_min_issue_delay > 856216 r znver1_fp_transitions > > After: > > 1454 r znver1_ieu_translate > 1454 r znver1_translate > 2304 r znver1_ieu_transitions > 428108 r znver1_fp_min_issue_delay > 856216 r znver1_fp_transitions > > Will you help getting this reviewed for trunk? > > > > diff --git a/gcc/config/i386/znver1.md b/gcc/config/i386/znver1.md index > 9c25b4e27..39b59343d 100644 > --- a/gcc/config/i386/znver1.md > +++ b/gcc/config/i386/znver1.md > @@ -24,7 +24,7 @@ > ;; AMD znver1, znver2 and znver3 Scheduling ;; Modeling automatons for > zen decoders, integer execution pipes, ;; AGU pipes and floating point > execution units. > -(define_automaton "znver1, znver1_ieu, znver1_fp, znver1_agu") > +(define_automaton "znver1, znver1_ieu, znver1_fp, znver1_agu, > +znver1_idiv") > > ;; Decoders unit has 4 decoders and all of them can decode fast path ;;= and > vector type instructions. > @@ -50,6 +50,7 @@ > (define_cpu_unit "znver1-ieu1" "znver1_ieu") (define_cpu_unit "znver1- > ieu2" "znver1_ieu") (define_cpu_unit "znver1-ieu3" "znver1_ieu") > +(define_cpu_unit "znver1-idiv" "znver1_idiv") > (define_reservation "znver1-ieu" "znver1-ieu0|znver1-ieu1|znver1- > ieu2|znver1-ieu3") > > ;; 2 AGU pipes in znver1 and 3 AGU pipes in znver2 and znver3 @@ - > 176,28 +177,28 @@ > (and (eq_attr "type" "idiv") > (and (eq_attr "mode" "DI") > (eq_attr "memory" "none")))) > - "znver1-double,znver1-ieu2*41") > + "znver1-double,znver1-idiv*14") > > (define_insn_reservation "znver1_idiv_SI" 25 > (and (eq_attr "cpu" "znver1,znver2") > (and (eq_attr "type" "idiv") > (and (eq_attr "mode" "SI") > (eq_attr "memory" "none")))) > - "znver1-double,znver1-ieu2*25") > + "znver1-double,znver1-idiv*14") > > (define_insn_reservation "znver1_idiv_HI" 17 > (and (eq_attr "cpu" "znver1,znver2") > (and (eq_attr "type" "idiv") > (and (eq_attr "mode" "HI") > (eq_attr "memory" "none")))) > - "znver1-double,znver1-ieu2*17") > + "znver1-double,znver1-idiv*14") > > (define_insn_reservation "znver1_idiv_QI" 12 > (and (eq_attr "cpu" "znver1,znver2") > (and (eq_attr "type" "idiv") > (and (eq_attr "mode" "QI") > (eq_attr "memory" "none")))) > - "znver1-direct,znver1-ieu2*12") > + "znver1-direct,znver1-idiv*13") > > ;; Mem operands > (define_insn_reservation "znver1_idiv_mem_DI" 45 @@ -205,84 +206,84 > @@ > (and (eq_attr "type" "idiv") > (and (eq_attr "mode" "DI") > (eq_attr "memory" "none")))) > - "znver1-double,znver1-load,znver1-ieu2*41") > + "znver1-double,znver1-load,znver1-idiv*14") > > (define_insn_reservation "znver1_idiv_mem_SI" 29 > (and (eq_attr "cpu" "znver1,znver2") > (and (eq_attr "type" "idiv") > (and (eq_attr "mode" "SI") > (eq_attr "memory" "none")))) > - "znver1-double,znver1-load,znver1-ieu2*25") > + "znver1-double,znver1-load,znver1-idiv*14") > > (define_insn_reservation "znver1_idiv_mem_HI" 21 > (and (eq_attr "cpu" "znver1,znver2") > (and (eq_attr "type" "idiv") > (and (eq_attr "mode" "HI") > (eq_attr "memory" "none")))) > - "znver1-double,znver1-load,znver1-ieu2*17") > + "znver1-double,znver1-load,znver1-idiv*14") > > (define_insn_reservation "znver1_idiv_mem_QI" 16 > (and (eq_attr "cpu" "znver1,znver2") > (and (eq_attr "type" "idiv") > (and (eq_attr "mode" "QI") > (eq_attr "memory" "none")))) > - "znver1-direct,znver1-load,znver1-ieu2*12") > + "znver1-direct,znver1-load,znver1-idiv*13") > > (define_insn_reservation "znver3_idiv_DI" 18 > (and (eq_attr "cpu" "znver3") > (and (eq_attr "type" "idiv") > (and (eq_attr "mode" "DI") > (eq_attr "memory" "none")))) > - "znver1-double,znver1-ieu2*18") > + "znver1-double,znver1-idiv*7") > > (define_insn_reservation "znver3_idiv_SI" 12 > (and (eq_attr "cpu" "znver3") > (and (eq_attr "type" "idiv") > (and (eq_attr "mode" "SI") > (eq_attr "memory" "none")))) > - "znver1-double,znver1-ieu2*12") > + "znver1-double,znver1-idiv*6") > > (define_insn_reservation "znver3_idiv_HI" 10 > (and (eq_attr "cpu" "znver3") > (and (eq_attr "type" "idiv") > (and (eq_attr "mode" "HI") > (eq_attr "memory" "none")))) > - "znver1-double,znver1-ieu2*10") > + "znver1-double,znver1-idiv*4") > > (define_insn_reservation "znver3_idiv_QI" 9 > (and (eq_attr "cpu" "znver3") > (and (eq_attr "type" "idiv") > (and (eq_attr "mode" "QI") > (eq_attr "memory" "none")))) > - "znver1-direct,znver1-ieu2*9") > + "znver1-direct,znver1-idiv*4") > > (define_insn_reservation "znver3_idiv_mem_DI" 22 > (and (eq_attr "cpu" "znver3") > (and (eq_attr "type" "idiv") > (and (eq_attr "mode" "DI") > (eq_attr "memory" "load")))) > - "znver1-double,znver1-load,znver1-ieu2*22") > + "znver1-double,znver1-load,znver1-idiv*7") > > (define_insn_reservation "znver3_idiv_mem_SI" 16 > (and (eq_attr "cpu" "znver3") > (and (eq_attr "type" "idiv") > (and (eq_attr "mode" "SI") > (eq_attr "memory" "load")))) > - "znver1-double,znver1-load,znver1-ieu2*16") > + "znver1-double,znver1-load,znver1-idiv*6") > > (define_insn_reservation "znver3_idiv_mem_HI" 14 > (and (eq_attr "cpu" "znver3") > (and (eq_attr "type" "idiv") > (and (eq_attr "mode" "HI") > (eq_attr "memory" "load")))) > - "znver1-double,znver1-load,znver1-ieu2*10") > + "znver1-double,znver1-load,znver1-idiv*4") > > (define_insn_reservation "znver3_idiv_mem_QI" 13 > (and (eq_attr "cpu" "znver3") > (and (eq_attr "type" "idiv") > (and (eq_attr "mode" "QI") > (eq_attr "memory" "load")))) > - "znver1-direct,znver1-load,znver1-ieu2*9") > + "znver1-direct,znver1-load,znver1-idiv*4") > > ;; STR ISHIFT which are micro coded. > ;; Fix me: Latency need to be rechecked. The changes looks good. But we will do a quick benchmarking with your patc= h and update you . Regards, Venkat.