From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from NAM04-MW2-obe.outbound.protection.outlook.com (mail-mw2nam04on2073.outbound.protection.outlook.com [40.107.101.73]) by sourceware.org (Postfix) with ESMTPS id 2DC5B389365E for ; Tue, 15 Nov 2022 12:09:03 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 2DC5B389365E Authentication-Results: sourceware.org; dmarc=fail (p=quarantine dis=none) header.from=amd.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=amd.com ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=WJ0FaeUY5ClYXMt+j2Lr/yXvykumsDN5ZKmblPBDlJpOTSkNRvdyDmfFkASaz8fFbMrbZZTw2RKyx9bwqQvrpYvqroZA40MS22jueil0klgQUoj0hl+E0zAB4aKUJvRubPZhCBcV0l+unu9LSGb2s+YQMl2eQ5/MSPMSimVUsXxxDYwEJu77dBF9xYqwNQ+VKNlxItSkJaPnzOsqlEPVkgaKkT/ng0L4e4YWJ+5C6k4OqOvsfHaLeh+9BwSQtlEKa5qEt2wgqlJtGvrfgH6JErqDn8l0Xw723cxUZcUpYlG6vpj5Dx+eaOschCpqvHs77N5gSHxHBXrE2Tm+VrOBzQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=/kJB7TjpCmxzI3B0+6Hl/COswcauDp5I0/UQ3CbbWHQ=; b=QZREdn+dQDU5TGpZh6iydciyxjgrZOqwJ7PJuB7APPuPP/nBfhVchYbckYLr41fPgFJLNN9oWX/Pd1z2XM6SdyYgWIvVNvcw/xTuo3h26F/ipiyzLZO3qPtnGeUpstYSvOjK7h73ATIBArPCYBaTbSNAlfZiInVB5Qnz1zlc3yPmNk68Isu+tdUipYbtsZ2Xq34gONcvVcB4G5VOkn69P2MvLPx+ZbQqVh2te9ftTKnoBB9dVQuHxvuYGHEbUVYlNOHYt0sgMa9v73/fzkxt/iUYnTubrn9soczZZDQ2ljhRFCS2ivGahPGWxWbDhFP/bDykHbARvRM1H1Esc32rOQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=amd.com; dmarc=pass action=none header.from=amd.com; dkim=pass header.d=amd.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=/kJB7TjpCmxzI3B0+6Hl/COswcauDp5I0/UQ3CbbWHQ=; b=HovyY88SLpE1E0FClzUgs6G+HVQhFefV0A/z53EMwJBkGbKiwYk8SCbqucxR810QGW7KD9qyzKPLeGvTG+k85Xxx+PJ9pm1bVhP/wQJR2ULfsFk4zwv1x+sJGeFfSqRqsABEvMWanG6EOD+QVUbEezfDwZGtggffdbHljEYId2M= Received: from DM6PR12MB4795.namprd12.prod.outlook.com (2603:10b6:5:164::11) by LV2PR12MB5749.namprd12.prod.outlook.com (2603:10b6:408:17f::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5813.17; Tue, 15 Nov 2022 12:08:58 +0000 Received: from DM6PR12MB4795.namprd12.prod.outlook.com ([fe80::5a61:e2c8:1c78:c163]) by DM6PR12MB4795.namprd12.prod.outlook.com ([fe80::5a61:e2c8:1c78:c163%6]) with mapi id 15.20.5813.018; Tue, 15 Nov 2022 12:08:58 +0000 From: "Joshi, Tejas Sanjay" To: Alexander Monakov , "gcc-patches@gcc.gnu.org" CC: "honza.hubicka@gmail.com" , "Kumar, Venkataramanan" Subject: RE: [PATCH][X86_64] Separate znver4 insn reservations from older znvers Thread-Topic: [PATCH][X86_64] Separate znver4 insn reservations from older znvers Thread-Index: Adj4PDK6isiklu5QQkmQOxLj7oCI6QAHez6AACHWebA= Date: Tue, 15 Nov 2022 12:08:58 +0000 Message-ID: References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: msip_labels: MSIP_Label_d4243a53-6221-4f75-8154-e4b33a5707a1_Enabled=true; MSIP_Label_d4243a53-6221-4f75-8154-e4b33a5707a1_SetDate=2022-11-15T11:40:10Z; MSIP_Label_d4243a53-6221-4f75-8154-e4b33a5707a1_Method=Privileged; MSIP_Label_d4243a53-6221-4f75-8154-e4b33a5707a1_Name=Public-AIP 2.0; MSIP_Label_d4243a53-6221-4f75-8154-e4b33a5707a1_SiteId=3dd8961f-e488-4e60-8e11-a82d994e183d; MSIP_Label_d4243a53-6221-4f75-8154-e4b33a5707a1_ActionId=0c803c9e-b0c8-4bce-9c8d-7012ab8a81d1; MSIP_Label_d4243a53-6221-4f75-8154-e4b33a5707a1_ContentBits=1 msip_label_d4243a53-6221-4f75-8154-e4b33a5707a1_enabled: true msip_label_d4243a53-6221-4f75-8154-e4b33a5707a1_setdate: 2022-11-15T12:08:54Z msip_label_d4243a53-6221-4f75-8154-e4b33a5707a1_method: Privileged msip_label_d4243a53-6221-4f75-8154-e4b33a5707a1_name: Public-AIP 2.0 msip_label_d4243a53-6221-4f75-8154-e4b33a5707a1_siteid: 3dd8961f-e488-4e60-8e11-a82d994e183d msip_label_d4243a53-6221-4f75-8154-e4b33a5707a1_actionid: 320240ed-92de-46cf-b863-61fb5a41ad67 msip_label_d4243a53-6221-4f75-8154-e4b33a5707a1_contentbits: 0 authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=amd.com; x-ms-publictraffictype: Email x-ms-traffictypediagnostic: DM6PR12MB4795:EE_|LV2PR12MB5749:EE_ x-ms-office365-filtering-correlation-id: 8f164b31-58af-4732-6e83-08dac7022ce1 x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: =?us-ascii?Q?CD4WzUU8XqKt97fRYblPCf13c6wAHfO8oTbyivQYcpnJN0ujNwNfrox87EOL?= =?us-ascii?Q?MyQ+uKCw1NxjFQQUm3Q95GXhv0Vfl7cOmgI4/OpoMFICxFXdOOmN6IapLhZP?= =?us-ascii?Q?20ruy3C8NbqpisrzQppLF/bQYVTiFaMVpfyu9YkbFsMubpPaKtGwso3S53tS?= =?us-ascii?Q?qTA5GnlF01VAsnH73HqiQmn4/j4vY5OXuayGc/preCtUc2BHFEIX0h8ZyPoE?= =?us-ascii?Q?sacolscRFraPTaa+lAsO2qQ4xmUSI2RFUDZ4KGQhFswIW9jU33mM4kmhyii1?= =?us-ascii?Q?+8i6Xm/Hv231z5046Dtt6I0koTg93QpWXxjI416p1+JoasMpjZd2i2jJP9dB?= =?us-ascii?Q?UcMC2JeaLSZYFDeB1CMeXZkliCbIkT1t/mva2lhxRGnMpOE4h7yMCnqHmnn6?= =?us-ascii?Q?2P3bJflHfF8bj4WfonmzzJ7lGQ7UmkHs4Zkfvln6BezEINhXEV7KSRjXdR7c?= =?us-ascii?Q?DhdD63iz/FkXGGn1zV108nY5Mb2Ru75iitOwUHgnCd8tQ4IkUqu5CIbMvJ7D?= =?us-ascii?Q?2o4A6ZU5JFSchpz4hC9A9zwmiEnY2KB2qAKFLZ8UEUVvll7qvLYexA9mjeNn?= =?us-ascii?Q?NnpBfKN0ymLGcLb7M9feXTUpYfJPRQnP7W6x3/1iueiVRpZsMcuG0oQHpSc6?= =?us-ascii?Q?h1nCXkjGYC+36pidc4VzNxUIfdKz/4fI/Fq72jSCSd03OomqCASM7nmx1Xwc?= =?us-ascii?Q?9BnUcVUZoN5asX5yCFrpGXE9m/Xzx/1opeZs6NPjyfVzL34IQ4wzGcgzMfhI?= =?us-ascii?Q?DmbRbj21xkBl2u131bPhJ4kQA+UT1g8GzjcGhkgUggvX8KCCt7YLhRUGkEJ3?= =?us-ascii?Q?WFKZ3wI0w2LN5fUM/JuX2y9wLvoowke/2MoWKT733C5aMsYmaoN3d0V2edvs?= =?us-ascii?Q?KgfkRMEPQPRfcWAXuG8PCZ5KGrQiVG2lGcY88DKE5iQn/BYnQ/xJeE5dmDiQ?= =?us-ascii?Q?mNGD9kc3ppGU5Rhs/M0vibReBHTLYQ/OpfhuPXm0WGM8oBJbWZXwgZkKwiBh?= =?us-ascii?Q?KgK/e/3ClLFaDSWW7Be8bBYIYaykBoou+WtKvCCse0f6Nw5E73MJDzT89fq1?= =?us-ascii?Q?N3c8MsdO?= x-forefront-antispam-report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DM6PR12MB4795.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(4636009)(346002)(136003)(366004)(39860400002)(376002)(396003)(451199015)(2906002)(8936002)(66476007)(76116006)(52536014)(4326008)(66556008)(8676002)(5660300002)(64756008)(41300700001)(66446008)(7696005)(122000001)(966005)(26005)(9686003)(186003)(6506007)(38100700002)(45080400002)(71200400001)(478600001)(316002)(54906003)(110136005)(33656002)(66946007)(38070700005)(55016003)(86362001)(83380400001);DIR:OUT;SFP:1101; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?us-ascii?Q?YyQCpH3L9lTOZsc5vwYBDCFYhxmZF6XpnxDH75rz65KQwraQ+3C2l/pUo7Is?= =?us-ascii?Q?mcEYxhdRx0lXcddwi+sPLBQUU/jgUVTgKuGwPTMyWKQtIbFC/urtzs/ns1np?= =?us-ascii?Q?GqKaMar2mRUd6U+Ha6Soa7iUrs65zGoVmlwS0AJLuO5No++LhqJZAb+b+pzc?= =?us-ascii?Q?n7fNazhzefxTLyFPjztNRqVK/ZPAxjnBsFGkCrRql6XF5V9I/hj3aw/eoRAC?= =?us-ascii?Q?AtVOyZ9U2H0ksu80U/j1UgSwxZCrjcs3Ejyfxs7yWDH2u45jiEcrEW3r0kAL?= =?us-ascii?Q?VQyXWgN8JPiDmSfVXDFmnCYIj5g8Y5tVxLyyt9bmlOJ+1BVHK0Al8Zux7cRc?= =?us-ascii?Q?PST8t6FsxyEe4ekawoVOtuZgGXIFCuVDO8j5amFwPEzPI3OFZXAAsAP92MIq?= =?us-ascii?Q?P3HPIqCcvEyGA4pHb7RsJyv91pyky7fdLGbYUydAN458cE7POxFIrKDQcpSf?= =?us-ascii?Q?5RvhWYItp4k7roJKbqCBlts9gS5SxNfn98w3G+rVzUR+8BPaPV472YSAkMoA?= =?us-ascii?Q?Kf0xqGL+MLjgQlNi48dAb77SjEmK/V36KQKMVevfkR0B/aAYtmr5TrLeV0VJ?= =?us-ascii?Q?x/jfVpZfxH+QCjt6HtKr0gF2C1Rd0rYT6dZUj//hYuzVQgSerNKjqvV6g1zV?= =?us-ascii?Q?KeDcC33arRTQ8Y1uY/2Lrqswx8ZDvPrmZqOoNr55O37WyvLkc7L/D7oz1d4x?= =?us-ascii?Q?0BhR7mrPh4J1wHqG53gOzYmZLnqrGnKPY5gmdrUQZFDi7eXbdIZZA/L/ITKA?= =?us-ascii?Q?S1EI6C7mUkfh1RGCd9df/I905BLpLKG8lb94JD7pogXp5LHIVCV/p4CpNtNs?= =?us-ascii?Q?yjnvaeUmnIoq62lIqb78Ge4vwp1CX9eWyABIhbR66Ba+1RynhWFaAI28o2oC?= =?us-ascii?Q?eH7Ic7cTIOSM3xwy2lENnwcpzcQhN+kb/UIXsrdoF6oQwklYBnMTvgGz3G1Q?= =?us-ascii?Q?iRwvKXvW9cmB/I3mxyOf8YmZ0YmmgFWx0CyE8iARl5EMyUkwYDkwrKYFzNTi?= =?us-ascii?Q?UY8kIR53mN7JC6nlkoqSNWjE3RnjIe1yCde0SoAhoZOIxsw5SSM33V1sSY6n?= =?us-ascii?Q?6hB/cp1mLTJDN7mG9v7EEAVthm6tx9WBuI9V6ghRo2SVxul5nIh5I49A7TGK?= =?us-ascii?Q?AgFAsSuuZbgYFAfv81y1QYH02UEbvPdp/b6/xWFClUmmu5/TEsnSLbp6xzjz?= =?us-ascii?Q?n3UUDyNNbQY0ykl0dtc7g3D2U0VggWsN9Ki3fTY3MxCSpRpogUDb+071YeUQ?= =?us-ascii?Q?j7WWfAQzoY3P6SYNEdC0UlC+fPUgAfMv0EVyIF0jqAkxG8aztrIyzXnzaqwY?= =?us-ascii?Q?Kj6BmB/3vAa04s3MC6UDhFpmMZiUQiGzCBd5W1sht/ZMxclpFQX73P5vsErb?= =?us-ascii?Q?aLniHWRlQeydsU3gjU2XJxMmMLNixFTZ+7NhIYgNfYCG/OUq0C0L/cjye8Pj?= =?us-ascii?Q?wa6Fs4KJ7i5ejzqiD11tKYVkV5rAo+9hoZ4rGw7X9x7btTsRIvg/DdqAgYnd?= =?us-ascii?Q?kZWYinbpm6BVc98xDHMPQe3S6Ot3ZS2O+Sr8qohR5ckCnTTE4faYP+tW1SWS?= =?us-ascii?Q?5K4/g/+fMDBVm7DnXkhgqXjNr5a+8o6pc+Plt918?= Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: DM6PR12MB4795.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 8f164b31-58af-4732-6e83-08dac7022ce1 X-MS-Exchange-CrossTenant-originalarrivaltime: 15 Nov 2022 12:08:58.2867 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: VWGeEUF6Gm3U1o24oP0NucpnvuzOPLlsxo8G075cYhMEFVDmNv8rstK9qpphT2NeI/1kJg3RALDgaSvnda7PXA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: LV2PR12MB5749 X-Spam-Status: No, score=-3.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: [Public] Hi, Thank you for reviewing the patch. > Hi. I'm still waiting for feedback on fixes for existing models: > https://nam11.safelinks.protection.outlook.com/?url=3Dhttps%3A%2F%2Finbox= . > sourceware.org%2Fgcc-patches%2F5ae6fc21-edc6-133-aee2- > a41e16eb5b7%40ispras.ru%2FT%2F%23t&data=3D05%7C01%7CTejasSanja > y.Joshi%40amd.com%7C5e440454f42948dd6b2e08dac6714448%7C3dd8961fe > 4884e608e11a82d994e183d%7C0%7C0%7C638040487038011623%7CUnknow > n%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1ha > WwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=3DiWNT2VRhEHxgpbq > Y4dNYjuzdvz%2BaV5XkLTuAegjj%2B5Q%3D&reserved=3D0 > did you have a chance to look at those? I am yet to evaluate that patch, I will soon revert back. > Why are you modeling 'fdiv' and 'ssediv' separately? When preparing the > above patches, I checked that x87 and SSE divisions use the same hardware > unit, and I don't see a strong reason to artificially clone it in the mod= el. I thought of modelling them separately as they are different ISA groups. But yes, since they execute in the same unit, we can model them in the same= automaton. > I have a question on AVX512 modeling in your patch: >=20 > > +;; AVX instructions > > +(define_insn_reservation "znver4_sse_log" 1 > > + (and (eq_attr "cpu" "znver4") > > + (and (eq_attr "type" "sselog,sselog1") > > + (and (eq_attr "mode" "V4SF,V8SF,V2DF,V= 4DF") > > + (eq_attr "memory" "none")))) > > + "znver4-direct,znver4-fpu") > > + > > +(define_insn_reservation "znver4_sse_log_evex" 1 > > + (and (eq_attr "cpu" "znver4") > > + (and (eq_attr "type" "sselog,sselog1") > > + (and (eq_attr "mode" "V16SF,V8DF") > > + (eq_attr "memory" "none")))) > > + > > +"znver4-direct,znver4-fpu0+znver4-fpu1|znver4-fpu2+znver4-fpu3") > > + >=20 > This is an AVX512 instruction, and you're modeling that it occupies two p= orts > at once and thus has half throughput, but later in the AVX512 section: >=20 > > +;; AVX512 instructions > > +(define_insn_reservation "znver4_sse_mul_evex" 3 > > + (and (eq_attr "cpu" "znver4") > > + (and (eq_attr "type" "ssemul") > > + (and (eq_attr "mode" "V16SF,V8DF") > > + (eq_attr "memory" "none")))) > > + "znver4-double,znver4-fpu0|znver4-fpu3") >=20 > none of the instructions are modeled this way. If that's on purpose, can = you > add a comment? It's surprising, since generally AVX512 has half throughpu= t > compared to AVX256 on Zen 4, but the model doesn't seem to reflect that. > > +"znver4-direct,znver4-fpu0+znver4-fpu1|znver4-fpu2+znver4-fpu3") AVX512 instructions (512-bitwide) occupy 2 consecutive cycles in the pipes = they execute. So, it should be modelled as shown below: (define_insn_reservation "znver4_sse_log_evex" 1 (and (eq_attr "cpu" "znver4") (and (eq_attr "type" "sselog") (and (eq_attr "mode" "V16SF,V8DF,XI") (eq_attr "memory" "none")))) "znver4-double,(znver4-fpu)*2") (define_insn_reservation "znver4_sse_mul_evex" 3 (and (eq_attr "cpu" "znver4") (and (eq_attr "type" "ssemul") (and (eq_attr "mode" "V16SF,V8DF") (eq_attr "memory" "none")))) "znver4-double,(znver4-fpu0|znver4-fpu1)*2") Doing this way increased the insn-automata.cc size from 201402 lines to 212= 189. Hope it is a tolerable increase or do you have any suggestions? I will= revise all avx512 instructions and post it. Thanks and Regards, Tejas