From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR04-HE1-obe.outbound.protection.outlook.com (mail-he1eur04on2044.outbound.protection.outlook.com [40.107.7.44]) by sourceware.org (Postfix) with ESMTPS id 5C7033858C1F for ; Thu, 15 Jun 2023 06:03:16 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5C7033858C1F Authentication-Results: sourceware.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.com ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=FaE3IqIqTVDfjVvofLn2wQ4BMyJQKvgLEe0GiKrrvUb/akEgrNIKR1YF4TJWZtAUHkPuUL8iYNcScFJn2CE46NLjUl1RwnkUjakrDMj1wA1Rf/u62V+ID9EKxmikxsgOsVsMLPplRVETMTWpdWJdO/ZDlIhXZUAYMTIGDgdx8PjGUX1sgSiR25CY2UYW2RfpYkQWoZLARM1mx4E5CMFY7a5mx1DD+IZ1cjWW6xwpQJ3RXQJIjiLMODdC018c413QR6FognSZGoHZ+u0ooIssX2A+SkouVlcWEYbKzhWiRKAC+XrrM/sxODviog/ZoH9VuOa9DKX0rz2B43499YKrdw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=zSWgvmKK5vu2NAyJI56VfFa4/Xu4UhLHCe0VQL2DaxE=; b=M8munLF+ygqX/iMOvgZA5Vgq9TjTALDbYtaofCaA6X51VmZgD/gkCLmO0u/2R/jCAcEWLtjy9LqmHpfZFkDf6tVs6pruttX97/ti9Bis2WJQf9wKGADfILTM7BOjqPVOCHfbfsZe1ElAEA92NJlX415yoU/OWxuUqLdsN3J7ejPbsNADNMXxsI9JWkgCIu8Fig5gmE+kuVO5r3oM1LwdBV2rZdgEstdsUfwZvm3H9DwShFoRN3uvbg7uAMMF7utJ2q+G//5cI6mCUzP6+W2BnQyGFOT61XCcLMVt3aB7M1e1JvODAB09z6/Zdqq6fWGFU6CWS7jHKPI4UTB/Ht3ioA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=zSWgvmKK5vu2NAyJI56VfFa4/Xu4UhLHCe0VQL2DaxE=; b=ksLF+vVze8RuRvXg5QTcTVlrwG4TLTWZ0hCPcFrHEYtMMwHdmHS1cafiJ9wFW7LzR0/hyyyRbZnjNg6rRaHAmGW+ushc4Wco5ZOVnD16LNc0ImSCaCFuOPSFYV5E3GMFNNoVOJIkT2KFS6aFK2vFlagZGLJbJoBCWAWvVew921xq+jm2O++ih0LcckhKZVc0rkbX94kRhWt8zrOrkB3axi+zE+ZRGS2Kt7JREQDvvN+/xGVVBKVXfyS6Y+O0f2ZK4Hpz3agWliOIKoQh9lk68cL+7Mc8BA4qCNcUH1bdJJZkapbRTk6JneArWRpGbLoEEKl3D4geU15rsyAA7ZauXA== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com; Received: from VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) by DU0PR04MB9493.eurprd04.prod.outlook.com (2603:10a6:10:350::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6477.37; Thu, 15 Jun 2023 06:03:13 +0000 Received: from VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::e442:306f:7711:e24c]) by VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::e442:306f:7711:e24c%5]) with mapi id 15.20.6455.039; Thu, 15 Jun 2023 06:03:13 +0000 Message-ID: Date: Thu, 15 Jun 2023 08:03:11 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.12.0 Content-Language: en-US To: "gcc-patches@gcc.gnu.org" Cc: Hongtao Liu , Kirill Yukhin From: Jan Beulich Subject: [PATCH] x86: correct and improve "*vec_dupv2di" Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-ClientProxiedBy: FR2P281CA0125.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:9d::19) To VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: VE1PR04MB6560:EE_|DU0PR04MB9493:EE_ X-MS-Office365-Filtering-Correlation-Id: 4777edd4-acce-4459-c1c1-08db6d663419 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: R+2O532NJQMA+zHcja4YOD7NgRIySInd3ISdmMXBpZ032DHGGK64oT6aVEgO7L0Siu0cuTK4ga6t+vU6y/1MqKEWRS6S0Dx2iJ/dzRm9gxFVuMcbDWP51YhWBmwCaOXd8SHvstxgEMAFRbDWNT/LmF33nGdC1uSEfHyKdurt8cSLoQLgYeUa6xqActTrfsgCMLe5ADS9tG+kw49+rtsuiTCF9OVf8TLY2mUI4JTjMZsHQ/Q1rlr836PxqpEh8JU2x6UE/ouXTCxfl98yO0pvSHY8sST1sLIsJtHdcxdaaD9cScqxDHVHxlAnrTmxi+gIaEj0Hlg/IMHpnk4rLqTs5a1yMt67s2MTbF2+cmuhjoLsq7st8kz9U4FiXbXNWl9Av+4a8whHzL5hVNDNxW69Upy8hvDH87hWwR/o2fe9U15I9WHfPI3lWv44N0gi2jdCUpRDaGdgtA47KSsCdOyDAVxWCkWY5hxPHEnwjAnA9y9GC5lMEnGITe9CLNYv6Otp6s2iKxlh1An76if65Lc4zoY+KMobR6VRQXY12OxzSeoqP2tczxEBDgY4obJ1L5PHNSx3oplgm3xni4H+J9z2VDBAWoE2fK1H8+a5PH3xhkmZ/o/YdevlTV+G5oRSnXt2QvFJzlCWVg5ym5qeHnUUQg== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:VE1PR04MB6560.eurprd04.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230028)(346002)(39860400002)(396003)(136003)(376002)(366004)(451199021)(2906002)(2616005)(36756003)(86362001)(31696002)(38100700002)(8936002)(8676002)(6486002)(316002)(41300700001)(5660300002)(478600001)(31686004)(54906003)(66946007)(4326008)(66476007)(6916009)(6512007)(26005)(6506007)(66556008)(186003)(43740500002)(45980500001);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?Z05FUC93aXhvTjVnTUY1VWZkQUR2TWR2V1dUSG5FU3NaQ0VXYVRDRkY1K081?= =?utf-8?B?endrV09RVy9GRTBKeGFUNWZVVGxpZWVUa1dKWGU0SEhmVDBlTGY0MkFZb2Z3?= =?utf-8?B?MFEvcWh5cWdWYlI4YWtJUTIrTGdqTkFRR2hjbU96bjQxZm5JdWpyN0kwdUww?= =?utf-8?B?b0xTMTg1TWhubldwZGVuM1RKR2dTNTZEQmF3U3FEZ21wMlhtOXVzWXBUcmlL?= =?utf-8?B?b0ZuWUg2alg4dlpheW42bk1GV1hTa2d6NUtNSFBWTjdmbXFOT0Y2Q2lBY1Bv?= =?utf-8?B?eXdhVGtVZEhBRkc1NVMrRU85L3Q2Uk05RldaNzV5akJLdUtISzhuUkNNNHBZ?= =?utf-8?B?WEFDR2VmNjRVaWlENTk3Y1E5MlE5ODByT1EySlhOUHhtSUNUVzVoMDFNMkhj?= =?utf-8?B?NmlrWExZeFg3SWdBelgzT1ZHRTBtU2VpMmVyR2ZzWElVOUVQSGVoclQyaEY4?= =?utf-8?B?bmlXeUExbFl2V0FiS01xdHo0RnRSejl3VkdWY3V1ZHorSlZpcW1aTzYvYTlz?= =?utf-8?B?YldyaDQ4SHc5aFpPclVmcFdXaHo4Q0puL2JVekVvZGdwZ0RSb3dIaDZYMEtL?= =?utf-8?B?Y0JWZlVxZXBOUmNleXRYMWJhZ2dNNjhsLzg4VjQyODFSZDVkZVFZQ0FERjBT?= =?utf-8?B?eS9lYy9hZkhCaHF5Tzc4Q3NpZUw0bmRPMEZOMlMrQURlaWc5THZlSSsxaFBQ?= =?utf-8?B?aFpJa1pqM2Z4eGE0ck9tSmczcUR6Uk1YU09VSVBsaEQ5UlBrclNkZFFFRSs0?= =?utf-8?B?cDd4TE9qTTJicUJLTEtxWnBwNDdrOEVYVmRyZXpXTDA3dzNScnhZdGJFdmhT?= =?utf-8?B?QzAxTEIxaFh3dVZFOS90bVUzVGVuMlh0MDhXdDlUVVVKaEVKNHBWZThJS3pn?= =?utf-8?B?WW1OQVJyWlF4aEV1NVNiK1BQQXQxNFBTek1ZajgxNlpBMXhia2QxQTNUdC9M?= =?utf-8?B?ZDBLK3BqWmVQdll5cjR3bU4rLzFEd0VzejdMMXVZblBsRWdCZDJHMVU0RTdF?= =?utf-8?B?dFkrNm56YWt6VW9OdGxZWFRYQ2lvdkNjTVpsaDVPd0tlL0VVcmZGVXFkZVJX?= =?utf-8?B?UG13Rk8rY1Z4R2cyRklNSUtCWmwrN3lPaFQ1Y2hheXVwb3BpWllkeUpjM1dS?= =?utf-8?B?LzJabk1wL3NSaVJKVTkzdWFTYUJNWnF2ZXpsdVhKTXNQVnAvTTVLT1ZPSERR?= =?utf-8?B?dkhqZmVkUE5UNGV2cHR6MVo1Sks1OHdQRGwrcEExVkJza2twYXEzL3RtTFU2?= =?utf-8?B?NWZhMGl5WVVPZitGMGMwTGRpMVRHaE1ycUlmVGVUajA2dDNINE51M2U2ZG9F?= =?utf-8?B?VUxsWitLZ2o5Z2hnOFZORmpyaGR2bXpBUFJSSzRoOWZRMDV1Y1NnSkoxS0ZO?= =?utf-8?B?ZkNidkpQK3d3eGJsNzBvTzlLQ0hVRVBwaFZWYUs1U1oyNGJ2d25rY0Z1UWo3?= =?utf-8?B?cWJYb1gxcDhIRytDY2lVTGppcEo4OGNHRHcrdzFTRXRsZjZlRE1MUHdvaGJT?= =?utf-8?B?ZEUrZUFkd0orakhsaGFQNHFicTh3amJUd1dKV0NEbTU1c0xrUjNtNk9NODFn?= =?utf-8?B?cERKOERJVmxpL1l3cXZYMHoyVjI2VDAzc1F1RjZEL3E2bEdxMU5tQlRuUTBq?= =?utf-8?B?aGNjZWRKNjlPcEVUS29NcFRNVXpxc3FkVlk3U1lURFliclBid1JpbC9tK3dv?= =?utf-8?B?WDN5cFpqSFRJeUFiYkk4VmdSMm9OMzJBQ0xGOXVuRVVEODQxRk9hQjlCL0lD?= =?utf-8?B?M2R1eUVybXp5djIybllFVnphUlhlTE1NbERXRncrQm1Dd0FaVEdHYzI3TVpE?= =?utf-8?B?Ry9pTDY2QnpUOElpcGMwZk9zZVFLQmc1a2NmY0oyTmhYaHVsa1ZoMmxHRmkz?= =?utf-8?B?R2d4N0ZUUFRZUVR1WFNmYSt6ZGJCc3VENjhYczBnV01xZkVlL3lZa0g0Zlgy?= =?utf-8?B?K0dqVHJIQzBORnY0a1h6V0w5M2IwRklSUEF0SVhTTmNRY2RxeFpUcHhZTGo2?= =?utf-8?B?NCtjMjlzYnJLZ2J4M3hKNndvaERSamIvOFFGQUhib0J6TC9oUXlhaVVqK2VF?= =?utf-8?B?ZXYrbFhaTWRUUHh2QTU0bkxBWGRiOEg4SHZrTnBnaU5Fbnljek9CVjV5dC9R?= =?utf-8?Q?kGx0WS8Qb6991B1FKJSJTXutZ?= X-OriginatorOrg: suse.com X-MS-Exchange-CrossTenant-Network-Message-Id: 4777edd4-acce-4459-c1c1-08db6d663419 X-MS-Exchange-CrossTenant-AuthSource: VE1PR04MB6560.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 15 Jun 2023 06:03:13.2679 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: f7a17af6-1c5c-4a36-aa8b-f5be247aa4ba X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 72bnYYfjjHhWsby9+Pxtnkk2uXNHJjZoy/z0IoQiMctA0KiIolKZf4LupMDwKKdlMYJUCC3WXM8AxWFetrYw3A== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DU0PR04MB9493 X-Spam-Status: No, score=-3027.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: The input constraint for the %vmovddup alternative was wrong, as the upper 16 XMM registers require AVX512VL to be used with this insn. To compensate, introduce a new alternative permitting all 32 registers, by broadcasting to the full 512 bits in that case if AVX512VL is not available. gcc/ * config/i386/sse.md (vec_dupv2di): Correct %vmovddup input constraint. Add new AVX512F alternative. --- Strictly speaking the new alternative could be enabled from AVX2 onwards, but vmovddup can frequently be a shorter encoding (VEX2 vs VEX3). --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -25851,19 +25851,39 @@ (symbol_ref "true")))]) (define_insn "*vec_dupv2di" - [(set (match_operand:V2DI 0 "register_operand" "=x,v,v,x") + [(set (match_operand:V2DI 0 "register_operand" "=x,v,v,v,x") (vec_duplicate:V2DI - (match_operand:DI 1 "nonimmediate_operand" " 0,Yv,vm,0")))] + (match_operand:DI 1 "nonimmediate_operand" " 0,Yv,vm,Yvm,0")))] "TARGET_SSE" - "@ - punpcklqdq\t%0, %0 - vpunpcklqdq\t{%d1, %0|%0, %d1} - %vmovddup\t{%1, %0|%0, %1} - movlhps\t%0, %0" - [(set_attr "isa" "sse2_noavx,avx,sse3,noavx") - (set_attr "type" "sselog1,sselog1,sselog1,ssemov") - (set_attr "prefix" "orig,maybe_evex,maybe_vex,orig") - (set_attr "mode" "TI,TI,DF,V4SF")]) +{ + switch (which_alternative) + { + case 0: + return "punpcklqdq\t%0, %0"; + case 1: + return "vpunpcklqdq\t{%d1, %0|%0, %d1}"; + case 2: + if (TARGET_AVX512VL) + return "vpbroadcastq\t{%1, %0|%0, %1}"; + return "vpbroadcastq\t{%1, %g0|%g0, %1}"; + case 3: + return "%vmovddup\t{%1, %0|%0, %1}"; + case 4: + return "movlhps\t%0, %0"; + default: + gcc_unreachable (); + } +} + [(set_attr "isa" "sse2_noavx,avx,avx512f,sse3,noavx") + (set_attr "type" "sselog1,sselog1,ssemov,sselog1,ssemov") + (set_attr "prefix" "orig,maybe_evex,evex,maybe_vex,orig") + (set_attr "mode" "TI,TI,TI,DF,V4SF") + (set (attr "enabled") + (if_then_else + (eq_attr "alternative" "2") + (symbol_ref "TARGET_AVX512VL + || (TARGET_AVX512F && !TARGET_PREFER_AVX256)") + (const_string "*")))]) (define_insn "avx2_vbroadcasti128_" [(set (match_operand:VI_256 0 "register_operand" "=x,v,v")