From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR04-VI1-obe.outbound.protection.outlook.com (mail-vi1eur04on2041.outbound.protection.outlook.com [40.107.8.41]) by sourceware.org (Postfix) with ESMTPS id E364B385C6FE for ; Fri, 16 Jun 2023 06:20:05 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E364B385C6FE Authentication-Results: sourceware.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.com ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Zqjk0xsMcCbtgirbnoio4hxU7GNe+ZRVchkPUPHZnF7yWUkDJhXZvP1qVgIzvSzDF/MkWJ4Tozn3EkNaMCCwAM/GfwrHsCFdJNL/KZXPerghIlZYAohW5Tnnfl7KIZODNh+MvZHWvjDxVk3IGABz9u/TBmAnii6y22546qmswIK7aP/zQIAVUfIEK2WLHRN0mKnnqazs+F/3UGuUtxpE7/6nz5t1Dxapf60nR7Mtx28pi7DTgDRzJTh+/P9EA1lBh+x9mxlYbF9PjBdax6ld3L98TTwPTTqF6PS7Gf4AHPtCqpa913xPBEi2fO/9CGiiXnTw/qZ5p0M/sHyhkpGvjw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=q5LIqNl1pPgDOoZFV5LNKa4QnINoPDFWicf09FDKCrM=; b=A7BET30qXrOUL7NmTqxDqY50jmNpn2BXcWdu2V0f7NUBWZ8/N9zW4Ayjw9wYQc9cZ6k4lS5ShUP45AVec0srvWJFLVyM6F/f4HnOybGsuBRs4iiilJgTPNfeche/CxXsaXqfCm42mHp/HMNIfEOmWIU45ELVWT51k6ySNkPY824zuVvl26lBiv3+ILn0z/Ps0Gr9iL43d5QvgUaMM6vXJ55vbSwWZzHdpviwPKeenV7bqrJ/eQFHT+ooMErpGK7R22DQtSmfStiCjndvHmaCopBipGDnvA8v0FAxbLzPQT5+uw3oANkfpbKWVfvx1qqVQZIdG7t0QnAx5znuSXqtgA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=q5LIqNl1pPgDOoZFV5LNKa4QnINoPDFWicf09FDKCrM=; b=yzZxZ4pEdVwoy8bPHR7MjGoTwpUCBk2TFk8rSgmN9zrlXfkqL7p3e4iiIHfRm4Hq3oPd/oiNcSA3fkinQu2np1KPDEuyfyoaamokxFBCCUQnyD9hbfMPI5ys1nOKNkF32BrbgH2fZ0FDwvQbcjEAagPUZvhTRw2/hwojkSSHV33+mh8r9eCB0RUzMk1psXZJpjFJTPqywL+soj4EVuP6SYIiu3db6+g3Rg/QWVR+Vj4RUSPY3m48QAjTMmyJh6FrcDeS0KnMahAWsbyN9Mc2amsHfoUIng8hpeZfU0YhcZaUuttH89oNp4q83g7YKN04FBaCoI0D3Ppo2UwPt+IpSA== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com; Received: from VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) by VI1PR04MB7008.eurprd04.prod.outlook.com (2603:10a6:803:13b::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6500.29; Fri, 16 Jun 2023 06:20:02 +0000 Received: from VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::e442:306f:7711:e24c]) by VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::e442:306f:7711:e24c%5]) with mapi id 15.20.6455.039; Fri, 16 Jun 2023 06:20:02 +0000 Message-ID: Date: Fri, 16 Jun 2023 08:19:59 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.12.0 From: Jan Beulich Subject: [PATCH v2] x86: correct and improve "*vec_dupv2di" To: "gcc-patches@gcc.gnu.org" Cc: Hongtao Liu , Kirill Yukhin Content-Language: en-US Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-ClientProxiedBy: FR3P281CA0129.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:94::17) To VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: VE1PR04MB6560:EE_|VI1PR04MB7008:EE_ X-MS-Office365-Filtering-Correlation-Id: 79fbdac9-52d3-454e-758a-08db6e31b7ea X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: udOgPZZneEukNmPTGxbNJ8eI6DRfu30qb2leV3sVNpSd9MbxoxHi8ptn6LNZjR3QVkhTjj5CwdwNl3Eqo71S+Vjs+Xhs/WVuSYN27x0m0HGv/FhwjN5ZW3AkJudXIvPECWVnwXtJzHoQF5cKv0h2kgZ2SJHsoS3xi9Go+At0sRMQNpb5FR7N/W16azvtAn5LjCsOXXUNW8+3TNWhduNeSJ53uEASazV7Y8EGfZAGDUYwj4NIsy9KTJz0MFni0CcPC0qu93BjbGEsxAMsIvicGFrzQr4QkgMaTLiK/5xrfTKpEVUgbVesX8AkcF8RC4QGZnnY9tguFW6Hf7jLiIF7WtKBc875tFcXmDTeq8n8aZOj+OkQex7uxV4HerApmM1X1St4Hqt00+34GboekKOWf+tRbNyk6b8eOhrtaBzCFfPM7UZR05YRYUu7yRrWzXHAZHMGhKGzNEhExhFKyRi76wiuznGQ82JyYQkgeUmsPUUUxAH46mI1BRE02BVwezow9LNUbZgDkHNmARdCt5kHxZflMYIRR30QW3DTYgYslWcuAHJmLFmEXkn0F7E5avFtthrDAdteTbGiKBcF3cP+/MPRXI13r3njavhNqG5o198EOhMw2jYIosbMuWF6SpiCtHBUdfcoQ3PkgoqFrbAYLw== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:VE1PR04MB6560.eurprd04.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230028)(396003)(366004)(136003)(346002)(376002)(39860400002)(451199021)(2906002)(41300700001)(5660300002)(8676002)(8936002)(316002)(4326008)(6916009)(66476007)(66556008)(66946007)(36756003)(2616005)(6506007)(6512007)(26005)(186003)(31696002)(38100700002)(31686004)(86362001)(478600001)(54906003)(6486002)(6666004)(43740500002)(45980500001);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?R2dCMkg5dFRnSkJCQnZtc0JXKzlxRHdTZUpPdDY1RHNrbWdSSGJ3SFlQZitJ?= =?utf-8?B?dWxEaWZabVFQTmwwdlA3VDYzV1l1Q0FFTXVuYlVqQXFMekd1K3BkYVNzNm5a?= =?utf-8?B?YWFpc1NPNkNJblZZRW9iR3M4Z1JqUnBRSlRpS1hYeXpNd2VRZENpL2FJL0l1?= =?utf-8?B?bHpybkQzczgvdUJYWi9lUzB0cDQyWDRmUUswaVkrQU10RVZWZzlkVGRxeTJY?= =?utf-8?B?QTVNSEVzYll4dU1jcnM4dy9HdVVHaXIvSkpRUkNiM2VycnM2eUh4UjY4THYz?= =?utf-8?B?SkdkWmlsQUhZakNDWjRTZnVVVWtRbGNCOVYzeVA0bFdlaTJJdmxCdGNhQzh0?= =?utf-8?B?bngvbEFSLzNOUk9sS3hjNVV5Wlpaa2VHV3BmUHhLSEFYTlBrMkQ1YUhHVlhw?= =?utf-8?B?ZkQrWWtDeHNIS2NmTEI1NDduejFENTc1YlVVcENQK3U1VDBVZTdZS2RPR3FY?= =?utf-8?B?WHpSMnVRT28xajBLS2FkdnBxa2QvdzdnSDdZKzlIV293QUVuSjV3eFltYmZ0?= =?utf-8?B?MVN0YUgvZjJOL2tHNnJpRkNDWFE4TE9pWTlSNGg0U25WQnBOZ0F2VGFGVjBH?= =?utf-8?B?Zmhpb1lQNTBNVXF5cWRSMm54bUFLVlVMR3kwK1pwNWZCVU5mQ3Z1SjMyZVBP?= =?utf-8?B?WnBpQkpUbm96TjB5QVVpNDhVbXNHZytEM3p2K2F4dzVkUC9iWFVzRWlIdlJt?= =?utf-8?B?ZkFIc2F6U0lHR1IrMGdpUWExZ3BOaUR4UFNtYkd5WC9manJ4T2JrL0VuNEwv?= =?utf-8?B?QWlkSms3eDlIaEQva093SkNVMGFBRlg2SFd4WnZ4MVFVdjliR2FpSkZUNmdP?= =?utf-8?B?Umx6c2V6N1NTRVd6VkFxcFJGMEU5VTlJQ0RtMFdvbmgxOW94VjZSVkFuTzJD?= =?utf-8?B?MXpXUHRUeW5ZaUl1TERnZFF6d1BWY1NhWk1ibUcyZ1hhV0hRR1NMdkc0cmJH?= =?utf-8?B?R0wrNmQ5bUFWSjVocndKUklZcjJEN3UwWG4vTWZQcDV6U2NRUFd1QWZ2UUdm?= =?utf-8?B?YWY1Rk1GMlcxSENyd0IvRkRwNmoydGFBR3kyb0xkKzVTVXJVVXVXcGFmYmtW?= =?utf-8?B?RVdZL0JuMHJCZ1gydWRUdlFoZjd4aWNZa3NRWHd3eVVKeFNRSHNnbnNSSHkv?= =?utf-8?B?NzlWdndqTzdVRVNMZUJYU2JCcmhja3RhWGxYUkcya3ZDSi8rUjB5VTBRWDdF?= =?utf-8?B?M2YrWGwvcTVZcjVpNjI4eTRkOUlOQTZ4U3k2Q0lCclh6NzRSV2liVWdzT2l1?= =?utf-8?B?azkwZzFLNm5CUVdsODlCK1kvK0xnajJ2Vi96RS9ldGpUOUNtdVA1QmZ1RkI3?= =?utf-8?B?eFg5QWJ5OVFOb2lTZmlIcW9lVmhvbjY0SzljbmsyMnNhYmNlaUhicjYwUG1H?= =?utf-8?B?b3hZSEpjL1lZdFMvcXlFaGx1K0JSMVdJdHBtS2RUTDNmZTNMSUppUzcvSkJM?= =?utf-8?B?b2xwQytuM0N1aGRZajFvZmVNYUNiTmluOHRqUXpRcGNNSk1vdUc0eSswczMz?= =?utf-8?B?c1ZxODhBYTFzVU1OT0pibjdWVkE1MnVEektoTTJxU05qbjEwZzZrMjJDL3NI?= =?utf-8?B?Tm1BRmdNNG50Ri9mR3M4TWlkM1BaUXZaM1Z5Vk1LOVpOTGNhYzdodHQyVGY4?= =?utf-8?B?UFA5eWNLRG9UZlRYMnJHb2djQUpNam1Vc0ZaQnNQVzl5VEN1S0JzNjNvZFFL?= =?utf-8?B?TDhYeVpreW4xZllnY01IbkN2d3c3T09FdS84c0hiTmRMSEoxOFBrY3diUkh1?= =?utf-8?B?RXFlNnEycklsVGFxa0o3Mi9SZkg2d0sxSTVXeVByNzU3bmM0MksyY0tMcU9W?= =?utf-8?B?cTd5dVAxRXlDNFJBY2pMaWVnNFd5K0JLampCMWQ1cDZzWWhtUnlHSzVVeVZr?= =?utf-8?B?YTN4Z3V0Q25KVmhmZzFmRnY0T01JNFVKc0VSL0hkck4vQzd1Qy9hWlRxeERG?= =?utf-8?B?ZHdGdzE2S2hGQnNGYk5SY29GWFB5Mnpwc21ndlVLd1ZzbktUcHEyUmZBZERX?= =?utf-8?B?NXhkYld5ME1xZ1BxODBzbVMwQUtzMnFDSjNJU0YvOGVZL2FrbE9ibnlha0wv?= =?utf-8?B?dU1sZ0FlNkF5cVF4UEdvaVFVRTFVR2Z3bVNGSlMyRkc3RlRBSlExVlBDMUNw?= =?utf-8?Q?jVC3e4uh2R3ZJJibwCNf0tZ4n?= X-OriginatorOrg: suse.com X-MS-Exchange-CrossTenant-Network-Message-Id: 79fbdac9-52d3-454e-758a-08db6e31b7ea X-MS-Exchange-CrossTenant-AuthSource: VE1PR04MB6560.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 16 Jun 2023 06:20:02.2499 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: f7a17af6-1c5c-4a36-aa8b-f5be247aa4ba X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: lj80zKICBvqwGSjj2XFuRxGmvWnN2FsFZCBSn2hUVkvgeJTfeveTeWdxvHvJcAcMEZfxZtscAxhOKPMc7GpdpA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR04MB7008 X-Spam-Status: No, score=-3027.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: The input constraint for the %vmovddup alternative was wrong, as the upper 16 XMM registers require AVX512VL to be used with this insn. To compensate, introduce a new alternative permitting all 32 registers, by broadcasting to the full 512 bits in that case if AVX512VL is not available. gcc/ * config/i386/sse.md (vec_dupv2di): Correct %vmovddup input constraint. Add new AVX512F alternative. --- Strictly speaking the new alternative could be enabled from AVX2 onwards, but vmovddup can frequently be a shorter encoding (VEX2 vs VEX3). It was suggested that the previously flawed %vmovddup alternative could use "xm" as source constraint. But then its destination would better also use "x", I think? --- v2: Use "* return ..." form. Set "mode" to XI for new alternative without AVX512VL. --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -26033,19 +26033,35 @@ (symbol_ref "true")))]) (define_insn "*vec_dupv2di" - [(set (match_operand:V2DI 0 "register_operand" "=x,v,v,x") + [(set (match_operand:V2DI 0 "register_operand" "=x,v,v,v,x") (vec_duplicate:V2DI - (match_operand:DI 1 "nonimmediate_operand" " 0,Yv,vm,0")))] + (match_operand:DI 1 "nonimmediate_operand" " 0,Yv,vm,Yvm,0")))] "TARGET_SSE" "@ punpcklqdq\t%0, %0 vpunpcklqdq\t{%d1, %0|%0, %d1} + * return TARGET_AVX512VL ? \"vpbroadcastq\t{%1, %0|%0, %1}\" : \"vpbroadcastq\t{%1, %g0|%g0, %1}\"; %vmovddup\t{%1, %0|%0, %1} movlhps\t%0, %0" - [(set_attr "isa" "sse2_noavx,avx,sse3,noavx") - (set_attr "type" "sselog1,sselog1,sselog1,ssemov") - (set_attr "prefix" "orig,maybe_evex,maybe_vex,orig") - (set_attr "mode" "TI,TI,DF,V4SF")]) + [(set_attr "isa" "sse2_noavx,avx,avx512f,sse3,noavx") + (set_attr "type" "sselog1,sselog1,ssemov,sselog1,ssemov") + (set_attr "prefix" "orig,maybe_evex,evex,maybe_vex,orig") + (set (attr "mode") + (cond [(and (eq_attr "alternative" "2") + (match_test "!TARGET_AVX512VL")) + (const_string "XI") + (eq_attr "alternative" "3") + (const_string "DF") + (eq_attr "alternative" "4") + (const_string "V4SF") + ] + (const_string "TI"))) + (set (attr "enabled") + (if_then_else + (eq_attr "alternative" "2") + (symbol_ref "TARGET_AVX512VL + || (TARGET_AVX512F && !TARGET_PREFER_AVX256)") + (const_string "*")))]) (define_insn "avx2_vbroadcasti128_" [(set (match_operand:VI_256 0 "register_operand" "=x,v,v")