From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 44245 invoked by alias); 29 Jan 2018 10:48:38 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 44213 invoked by uid 89); 29 Jan 2018 10:48:36 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.2 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_NONE,SPF_HELO_PASS,SPF_PASS autolearn=ham version=3.3.2 spammy=surprises, H*f:sk:87po5wm, H*i:sk:87po5wm X-HELO: EUR01-DB5-obe.outbound.protection.outlook.com Received: from mail-db5eur01on0085.outbound.protection.outlook.com (HELO EUR01-DB5-obe.outbound.protection.outlook.com) (104.47.2.85) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 29 Jan 2018 10:48:34 +0000 Received: from DB6PR0801CA0001.eurprd08.prod.outlook.com (2603:10a6:4:2::11) by DB3PR08MB0138.eurprd08.prod.outlook.com (2a01:111:e400:5045::28) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.20.444.14; Mon, 29 Jan 2018 10:48:30 +0000 Received: from VE1EUR03FT022.eop-EUR03.prod.protection.outlook.com (2a01:111:f400:7e09::203) by DB6PR0801CA0001.outlook.office365.com (2603:10a6:4:2::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.444.14 via Frontend Transport; Mon, 29 Jan 2018 10:48:30 +0000 Authentication-Results: spf=pass (sender IP is 217.140.96.140) smtp.mailfrom=arm.com; linaro.org; dkim=none (message not signed) header.d=none;linaro.org; dmarc=bestguesspass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 217.140.96.140 as permitted sender) receiver=protection.outlook.com; client-ip=217.140.96.140; helo=nebula.arm.com; Received: from nebula.arm.com (217.140.96.140) by VE1EUR03FT022.mail.protection.outlook.com (10.152.18.64) with Microsoft SMTP Server (version=TLS1_0, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA_P384) id 15.20.444.13 via Frontend Transport; Mon, 29 Jan 2018 10:48:29 +0000 Received: from arm.com (10.1.2.79) by mail.arm.com (10.1.105.66) with Microsoft SMTP Server id 14.3.294.0; Mon, 29 Jan 2018 10:48:16 +0000 Date: Mon, 29 Jan 2018 12:37:00 -0000 From: James Greenhalgh To: Kyrill Tkachov , , , , CC: Subject: Re: [AArch64] Fix sve/extract_[12].c for big-endian SVE Message-ID: <20180129104816.GD6406@arm.com> References: <87d11wpx0r.fsf@linaro.org> <5A6B345C.9040105@foss.arm.com> <87po5wmz0x.fsf@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline In-Reply-To: <87po5wmz0x.fsf@linaro.org> User-Agent: Mutt/1.5.21 (2010-09-15) X-EOPAttributedMessage: 0 X-MS-Office365-Filtering-HT: Tenant X-Forefront-Antispam-Report: CIP:217.140.96.140;IPV:CAL;SCL:-1;CTRY:GB;EFV:NLI;SFV:NSPM;SFS:(10009020)(39860400002)(376002)(39380400002)(346002)(396003)(2980300002)(438002)(189003)(199004)(58126008)(6286002)(77096007)(46406003)(50466002)(186003)(2201001)(47776003)(246002)(26005)(106002)(305945005)(356003)(23726003)(316002)(4326008)(53546011)(86362001)(6246003)(1076002)(59450400001)(110136005)(336011)(55016002)(16586007)(104016004)(72206003)(229853002)(8676002)(2950100002)(2906002)(97756001)(76176011)(26826003)(36756003)(33656002)(106466001)(478600001)(8936002)(5660300001)(83506002)(7696005)(18370500001);DIR:OUT;SFP:1101;SCL:1;SRVR:DB3PR08MB0138;H:nebula.arm.com;FPR:;SPF:Pass;PTR:fw-tnat.cambridge.arm.com;MX:1;A:1;LANG:en; X-Microsoft-Exchange-Diagnostics: 1;VE1EUR03FT022;1:IMR5uJIroNnTVM1wnjIvBK6N4TE08boyBRPxgs7fbuG0155kceFqbJ9aVizuc0cWGlRxeB18ofxFOxS/p/THTzifjfEk1aJBGvyTV8vdBF6SJ3vHdumo/v5DEZpRHYS4 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 09421477-1f96-41d2-d4fc-08d56705d5ae X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:(7020095)(4652020)(4534165)(4627221)(201703031133081)(201702281549075)(5600026)(4604075)(4608076)(2017052603307)(7153060);SRVR:DB3PR08MB0138; X-Microsoft-Exchange-Diagnostics: 1;DB3PR08MB0138;3:GSF40kHDGM8CeEDjq3jsJrSKf6U65SEhd+DccRprXOOmFzRa7+pOjn/s7nVwV9cMuugHFwr102Qkj5oi9Bus6SPcNmI79KzZZAh+K0SiBFWZvhiFi1hCY+q4otMcPNRMfT4ydrPhqcBfmYd90J/xM4yaJRhYtZ8Lf+8SfuNpU32h310FLazD9VNJNFyUUnuUgT5nc7CXoYU1iFlUk5pkg4yGQMMP3JBadVSPk+9pmZLHC09lKPHRtrivfYGu0vqkDoffNGj69C3/7+7T4t/IEiQAvd/Dl4iAKq3qAWg5VDxc9ZCzvv1rmJ/1j83jiaF3djSsLtScI+0ccsjMsk35F591Ognzek51NpV1HJo9Wds=;25:b0yYQevLFH4buu4C55g11a1wm49r+Ye81t9WDPAYqh/4VsyyAG05c0oF09B5Lhwi8qFGb0VVxbjJ9nKuoYN2Jc8k41lBoJmvbf54wudAwwes0wTN9a7B1LN/MkiVYSJ19MzEzX3nkDH3ItbFLedyMIC73NWTnhfLNKjGBlPHFpNuoMdkH0oCAVkQjHEogmM0rlYpi3gL2zJsAzNybnv4z+Hzy0gLLT9oAmQWxIwG/3heiTQXOFXt85seQi1PbkJYJynWcnz02TfKu5IZU79KNibYhnA5r9ukjUUwZwOkoTxl9LAUHx1y1VsfG2VJl+MxUi+1HXPeA+eqBaNZ3U0SoeiqD5UnK3I2eWfB9GFOa3E= X-MS-TrafficTypeDiagnostic: DB3PR08MB0138: X-Microsoft-Exchange-Diagnostics: 1;DB3PR08MB0138;31:RV0la+U+IjeGKsGaUHR7J8WcTcrhJQ/M3LgTosC7Pb0PfX8M9SJCr3RI+Y6wUZ7XpVGevS2pSM+czsccj61NHh/e/JGEJb8XFfFlCcjMo8UCCy+/WIbqI1p8/4hvVIx5omcqQD4kWHuOoCw30+F8wfBwuY4Ah7RgyiFUzzRmGNJLQgR5rmaMBlMyAB8MZ7u7El6981cUoHXHKcPA4qk08TWFmPdk2hVDyNvEB0m3qKY=;20:FYj2MZXFJjvzdkyM9E+k13EQa+GObYdvWDltPPQEtGRwUlhx5+DZ+iCax7m+gW90VkDxVdl4fT7svMKHG4paPnYtpKP5dxZfbbWYFIIF1vhrig5V0kI1zZrfXRakCV/gEX4JiN1B4Y6gkSdpyjAQWlEl37iy7fqw62xhMo5NMyID4ALIEgTXNANZ0YE9thYSOo9meCpjBNMRaE147Xf5Ipm4V9mjaUpmOoHntJz3NQ6GmLqFviwS9EDy0lAG94IZ;4:P20Bxjvs+hXAfoI0VO/lt/L5yD4P6TC2PGiyFsvDApDmwosoRDAtxZSD0E44zck0+txBbTFNr5ZN+5YbGRiF9GbEM+NQyig2bWwWBBzvoTur3oDDuQS8SMmAjY7YYlLz3mTAssF2GoxcWntipqDK5GP44PDnJiWbDKN2ZsN8ww/rJKXF/9nypo6JdE/Zj7DbDw3DxbXMsIciqkUJlzfRe2EJz7BnpjLuOw508Vi3wICsB17NoRnQK3b0K+DDz6XzjlYCEDT40J+PAI4dRHiX9hKF+AQT+fU4JyMB6eLjYRIYrlhVLqSrWqk5Sp6azUMm NoDisclaimer: True X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:(180628864354917); X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(6040501)(2401047)(8121501046)(5005006)(10201501046)(3231101)(944501161)(3002001)(93006095)(93004095)(6055026)(6041288)(20161123558120)(20161123560045)(20161123564045)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123562045)(6072148)(201708071742011);SRVR:DB3PR08MB0138;BCL:0;PCL:0;RULEID:;SRVR:DB3PR08MB0138; X-Forefront-PRVS: 0567A15835 X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1;DB3PR08MB0138;23:Aej0iIuBDcVD8dutwmYNeUtn22PeHZSv0IxpDDIPM?= =?us-ascii?Q?oU/elNCI2XJMDc6bk/PZUSJ4vhYhDFaiUW8TTTmdrKUUnmvYQdj764Q8ZVDa?= =?us-ascii?Q?4Jzd6M/C38MCLGIO620anjy5nODPZ7F/04y5bIYRFE4lEzTDQkGWW4bFJacS?= =?us-ascii?Q?XEPeKdPFomoTqXuGa+kR4oVZxezACkV6JzeWx99KLMcI/s06UkkkCeJiKuKS?= =?us-ascii?Q?mmC/7ZsYHEj5qrCZ4ZWT5XDZGS0sOcUD3kpIy3Il7r6S4pI0zjpH/KqiypSZ?= =?us-ascii?Q?SXxChbVGe1MepRVWMHNx1+v6NFm+1m8dqN9A+LAfNImDTjOywxTm8C0dnm/k?= =?us-ascii?Q?BIG0H7SJlviHoZ0L3O2f5h8m9EldOYi+QYpWUCrIcoM4n60RwCxeGJGxl1OA?= =?us-ascii?Q?0TqgqOhFJYodcb2MYMYyhgCOfMlWp1DpF+xXNJx8s3A6uxeFobF6LChq7s6n?= =?us-ascii?Q?unIB5vZ0aC9v3ztpjLKuaxShnHm0pjZWycIl1uZdM6dS2Hfcp/OdfI7fqaPg?= =?us-ascii?Q?aSYZG01eibfWL3Dm//FJsEDK1oYNqiF0xtJpwsfg/AwKUhsWAqr9ZHTNm7OR?= =?us-ascii?Q?hqpltCTpAuCbkecs9AHg09loFsfH2hSw7ekbbxOZ27LtE9MeF5jnARKVz3F6?= =?us-ascii?Q?naVJM47ySiDdBBnlGj15LmJ4MqqwWSZRoCljzT1PJj8xlAJPwsgUtRqgY5Q+?= =?us-ascii?Q?81GZnprZC/bjLplxkfTyH2kDPVgEHn+nvch8L+7NFVOHDEvQuOhpfTQ40NqO?= =?us-ascii?Q?ZtL2ZhRLxLjHNy2GWllYHqdujHWX7LdXfbh5TBfG/pJMzHqFJyl7kBoXzqdK?= =?us-ascii?Q?nTjYCirUvTdYNafVGbtTHuhrkERBkWJJIgq138PqBnzZqqvlyOv/FYRNeV1U?= =?us-ascii?Q?kiWd4YiYWUraLoNqV64R7bcizhS2cdHISDv4yl7lbSEOIMFJCax3Z4P7gsmz?= =?us-ascii?Q?Kf8eqq9HBr+T80BCV+tZjYVp2dX9C1FmmUEd7cCjUtN+OgUvrkmuGR9AmzbG?= =?us-ascii?Q?iQ8tdD2o+8VoHHSzmagrcaoh8qHQMyeEl15fq5FQw4d/RxHsszVXANlCISGc?= =?us-ascii?Q?vS+R+XMfVaVbmsezO8yz3/Z1LDiWaL9l1dVqptmXRVkIxcKJ1wjT6qrZ9JGb?= =?us-ascii?Q?P6+JJ44QpQWzByKg1RgDpi8bMKHQhxJT3jMQa0bTNGv55GWlwY4kK/6CjsnE?= =?us-ascii?Q?oxoWKlwKYXdP9s=3D?= X-Microsoft-Antispam-Message-Info: WfZIcXWyWrbuiyolimH40lOJgaVidWx5uFIzSXZVt9/3UjBecERnlIGG3yB9KJgnssAd9tyYC2q3m0qXOCRWfw== X-Microsoft-Exchange-Diagnostics: 1;DB3PR08MB0138;6:ff29qOEtpH4iG9647O7OPcftIu+9Ruj9mC/ehG6Ml1+eK9Dt9kIQdNM1QARAc2x0ESkutNzBsCBLC7K0UAS7F461MhdrQaO+byPRCIsXq0T6FhzUyj3oorJKz0tN1VWyn7cSkr/Yj/paRWI3r1Uv0IPxjTV1INxRmusTPlHR+GtLGSOOJvxzCBbZ31LqPaBriNei/DURiL9/JJQu1zG+yL4rbmwgcfUlJIFi3uvneavTyIVLYgzmScLNJAtTCBZPu+o2LaxgHin+LcJr6a7II8feZJ9nmXtiN63tbNmZ/gI30MyhHltrpTW9y/ztEJnL3Z0YaLrphW7gLSyttOUXOQFYXNc9pikh2Q1tQL+Uw5NYFT6EhSti8TRV13qnjdWj5nlIq09ltAReFsXdcFU7GQ==;5:ELZww5em/il7ErIdjIY+xEdAMUw58vGYQjlEwZrkPOOtCQX9IfYlwL0UKrXGDsUlJFdqvV0/RuTCULG1qXNxATth9jDOnVuQujPqIUmROrCFpQYsg92K8QjfI2z0RXdz86pD5MnbMT5i1jJz4K7wXwQcGrIRXDDMu6G2Ff1AvW4=;24:sS4uPee4Zlc3ewjLXrSQUieDP/PKRRspCefi3Zz6DqefVFOzvEkeSQu1UAZwleq6t95SeuvbdEsip9zYA9O1bjMi5asIo6LFdcwQOJWs2o8= SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1;DB3PR08MB0138;7:bciWp4Can2p38N05Wq5YlVim/IQsyx1GsOrhvCkWzbCOtrO8aUXOpkNtWEnsHmrW1pINin5hVz8I76GQF+mVFkXfC2BXUzDFyxVQyKNzF3hszvkIx9/Ao1AHugRVPvQBsfJFJGYRxI8tXauV2nRgabNDFkIvL7+KchkEZ2GHwheaRJz5Ggy9tespTnfuvf9YcUttPJdg/P4GDKsMYVNkNi63QhkhUscXyYvg9wc2hQ+MXUfSuL9Zk67MBZwXx3J5 X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 29 Jan 2018 10:48:29.7170 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 09421477-1f96-41d2-d4fc-08d56705d5ae X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d;Ip=[217.140.96.140];Helo=[nebula.arm.com] X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB3PR08MB0138 X-IsSubscribed: yes X-SW-Source: 2018-01/txt/msg02246.txt.bz2 On Fri, Jan 26, 2018 at 03:15:58PM +0000, Richard Sandiford wrote: > Kyrill Tkachov writes: > > On 26/01/18 13:31, Richard Sandiford wrote: > >> sve/extract_[12].c were relying on the target-independent optimisation > >> that removes a redundant vec_select, so that we don't end up with > >> things like: > >> > >> dup v0.4s, v0.4s[0] > >> ...use s0... > >> > >> But that optimisation rightly doesn't trigger for big-endian targets, > >> because GCC expects lane 0 to be in the high part of the register > >> rather than the low part. > >> > >> SVE breaks this assumption -- see the comment at the head of > >> aarch64-sve.md for details -- so the optimisation is valid for > >> both endiannesses. Long term, we probably need some kind of target > >> hook to make GCC aware of this. This explanation is scary - it implies there might be more surprises waiting for us. > >> > >> But there's another problem with the current extract pattern: it doesn't > >> tell the register allocator how cheap an extraction of lane 0 is with > >> tied registers. It seems better to split the lane 0 case out into > >> its own pattern and use tied operands for the FPR<-SIMD case, > >> so that using different registers has the cost of an extra reload. > >> I think we want this for both endiannesses, regardless of the hook > >> described above. > >> > >> Also, the gen_lowpart in this pattern fails for aarch64_be due to > >> TARGET_CAN_CHANGE_MODE_CLASS restrictions, so the patch uses gen_rtx_REG > >> instead. We're only creating this rtl in order to print it, so there's > >> no need for anything fancier. > >> > >> Tested on aarch64_be-elf and aarch64-linux-gnu. OK to install? OK. Thanks, James