From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 84175 invoked by alias); 12 Sep 2017 16:28:26 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 84149 invoked by uid 89); 12 Sep 2017 16:28:25 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.1 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_NONE,SPF_HELO_PASS,SPF_PASS autolearn=ham version=3.3.2 spammy=acts, jackson, Jackson X-HELO: EUR03-DB5-obe.outbound.protection.outlook.com Received: from mail-eopbgr40065.outbound.protection.outlook.com (HELO EUR03-DB5-obe.outbound.protection.outlook.com) (40.107.4.65) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 12 Sep 2017 16:28:23 +0000 Received: from VI1PR08CA0011.eurprd08.prod.outlook.com (2a01:111:e400:597a::21) by AM4PR0801MB1521.eurprd08.prod.outlook.com (2603:10a6:200:3d::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.20.35.12; Tue, 12 Sep 2017 16:28:19 +0000 Received: from VE1EUR03FT043.eop-EUR03.prod.protection.outlook.com (2a01:111:f400:7e09::205) by VI1PR08CA0011.outlook.office365.com (2a01:111:e400:597a::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.20.56.8 via Frontend Transport; Tue, 12 Sep 2017 16:28:19 +0000 Authentication-Results: spf=pass (sender IP is 217.140.96.140) smtp.mailfrom=arm.com; linaro.org; dkim=none (message not signed) header.d=none;linaro.org; dmarc=bestguesspass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 217.140.96.140 as permitted sender) receiver=protection.outlook.com; client-ip=217.140.96.140; helo=nebula.arm.com; Received: from nebula.arm.com (217.140.96.140) by VE1EUR03FT043.mail.protection.outlook.com (10.152.19.122) with Microsoft SMTP Server (version=TLS1_0, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA_P384) id 15.20.13.11 via Frontend Transport; Tue, 12 Sep 2017 16:28:18 +0000 Received: from arm.com (10.1.2.79) by mail.arm.com (10.1.106.66) with Microsoft SMTP Server id 14.3.294.0; Tue, 12 Sep 2017 17:28:07 +0100 Date: Tue, 12 Sep 2017 16:28:00 -0000 From: James Greenhalgh To: Jackson Woodruff CC: Wilco Dijkstra , GCC Patches , "richard.sandiford@linaro.org" , nd , Richard Earnshaw Subject: Re: [AArch64, PATCH] Improve Neon store of zero Message-ID: <20170912162806.GB33912@arm.com> References: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-EOPAttributedMessage: 0 X-MS-Office365-Filtering-HT: Tenant X-Forefront-Antispam-Report: CIP:217.140.96.140;IPV:CAL;SCL:-1;CTRY:GB;EFV:NLI;SFV:NSPM;SFS:(10009020)(6009001)(39860400002)(2980300002)(438002)(199003)(24454002)(377454003)(377424004)(189002)(53754006)(54534003)(189998001)(356003)(4326008)(77096006)(305945005)(6862004)(47776003)(2906002)(8676002)(8936002)(246002)(6246003)(83506001)(316002)(97756001)(50466002)(6286002)(46406003)(229853002)(86362001)(4001350100001)(33656002)(26826003)(478600001)(55016002)(2950100002)(104016004)(54906002)(5890100001)(72206003)(1076002)(7696004)(106466001)(110136004)(76176999)(53546010)(36756003)(54356999)(50986999)(5660300001)(23726003)(18370500001);DIR:OUT;SFP:1101;SCL:1;SRVR:AM4PR0801MB1521;H:nebula.arm.com;FPR:;SPF:Pass;PTR:fw-tnat.cambridge.arm.com;A:1;MX:1;LANG:en; X-Microsoft-Exchange-Diagnostics: 1;VE1EUR03FT043;1:K3K3TngeRaLkdxc8Rs/t5QJZAPA9D31E4kav7MKet6/t8AXqqD4710784eASzBz4OLz3/3O3LK8wZZFrAOBZUH1QwqoINP7Cwm1NNvmgpxcTJTc7LRnGmMlseZp8WoBm X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 01cab5eb-545b-47ab-465c-08d4f9fb471d X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:(300000500095)(300135000095)(300000501095)(300135300095)(300000502095)(300135100095)(22001)(2017030254152)(8251501002)(300000503095)(300135400095)(2017052603199)(201703131423075)(201703031133081)(201702281549075)(300000504095)(300135200095)(300000505095)(300135600095)(300000506095)(300135500095);SRVR:AM4PR0801MB1521; X-Microsoft-Exchange-Diagnostics: 1;AM4PR0801MB1521;3:3y16qVweDTon0t7pt4JIqeuVbCfYJP7Pcu4UFnO2NhH+xIBJs/2lN1dJ1BOy3uaq7Zx2QCawtjgfkz/w+yCJXr33JqZ7/LsX8XinE7JYm/oFtvUgneKHDLrizsabXzDYffgTKmorwtsSOK3MpVPSaD5HQ8FtJc9pDP/YTwBV8SBDjT/gMC5+awBdHusA52azisEmBLVPvr81mCcBnqD2UB678Zve3spKOkn/mEVUUSaAyYnBdQbtCr3ZlhtBHf1kh/sKPHQiVo9ADeo+IrcDFqQcvuvqGVqekQDQCTeUSEvbew6U8oSAdBmrkRv41mMBKhThhOucl50QEMfWXj3Ema6ORO0B/bZWw50S7lpld/M=;25:npD/MK6cZLEHdOLDun2RboQsTyy8tdYOP9JUGCGswKw34GPfWI4WyW6DQ7Hdq6fqfQ0c2Qymzegcb3uTE0EVwN179S4aYe4LzCyri4UxH1zjRGbGBWwUnGgxaowbAhwIJGvJt0rwPQVfZkH46s1nVpKAtlYdOFAsWD/9csEB3eAkt1zJeImPSiYLKbbx8YEkfJuulASzGIbIfITNOOyLeuHH2KdVEiP+Q4NVvDDspHS4I83U/R8SLwmSocebPMPMi4FSdaO2vv/ErvQlEwhCIx3clatcjWA4Y2gzUUCEzt1M5ftFgbT+zJZseEK7XgXJh7AlbKi1tVv7+yV3Wlj1cQ== X-MS-TrafficTypeDiagnostic: AM4PR0801MB1521: X-Microsoft-Exchange-Diagnostics: 1;AM4PR0801MB1521;31:5neVOwl52ZFXf97FaRaWymxwlfO7eMSdZzS76qF0NCPDnsYAgp6IcV7tRbKgOjAsHwYXvTMXtVJb+NDL/Mk+MgvGHt0uh1NSAmCjGJ9Cec91SqhzNFcOUh79vX8GubYT/y83AgxCpaoJ8gLnvOjYCT0lbRT08J3ePKYQ5dbO2o412uAoNOHDIAa8GPUxpz4HinniQMnyFPttiZPcjZGDHIbJKHKZtYssquHcbXxd+Oc=;20:2UNL+dUH5XWTTUabXCh5gTOX5JmwGBh3qsPrwPXnkLJEEojGtP7aCRJguEi/W8T2bXShasks6bdVMGX37HNWloVwMYBU5acE+o5wzt0ZisiBAACs7vZkM8q7UElJ4FZg42zJSqGb3o8RiOoVUaZUYCFH+pPG0XUxRmvX7I7XWWDhV9snHWvu6Xkig5nVL6KdblBKcDF0R+HjUa3gR1Et+k0O76w9FWRa9hD1SwHFeOc1yOHckcrCPtqDy+kCKJPF;4:Ah+u0cARmYfoHkD242eRE0QBk94OqK2wShSb7lS7xnAYFiZZjcavC426DPvY6dy+zYWnWYI8/TmyhEJUp4dIZmWoflAsCQaqBFgSfRhp+s6u1GG/zppEyswTj5RCDM0iSM0GiYbPzr4jFqnBt/qhDzt1zGr/3H8D8zIz0MKdZIitL0E5/iHS2GQQhXZ2wDSr3poAqrnO4f8sdJ18gh3mLU8im0kfSXaSG41aMzpaPErXa46yMi/kjRfDbz1YMICl1LG9jRBuVBVj1yQizeH0xxtzeZztgA4Rfdh2usBK7jY= NoDisclaimer: True X-Exchange-Antispam-Report-Test: UriScan:(180628864354917); X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(100000700101)(100105000095)(100000701101)(100105300095)(100000702101)(100105100095)(6040450)(2401047)(8121501046)(5005006)(93006095)(93004095)(3002001)(100000703101)(100105400095)(10201501046)(6055026)(6041248)(20161123562025)(201703131423075)(201702281528075)(201703061421075)(201703061406153)(20161123558100)(20161123564025)(20161123560025)(20161123555025)(6072148)(201708071742011)(100000704101)(100105200095)(100000705101)(100105500095);SRVR:AM4PR0801MB1521;BCL:0;PCL:0;RULEID:(100000800101)(100110000095)(100000801101)(100110300095)(100000802101)(100110100095)(100000803101)(100110400095)(100000804101)(100110200095)(100000805101)(100110500095);SRVR:AM4PR0801MB1521; X-Forefront-PRVS: 042857DBB5 X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1;AM4PR0801MB1521;23:TQXlMiUBBmJKFH+STJimh2v1RFHdSFdnOw6rFKN?= =?us-ascii?Q?jjPqmw10gRTZcm608iPv4QSfxklk1bYwry2keUx+s+lZ98h3akU3GVi9d8se?= =?us-ascii?Q?XgW5Jth7VLoZ9AUI0rjoMS9Yde8P2bmzt8Vs9FYoO9aABGHmsKLauQVLJNH2?= =?us-ascii?Q?1bRZvznl/0GmzRnPW1D32RK4ORmMDzKYv7ehj73SyexTOM7X6iFvKpQPfCOi?= =?us-ascii?Q?lQYKkDOGybt56LVhx508xrXwJnJSPkYL10611EvDct5k9fosxIksFGjGKlU8?= =?us-ascii?Q?HuTEoupK8vDJnIT1WqJU6/WyXA7tuGuewtp4XTcCdX/3NSfPBUfGFaSVSCI6?= =?us-ascii?Q?nc7gPoyuAUXEb738l7nD7+kUIYKwD6ZtXf/6O90IrMf/nIklEG/KM2VL88/Y?= =?us-ascii?Q?RQmWqQGdsk3AnKuaKINhRs7HbyQueh6b3MB5xUxo06n06afA2ZWtCH6fwFeH?= =?us-ascii?Q?Oz20kCbW4Y8pmRTR6KaExfaeUchflvBwByvmTTBYppJ9Mvq5y5boyorkUl/b?= =?us-ascii?Q?KRAxj7VBC8Ff3/oDnmwBS8oUH7omHzKGQLrnUPKZ2K/EqWlQuhCfXP7OI14Z?= =?us-ascii?Q?XceQGJQJr5MNHwN71AvTdyKgiIIdVTXW6OzWD6GLYROBNIh34l/bc5CuGsHn?= =?us-ascii?Q?iSrCJ6pV+3uNf7dC7HVZa1hOrfjgNTAkodNnRr2p3zI3nqIXhfbLwGzzQmvZ?= =?us-ascii?Q?EMkZFp4dgg01hb4K0994OTptjujvOPrPVJ7e1MxhdnVOSfCkcMeoMrNd3Mm7?= =?us-ascii?Q?Ju9MS5+nv2ZnSRwhDGC3wZtdss0DxePYxUGCg5HRa1qddbPvKzm9PWKwxxTK?= =?us-ascii?Q?ID2V6vX5kngJCAkeDq12SZCmuUvF0B0HB0BaYymSynHUTmfTH92aDDwZf6BL?= =?us-ascii?Q?zt1lv0woo4veqlHptJMZMpqoI9GIbkWb/aZ2HeyMvbkmhGcoA3W0YsP0Ahk/?= =?us-ascii?Q?cMoj0l/cNxfs8i+IvFvO07VIm6HW46cLCFMZj7t8toeZttA+GdOO4eMPSDb2?= =?us-ascii?Q?NVA+MVJKqQzZMcNTVyRPyvptyuxs3BfeUUzzqHHLMOAB2hIlAz8gco5cgOUy?= =?us-ascii?Q?GBWn0ZyrQkFCZO5yHywExSS3IFy+cxryT9UGiEyxt3S18YOoUyBXnSr50CSQ?= =?us-ascii?Q?R2rwR8Tw3eMwPVS4WylFGdGtma2jjQBcPziXxvtIq2EnD0oZBtP6yQIxePL8?= =?us-ascii?Q?CAv4/uq2ljHHCFxw2wri82htv94rkabRKM/VArysj8/snG5dpUJIXjf9vv3D?= =?us-ascii?Q?mdc++BTAVGVVAyG65r/b7E1hT4oIJcpM/htc//TJN?= X-Microsoft-Exchange-Diagnostics: 1;AM4PR0801MB1521;6:YiFNb52cZrG+K4OqE2EXO2PuzkmJaERQcpKNVpdhO2P+CIKures/WQkYlfLC4XxZYzSrD/psknbjlcxpLt8o+C9nb4yu+ipa/UIHBS2cWr+pAM/Uj9lMR/nXprqKc+NNMNa0DWab0vXoD7Iv8WVrT80crpOx38d3lei75U87CXVKTHHyr0IkDye/RECPdFHcYnThSKhz41eSFzAfANe/3LD2gS4VrSO5dZdiGA/ZDIcS8QWcxWvLarAc+xhv4xQovhUoIeUhBmFAUO7bpVZ9ojLIj9VH7cI0ABlyxrljk+0R2+I+BacqXf4DylUh9RfYJOaLSvpLh3RvKUVzr2Wl8w==;5:w8Gx+ugTwDE8lWs+Tz0s1NMFPdXZKcZBhBSOfTkzN70RITNKnR3x1IT+RTio17v7U7lYqltKPzw8nvp0GVa63/Q620EfNKP8IpNe3+eFVsPD1hFFVRAiXrAGG2ctBsfgHgkgmLEc7miHy5/Y6akX2SBGkLTsvhBdwB8rIHZRJ/w=;24:3BxBdJjGTH7nXXvHfWLYRGdnxGmEJSrGMUDSiRpm1XxYd9ITlbpoN6I2KKOHn472nXYPij8b8sFuWIFDFvXjEbIOetIJkohpm/av+AwGafA=;7:d99iD/8XojDD11CRS+rUfdxynirQTPmKKnkRDJb9bHOPYG8o69BrX/Fp+PNVrY49QxdFZaiTJgSfgpEG/nc8U1cn91nF8THDSH60a1HgE7phZYviuR3Fr4EnbCez2WVpXtY9der1jidYCKUdYqwYdObugtUwJ1bpXLtzjMop9M0lpXaObkA7nwUQxSoNJGhLlAoxV9LgHR5AOY+Dnl3+2H0RxYsYUNUeMLiEZaDjv08= SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 12 Sep 2017 16:28:18.8478 (UTC) X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d;Ip=[217.140.96.140];Helo=[nebula.arm.com] X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM4PR0801MB1521 X-IsSubscribed: yes X-SW-Source: 2017-09/txt/msg00724.txt.bz2 On Wed, Sep 06, 2017 at 10:02:52AM +0100, Jackson Woodruff wrote: > Hi all, > > I've attached a new patch that addresses some of the issues raised with > my original patch. > > On 08/23/2017 03:35 PM, Wilco Dijkstra wrote: > > Richard Sandiford wrote: > >> > >> Sorry for only noticing now, but the call to aarch64_legitimate_address_p > >> is asking whether the MEM itself is a legitimate LDP/STP address. Also, > >> it might be better to pass false for strict_p, since this can be called > >> before RA. So maybe: > >> > >> if (GET_CODE (operands[0]) == MEM > >> && !(aarch64_simd_imm_zero (operands[1], mode) > >> && aarch64_mem_pair_operand (operands[0], mode))) > > There were also some issues with the choice of mode for the call the > aarch64_mem_pair_operand. > > For a 128-bit wide mode, we want to check `aarch64_mem_pair_operand > (operands[0], DImode)` since that's what the stp will be. > > For a 64-bit wide mode, we don't need to do that check because a normal > `str` can be issued. > > I've updated the condition as such. > > > > > Is there any reason for doing this check at all (or at least this early during > > expand)? > > Not doing this check means that the zero is forced into a register, so > we then carry around a bit more RTL and rely on combine to merge things. > > > > > There is a similar issue with this part: > > > > (define_insn "*aarch64_simd_mov" > > [(set (match_operand:VQ 0 "nonimmediate_operand" > > - "=w, m, w, ?r, ?w, ?r, w") > > + "=w, Ump, m, w, ?r, ?w, ?r, w") > > > > The Ump causes the instruction to always split off the address offset. Ump > > cannot be used in patterns that are generated before register allocation as it > > also calls laarch64_legitimate_address_p with strict_p set to true. > > I've changed the constraint to a new constraint 'Umq', that acts the > same as Ump, but calls aarch64_legitimate_address_p with strict_p set to > false and uses DImode for the mode to pass. This looks mostly OK to me, but this conditional: > + if (GET_CODE (operands[0]) == MEM > + && !(aarch64_simd_imm_zero (operands[1], mode) > + && ((GET_MODE_SIZE (mode) == 16 > + && aarch64_mem_pair_operand (operands[0], DImode)) > + || GET_MODE_SIZE (mode) == 8))) Has grown a bit too big in such a general pattern to live without a comment explaining what is going on. > +(define_memory_constraint "Umq" > + "@internal > + A memory address which uses a base register with an offset small enough for > + a load/store pair operation in DI mode." > + (and (match_code "mem") > + (match_test "aarch64_legitimate_address_p (DImode, XEXP (op, 0), > + PARALLEL, 0)"))) And here you want 'false' rather than '0'. I'll happily merge the patch with those changes, please send an update. Thanks, James > > ChangeLog: > > gcc/ > > 2017-08-29 Jackson Woodruff > > * config/aarch64/constraints.md (Umq): New constraint. > * config/aarch64/aarch64-simd.md (*aarch64_simd_mov): > Change to use Umq. > (mov): Update condition. > > gcc/testsuite > > 2017-08-29 Jackson Woodruff > > * gcc.target/aarch64/simd/vect_str_zero.c: > Update testcase.