From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 12232 invoked by alias); 16 May 2014 10:48:10 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 12222 invoked by uid 89); 16 May 2014 10:48:10 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.6 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-wi0-f181.google.com Received: from mail-wi0-f181.google.com (HELO mail-wi0-f181.google.com) (209.85.212.181) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Fri, 16 May 2014 10:48:08 +0000 Received: by mail-wi0-f181.google.com with SMTP id n15so733608wiw.8 for ; Fri, 16 May 2014 03:48:05 -0700 (PDT) MIME-Version: 1.0 X-Received: by 10.180.78.5 with SMTP id x5mr35991015wiw.12.1400237285834; Fri, 16 May 2014 03:48:05 -0700 (PDT) Received: by 10.194.119.193 with HTTP; Fri, 16 May 2014 03:48:05 -0700 (PDT) In-Reply-To: <000001cf70ee$9aa2ed90$cfe8c8b0$@arm.com> References: <006f01cf6b71$1cf10df0$56d329d0$@arm.com> <000001cf70ee$9aa2ed90$cfe8c8b0$@arm.com> Date: Fri, 16 May 2014 10:48:00 -0000 Message-ID: Subject: Re: [PATCH] Fix PR54733 Optimize endian independent load/store From: Richard Biener To: "Thomas Preud'homme" Cc: GCC Patches Content-Type: text/plain; charset=UTF-8 X-IsSubscribed: yes X-SW-Source: 2014-05/txt/msg01261.txt.bz2 On Fri, May 16, 2014 at 12:07 PM, Thomas Preud'homme wrote: > Ping? Sorry ... > Best regards, > > Thomas Preud'homme > >> -----Original Message----- >> From: gcc-patches-owner@gcc.gnu.org [mailto:gcc-patches- >> owner@gcc.gnu.org] On Behalf Of Thomas Preud'homme >> Sent: Friday, May 09, 2014 6:26 PM >> To: GCC Patches >> Subject: RE: [PATCH] Fix PR54733 Optimize endian independent load/store >> >> Sorry, took longer than expected as I got distracted by some other patch. >> I merged the whole patchset in a single patch as I was told the current setup >> is actually more difficult to read. >> >> Here are the updated ChangeLogs: >> >> *** gcc/ChangeLog *** >> >> 2014-05-09 Thomas Preud'homme >> >> PR tree-optimization/54733 >> * expr.c (get_inner_reference): Add a parameter to control whether >> a >> MEM_REF should be split into base + offset. >> * tree.h (get_inner_reference): Default new parameter to false. >> * tree-ssa-math-opts.c (nop_stats): New "bswap_stats" structure. >> (CMPNOP): Define. >> (find_bswap_or_nop_load): New. >> (find_bswap_1): Renamed to ... >> (find_bswap_or_nop_1): This. Also add support for memory source. >> (find_bswap): Renamed to ... >> (find_bswap_or_nop): This. Also add support for memory source and >> detection of bitwise operations equivalent to load in host endianness. >> (execute_optimize_bswap): Likewise. Also move its leading >> comment back >> in place and split statement transformation into ... >> (bswap_replace): This. Add assert when updating bswap_stats. >> >> *** gcc/testsuite/ChangeLog *** >> >> 2014-05-09 Thomas Preud'homme >> >> PR tree-optimization/54733 >> * gcc.dg/optimize-bswapdi-3.c: New test to check extension of >> bswap >> optimization to support memory sources and bitwise operations >> equivalent to load in host endianness. >> * gcc.dg/optimize-bswaphi-1.c: Likewise. >> * gcc.dg/optimize-bswapsi-2.c: Likewise. >> * gcc.c-torture/execute/bswap-2.c: Likewise. >> >> Ok for trunk? Ok, I now decided otherwise and dislike the new parameter to get_inner_reference. Can you please revert that part and just deal with a MEM_REF result in your only caller? And (of course) I found another possible issue. The way you compute load_type and use it here: + /* Perform the load. */ + load_offset_ptr = build_int_cst (n->alias_set, 0); + val_expr = fold_build2 (MEM_REF, load_type, addr_tmp, + load_offset_ptr); makes the load always appear aligned according to the mode of load_type. On strict-alignment targets this may cause faults. So what you have to do is either (simpler) unsigned int align = get_pointer_alignment (addr_tmp); tree al_load_type = load_type; if (align < TYPE_ALIGN (load_type)) al_load_type = build_aligned_type (load_type, align); ... val_expr = fold_build2 (MEM_REF, al_load_type, addr_tmp, load_offset_ptr); or keep track of the "first" actual load and use unsigned int align = get_object_alignment (that_first_load); "first" in the one that corresponds to addr_tmp. From that there is a much better chance to derive good alignment values. Of course on STRICT_ALIGNMENT targets a not aligned load will be decomposed again, so eventually doing the transformation may no longer be profitable(?). Thanks and sorry again for the delay. Otherwise the patch looks good to me. Richard. >> Best regards, >> >> Thomas >> >> > -----Original Message----- >> > From: gcc-patches-owner@gcc.gnu.org [mailto:gcc-patches- >> > owner@gcc.gnu.org] On Behalf Of Thomas Preud'homme >> > Sent: Monday, May 05, 2014 7:30 PM >> > To: GCC Patches >> > Subject: RE: [PATCH][2/3] Fix PR54733 Optimize endian independent >> > load/store >> > >> > I found a way to improve the function find_bswap/find_bswap_or_nop >> > and reduce its size. Please hold for the review, I will post an updated >> > version as soon as I finish testing. >> > >> > Best regards, >> > >> > Thomas Preud'homme >> > >> > >> > > >