From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 31599 invoked by alias); 15 Jan 2013 14:31:31 -0000 Received: (qmail 30808 invoked by uid 48); 15 Jan 2013 14:31:06 -0000 From: "rguenth at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/15184] [4.6/4.7/4.8 Regression] Direct access to byte inside word not working with -march=pentiumpro Date: Tue, 15 Jan 2013 14:31:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: minor X-Bugzilla-Who: rguenth at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 4.6.4 X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated Content-Type: text/plain; charset="UTF-8" MIME-Version: 1.0 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org X-SW-Source: 2013-01/txt/msg01352.txt.bz2 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15184 --- Comment #24 from Richard Biener 2013-01-15 14:31:02 UTC --- We fail to fold (short unsigned int) ( X | (signed short) Y ) to (short unsigned int ) X | (short unsigned int) Y and thus end up with needlessly many conversions on the tree level. Index: gcc/convert.c =================================================================== --- gcc/convert.c (revision 195194) +++ gcc/convert.c (working copy) @@ -750,7 +750,7 @@ convert_to_integer (tree type, tree expr || ex_form == MULT_EXPR))) typex = unsigned_type_for (typex); else - typex = signed_type_for (typex); + ; return convert (type, fold_build2 (ex_form, typex, convert (typex, arg0), "fixes" that (but has no effect on the resulting assembly). Btw, even generic tuning assembler is bad which is IMHO the important fact. I don't think combine is the right mechanism to deal with this as partial register writes are certainly not wanted. Instead it's necessary to see that we can store into memory directly which requires us to combine too much things. This is rather to be dealt with on the tree level - the bswap pass infrastructure should provide a starting point for value composition like this.