From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 25716 invoked by alias); 12 May 2013 17:36:46 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org Received: (qmail 25642 invoked by uid 48); 12 May 2013 17:36:38 -0000 From: "ubizjak at gmail dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug rtl-optimization/55278] [4.8/4.9 Regression] Botan performance regressions, other compilers generate better code than gcc Date: Sun, 12 May 2013 17:36:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: rtl-optimization X-Bugzilla-Version: 4.8.0 X-Bugzilla-Keywords: missed-optimization, ra X-Bugzilla-Severity: normal X-Bugzilla-Who: ubizjak at gmail dot com X-Bugzilla-Status: NEW X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 4.8.1 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2013-05/txt/msg00778.txt.bz2 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55278 --- Comment #14 from Uros Bizjak --- (In reply to Jakub Jelinek from comment #12) > (force gcc to avoid xorw memory, %hireg and instead use movzwl memory, > %sireg; ... xorl %sireg, %sireg2) and p2 was something similar for *xorqi_1. > > Looking at icc generated assembly, it is interesting to see that the only > HImode instructions it ever uses are rolw and movw stores, for everything > else it uses > movzwl loads and SImode arithmetics (well, I guess shift right > shrw/sarw/rorw can't be avoided either). Similarly, icc on the testcase > doesn't emit any QImode instructions at all, while gcc emits tons of them > and llvm something in between. > > So perhaps this bug is not about LRA, but about instruction selection, and > when not optimizing for size at least on some CPUs we should consider using > SImode arithmetics instead of QImode/HImode much more aggressively than we > do now. > Not sure if it is better done by (Kai's?) type optimization pass, which > shortly before expansion using target hints would just try to get rid of as > many QImode and especially HImode operations as possible, guess we can often > keep complete garbage in the upper bits, or if it is better done at the *.md > level. Please note that it is possible to tune usage of HImode and QImode arithmetics with X86_TUNE_QIMODE_MATH and X86_TUNE_HIMODE_MATH. Also, X86_TUNE_PROMOTE_QI_REGS, X86_TUNE_PROMOTE_QI_REGS and eventually X86_TUNE_PARTIAL_REG_STALL can be used to fine-tune usage of partial registers.