From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugs-return-479680-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 92817 invoked by alias); 8 Mar 2015 10:24:27 -0000
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
Received: (qmail 92755 invoked by uid 48); 8 Mar 2015 10:24:20 -0000
From: "olegendo at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/64785] [5 Regression][SH] and|or|xor #imm not used
Date: Sun, 08 Mar 2015 10:24:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: target
X-Bugzilla-Version: 5.0
X-Bugzilla-Keywords: missed-optimization
X-Bugzilla-Severity: normal
X-Bugzilla-Who: olegendo at gcc dot gnu.org
X-Bugzilla-Status: NEW
X-Bugzilla-Priority: P4
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: 5.0
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields:
Message-ID: <bug-64785-4-F732MwhaXZ@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-64785-4@http.gcc.gnu.org/bugzilla/>
References: <bug-64785-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2015-03/txt/msg00824.txt.bz2

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64785
--- Comment #7 from Oleg Endo <olegendo at gcc dot gnu.org> ---
(In reply to Kazumoto Kojima from comment #6)
> 
> I like your pre-RA pass even if it's a too big hammer for
> this specific problem.  It should wait the next stage1, though.

It seems that this PR's issue is not a frequent use case (no hits in CSiBE at
all).  So yes, stage1 is good.


> Also it would be better to look for another use cases of that
> pass as you suggested so as to justify the cost of scanning
> all insns.

Some use cases for the pre-RA pass:
- R0 pre-allocation

- reduction of number of pseudos and reg-reg copies
  some passes leave pseudos and copies which can be removed
  to make the RA task easier.

- 2 operand / commutative operands optimization
  on SH the dest operand is always one of the source operands.
  I've seen several times that the generic RA makes not-so-good
  choices which results in more live regs and unnecessary reg-reg
  copies.  Very often output operands are put in different pseudos
  than the input operands before RA and RA has to fix this somehow.

- the last time I played with the fipr insn (PR 55295) RA had trouble
  allocating FV regs.  For example:
     void func (float a, float b, float c, float d)
  would not allocate (a,b,c,d) to FV4, although the operands are already
  in the appropriate FR regs.  It resulted in many unnecessary reg-reg
  copies.  I haven't tried this with LRA though.


There are some more things which I'd do before RA:

- Forming SH2A movu.{b|w} insns (PR 64792)
- Various constant optimizations (PR 63390, PR 51708, PR 54682, PR 65069)
- 64 bit FP load/store fusion (PR 64305)

It would be possible to write one huge pre-RA RTL pass to do all of that. 
However, I'd like to avoid accidents such as reload.c and rather keep things
separated as much as possible.  I don't have evidence, but I don't think that
scanning all insns is too bad.  It's being done multiple times during
compilation and there are other places which could be optimized.  For example,
as far as I know, split4 and split5 passes are not needed on SH and could be
disabled.  Or maybe the conditions in define_split such as "can_create_pseudo_p
()" should be evaluated *before* all insns are scanned/recog'ed.