From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-395286-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 7468 invoked by alias); 16 Apr 2015 16:55:18 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Received: (qmail 7459 invoked by uid 89); 16 Apr 2015 16:55:17 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-2.2 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2
X-HELO: mail-lb0-f172.google.com
Received: from mail-lb0-f172.google.com (HELO mail-lb0-f172.google.com) (209.85.217.172) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Thu, 16 Apr 2015 16:55:16 +0000
Received: by lbbuc2 with SMTP id uc2so63458141lbb.2        for <gcc-patches@gcc.gnu.org>; Thu, 16 Apr 2015 09:55:13 -0700 (PDT)
X-Received: by 10.112.159.162 with SMTP id xd2mr28818001lbb.67.1429203312967; Thu, 16 Apr 2015 09:55:12 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.25.163.129 with HTTP; Thu, 16 Apr 2015 09:54:52 -0700 (PDT)
In-Reply-To: <552FE79E.2040509@gjlay.de>
References: <20150416064347.GA31491@atmel.com> <552F7A8D.1080709@gjlay.de> <20150416092802.GC31491@atmel.com> <552FE79E.2040509@gjlay.de>
From: Denis Chertykov <chertykov@gmail.com>
Date: Thu, 16 Apr 2015 16:55:00 -0000
Message-ID: <CADOs=zaeexQLyST9m0dWtJRkDZKnMbx+ae5eYZ3oFqux7VdVUg@mail.gmail.com>
Subject: Re: [patch,avr] Fix PR 65657 - read from __memx address space tramples arguments to function call
To: Georg-Johann Lay <avr@gjlay.de>
Cc: Senthil Kumar Selvaraj <senthil_kumar.selvaraj@atmel.com>, gcc-patches <gcc-patches@gcc.gnu.org>
Content-Type: text/plain; charset=UTF-8
X-SW-Source: 2015-04/txt/msg00826.txt.bz2

2015-04-16 19:47 GMT+03:00 Georg-Johann Lay <avr@gjlay.de>:
>
> Am 04/16/2015 um 11:28 AM schrieb Senthil Kumar Selvaraj:
>>
>> On Thu, Apr 16, 2015 at 11:02:05AM +0200, Georg-Johann Lay wrote:
>>>
>>> Am 04/16/2015 um 08:43 AM schrieb Senthil Kumar Selvaraj:
>>>>
>>>> This patch fixes PR 65657.
>>>
>>>
>>> The following artifact appears to be PR63633.
>>>
>>
>> I did see that one - unfortunately, that fix won't help here. IIUC, you
>> check if input/output operand hard regs are in the clobber list,
>> and if yes, you generate pseudos to save and restore clobbered hard
>> regs.
>>
>> In this case, the reg is actually clobbered by a different insn (one
>
>
> Arrgh, yes...
>
>> that loads the next argument to the function). So unless I blindly generate pseudos for
>> all regs in the clobber list, the clobbering will still happen.
>>
>> FWIW, I did try saving/restoring all clobbered regs, and it did fix the
>> problem - just that it appeared like a (worse) hack to me. Aren't we
>> manually replicating what RA/reload should be doing?
>
>
> As it appears, we'll have to do it by hand.  The attaches patch is just a sketch that indicates how the problem could be approached.  Notice the new assertions in the split expanders; they will throw ICE until the fix is actually installed.
>
> The critical insn are generated in movMM expander and are changed to have no clobbers (SCRATCHes actually).  An a later pass, when appropriate life info can be made available, run yet another avr pass that
>
> 1a) Save-restore needed hard regs around the insn.
>
> 2a) Kick out hard regs overlapping the clobbers, e.g. in set_src, into new pseudos.  Maybe that could happen due to some hard regs progagation, or we can use a new predicate similar combine_pseudo_register_operand.
>
> 3) Replace scratch -> hard regs for all scratch_operands.
>
> 2b) Restore respective hard regs from their pseudos.
>
> 1b) Restore respective hard regs from their pseudos.
>
>
> And maybe we can get better code by allocating things like address register by hand and get better code then.
>
> When I implemented some of the libgcc insns I tried to express the operand by means of constraints, e.h. for (reg:HI 22) and let register allocator do the job.
>
> The resulting code was functional but *horrific*.
>
> The register allocator is not yet ready to generate efficient code in such demanding situations...
>
>>
>> What do you think?
>>
>
> IMO sooner or later we'll need such an infrastructure; maybe also for non-mov insn that are implemented by transparent libcalls like divmod, mul, etc.
>
>>>> When cfgexpand.c expands a function call, it first figures out the
>>>> registers in which the arguments will go, followed by expansion of the
>>>> arguments themselves (right to left). It later emits mov insns to set
>>>> the precomputed registers with the expanded RTXes.
>>>>
>>>> If one of the arguments is a __memx char pointer dereference, the mov
>>>> eventually expands to gen_xload<mode>_A (for certain cases), which
>>>> clobbers R22, R21 and Z.  This causes problems if one of those
>>>> clobbered registers was precomputed to hold another argument.
>>>>
>>>> In general, call expansion does not appear to take clobbers into account -
>
>
> We had been warned that using hard regs is evil...  But without that technique the code quality would decrease way too much.
>
>>>> when it checks for argument overlap, the RTX (args[i].value) is only a MEM
>>>> in QImode for the memx deref - the clobber shows up when it eventually
>>>> calls emit_move_insn, at which point, it is too late.
>
>
> Such situations could only be handled by a target hook which allowed to expand specific trees by hand...  Such a hook could cater for insn that must use hard registers.
>
>>>> This does not happen for a int pointer dereference - turns out that
>>>> precompute_register_parameters does a copy_to_mode_reg if the
>>>> cost of args[i].value is more than COSTS_N_INSNS(1) i.e., it creates a
>>>> pseudo and later assigns the pseudo to the arg register. This is done
>>>> before any moves to arg registers is done, so other arguments are not
>>>> overwritten.
>>>>
>>>> Doing the same thing - providing a better cost estimate for a MEM rtx in
>>>> the non-default address space, makes this problem go away, and that is
>>>> what this patch does. Regression testing does not show any new failures.
>
>
> Can you tell something about overall code quality? If it is not significantly worse then I'd propose to apply your rtx-costs solution soon.  The full fix will take more time to work it out.

I'm agree with Georg.