From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <richard.sandiford@arm.com>
Received: from foss.arm.com (foss.arm.com [217.140.110.172])
 by sourceware.org (Postfix) with ESMTP id 3A8AA3858C20
 for <gcc-patches@gcc.gnu.org>; Tue, 16 Aug 2022 08:14:09 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 3A8AA3858C20
Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14])
 by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 9FA351FB;
 Tue, 16 Aug 2022 01:14:09 -0700 (PDT)
Received: from localhost (e121540-lin.manchester.arm.com [10.32.98.62])
 by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 3BF193F70D;
 Tue, 16 Aug 2022 01:14:08 -0700 (PDT)
From: Richard Sandiford <richard.sandiford@arm.com>
To: Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org>
Mail-Followup-To: Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org>,
 Roger Sayle <roger@nextmovesoftware.com>,
 Richard Biener <richard.guenther@gmail.com>, richard.sandiford@arm.com
Cc: Roger Sayle <roger@nextmovesoftware.com>,
 Richard Biener <richard.guenther@gmail.com>
Subject: Re: [x86 PATCH] PR target/106577: force_reg may clobber operands
 during split.
References: <002201d8ae8b$d11fa9b0$735efd10$@nextmovesoftware.com>
 <CAFiYyc0xK7BMJkR3xR3=Ku7_hhqyyZDEzPiWSUe2e7cYiTS2zw@mail.gmail.com>
Date: Tue, 16 Aug 2022 09:14:06 +0100
In-Reply-To: <CAFiYyc0xK7BMJkR3xR3=Ku7_hhqyyZDEzPiWSUe2e7cYiTS2zw@mail.gmail.com>
 (Richard Biener via Gcc-patches's message of "Mon, 15 Aug 2022
 09:45:51 +0200")
Message-ID: <mptleromwq9.fsf@arm.com>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
X-Spam-Status: No, score=-45.7 required=5.0 tests=BAYES_00, KAM_DMARC_NONE,
 KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, SPF_HELO_NONE, SPF_NONE, TXREP,
 T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Tue, 16 Aug 2022 08:14:15 -0000

Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> On Fri, Aug 12, 2022 at 10:41 PM Roger Sayle <roger@nextmovesoftware.com> wrote:
>>
>>
>> This patch fixes PR target/106577 which is a recent ICE on valid regression
>> caused by my introduction of a *testti_doubleword pre-reload splitter in
>> i386.md.  During the split pass before reload, this converts the virtual
>> *testti_doubleword into an *andti3_doubleword and *cmpti_doubleword,
>> checking that any immediate operand is a valid "x86_64_hilo_general_operand"
>> and placing it into a TImode register using force_reg if it isn't.
>>
>> The unexpected behaviour (that caught me out) is that calling force_reg
>> may occasionally clobber the contents of the global operands array, or
>> more accurately recog_data.operand[0], which means that by the time
>> split_XXX calls gen_split_YYY the replacement insn's operands have been
>> corrupted.
>>
>> It's difficult to tell who (if anyone is at fault).  The re-entrant
>> stack trace (for the attached PR) looks like:
>>
>> gen_split_203 (*testti_doubleword) calls
>> force_reg calls
>> emit_move_insn calls
>> emit_move_insn_1 calls
>> gen_movti calls
>> ix86_expand_move calls
>> ix86_convert_const_wide_int_to_broadcast calls
>> ix86_vector_duplicate_value calls
>> recog_memoized calls
>> recog.
>>
>> By far the simplest and possibly correct fix is rather than attempt
>> to push and pop recog_data, to simply (in pre-reload splits) save a
>> copy of any operands that will be needed after force_reg, and use
>> these copies afterwards.  Many pre-reload splitters avoid this issue
>> using "[(clobber (const_int 0))]" and so avoid gen_split_YYY functions,
>> but in our case we still need to save a copy of operands[0] (even if we
>> call emit_insn or expand_* ourselves), so we might as well continue to
>> use the conveniently generated gen_split.
>>
>> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
>> and make -k check, both with and without --target_board=unix{-m32},
>> with no new failures. Ok for mainline?
>
> Why this obviously fixes the issue seen I wonder whether there's
> more of recog_data that might be used after control flow returns
> to recog_memoized and thus the fix would be there, not in any
> backend pattern triggering the issue like this?
>
> The "easiest" fix would maybe to add a in_recog flag and
> simply return FAIL from recog when recursing.  Not sure what
> the effect on this particular pattern would be though?
>
> The better(?) fix might be to push/pop recog_data in 'recog', but
> of course give that recog_data is currently a global leakage
> in intermediate code can still happen.
>
> That said - does anybody know of similar fixes for this issue in other
> backends patterns?

I don't think it's valid for a simple query function like
ix86_vector_duplicate_value to clobber global state.  Doing that
could cause problems in other situations, not just splits.

Ideally, it would be good to wean insn-recog.cc:recog off global state.
The only parts of recog_data it uses (if I didn't miss something)
are recog_data.operands and recog_data.insn (but only to nullify
it for recog_memoized, which wouldn't be necessary if recog didn't
clobber recog_data.operands).  But I guess some .md expand/insn
conditions probably rely on the operands array being in recog_data,
so that might not be easy.

IMO the correct low-effort fix is to save and restore recog_data
in ix86_vector_duplicate_value.  It's a relatively big copy,
but the current code is pretty wasteful anyway (allocating at
least a new SET and INSN for every query).  Compared to the
overhead of doing that, a copy to and from the stack shouldn't
be too bad.

Thanks,
Richard