From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 108246 invoked by alias); 28 Sep 2015 14:53:49 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 108233 invoked by uid 89); 28 Sep 2015 14:53:47 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.7 required=5.0 tests=AWL,BAYES_00,SPF_PASS autolearn=ham version=3.3.2 X-HELO: eu-smtp-delivery-143.mimecast.com Received: from eu-smtp-delivery-143.mimecast.com (HELO eu-smtp-delivery-143.mimecast.com) (146.101.78.143) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 28 Sep 2015 14:53:45 +0000 Received: from cam-owa1.Emea.Arm.com (fw-tnat.cambridge.arm.com [217.140.96.140]) by eu-smtp-1.mimecast.com with ESMTP id uk-mta-23-QB1sDYFuQmG8lVPFX0zWeA-1; Mon, 28 Sep 2015 15:53:41 +0100 Received: from e104437-lin ([10.1.2.79]) by cam-owa1.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.3959); Mon, 28 Sep 2015 15:53:41 +0100 References: From: Jiong Wang To: Andrew Pinski Cc: gcc-patches Subject: Re: [AArch64] Improve TLS Descriptor pattern to release RTL loop IV opt Date: Mon, 28 Sep 2015 15:17:00 -0000 In-reply-to: Message-ID: MIME-Version: 1.0 X-MC-Unique: QB1sDYFuQmG8lVPFX0zWeA-1 Content-Type: text/plain; charset=WINDOWS-1252 Content-Transfer-Encoding: quoted-printable X-SW-Source: 2015-09/txt/msg02120.txt.bz2 Andrew Pinski writes: > On Tue, Jul 28, 2015 at 6:12 AM, Jiong Wang wrote: >> >> The instruction sequences for preparing argument for TLS descriptor >> runtime resolver and the later function call to resolver can actually be >> hoisted out of the loop. >> >> Currently we can't because we have exposed the hard register X0 as >> destination of "set". While GCC's RTL data flow infrastructure will >> skip or do very conservative assumption when hard register involved in >> and thus some loop IV opportunities are missed. >> >> This patch add another "tlsdesc_small_pseudo_" pattern, and avoid >> expose x0 to gcc generic code. >> >> Generally, we define a new register class FIXED_R0 which only contains r= egister >> 0, so the instruction sequences generated from the new add pattern is th= e same >> as tlsdesc_small_, while the operand 0 is wrapped as pseudo regist= er that >> RTL IV opt can handle it. >> >> Ideally, we should allow operand 0 to be any pseudo register, but then >> we can't model the override of x0 caused by the function call which is >> hidded by the UNSPEC. >> >> So here, we restricting operand 0 to be x0, the override of x0 can be >> reflected to the gcc. >> >> OK for trunk? > > > This patch broke ILP32 because we used mode rather than ptr_mode for > the psedu . I have an idea on how to fix it (like tlsie_small_sidi > case) but I still need to test it fully. Have done a quick re-visit the code, the use of "mode" instead of "ptr_mode" looks OK to me. While what looks strange to me is under ILP32 symbol_ref is DImode. (symbol_ref:DI ("*.LANCHOR0") My my understanding, the symbol_ref should be SI mode under ilp32 instead of DI mode. So it's better to fix "create_block_symbol" in varasm, and we should let it use ptr_mode instead of Pmode as Pmode is used to describe the underline hardware mode instead of the mode view in C language level. > > This is the smallest testcase where the problem is: > struct dtor_list > { > struct dtor_list *next; > }; > static __thread struct dtor_list *tls_dtor_list; > __cxa_thread_atexit_impl ( struct dtor_list *new) > { > new->next =3D tls_dtor_list; > tls_dtor_list =3D new; > } > > > Thanks, > Andrew > >> >> 2015-07-28 Ramana Radhakrishnan >> Jiong Wang >> >> gcc/ >> * config/aarch64/aarch64.d (tlsdesc_small_pseudo_): New pattern. >> * config/aarch64/aarch64.h (reg_class): New enumeration FIXED_REG0. >> (REG_CLASS_NAMES): Likewise. >> (REG_CLASS_CONTENTS): Likewise. >> * config/aarch64/aarch64.c (aarch64_class_max_nregs): Likewise. >> (aarch64_register_move_cost): Likewise. >> (aarch64_load_symref_appropriately): Invoke the new added pattern if >> possible. >> * config/aarch64/constraints.md (Uc0): New constraint. >> >> gcc/testsuite. >> * gcc.target/aarch64/tlsdesc_hoist.c: New testcase. >> >> -- >> Regards, >> Jiong >> --=20 Regards, Jiong