From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pj1-x102e.google.com (mail-pj1-x102e.google.com [IPv6:2607:f8b0:4864:20::102e]) by sourceware.org (Postfix) with ESMTPS id 20D743861031 for ; Fri, 30 Jul 2021 14:58:53 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 20D743861031 Received: by mail-pj1-x102e.google.com with SMTP id ca5so15512535pjb.5 for ; Fri, 30 Jul 2021 07:58:53 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=W+o0azrjTrSExwEymOON4GTMuVZEo6dkYb3VF/9YcfY=; b=sWo6sPTMPEdfoUpKguWtauAulqhIsS3rFvynRbqpUE08Mz8ZCB3EwwH1tFzg1smH7S veDZVe6ggf5A80g0CI7jTYWBdOzVUqFwC8A4zrqO3YIM6eLwjrc85gCb6fPRcDTuB18U X+9TuNk7kY8yzr2wVpuFeSj8HYxhAnGX2TsLMXASkY512Wkp1hjfVYsyTmsh+K4dB0jz PTLY9eDGPMQtTj3Son8HUFsP3nnBkp2vVdmfwYmXFxYd5u9zdIHtf90T4tTI4bGKAqcE NLSvIbya1qb+ovNBYcURTDlME+IVFe7fyY7rqZltvDsKbsqTAi3cdQKXfqPDfLaHAUlb LOgg== X-Gm-Message-State: AOAM533eJXicq8ZBYHVRYAQ5GW4pNeOq9I1P+8RvaGrgq0rGYeRYXAtE adfgg103AsPh5daSPPfoEsUJz6Jt4a2WTw== X-Google-Smtp-Source: ABdhPJzW9NL1AXoG+hdTfDTDh1l2JYtzWrAKD36O3+u0Kre1L/XpQpSUGdyLwVAHe5XrvMn8CnCNHQ== X-Received: by 2002:a17:90b:250f:: with SMTP id ns15mr3540100pjb.26.1627657131872; Fri, 30 Jul 2021 07:58:51 -0700 (PDT) Received: from ?IPv6:2804:431:c7cb:43e2:6c33:fd81:e602:d33? ([2804:431:c7cb:43e2:6c33:fd81:e602:d33]) by smtp.gmail.com with ESMTPSA id q9sm3062060pgt.65.2021.07.30.07.58.50 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 30 Jul 2021 07:58:51 -0700 (PDT) Subject: Re: A collection of LD_AUDIT bugs that are important for tools (with better formatting for this list) To: Ben Woodard Cc: Florian Weimer , John Mellor-Crummey , libc-alpha@sourceware.org References: <8A8FF420-8316-4A22-AC4D-DA1F2D5625A5@rice.edu> <2fc830b9-35da-9b94-369f-4df683078a5c@linaro.org> <8735tguubc.fsf@oldenburg.str.redhat.com> <75b2e838-a32e-6a2a-27b2-75bc06c01118@linaro.org> <3F6132F0-1042-4285-A309-0365D014422A@redhat.com> From: Adhemerval Zanella Message-ID: Date: Fri, 30 Jul 2021 11:58:49 -0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: <3F6132F0-1042-4285-A309-0365D014422A@redhat.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-13.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Jul 2021 14:58:55 -0000 On 23/06/2021 14:42, Ben Woodard wrote: > > >> On Jun 17, 2021, at 4:06 PM, Adhemerval Zanella wrote: >> >> >> >> On 17/06/2021 17:09, Florian Weimer wrote: >>> * Adhemerval Zanella: >>> >>>> >>>> >>>> * SVE support: as indicated by Szabolcs SVE calls are marked with the >>>> STO_AARCH64_VARIANT_PCS and thus explicit not supported by dynamic loader. > > To me this sounds like partly a toolchain issue. The aarch64 PCS does define the ABI for SVE calls. I haven’t checked what GCC/binutils does in quite a while. It seems like the STO_AARCH64_VARIANT_PCS was an expedient for SVE when it first came out and the semantics that it defines where all the registers are caller preserved makes it very difficult to implement around. > > For LAV_CURRENT=2 what I planned to do was: Now that I have finished the various audit issues that John Mellor-Crummey has brought up, I think I have a better idea of how to address it. > > diff --git a/sysdeps/aarch64/bits/link.h b/sysdeps/aarch64/bits/link.h > index ca76087ee1..390b12a826 100644 > --- a/sysdeps/aarch64/bits/link.h > +++ b/sysdeps/aarch64/bits/link.h > @@ -20,13 +20,24 @@ > # error "Never include directly; use instead." > #endif > > +typedef struct La_sve_regs { > + uint16_t *lr_preg[3]; > + __uint128_t *lr_zreg[8]; > +} La_sve_regs; > + > /* Registers for entry into PLT on AArch64. */ > typedef struct La_aarch64_regs > { > uint64_t lr_xreg[9]; > - __uint128_t lr_vreg[8]; > uint64_t lr_sp; > uint64_t lr_lr; > + char lr_sve; /* 0 - no SVE, > + 1-16 length of the SVE registers in vq (128 bits) */ > + union { > + /* when sve!=0 accessing the lr_vreg is undefined */ > + __uint128_t lr_vreg[8]; > + La_sve_regs lr_zreg; > + }; > } La_aarch64_regs; > > /* Return values for calls from PLT on AArch64. */ > @@ -34,9 +45,14 @@ typedef struct La_aarch64_retval > { > /* Up to eight integer registers can be used for a return value. */ > uint64_t lrv_xreg[8]; > - /* Up to eight V registers can be used for a return value. */ > - __uint128_t lrv_vreg[8]; > - > + char lrv_sve; /* 0 - no SVE, > + 1-16 length of the SVE registers in vq (128 bits) */ > + union{ > + /* Up to eight V registers can be used for a return value. > + When sve!=0 accessing the lr_vreg is undefined */ > + __uint128_t lrv_vreg[8]; > + La_sve_regs lrv_zreg; > + }; > } La_aarch64_retval; > __BEGIN_DECLS My idea is to do something similar: --- typedef struct La_sve_regs { uint16_t *lr_preg; long double *lr_zreg; } La_sve_regs; typedef struct La_aarch64_regs { [...] uint8_t lr_sve; /* 0 - no SVE 1-16 length of the SVE registers in vq (128 bits) */ La_sve_regs *lr_sve_regs; /* NULL - no SVE. */ }; typedef struct La_aarch64_retval { [...] uint8_t lr_sve; /* 0 - no SVE 1-16 length of the SVE registers in vq (128 bits) */ La_sve_regs *lr_sve_regs; /* NULL - no SVE. */ } --- The _dl_runtime_resolve will be responsible to allocate on the stack the required space for the La_sve_regs and setup the La_aarch64_regs and La_aarch64_retval internal pointers. It has the advantage of allocate only the required space and if the we can distinguish if the symbol does use SVE we can avoid the performance issue for symbols that do not use SVE. The downside is it would require a potential large stack space. > > However, that would require toolchain support and another hint in st_other which separates SVE calls from other uses of STO_AARCH64_VARIANT_PCS like STO_AARCH64_VARIANT_SVE. Then the runtime linker could populate the lrv_sve with information from the lrv_vreg with the size of the vector registers from the processor’s registers. I think it should be feasible to assume now that STO_AARCH64_VARIANT_PCS means SVE, which meant that we can use it _dl_runtime_resolve to skip the save/restore if the symbol follows the default standard procedure call. > > There are at least two problems with that approach. > 1) who allocates the lr_zreg pointers in the la_sve_regs and how long should they be? Do they always have to be allocated to be the max size 2048 bits? > 2) I hadn’t worked out how to handle functions that return things in the SVE regs. Do we need two new flags in st_other? STO_AARCH64_VARIANT_SVE and STO_AARCH64_VARIANT_SVERET? I don't think we need the two extra flags to handle SVE calls, but I am also assuming that STO_AARCH64_VARIANT_PCS will be used solely for SVE. And it has the extra problem of using two extra flags is not backward compatible. I presume that if ARM wants to push for another procedure call variant on linux-gnu I would expect another flag. > > Then there was the question in the future could there be: big.LITTLE processors where some big processors had SVE registers of one length while the LITTLE processors had different ones? Although I find this kind of setup unlikely, my expectation is either that it would be transparent to userland (either kernel will emulate the required instructions or it will bind process with STO_AARCH64_VARIANT_PCS to cores with SVE).