From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 18756 invoked by alias); 20 May 2018 13:36:07 -0000 Mailing-List: contact libc-help-help@sourceware.org; run by ezmlm Precedence: bulk List-Subscribe: List-Post: List-Help: , Sender: libc-help-owner@sourceware.org Received: (qmail 18743 invoked by uid 89); 20 May 2018 13:36:07 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=0.3 required=5.0 tests=AWL,BAYES_20,FREEMAIL_FROM,HTML_MESSAGE,HTML_OBFUSCATE_05_10,KAM_LOTSOFHASH,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=no version=3.3.2 spammy=We, gs, is, R12 X-HELO: mail-qk0-f177.google.com Received: from mail-qk0-f177.google.com (HELO mail-qk0-f177.google.com) (209.85.220.177) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Sun, 20 May 2018 13:36:04 +0000 Received: by mail-qk0-f177.google.com with SMTP id p186-v6so9968476qkd.1 for ; Sun, 20 May 2018 06:36:03 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=Zqqein8FC5HV9r6suQ4dbr/Xk1WO998JQ2GbpuznyY8=; b=WBo49kBYvCEp51iZi29ALU3ZT0uQxXSN30r0npNWOoVCwZeQt8JDimSkj5NSNAy20L loWApvsua2muhBOnkDb/KGZ6T1aO44L0k88FNKYbUHb6KVuQexfv+Eq/NaQx+XsUtmlZ EHqjNLyIXiO0EjbMClrM0X93KauG6O9f8A4CvZgUamRSr6gzYSADTM9w+tNngEwDVyzQ DmOpHpeUIw2wGjv7UedXtO/IkLtlI8NWPgGqyMqGrkrG40Jc1wqzrEPBG+8qHnkeCZKr A2Xcr5X0a24OHgJm5SEf2cm/fJBomaz7hqPioi1GxG1xLjEEk3u0sIWC6bPPCMxcPgZ1 4A9Q== X-Gm-Message-State: ALKqPwe/vP9amDiSDK4eu+FpMSa54zPElV/synhIaTcqP1fTsibT8aPH OzHDHch0B8uI1YTQSGTzLgBIqtmbk0NVd+KmAvScZo9l X-Google-Smtp-Source: AB8JxZoLrz1eYlINu3Z9/17KPXtQwyR/LdAKItsyT1b1FUiimww5Gs6RLso2zBLAemgzAt5uRSbzAKgRG1g8y24wsLE= X-Received: by 2002:a37:c51c:: with SMTP id p28-v6mr15119352qki.366.1526823362234; Sun, 20 May 2018 06:36:02 -0700 (PDT) MIME-Version: 1.0 Received: by 10.12.232.13 with HTTP; Sun, 20 May 2018 06:36:01 -0700 (PDT) In-Reply-To: References: From: Remus Clearwater Date: Sun, 20 May 2018 13:36:00 -0000 Message-ID: Subject: Re: Questions and possible optimizations about sysv/linux/[x86_64|i386]/swapcontext.S To: libc-help@sourceware.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-IsSubscribed: yes X-SW-Source: 2018-05/txt/msg00020.txt.bz2 Really sorry for the unexpected auto typesetting by the mail-list server. ---- Hi, I'm reading the sources of ucontext. According to https://en.wikipedia.org/wiki/X86_calling_conventions#cdecl: Integer values and memory addresses are returned in the EAX register, floating point values in the ST0 x87 register. Registers EAX, ECX, and EDX are caller-saved, and the rest are callee-saved. The x87 floating point registers ST0 to ST7 must be empty (popped or freed) when calling a new function, and ST1 to ST7 must be empty on exiting a function. ST0 must also be empty when not used for returning a value. And https://en.wikipedia.org/wiki/X86_calling_conventions#System_V_AMD64_ABI : If the callee wishes to use registers RBX, RBP, and R12=E2=80=93R15, it must restore their original values before returning control to the caller. All other registers must be saved by the caller if it wishes to preserve their values. Questions: 1. Since `swapcontext` is a function call, so it maybe not need to save/restore (fnstenv,stmxcsr/fldenv,ldmxcsr) the floating-point context. Is this correct? If it's wrong, where I could found the detailed calling convention about the floating-point context ? Or a bug example after the (fnstenv,stmxcsr/fldenv,ldmxcsr) part is eliminated? 2. Could the save/restore (fnstenv,stmxcsr/fldenv,ldmxcsr) be eliminated if all the functions which uses fpu are leaf-node function (function which do not call any other function) ? If it is true, where I could found the detailed reference? 3. Since `swapcontext` is a function call, so it is only necessary to save/restore RBX, RBP, R12=E2=80=93R15, RSP and RIP, the save/restore of RD= I, RSI, RDX, RCX, R8 and R8 is quite useless and can be eliminated which could save at least (6*8*2 =3D 96) times memory read/writes in each swapcontext operation (Same optimization also exists in i386/swapcontext.S). Is this correct? 4. In i386/swapcontext.S: 54 /* Restore the FS segment register. We don't touch the GS register 55 since it is used for threads. */ 56 movl oFS(%eax), %ecx 57 movw %cx, %fs But there is no such save/restore about register fs in x86_64/swapcontext.S. Is this a bug or it's no need to save/restore the register fs at all ? (The save/restore of register fs in i386/swapcontext.S could also be elimianted if the later is correct) Thanks a lot :) On Sun, May 20, 2018 at 9:15 PM, Remus Clearwater < remus.clearwater@gmail.com> wrote: > Hi, I'm reading the sources of ucontext > > . > > According to https://en.wikipedia.org/wiki/X86_calling_conventions#cdecl: > > Integer values and memory addresses are returned in the EAX register >> , floating point >> values in the ST0 x87 register. >> Registers EAX, ECX, and EDX are caller-saved, and the rest are >> callee-saved. The x87 floating point >> registers ST0 to ST7 must be empty (popped or freed) when calling a new >> function, and ST1 to ST7 must be empty on exiting a function. ST0 must a= lso >> be empty when not used for returning a value. > > > And https://en.wikipedia.org/wiki/X86_calling_conventions#Sy > stem_V_AMD64_ABI: > > If the callee wishes to use registers RBX, RBP, and R12=E2=80=93R15, it m= ust >> restore their original values before returning control to the caller. All >> other registers must be saved by the caller if it wishes to preserve the= ir >> values. > > > Questions: > > 1. Since `swapcontext` is a function call, so it maybe not need to > save/restore > (fnstenv,stmxcsr/fldenv,ldmxcsr) the floating-point context. Is this > correct? > If it's wrong, where I could found the detailed calling convention about > the > floating-point context ? Or a bug example after the ( > fnstenv,stmxcsr/fldenv,ldmxcsr) > part is eliminated? > > 2. Could the save/restore (fnstenv,stmxcsr/fldenv,ldmxcsr) be eliminated > if all > the functions which uses fpu are leaf-node function (function which do not > call > any other function) ? If it is true, where I could found the detailed > reference? > > 3. Since `swapcontext` is a function call, so it is only necessary to > save/restore > RBX, RBP, R12=E2=80=93R15, RSP and RIP, the save/restore of RDI, RSI, RDX= , RCX, R8 > and R8 is quite useless and can be eliminated which could save at least > (6*8*2 =3D 96) > times memory read/writes in each swapcontext operation (Same optimization > also > exists in i386/swapcontext.S > ). > Is this correct? > > 4. In i386/swapcontext.S > > : > > 54 > > /* Restore the FS segment register. We don't touch > the GS register > 55 > > since it is used for threads. */ > 56 > > movl oFS(%eax), %ecx > 57 > > movw %cx, %fs > > But there is no such save/restore about register fs in > x86_64/swapcontext.S > > . > Is this a bug or it's no need to save/restore the register fs at all ? > (The > save/restore of register fs in i386/swapcontext.S > could > also be elimianted > if the later is correct) > > Thanks a lot :) > > Remus >