From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm1-x32d.google.com (mail-wm1-x32d.google.com [IPv6:2a00:1450:4864:20::32d]) by sourceware.org (Postfix) with ESMTPS id 273443865474 for ; Thu, 11 Jan 2024 08:13:52 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 273443865474 Authentication-Results: sourceware.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 273443865474 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::32d ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1704960836; cv=none; b=WWYPxopAINBVbPL73/K2+mElMT3YXiAM+GWkyrs2Cx2QUP/kZYKH9oThv2qu0l1BD+Nawa0tlF/Krc0cd6MREhDg0JjX0V18472iZ1Bj56Qt6AwXmAhhNrFN/uTybeXmwrJgBLEFFcC3FOIuztGQaXRCSkGswsRJ2ygwqTiZM6U= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1704960836; c=relaxed/simple; bh=DuruSVdQHphs2O0ANJxwquiTj5x29AfcbI6wJyP2vbM=; h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:To:From; b=n4YZk5D2RZemtrEnzVy9SGOJ1lsHeOKgF0YKuGquaUZqhH98EgXYxuJjlDQa6duXLZFtBLVP315WI7uFyeUq7tkjT0NYQj/hXx29B6PqAmpTIfwXPPtZHmw8YqvWtRT9jM1MLvirkiwPVn16dKrE7e2XTlh/nU6lBCM1l/hRgwU= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-wm1-x32d.google.com with SMTP id 5b1f17b1804b1-40d5336986cso64791845e9.1 for ; Thu, 11 Jan 2024 00:13:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1704960831; x=1705565631; darn=sourceware.org; h=content-transfer-encoding:in-reply-to:autocrypt:from:references:cc :to:content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=ZQBx44IniG2/XRNrm5UOUULtmY8VXaK6ZW66jzcKkUc=; b=MzhBLKJvyRG6K7/o9L3Y+mj+kOmYnlL8vDbI2wT6Ck6tTdDnqgKKEuAF956wsMKN8J 3WxjgYKzBTuk1nt28C7r/afIE1XoUE+ZmGRBa4eWdq+L6iXElgZGd1u9FGAocyceGMhF DaIIBowTydrAE6oFu3Bf7scr9BM6521neYtXETkof4W2sEotX+gUUU+jctp2nRtc8Ynq wgZnyTXeLrZXy6Osx08zr52Jvuq/9o90en6turAZ1anLs3MxVA5NteEjB4edd0IV3pb1 Uu65mmbqz7LJ5o87PvQwQZvGk/x5OCeVYzvF0sOWkIJDcvkOsCih2MMLsOCQC4vXR0Mq yzdg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1704960831; x=1705565631; h=content-transfer-encoding:in-reply-to:autocrypt:from:references:cc :to:content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ZQBx44IniG2/XRNrm5UOUULtmY8VXaK6ZW66jzcKkUc=; b=qj0kwDEbvKwJx4vN3bcgiA5MJn7B+JZ1l4wlOQPng1PaGYF6n3t5w5Gop0DBmS0xhT cdv2qhHAiAgsohcx0Sbj1tO8nsoXQxLVCoFj1frFL2oy7qZngoUjPP2azHKN3NQKm1l2 HKQl15oghde9k28pllqGIQ5MeDIF21K127k5OU22QIfk1V/8DjMFjE/Xm5e6MyPUOWJ3 dqyMnGeL+5skL3etVjbmPm0VQj71Lac+NFEASDT6RhAsIIwM9+wGyLZNE4jucdm4KT57 pYoLs5KbS124rrbjnq7+ktJsWviUTQV0H33GxPgDFpTxPSuJMZcuFf/4dRknz1lXHlq1 b9cA== X-Gm-Message-State: AOJu0Yw3ggV7lX5fDj5O5jBaXZYm4oOGshxbvpTl177Mpqt2R00kKqBZ x1WfthEAPVkkEVxjFM9qX8oixT0/TAgh3YtgR6OGA+jJvA== X-Google-Smtp-Source: AGHT+IHHQ5n7QDWwInVwMuo+vrwiUCAQtyw1s/9NvjsSvesx+BVO5DeivAmyAZmES1gwBzeTE1N8UQ== X-Received: by 2002:a7b:cd15:0:b0:40e:44c6:cf1b with SMTP id f21-20020a7bcd15000000b0040e44c6cf1bmr162963wmj.54.1704960830775; Thu, 11 Jan 2024 00:13:50 -0800 (PST) Received: from [10.156.60.236] (ip-037-024-206-209.um08.pools.vodafone-ip.de. [37.24.206.209]) by smtp.gmail.com with ESMTPSA id o34-20020a05600c512200b0040e55ee7fa7sm950109wms.8.2024.01.11.00.13.50 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 11 Jan 2024 00:13:50 -0800 (PST) Message-ID: <15461c7f-08eb-40a4-b24e-15df25b744e9@suse.com> Date: Thu, 11 Jan 2024 09:13:50 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH,V4 10/14] gas: synthesize CFI for hand-written asm Content-Language: en-US To: Indu Bhagat Cc: binutils@sourceware.org References: <20240103071526.3846985-1-indu.bhagat@oracle.com> <20240103071526.3846985-11-indu.bhagat@oracle.com> <0ecd9240-0700-4072-91d4-ccf9bdb56071@suse.com> <055b92ae-b781-41e8-bd34-4ad68bdc5f6f@suse.com> <78b9f98f-2030-4675-af0a-8f47d195711b@oracle.com> <20b71f7f-7c8b-41fd-a85c-6887cc19e5ff@suse.com> <409f6d2d-cd7e-4822-a29a-8970655c5af0@oracle.com> From: Jan Beulich Autocrypt: addr=jbeulich@suse.com; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-3025.9 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 10.01.2024 20:43, Indu Bhagat wrote: > On 1/10/24 06:15, Jan Beulich wrote: >> On 10.01.2024 12:26, Indu Bhagat wrote: >>> On 1/10/24 01:44, Jan Beulich wrote: >>>> On 10.01.2024 07:10, Indu Bhagat wrote: >>>>> On 1/9/24 01:30, Jan Beulich wrote: >>>>>> On 08.01.2024 20:33, Indu Bhagat wrote: >>>>>>> On 1/5/24 05:58, Jan Beulich wrote: >>>>>>>> On 03.01.2024 08:15, Indu Bhagat wrote: >>>>>>>>> +/* Generate one or more generic GAS instructions, a.k.a, ginsns for the current >>>>>>>>> + machine instruction. >>>>>>>>> + >>>>>>>>> + Returns the head of linked list of ginsn(s) added, if success; Returns NULL >>>>>>>>> + if failure. >>>>>>>>> + >>>>>>>>> + The input ginsn_gen_mode GMODE determines the set of minimal necessary >>>>>>>>> + ginsns necessary for correctness of any passes applicable for that mode. >>>>>>>>> + For supporting the GINSN_GEN_SCFI generation mode, following is the list of >>>>>>>>> + machine instructions that must be translated into the corresponding ginsns >>>>>>>>> + to ensure correctness of SCFI: >>>>>>>>> + - All instructions affecting the two registers that could potentially >>>>>>>>> + be used as the base register for CFA tracking. For SCFI, the base >>>>>>>>> + register for CFA tracking is limited to REG_SP and REG_FP only for >>>>>>>>> + now. >>>>>>>>> + - All change of flow instructions: conditional and unconditional branches, >>>>>>>>> + call and return from functions. >>>>>>>>> + - All instructions that can potentially be a register save / restore >>>>>>>>> + operation. >>>>>>>> >>>>>>>> This could do with being more fine grained, as "potentially" is pretty vague, >>>>>>>> and (as per earlier version review comments) my take on this is a much wider >>>>>>>> set than yours. >>>>>>> >>>>>>> I would like to understand more on this comment, especially the "my take >>>>>>> on this is a much wider set than yours". I see its being hinted at in >>>>>>> different flavors in the current review. >>>>>>> >>>>>>> I see some issues pointed out in this review (addressing modes of mov >>>>>>> etc, safe to skip opcodes for TEST, CMP) etc., but it seems that your >>>>>>> concerns are wider than this. >>>>>> >>>>>> I earlier version review I mentioned that even vector or mask registers >>>>>> could in principle be use to hold preserved GPR values. I seem to recall >>>>>> that you said you wouldn't want to deal with such. Hence my use of >>>>>> "wider set": Just to give an example, "kmovq %rbp, %k0" plus later >>>>>> "kmovq %k0, %rbp" is a pair of "instructions that can potentially be a >>>>>> register save / restore operation". >>>>>> >>>>> >>>>> Hmm. I will need to understand them on a case to case basis. For the >>>>> case of "kmovq %rbp, %k0" / "kmovq %k0, %rbp" how can this be used as >>>>> save/restore to/from stack ? >>>> >>>> Maybe I'm still not having a clear enough picture of what forms of insns >>>> you want to fully track. Said insn forms don't access the stack. But they >>>> could in principle be used to preserve a certain register. Such preserving >>>> of registers is part of what needs encoding in CFI, isn't it? >>>> >>> >>> The kind of preserving is usually on stack. It can also be in another >>> callee-saved register, in theory, but the latter defeats the purpose of >>> state saving across calls. >> >> Callee-preserved registers, when they have a special purpose in the >> architecture (like %rsi, %rdi, and %rbx have) may be cheaper to >> preserve by moving to a call-clobbered register that isn't otherwise >> used in the function. In the SysV ABI this only affects %rbx, the >> special purpose of which is extremely limited in the ISA (xlatb). In >> the Windows ABI, otoh, %rsi and %rdi are callee-preserved, and those >> have very common uses in the string insns. >> > > I am not sure I follow completely. Call-clobbered registers are not of > interest for SCFI... Well, what's x86_scfi_callee_saved_p() about if the distinction isn't relevant? >>>>>>>>> + case 0xc2: >>>>>>>>> + case 0xc3: >>>>>>>>> + if (i.tm.opcode_space != SPACE_BASE) >>>>>>>>> + break; >>>>>>>>> + /* Near ret. */ >>>>>>>>> + ginsn = ginsn_new_return (insn_end_sym, true); >>>>>>>>> + ginsn_set_where (ginsn); >>>>>>>>> + break; >>>>>>>> >>>>>>>> No tracking of the stack pointer adjustment? >>>>>>> >>>>>>> No stack unwind information for a function is relevant after the >>>>>>> function has returned. So, tracking of stack pointer adjustment by >>>>>>> return is not necessary. >>>>>> >>>>>> What information does the "return" insn then carry, beyond it being >>>>>> an unconditional branch (which you have a different insn for)? >>>>>> >>>>> >>>>> "return" does not carry any more information than just the >>>>> GINSN_TYPE_RETURN as ginsn->type. >>>>> >>>>> So then why support both "return" and an unconditional branch: The >>>>> intention is to carry the semantic difference between ret and >>>>> unconditional jump. Unconditional jumps may be to a label within >>>>> function, and in those cases, we use it for some validation and BB >>>>> linking when creating CFG. Return, OTOH, always indicates exit from >>>>> function. >>>>> >>>>> For SCFI purposes, above is the one use. Future analyses may find other >>>>> use-cases for an explicit return ginsn. But IMO, keeping >>>>> GINSN_TYPE_RETURN as an explicit insn makes the overall offering cleaner. >>>> >>>> Okay. And here you don't bother decoding operands. Hence why I'm >>>> asking the same to be the case for (e.g.) CALL. >>>> >>> >>> It seems I will need to deal with operands of RETURN insn soon. For >>> implementing "Warn if imbalanced stack at return", we will need this info. >> >> Will you? Isn't stack state _before_ the RET what matters (and hence >> the optional immediate still doesn't matter)? >> > > RET with operand makes this tricky. > > My initial thought was: > "Balanced stack at function return" will check that the RSP at the entry > of the function (after the call instruction) is the same as that at the > return from the function (before the return instruction). > > Now if RET with operand (which tells how much stack to pop before an > eventual return) is in effect, I do need to check the RSP value right > before the RETURN (RETURN being the microOP/ginsn equivalent). No, that's not how it works. RET with operand discards arguments passed to the function (see Windows' __stdcall calling convention for an example use). Naturally arguments are pushed _before_ the return address. Jan