From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 7638D3858029 for ; Sat, 19 Feb 2022 10:54:18 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 7638D3858029 Received: from mail-wm1-f71.google.com (mail-wm1-f71.google.com [209.85.128.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-323-7UYo_KVWPSyc23XFfzIpkQ-1; Sat, 19 Feb 2022 05:54:16 -0500 X-MC-Unique: 7UYo_KVWPSyc23XFfzIpkQ-1 Received: by mail-wm1-f71.google.com with SMTP id t2-20020a7bc3c2000000b003528fe59cb9so3693400wmj.5 for ; Sat, 19 Feb 2022 02:54:16 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:in-reply-to:references:date :message-id:mime-version; bh=ZfFCTRzyt2+6wQiMGWbAq7NVE2hz9idC0Ah+OrPtjWI=; b=svC6muY+8bYPBBowXKH6g1DoYHuHvnWvEwX/IAE1EQWdLV184LxEPQHUSRRK44d6kX AjJoneKdS3vxbAQfQR5ehqxBI66Z2AlD01gmJrfCvnpJUPxpgRE8F/cudiJ5GsYz1sot an11n8MxTLiSAq3q8yAUvZEooc4IlwGbsJU8VnNItDEsSqaRuUaEC8m4liO3e+/W4qYW TfjFaD8uxoOr0PaRy9x7C8KndZy+1bUqTFvgwwillfSbqrFqDmzesZuG4ilGZhrLUg5/ Gh+zBfsbf3acatZorH2yvauYu6/DpL2fYSiSwQc33z1RQk1O348dQsl4JatKhFXp6Y21 VLhg== X-Gm-Message-State: AOAM531TObsCuxMrHgEU1ggqxKNcnA1K8zlm200uH84WS186LKwoTwJU 3u6fHdoTOhCVaWafIitci7kOeUlxgGC1c5QbFIyfCltrTa8JFAhxa+EY5jRveimFQAH0oKhc0Jl OG3HlO4de2yNl32k5yw== X-Received: by 2002:a5d:6a4e:0:b0:1e4:b619:52b0 with SMTP id t14-20020a5d6a4e000000b001e4b61952b0mr8602721wrw.504.1645268055267; Sat, 19 Feb 2022 02:54:15 -0800 (PST) X-Google-Smtp-Source: ABdhPJwXo59N9G1le0LI8y77KbvImUco7UDuw7V29hhLWvvP53qNSJqFISK/NxHus6sx07T6gfhUsA== X-Received: by 2002:a5d:6a4e:0:b0:1e4:b619:52b0 with SMTP id t14-20020a5d6a4e000000b001e4b61952b0mr8602704wrw.504.1645268054805; Sat, 19 Feb 2022 02:54:14 -0800 (PST) Received: from localhost (host86-134-151-224.range86-134.btcentralplus.com. [86.134.151.224]) by smtp.gmail.com with ESMTPSA id j5-20020a05600c410500b0037bc3e4b526sm1986642wmi.7.2022.02.19.02.54.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 19 Feb 2022 02:54:14 -0800 (PST) From: Andrew Burgess To: Jan Beulich Cc: binutils@sourceware.org Subject: Re: [PATCH 3/3] opcodes/i386: partially implement disassembler style support In-Reply-To: <15b09dbb-613f-a7fe-b58e-27317d0852dd@suse.com> References: <11996f886e69218629abf81f8041269e8740a60e.1645043588.git.aburgess@redhat.com> <37ba6375-00cf-3041-ee79-21377558402c@suse.com> <87o835eaf9.fsf@redhat.com> <2723bccf-3591-d1a7-2c94-741fcd26b8e7@suse.com> <87bkz5dsqu.fsf@redhat.com> <15b09dbb-613f-a7fe-b58e-27317d0852dd@suse.com> Date: Sat, 19 Feb 2022 10:54:13 +0000 Message-ID: <87sfsfceju.fsf@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain X-Spam-Status: No, score=-2.9 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_BARRACUDACENTRAL, RCVD_IN_DNSWL_LOW, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: binutils@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Binutils mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 19 Feb 2022 10:54:20 -0000 Jan Beulich via Binutils writes: > On 17.02.2022 23:37, Andrew Burgess wrote: >> Jan Beulich via Binutils writes: >>> On 17.02.2022 17:15, Andrew Burgess wrote: >>>> Jan Beulich via Binutils writes: >>>>> On 16.02.2022 21:53, Andrew Burgess via Binutils wrote: >>>>>> + (*ins->info->fprintf_styled_func) >>>>>> + (ins->info->stream, dis_style_text, " "); >>>>>> + (*ins->info->fprintf_styled_func) >>>>>> + (ins->info->stream, dis_style_immediate, "0x%x", >>>>>> + (unsigned int) priv.the_buffer[0]); >>>>> >>>>> I wonder if the naming (dis_style_immediate) isn't misleading. As per >>>>> the comment next to its definition it really appears to mean any kind >>>>> of number (like is the case here), not just immediate operands of >>>>> instructions. Hence maybe dis_style_number (as replacement for or in >>>>> addition to dis_style_immediate)? >>>> >>>> You mentioned this before in the previous thread, and I didn't really >>>> understand then either. >>>> >>>> Can you give an example of something that's a number, but not an >>>> immediate? e.g. I wonder (given the instruction/directive distinction >>>> you draw above), I wonder if you're conserned about: '.byte 0x4', maybe >>>> you don't like referring to this 0x4 here as an immediate? >>> >>> Well, an operand to a directive for example is not an immediate imo, >>> yes. A "load offset" (as your comment calls it) may also not be an >>> immediate. E.g. in x86 memory access instructions: >>> >>> mov 0x10(%rbx), %eax >>> >>> the 0x10 isn't an immediate, but a displacement. The difference may >>> be more relevant in something like >>> >>> mov $0, 0x10(%rbx) >>> >>> where the $0 is an immediate operand, but the 0x10 isn't (and you >>> wouldn't want to mix the two). >>> >>> From that comment it's not clear to me where else you would think >>> "immediate" applies (or not), but in RISC-V's >>> >>> lw x0, 0x10(x0) >>> >>> I wouldn't consider the 0x10 an immediate either, albeit this may >>> be a result of my x86 bias. >> >> I wonder if there's a name we could come up with that would allow me to >> classify the '$0' and '0x10' (in your example above) as the same style? >> >> I've kind-of lost the thread a bit, but maybe that's what the 'number' >> you suggested original was for? If I replaced dis_style_immediate with >> dis_style_number, and just replaced thoughout, would that be less >> problematic? > > Yes, "number" was meant to be a possible replacement. Whether it's > helpful to style all forms of numbers the same is questionable > though. Note that to me in "$0" the '$' would not be covered by > "number" then, while it would be covered by "immediate". > >> Another possibility would be to have some aliases either in the original >> enum, as in: >> >> dis_style_displacement = dis_style_immediate, >> >> or even at the top of i386-dis.c, as in: >> >> #define dis_style_displacement dis_style_immediate > > But that's wrong. I'm not primarily after the naming in the sources, > but after the output not showing distinct things as similar. > >> I really think we should avoid adding too many distinct styles if we >> can. My concern is less about disassembler users handling the different >> styles, and more about consistency between the disassemblers. I figure >> it's easier to be consistent if we only have a small number of styles. >> If displacement is a different style to other immediate (yes, I'm still >> going to call numbers in instruction immediates!) then we end up with >> some architectures going one way, and others another. > > I agree with the goal of limiting the number of styles. But instead > of marking distinct items with similar non-basic (text) styles, I'd > rather see items left alone then. IOW with dis_style_immediate I'd > prefer (in the x86 example) to see only true immediate insn operands > be tagged that way, and all other numbers to remain dis_style_text. Having reviewed the thread, I've tried to come up with the complete list of styles that I believe your are arguing for. These incude a new directive style, and an address_offset style. I've extended the text in several places to cover how to handle prefixes, as well as how styles should be used for directives. I'd be grateful if you could read through this list, and give any examples of things which, if styled using the rules below, you feel would be unacceptable. /* This is the default style, use this for any additional syntax (e.g. commas between operands, brackets, etc), or just as a default if no other style seems appropriate. */ dis_style_text, /* Use this for all instruction mnemonics, or aliases for mnemonics. These should be things that correspond to real machine instructions. */ dis_style_mnemonic, /* For things that aren't real machine instructions, but rather assembler directives, e.g. .byte, etc. */ dis_style_assembler_directive, /* Use this for any register names. This may or may-not include any register prefix, e.g. '$', '%', at the discretion of the target, though within each target the choice to include prefixes for not should be kept consistent. If the prefix is not printed with this style, then dis_style_text should be used. */ dis_style_register, /* Use this for any constant values used within instructions or directives, unless the value is an absolute address, or an offset that will be added to an address (no matter where the address comes from) before use. This style may, or may-not be used for any prefix to the immediate value, e.g. '$', at the discretion of the target, though within each target the choice to include these prefixes should be kept consistent. */ dis_style_immediate, /* The style for the numerical representation of an absolute address. Anything that is an address offset should use the immediate style. This style may, or may-not be used for any prefix to the immediate value, e.g. '$', at the discretion of the target, though within each target the choice to include these prefixes should be kept consistent. */ dis_style_address, /* The style for any constant value within an instruction or directive that represents an offset that will be added to an address before use. This style may, or may-not be used for any prefix to the immediate value, e.g. '$', at the discretion of the target, though within each target the choice to include these prefixes should be kept consistent. */ dis_style_address_offset, /* The style for a symbol's name. The numerical address of a symbol should use the address style above, this style is reserved for the name. */ dis_style_symbol, /* The start of a comment that runs to the end of the line. Anything printed after a comment start might be styled differently, e.g. everything might be styled as a comment, regardless of the actual style used. The disassembler itself should not try to adjust the style emitted for comment content, e.g. an address emitted within a comment should still be given dis_style_address, in this way it is up to the user of the disassembler to decide how comments should be styled. */ dis_style_comment_start Thanks, Andrew