From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out1.suse.de (smtp-out1.suse.de [IPv6:2001:67c:2178:6::1c]) by sourceware.org (Postfix) with ESMTPS id 247683858C54 for ; Tue, 8 Aug 2023 14:13:02 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 247683858C54 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out1.suse.de (Postfix) with ESMTP id 57A3622485; Tue, 8 Aug 2023 14:13:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1691503981; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=lHVq3Bj/iMMNHmh66rwLhjC2durX3jIv8y+nXi7P9wg=; b=N9CJJC/NWy421Pt8Natcyf2hD8N8rAtATSbCdbsXdO0SQZ6UpU/MLlzU3cavIc+/UuBTh6 DkLVZ9OF1xByMSXqL8/SCvRqyLHuRWsGixHnuD51iyXvMrsn26S/eOjD2atrOSdCQUVbef vsiz7fKe3adGjzJKvW6wq9pi6UtI9OY= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1691503981; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=lHVq3Bj/iMMNHmh66rwLhjC2durX3jIv8y+nXi7P9wg=; b=rmHmjmhz6jmy+45v+T2+cnR8IPdxpxlX1XEzK1Nd6AcP2jmzY33MehsnseKLD0kodUHMIa CzeADmm0kikWcsAQ== Received: from wotan.suse.de (wotan.suse.de [10.160.0.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id 4C5B22C142; Tue, 8 Aug 2023 14:13:01 +0000 (UTC) Received: by wotan.suse.de (Postfix, from userid 10510) id 3E18B696E; Tue, 8 Aug 2023 14:13:01 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by wotan.suse.de (Postfix) with ESMTP id 3C6A96497; Tue, 8 Aug 2023 14:13:01 +0000 (UTC) Date: Tue, 8 Aug 2023 14:13:01 +0000 (UTC) From: Michael Matz To: MegaIng cc: binutils@sourceware.org Subject: Re: Problems with relocations for a custom ISA In-Reply-To: Message-ID: References: User-Agent: Alpine 2.20 (LSU 67 2015-01-07) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Spam-Status: No, score=-3.1 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hello, On Tue, 8 Aug 2023, MegaIng via Binutils wrote: > I am currently in the process of porting binutils to a custom architecture I > design with a few others (Spec [1], Start of our Port [2]). An interesting > quirk of this ISA is that its highly modular, starting with fixed-size 16bit > opcodes, but with extensions supporting variable length instructions similar > in power to what x86 has with it's addressing modes. The base ISA is fixed > 16bit word, but there are extensions for 32 and 64bit words. > > Most of the basics I already managed to implement, i.e. I can generate simple > workable ELF files. However, I am running into problems with relocations for > "load immediate" instructions. Without extensions, we want to potentially emit > long chains of instruction (3 to 8 instructions is realistic), but with proper > extensions in can get down to only 1 instruction of 3 or 4 bytes. I am unsure > how to best represent such variable length relocations in BFD and ELF. The normal way would be to not do that. It seems the assembler will already see either a long chain of small insns, or a single large insn, right? So at that point you can already emit the correct relocs. For example, if I have three insns: setlo, sethi and setall, setting the low 16 bits, the high 16 bits, or all 32 bits of a 32bit immediate, then I also would have three reloc types: LOW16, HIGH16 and ABS32, which the assembler would appropriately emit: setlo %r1, lo(sym) --> RELOC_LOW16, symbol 'sym' sethi %r1, hi(sym) --> RELOC_HIGH16, symbol 'sym' setall %r1, sym --> RELOC_ABS32, symbol 'sym' (obviously details will differ, your 16bit insns won't be able to quite set all 16 bits :) ). If you really want to optimize these sequences also at link time (but why?) then all of this becomes more complicated, but remains essentially the same. The secret will then be in linking from one of the small relocs (say, the high16 one) to the other, for the linker to easily recognize the whole insn pair and appropriately do something about those byte sequences. In that scheme you need to differ between relocations applied to relaxable code and relocation applied to random non-relaxable data. E.g. you probably need two variants of the RELOC_LOW16 relocation. Some bfd targets chose to limit themself to only simple sequences of relaxable instructions, e.g. if the low16/high16 setter always comes in sequence directly after each other (the compiler or asm author will need to ensure this if it wants to benefit from relaxation then), then one reloc doesn't need to link to the other. I wouldn't go that way if I were you: it seems the assembler/compiler needs to know if targeting the extended ISA or not anyway, so generating the right instructions and relocations from the start in the assembler seems the right choice, and then doesn't need any relax complications at link time. Ciao, Michael.