From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 88978 invoked by alias); 18 Feb 2020 01:21:23 -0000 Mailing-List: contact binutils-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: binutils-owner@sourceware.org Received: (qmail 88670 invoked by uid 89); 18 Feb 2020 01:21:23 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-11.9 required=5.0 tests=BAYES_00,GIT_PATCH_2,GIT_PATCH_3,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.1 spammy=he's, FSF, fsf, H*M:home X-HELO: rock.gnat.com Received: from rock.gnat.com (HELO rock.gnat.com) (205.232.38.15) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 18 Feb 2020 01:21:21 +0000 Received: from localhost (localhost.localdomain [127.0.0.1]) by filtered-rock.gnat.com (Postfix) with ESMTP id F26DB56024; Mon, 17 Feb 2020 20:21:18 -0500 (EST) Received: from rock.gnat.com ([127.0.0.1]) by localhost (rock.gnat.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id tD9-CRrut8iT; Mon, 17 Feb 2020 20:21:18 -0500 (EST) Received: from free.home (tron.gnat.com [IPv6:2620:20:4000:0:46a8:42ff:fe0e:e294]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by rock.gnat.com (Postfix) with ESMTPS id B485856021; Mon, 17 Feb 2020 20:21:18 -0500 (EST) Received: from livre.home (livre.home [172.31.160.2]) by free.home (8.15.2/8.15.2) with ESMTPS id 01I1LCOs213226 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 17 Feb 2020 22:21:12 -0300 From: Alexandre Oliva To: binutils@sourceware.org Subject: impredictable alignment on ARM Date: Tue, 18 Feb 2020 01:21:00 -0000 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-SW-Source: 2020-02/txt/msg00418.txt.bz2 Consider the following asm input: .thumb .text ldr r1, 0f 0f: .word 0x12345678 In this case, we report the word is misaligned and fail, though the section is aligned to 2-byte boundaries, so the word *might* be properly aligned, after all, if only the previous linked section enabled the text section above to start 2 bytes after a 4-byte aligned address. Anyway, we probably don't want to worry about this case. However, I think we should be concerned about the converse case: .thumb .text ldr r1, 0f ldr r2, 1f 0f: .word 0x01234567 1f: .word 0x89abcdef nop We do NOT report an error here, but if this text segment gets placed at a 2-byte offset from a 4-byte aligned address (e.g., link the object file in twice), the second pair will have misaligned words, and the PC-relative offsets will resolve to aligned words that contain only part of the word to be loaded. The following patchlet arranges for us to complain when the target of such an ldr doesn't ensure the expected alignment. However, it's not quite enough to solve the general problem. Consider: .thumb .text ldr sp, 0f 0f: .word 0x80000000 This extended form of ldr takes 4 bytes, and it doesn't require nor ensure the target word to be aligned to a 4-byte boundary. It just so happens that, if it's not aligned, the value loaded into the register is a rotated version of the word containing the misaligned address. I'm not sure it would be appropriate for us to reject potentially misaligned words: there might be (obfuscated) code intended to detect and behave differently depending on whether it ends up at an even or odd half-word. However, I think it would be nice for us to at least warn that the code might behave differently depending on the actual alignment it gets. I'm thinking something as simple as tracking the max natural alignment used in each segment, and warning of potental linker-induced behavior changes if that alignment is not recorded for the segment. Tracking symbols with their natural alignments, and maybe even references to them that expect a certain alignment, might be pushing too far, on the one hand, and still missing relevant cases of separate compilation or complex address computations on the other. Is this something we might want to pursue, so as to warn even for e.g.: .text .word 0 but limited to once per segment? Or should we track and warn about PC-relative addressing requirements, so as to warn about segments containing PC-relative addressing (in whatever forms) whose expected alignment exceeds the section's? (this could miss e.g. setting a register to PC + offset, and then loading a word at the address stored in the register) A combination of these? Thoughts? Here's the patchlet that covers only the PCrel-load-to-low-reg case: --- gas/config/tc-arm.c 2020-01-28 12:50:34.000000000 +0100 +++ gas/config/tc-arm.c 2020-02-18 00:13:11.486184639 +0100 @@ -28755,6 +28755,9 @@ (((unsigned long) fixP->fx_frag->fr_address + (unsigned long) fixP->fx_where) & ~3) + (unsigned long) value); + else if (get_recorded_alignment (seg) < 2) + as_warn_where (fixP->fx_file, fixP->fx_line, + _("segment does not ensure enough alignment for target word")); if (value & ~0x3fc) as_bad_where (fixP->fx_file, fixP->fx_line, -- Alexandre Oliva, freedom fighter he/him https://FSFLA.org/blogs/lxo Free Software Evangelist Stallman was right, but he's left :( GNU Toolchain Engineer FSMatrix: It was he who freed the first of us FSF & FSFLA board member The Savior shall return (true);