From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-1.mimecast.com (us-smtp-delivery-1.mimecast.com [205.139.110.120]) by sourceware.org (Postfix) with ESMTP id EAB04386F81E for ; Sun, 17 May 2020 21:19:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org EAB04386F81E Received: from mail-ed1-f69.google.com (mail-ed1-f69.google.com [209.85.208.69]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-508-SkOM312RPk-KtQ36UTEB7A-1; Sun, 17 May 2020 17:19:42 -0400 X-MC-Unique: SkOM312RPk-KtQ36UTEB7A-1 Received: by mail-ed1-f69.google.com with SMTP id bo26so772493edb.22 for ; Sun, 17 May 2020 14:19:42 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=oUx4q6SyHpIG9aO6ONg9Cg0zNdqaWW/HGz6IW9e2998=; b=NlbC5+6WBbnrsAlMHdQMCv1zw3z8z5fAYAsITLdcS+3ZAxcZNFnb2x5+sjkMMcRn5r AnrAmeit+VItHpaWxkbpl02ZrrvsOGhc3ZG2+eszzU0t/8W2F4m4qXsAAkTwKwDoVD8I 2FntCs/zJa3qZebFHh+or50bzMnzzEKSycLyU8r8ATdDuNKRNtNoRUoO0NZYOXyIDHJ4 EH5K3AbPcEm/BWIhI/3iXXmd1ANSyqBndsT3UDnkO65FgRrivITKJiKC08PMbtioalsY SokUtFzmOFMrehq3yGWVZMvfyUabPN5gGZdhI+bhlsILgsKu4x7bmsmd3U66h0RNvyoY pRVQ== X-Gm-Message-State: AOAM5332J56TuR4C86Npw2SXV/yPHZWNyhrij6N8oibXbqr/TQ2Kzj+S oNslk8stY1DM2YiYs+dZlQhPi5PS0r+lYNgYdk1xkGaNPMGD27lTuIwPklVZaVs4Rbh0MCkER74 il85GpFyzxNc= X-Received: by 2002:a17:907:438e:: with SMTP id oj22mr12862061ejb.195.1589750381034; Sun, 17 May 2020 14:19:41 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx/g6Q0BT8OpQPbnqlH8CylfxWYnxD7NjKxb2Ya+my1kkwJXk21jKE8/x2AsCihvZZaTGHbHw== X-Received: by 2002:a17:907:438e:: with SMTP id oj22mr12862050ejb.195.1589750380719; Sun, 17 May 2020 14:19:40 -0700 (PDT) Received: from ?IPv6:2001:8a0:f909:7b00:2327:23ca:3e56:ef5f? ([2001:8a0:f909:7b00:2327:23ca:3e56:ef5f]) by smtp.gmail.com with ESMTPSA id c15sm1147299ejx.62.2020.05.17.14.19.39 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 17 May 2020 14:19:40 -0700 (PDT) Subject: Re: exec-file-mismatch and native-gdbserver testing To: Philippe Waroquiers , "Metzger, Markus T" , GDB References: <40713d32-0785-253a-bcde-c6969e12ed6a@redhat.com> <8725565f1879f78a1c37600819a354a47e6d492a.camel@skynet.be> From: Pedro Alves Message-ID: <7bf4097d-88ac-7016-bf0d-c1648ac8126b@redhat.com> Date: Sun, 17 May 2020 22:19:38 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.2.1 MIME-Version: 1.0 In-Reply-To: <8725565f1879f78a1c37600819a354a47e6d492a.camel@skynet.be> Content-Language: en-US X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-8.8 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gdb@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 17 May 2020 21:19:47 -0000 On 5/17/20 9:11 PM, Philippe Waroquiers wrote: > On Sun, 2020-05-17 at 20:50 +0100, Pedro Alves wrote: >>> E.g. I am wondering if the below will be visible and cause >>> an (understandable) warning/error/behaviour for the user: >>> If the user has debugged a first process with orig_exe, >>> then the user copied orig_exe to copy_orig_exe, and then GDB is >>> attached to a process that runs copy_orig_exe, the user does not expect >>> to have orig_exe protected/accessed anymore, and so might change it >>> or remove it or ..., while GDB still use orig_exe instead of copy_orig_exe. >> >> But this seems like a pretty benign problem? But I'm not sure >> I understood it. What exactly goes wrong in this scenario? > The user expects orig_exe to not be 'busy' anymore, and so > expects to be able to freely modify it, without e.g. impacting > the GDB session debugging the executable running copy_orig_exe. > (I guess that orig_exe will not cause 'Text busy' error, as no > process is still executing it from the kernel point of view). Do you really see these "Text busy" errors nowadays? I don't think I ever saw those on GNU/Linux. Still, I'm not seeing the same kind of problem that ending up with the wrong binary loaded in GDB causes. If you end up with the wrong binary loaded in GDB, then GDB may for example install breakpoints at the wrong addresses, and that may even cause the inferior to crash, because the breakpoint address may fall in the middle of instructions, resulting in the inferior potentially executing invalid instructions, or worse, executing valid instructions with disastrous side effects. The type of problem you're describing seems more like an annoyance, which will be detected some other way ("Text busy" or some other side effect), and the user can still fix it, with e.g., the "file" command. > >> >>> So, I was wondering if such a case of equal build ID >>> but different (local?) file names are not worth a warning. >> >> IMO it isn't, because it is very common to have different >> filenames (if you consider the whole path) for executable >> loaded in gdb compared to the executable that the process is >> running when you consider remote debugging. >> >>>> I'm thinking, if we support build ID validation, do we really want >>>> to fallback to filename validation? It seems to me that it causes >>>> more false positives than desirable. >>> You mean that the filename comparison is useless (or even harmful) >>> if we found the build ID in the files ? >>> Effectively, if build ID are different but filenames are equal, >>> that is likely a false positive 'file are matching' >>> (only possible in remote debugging setup I suppose). >> >> No, I mean, let's consider the feature from scratch again. >> I'm saying that IMHO filename comparison on its own is pretty >> weak and annoyingly chatty. I'd think e.g., a basename >> match + segments match (compare addresses and sizes of >> of text, data, etc, segments) would already be much better. >> But that's a path that's been considered in all other scenarios >> where we have to match binaries, and ultimately, build ID >> was invented to fix this kind of scenario without heuristics, >> because heuristics can always fail. >> >> So given that we can do buildid matching, shouldn't we just forget >> all other kinds of matching, and just stick with build id matching, >> with no fallback? I.e., add build id matching, remove the filename >> matching, and raise the bar for any fallback matching -- as in if >> you want some fallback, it has to be better than just filenames. >> >> IIRC, the main motivation for the feature is when you attach to >> a process running bar, while you have foo (completely unrelated to bar) >> loaded in gdb. GDB previously would assume that foo is the symbol file >> for bar, so it gladly continued debugging bar with the foo binary. >> Buildid detects this, and also detects the scenario of attaching to >> a process that is running an older version of bar than the version >> you have loaded in gdb (because you rebuilt the program before >> attaching, for example). >> >> More contrived use cases can be imagined, but it seems to me like >> if you want to catch them, then you're better off making sure your >> binaries include build ids. Which is true by default on modern >> GNU/Linux OSs at least. > At my work, objdump -h some_exe does not show a build ID, not clear > why (RHEL 7.8, but using gold linker from Adacore gnatpro). > > So, my main original use case needs filename comparison :(. According to: https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/developer_guide/compiling-build-id "Each executable or shared library built with Red Hat Enterprise Linux Server 6 or later is assigned a unique identification 160-bit SHA-1 string, generated as a checksum of selected parts of the binary. " Maybe older gold versions didn't emit the build id by default, while GNU ld did. I tried it with master gold, and it emits the build id by default. does explicitly specifying --build-id on the link work? Since you're already not using the default tools, you could tweak your build system to explicitly request a build id? > So, my main original use case needs filename comparison :(. I think that doesn't follow -- you could say that the build id isn't sufficient for you, and that you need a fallback, but that doesn't mean that the fallback must be the straight full path filename comparison as is it today. Thanks, Pedro Alves