From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 70CA93896C3A for ; Fri, 21 Jun 2024 12:06:09 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 70CA93896C3A Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 70CA93896C3A Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1718971571; cv=none; b=kfYeeZa+mUBqPilX8CckTIRDwUrWNO9zIdMCG9E95paiZ7klD6x4jPLXw+ZktvXwo8P1nK7FqUud1O/6v82aFdaE3NIDjD4hovc3j+SCXqeb0JT2Lxv6RjKq57/KPu1mXl0c1RxmfNE2SwWM9NMsb1Mgm2XbBZf65ZqHPPrlges= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1718971571; c=relaxed/simple; bh=nuoL2UswS1KpQXtiwhasoCcN8JHGrtJj5VvNOLb9gYQ=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=Ps18ysB6WO3ID8PSJvS2xeDUQjZZMWYshdvR5NZSt6z4N4sXWDNZaLUWR86CVH84mQx85VMMdkVSRigQhgF0UCnHM/6uHh50KabIa/Q/ZGjJ47NHBiApLq3s5+74+uz1C0eJk9QvA2A/70zEt3xwjrVD9Y8ubHWSc1zcRZa4pJY= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1718971569; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type; bh=CdXnMZMxDN5+n4axujtgjkDKMeMhvYMuZsFqi97O87w=; b=O/WjcDMRrALlOF8mE7fOeQavy0iUdsm82mSM71ZgZDj7Vp82DT8NFolC6tDsuAHOKlP59O rePFJX5XqegdwJqeh5rHmI+k9kia3SI+iLeNvdGHzUiYDIvWXYSbZcyH8CHBKbgey53poU vJ7Ww66biBIROV4UYwnC3en5CWp5a7w= Received: from mail-qk1-f197.google.com (mail-qk1-f197.google.com [209.85.222.197]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-680-afkhrl0BP3-6kVAwZB7ZwQ-1; Fri, 21 Jun 2024 08:06:07 -0400 X-MC-Unique: afkhrl0BP3-6kVAwZB7ZwQ-1 Received: by mail-qk1-f197.google.com with SMTP id af79cd13be357-79bbc561cadso277915285a.1 for ; Fri, 21 Jun 2024 05:06:07 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1718971567; x=1719576367; h=mime-version:user-agent:message-id:date:cc:organization:subject:to :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=CdXnMZMxDN5+n4axujtgjkDKMeMhvYMuZsFqi97O87w=; b=FVyHGBfZB+2fMPeQWiEQfXzJwYWAhGdLPYchSF553ULRoOR2M/xpjGltBRMPzJlPsb uGKDlNId6K3reOq5TqWWuJpluKLQFfSOSRvrMLOTRlB1vzR4EyJ6mRAE9xF9T0g6uHqH rgbtdC7Un0JUCjxU+b+52eY5rNomlyakPb0CHRGB7yHvHalsbZBefY1u6iN19YYVzJUd kwFniX4xlZI3aHbWtErvDgj48+58PfteE97+NT+ZFom6LvdHleW+5LxJ159wTSng4Svq 6Y134Og5cmgLLqrLIRav4+IS0NrBiL+JmvUofRAJHKHqhPuQ5RGXyqSVTIBznKFIC4Ze WkeA== X-Gm-Message-State: AOJu0Yzb0M37kN7S/wgYkczPs/wDaUPFrteyR5ldYPw7ub1dIZmLjJa2 F8jWcsipXOgDEHH0GOto+JbZ8JCt8Ohm43wNyI1fL3CA3NoTxMViFf8C16LVHKnndpoODEB2o8/ lPTqi1aDtB06aEBTo9HJx0Rps84XIGdrs6OzIlZ/U9mJV4EaKWY426lpOs1zWppdIJA== X-Received: by 2002:a05:6214:91:b0:6b0:8fb7:cd0 with SMTP id 6a1803df08f44-6b51be6b80amr28330616d6.43.1718971567154; Fri, 21 Jun 2024 05:06:07 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGLsLBX4GXxXJDj9SYaz1YFovBUpzmPEGqgqarE13VyxQGkDN+YHUahMSf06hP695dzhZQlcA== X-Received: by 2002:a05:6214:91:b0:6b0:8fb7:cd0 with SMTP id 6a1803df08f44-6b51be6b80amr28330376d6.43.1718971566799; Fri, 21 Jun 2024 05:06:06 -0700 (PDT) Received: from localhost (88-120-130-27.subs.proxad.net. [88.120.130.27]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6b51ecfec84sm8207926d6.19.2024.06.21.05.06.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 21 Jun 2024 05:06:06 -0700 (PDT) Received: by localhost (Postfix, from userid 1000) id 49EEFC1B7841; Fri, 21 Jun 2024 14:06:04 +0200 (CEST) From: Dodji Seketeli To: claudiu.zissulescu-ianculescu@oracle.com Subject: [PATCH 1/2] ctf-reader: Fix re-initialization of the CTF reader Organization: Red Hat / France cc: libabigail@sourceware.org X-Operating-System: AlmaLinux 9.4 X-URL: http://www.redhat.com Date: Fri, 21 Jun 2024 14:06:04 +0200 Message-ID: <87r0cqxzyr.fsf@redhat.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux) MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hello, When analyzing a Linux Kernel tree made of vmlinux and loadable module binaries, the same (CTF) binary reader is re-used to load every single binaries in a loop. That can be seen, for instance, by reading the code of load_vmlinux_corpus in abg-tools-utils.cc. As part of that process, abigail::ctf::reader::initialize is invoked prior to using the reader to read type information from each binary. But then, looking at things a bit closer, I realized that ctf::reader::initialize is failing to reset the reader::ctfa data member. That leads to a memory leak that was making things grow out of proportion. Also, the resetting code fails to actually clear out the map of types that are to be sorted and canonicalized. That leads to unnecessarily sorting huge amount of types. The patch address the two points above. Apart from the obvious gain in code integrity, this patch significantly reduces the time taken to analyze a Linux Kernel tree. The patch has been successfully tested in the CI on sourceware. OK to apply to the mainline? * src/abg-ctf-reader.cc (reader::reader): Initialize ctfa, ctf_sect, symtab_sect and strtab_sect data members. (reader::initialize): In the overload taking no argument, make sure to free the ctfa data member before setting it to nullptr. In the overload that takes argument, make sure to invoke reader::initialize() to really free the data used. Also, improve the comments. Signed-off-by: Dodji Seketeli --- src/abg-ctf-reader.cc | 48 ++++++++++++++++++++++++------------------- 1 file changed, 27 insertions(+), 21 deletions(-) diff --git a/src/abg-ctf-reader.cc b/src/abg-ctf-reader.cc index f4fb91ed..8810e138 100644 --- a/src/abg-ctf-reader.cc +++ b/src/abg-ctf-reader.cc @@ -238,7 +238,8 @@ public: reader(const string& elf_path, const vector& debug_info_root_paths, environment& env) - : elf_based_reader(elf_path, debug_info_root_paths, env) + : elf_based_reader(elf_path, debug_info_root_paths, env), + ctfa(), ctf_sect(), symtab_sect(), strtab_sect() { initialize(); } @@ -248,17 +249,21 @@ public: /// This is useful to clear out the data used by the reader and get /// it ready to be used again. /// - /// Note that the reader eeps the same environment it has been - /// originally created with. + /// Note that the reader keeps (doesn't clear) the same environment + /// it has been originally created with. /// /// Please also note that the life time of this environment object - /// must be greater than the life time of the resulting @ref - /// reader the context uses resources that are allocated in - /// the environment. + /// must be greater than the life time of the resulting @ref reader + /// the context uses resources that are allocated in the + /// environment. void initialize() { - ctfa = nullptr; + if (ctfa) + { + ctf_close(ctfa); + ctfa = nullptr; + } types_map.clear(); cur_tu_.reset(); corpus_group().reset(); @@ -266,6 +271,16 @@ public: /// Initializer of the reader. /// + /// This first makes sure the data used by the reader is cleared. + /// And then it initlizes it with the information passed in + /// argument. + /// + /// This is useful to clear out the data used by the reader and get + /// it ready to be used again. + /// + /// Note that the reader keeps the same environment it has been + /// originally created with. + /// /// @param elf_path the new path to the new ELF file to use. /// /// @param debug_info_root_paths a vector of paths to use to look @@ -275,22 +290,13 @@ public: /// /// @param linux_kernel_mode currently not used. /// - /// This is useful to clear out the data used by the reader and get - /// it ready to be used again. - /// - /// Note that the reader eeps the same environment it has been - /// originally created with. - /// - /// Please also note that the life time of this environment object - /// must be greater than the life time of the resulting @ref - /// reader the context uses resources that are allocated in - /// the environment. void - initialize(const string& elf_path, - const vector& debug_info_root_paths, - bool load_all_types = false, - bool linux_kernel_mode = false) + initialize(const string& elf_path, + const vector& debug_info_root_paths, + bool load_all_types = false, + bool linux_kernel_mode = false) { + initialize(); load_all_types = load_all_types; linux_kernel_mode = linux_kernel_mode; elf_based_reader::initialize(elf_path, debug_info_root_paths); -- 2.43.0 -- Dodji