From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTP id 71E5A3858402 for ; Tue, 14 Sep 2021 15:21:02 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 71E5A3858402 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-588-lmiFYO_FNGGa3gcRdbvocw-1; Tue, 14 Sep 2021 11:21:01 -0400 X-MC-Unique: lmiFYO_FNGGa3gcRdbvocw-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 3C3EB180830C for ; Tue, 14 Sep 2021 15:21:00 +0000 (UTC) Received: from redhat.com (ovpn-112-63.phx2.redhat.com [10.3.112.63]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 0C4366608B for ; Tue, 14 Sep 2021 15:21:00 +0000 (UTC) Received: from fche by redhat.com with local (Exim 4.94.2) (envelope-from ) id 1mQAF0-0000a0-OF for elfutils-devel@sourceware.org; Tue, 14 Sep 2021 11:20:58 -0400 Date: Tue, 14 Sep 2021 11:20:58 -0400 From: "Frank Ch. Eigler" To: elfutils-devel@sourceware.org Subject: [patch] PR28339: debuginfod groom/scan concurrency fix Message-ID: <20210914152058.GG19490@redhat.com> MIME-Version: 1.0 User-Agent: Mutt/1.12.0 (2019-05-25) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: elfutils-devel@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Elfutils-devel mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Sep 2021 15:21:06 -0000 commit ce695bedf073127883bbbaab528dd1f2b0e955f1 (HEAD -> master) Author: Frank Ch. Eigler Date: Tue Sep 14 08:15:23 2021 -0400 PR28339: debuginfod: fix groom/scan race condition on just-emptied queue debuginfod's scan and groom operations (thread_main_scanner, thread_main_fts_source_paths) are intended to be mutually exclusive, as a substitute for more complicated sql transaction batching. (This is because scanning / grooming involves inserting or deleting data from multiple related tables.) The workq class that governs this in debuginfod.cxx has a problem: if the workq just becomes empty, its sole entry pulled by a scanner thread in response to a wait_front(), an 'idler' groomer thread is ALSO permitted to run, because there is no indication as to when the scanner thread operation finishes, only when it starts. Extending the workq with a counter ("fronters") to track any active scanning activity (even if the workq is empty) lets us block idlers groomers a little longer. Signed-off-by: Frank Ch. Eigler diff --git a/debuginfod/ChangeLog b/debuginfod/ChangeLog index c54598239c82..07ddad3019d9 100644 --- a/debuginfod/ChangeLog +++ b/debuginfod/ChangeLog @@ -1,3 +1,11 @@ +2021-09-14 Frank Ch. Eigler + + PRPR28339 + * debuginfod.cxx (waitq::fronters): New field. + (waitq::wait_idle): Respect it. + (waitq::done_front): New function. + (thread_main_scanner): Call it to match wait_front(). + 2021-08-28 Mark Wielaard * debuginfod.cxx (parse_opt): Turn the -d arg ":memory:" into diff --git a/debuginfod/debuginfod.cxx b/debuginfod/debuginfod.cxx index 3269f657cc8d..6387a27ff820 100644 --- a/debuginfod/debuginfod.cxx +++ b/debuginfod/debuginfod.cxx @@ -663,10 +663,11 @@ class workq mutex mtx; condition_variable cv; bool dead; - unsigned idlers; + unsigned idlers; // number of threads busy with wait_idle / done_idle + unsigned fronters; // number of threads busy with wait_front / done_front public: - workq() { dead = false; idlers = 0; } + workq() { dead = false; idlers = 0; fronters = 0; } ~workq() {} void push_back(const Payload& p) @@ -690,10 +691,11 @@ class workq unique_lock lock(mtx); q.clear(); set_metric("thread_work_pending","role","scan", q.size()); + // NB: there may still be some live fronters cv.notify_all(); // maybe wake up waiting idlers } - // block this scanner thread until there is work to do and no active + // block this scanner thread until there is work to do and no active idler bool wait_front (Payload& p) { unique_lock lock(mtx); @@ -705,19 +707,29 @@ class workq { p = * q.begin(); q.erase (q.begin()); + fronters ++; // prevent idlers from starting awhile, even if empty q set_metric("thread_work_pending","role","scan", q.size()); - if (q.size() == 0) - cv.notify_all(); // maybe wake up waiting idlers + // NB: don't wake up idlers yet! The consumer is busy + // processing this element until it calls done_front(). return true; } } + // notify waitq that scanner thread is done with that last item + void done_front () + { + unique_lock lock(mtx); + fronters --; + if (q.size() == 0 && fronters == 0) + cv.notify_all(); // maybe wake up waiting idlers + } + // block this idler thread until there is no work to do void wait_idle () { unique_lock lock(mtx); cv.notify_all(); // maybe wake up waiting scanners - while (!dead && (q.size() != 0)) + while (!dead && ((q.size() != 0) || fronters > 0)) cv.wait(lock); idlers ++; } @@ -3145,6 +3157,8 @@ thread_main_scanner (void* arg) e.report(cerr); } + scanq.done_front(); // let idlers run + if (fts_cached || fts_executable || fts_debuginfo || fts_sourcefiles || fts_sref || fts_sdef) {} // NB: not just if a successful scan - we might have encountered -ENOSPC & failed (void) statfs_free_enough_p(db_path, "database"); // report sqlite filesystem size