From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from gnu.wildebeest.org (wildebeest.demon.nl [212.238.236.112]) by sourceware.org (Postfix) with ESMTPS id 3C49D385743A for ; Wed, 7 Jul 2021 15:28:19 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 3C49D385743A Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=klomp.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=klomp.org Received: from tarox.wildebeest.org (83-87-18-245.cable.dynamic.v4.ziggo.nl [83.87.18.245]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by gnu.wildebeest.org (Postfix) with ESMTPSA id B6710302FBA6; Wed, 7 Jul 2021 17:28:17 +0200 (CEST) Received: by tarox.wildebeest.org (Postfix, from userid 1000) id 1CAB44003072; Wed, 7 Jul 2021 17:28:17 +0200 (CEST) Message-ID: <433ef20b71cd63b5f755e9f21523dee540724409.camel@klomp.org> Subject: Re: PR27711 From: Mark Wielaard To: Noah Sanci , elfutils-devel@sourceware.org Date: Wed, 07 Jul 2021 17:28:16 +0200 In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Mailer: Evolution 3.28.5 (3.28.5-10.el7) Mime-Version: 1.0 X-Spam-Status: No, score=-9.3 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_BADIPHTTP, KAM_DMARC_STATUS, KAM_NUMSUBJECT, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: elfutils-devel@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Elfutils-devel mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 07 Jul 2021 15:28:21 -0000 Hi Noah, If possible please add a commit message and sent the patch using git send-email or git format-patch. Feel free to use the description of=20 https://sourceware.org/bugzilla/show_bug.cgi?id=3D27711#c0 that Frank wrote as commit message. It clearly explains the intention. On Thu, 2021-07-01 at 16:38 -0400, Noah Sanci via Elfutils-devel wrote: > diff --git a/debuginfod/ChangeLog b/debuginfod/ChangeLog > index 286c910a..5afbafcd 100644 > --- a/debuginfod/ChangeLog > +++ b/debuginfod/ChangeLog > @@ -1,3 +1,8 @@ > +2021-07-01 Noah Sanci > + PR27711 > + * debuginfod.cxx (groom): Allowed the use of regexes during the > + grooming process. Slightly better would be to describe the changes. e.g. (options): Add --regex-groon, -r option. (regex_groom): New static bool defaults to false. (parse_opt): Handle 'r' option by setting regex_groom to true. (groom): Introduce and use reg_include and reg_exclude. > 2021-06-03 Frank Ch. Eigler >=20 > PR27863 > diff --git a/debuginfod/debuginfod.cxx b/debuginfod/debuginfod.cxx > index 543044c6..4f7fd2d5 100644 > --- a/debuginfod/debuginfod.cxx > +++ b/debuginfod/debuginfod.cxx > @@ -360,6 +360,7 @@ static const struct argp_option options[] =3D > { "database", 'd', "FILE", 0, "Path to sqlite database.", 0 }, > { "ddl", 'D', "SQL", 0, "Apply extra sqlite ddl/pragma to connection.= ", 0 }, > { "verbose", 'v', NULL, 0, "Increase verbosity.", 0 }, > + { "regex-groom", 'r', NULL, 0,"Uses regexes from -I and -X > arguments to groom the database.",0}, > #define ARGP_KEY_FDCACHE_FDS 0x1001 > { "fdcache-fds", ARGP_KEY_FDCACHE_FDS, "NUM", 0, "Maximum number > of archive files to keep in fdcache.", 0 }, > #define ARGP_KEY_FDCACHE_MBS 0x1002 > @@ -407,6 +408,7 @@ static map scan_archives; > static vector extra_ddl; > static regex_t file_include_regex; > static regex_t file_exclude_regex; > +static bool regex_groom =3D false; > static bool traverse_logical; > static long fdcache_fds; > static long fdcache_mbs; > @@ -527,6 +529,9 @@ parse_opt (int key, char *arg, > if (rc !=3D 0) > argp_failure(state, 1, EINVAL, "regular expression"); > break; > + case 'r': > + regex_groom =3D true; > + break; > case ARGP_KEY_FDCACHE_FDS: > fdcache_fds =3D atol (arg); > break; > @@ -3249,8 +3254,11 @@ void groom() > int64_t fileid =3D sqlite3_column_int64 (files, 1); > const char* filename =3D ((const char*) sqlite3_column_text > (files, 2) ?: ""); > struct stat s; > + bool reg_include =3D !regexec (&file_include_regex, filename, 0, 0= , 0); > + bool reg_exclude =3D !regexec (&file_exclude_regex, filename, 0, 0= , 0); > + > rc =3D stat(filename, &s); > - if (rc < 0 || (mtime !=3D (int64_t) s.st_mtime)) > + if ( (regex_groom && reg_exclude && !reg_include) || rc < 0 || > (mtime !=3D (int64_t) s.st_mtime) ) > { OK, so we groom the file as before rc < 0 || (mtime !=3D (int64_t) s.st_mtime) ) But also (if -r is given) if the file matches the exclude regexp, but not the include one. So if I read this right, an exclude regexp match means groom that file, but an include rexexp match means, don't groom (except if it disappeared on itself). > if (verbose > 2) > obatched(clog) << "groom: forgetting file=3D" << filename > << " mtime=3D" << mtime << endl; > @@ -3261,7 +3269,6 @@ void groom() > } > else > inc_metric("groomed_total", "decision", "fresh"); > - > if (sigusr1 !=3D forced_rescan_count) // stop early if scan trigge= red > break; > } Spurious line removed? > diff --git a/doc/debuginfod.8 b/doc/debuginfod.8 > index 1ba42cf6..1adf703a 100644 > --- a/doc/debuginfod.8 > +++ b/doc/debuginfod.8 > @@ -159,6 +159,9 @@ scan, independent of the rescan time (including if > it was zero), > interrupting a groom pass (if any). >=20 > .TP > +.B "\-r" > +Apply the -I and -X during groom cycles, so that files excluded by > the regexes are removed from the index. These parameters are in > addition to what normally qualifies a file for grooming, not a > replacement. > + OK. That matches my reading of the code. Good. > .B "\-g SECONDS" "\-\-groom\-time=3DSECONDS" > Set the groom time for the index database. This is the amount of time > the grooming thread will wait after finishing a grooming pass before > diff --git a/tests/ChangeLog b/tests/ChangeLog > index d8fa97fa..346b9e6e 100644 > --- a/tests/ChangeLog > +++ b/tests/ChangeLog > @@ -1,3 +1,8 @@ > +2021-07-01 Noah Sanci > + PR2711 > + * run-debuginfod-find.sh: Added test case for grooming the databa= se > + using regexes. > + > 2021-06-16 Frank Ch. Eigler >=20 > * run-debuginfod-find.sh: Fix intermittent groom/stale failure, > diff --git a/tests/run-debuginfod-find.sh b/tests/run-debuginfod-find.sh > index 456dc2f8..bd78bf46 100755 > --- a/tests/run-debuginfod-find.sh > +++ b/tests/run-debuginfod-find.sh > @@ -36,13 +36,14 @@ export DEBUGINFOD_CACHE_PATH=3D${PWD}/.client_cache > PID1=3D0 > PID2=3D0 > PID3=3D0 > +PID4=3D0 >=20 > cleanup() > { > - if [ $PID1 -ne 0 ]; then kill $PID1 || true; wait $PID1; fi > - if [ $PID2 -ne 0 ]; then kill $PID2 || true; wait $PID2; fi > - if [ $PID3 -ne 0 ]; then kill $PID3 || true; wait $PID3; fi > - > + if [ $PID1 -ne 0 ]; then kill $PID1; wait $PID1; fi > + if [ $PID2 -ne 0 ]; then kill $PID2; wait $PID2; fi > + if [ $PID3 -ne 0 ]; then kill $PID3; wait $PID3; fi > + if [ $PID4 -ne 0 ]; then kill $PID4; wait $PID4; fi > rm -rf F R D L Z ${PWD}/foobar ${PWD}/mocktree > ${PWD}/.client_cache* ${PWD}/tmp* > exit_cleanup > } > @@ -293,7 +294,7 @@ kill -USR1 $PID1 > wait_ready $PORT1 'thread_work_total{role=3D"traverse"}' 3 > wait_ready $PORT1 'thread_work_pending{role=3D"scan"}' 0 > wait_ready $PORT1 'thread_busy{role=3D"scan"}' 0 > - > +cp $DB $DB.backup > # Rerun same tests for the prog2 binary > filename=3D`testrun ${abs_top_builddir}/debuginfod/debuginfod-find -v > debuginfo $BUILDID2 2>vlog` > cmp $filename F/prog2 > @@ -705,4 +706,29 @@ DEBUGINFOD_URLS=3D"file://${PWD}/mocktree/" > filename=3D`testrun ${abs_top_builddir}/debuginfod/debuginfod-find > source aaaaaaaaaabbbbbbbbbbccccccccccdddddddddd /my/path/main.c` > cmp $filename ${local_dir}/main.c >=20 > -exit 0 > +######################################################################## > +## PR27711 > +# Test to ensure the -A removes files from the index using a given regex > +while true; do > + PORT3=3D`expr '(' $RANDOM % 1000 ')' + 9000` > + ss -atn | fgrep ":$PORT3" || break > +done > +env LD_LIBRARY_PATH=3D$ldpath > DEBUGINFOD_URLS=3D"http://127.0.0.1:$PORT3/" > ${abs_builddir}/../debuginfod/debuginfod $VERBOSE -p $PORT3 -t0 -g0 > --regex-groom --include=3D"^$" --exclude=3D".*" -d $DB.backup > > vlog$PORT3 2>&1 & > +PID4=3D$! > +wait_ready $PORT3 'ready' 1 > +tempfiles vlog$PORT3 > +errfiles vlog$PORT3 > + > +kill -USR2 $PID4 > +wait_ready $PORT3 'thread_work_total{role=3D"groom"}' 1 > +wait_ready $PORT3 'groom{statistic=3D"archive d/e"}' 0 > +wait_ready $PORT3 'groom{statistic=3D"archive sdef"}' 0 > +wait_ready $PORT3 'groom{statistic=3D"archive sref"}' 0 > +wait_ready $PORT3 'groom{statistic=3D"buildids"}' 0 > +wait_ready $PORT3 'groom{statistic=3D"file d/e"}' 0 > +wait_ready $PORT3 'groom{statistic=3D"file s"}' 0 > +wait_ready $PORT3 'groom{statistic=3D"files scanned (#)"}' 0 > +wait_ready $PORT3 'groom{statistic=3D"files scanned (mb)"}' 0 > + > +kill $PID4 > +exit 0; OK, so this checks that nothing ^$ gets included (kept) and everything .* gets excluded (removed/groomed)? Thanks, Mark