From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 102331 invoked by alias); 25 Nov 2019 16:48:36 -0000 Mailing-List: contact overseers-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: , Sender: overseers-owner@sourceware.org Received: (qmail 102250 invoked by uid 89); 25 Nov 2019 16:48:27 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-4.1 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_1,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.1 spammy=spreads X-HELO: vsmx011.vodafonemail.xion.oxcs.net Received: from vsmx011.vodafonemail.xion.oxcs.net (HELO vsmx011.vodafonemail.xion.oxcs.net) (153.92.174.89) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 25 Nov 2019 16:48:14 +0000 Received: from vsmx003.vodafonemail.xion.oxcs.net (unknown [192.168.75.197]) by mta-5-out.mta.xion.oxcs.net (Postfix) with ESMTP id 65F5A59CF99; Mon, 25 Nov 2019 16:48:12 +0000 (UTC) Received: from gertrud.localnet (unknown [91.47.60.226]) by mta-7-out.mta.xion.oxcs.net (Postfix) with ESMTPA id BC1025399F4; Mon, 25 Nov 2019 16:48:04 +0000 (UTC) From: ASSI To: "Frank Ch. Eigler" Cc: Jon Turney , overseers@sourceware.org Subject: Re: [Cygwin] package-grep on Sourceware Date: Mon, 25 Nov 2019 16:48:00 -0000 Message-ID: <1810471.5kLSPG0ym2@gertrud> In-Reply-To: <20191125003701.GB1093554@elastic.org> References: <24997709.IERGFR1AUN@gertrud> <20191125003701.GB1093554@elastic.org> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" X-SW-Source: 2019-q4/txt/msg00043.txt.bz2 Hi Frank, On Monday, November 25, 2019 1:37:01 AM CET Frank Ch. Eigler wrote: > > you've made changes to the package-grep CGI script on sourceware > > that pretty much remove its originally intended functionality. The > > original was searching the index of the package _contents_ as it > > would get installed on the disk, while the JSON file you search now > > only has the metadata for the packages. That's not even > > superficially the same thing. [...] > > OK. As you may be aware, this simplification was done because the > previous code's workload was too high (grepping through 600MB+ of text > per cgi call), and we at overseers@ weren't told otherwise. I wasn't questioning your motives, that part was pretty clear from the change you did. I would like to understand what characteristic of the workload you were trying to avoid as to me it looks like avoiding IO comes at a significant cost in either memory or CPU consumption (or both). I understand that the server is a fairly well equipped machine, but not in enough detail to make that call. > > I've looked into what could be done to do that in a more > > server-friendly way, but it turns out that there either isn't > > anything that's ready-made for this or I can't find it. [...] > > Thanks for looking into it. A few other possibilities to > ameliorate the impact of restoring the previous version: > > - using xargs grep to use our extra cores, to reduce latency Then I'd suggest to just use ripgrep (is that available?), which automatically spreads the work across the available cores (and can be limited to less than that if so desired). Is that available on the machine? Would it be possible to cache the list files in a tmpfs? This doesn't improve performance on my machine, but I have an SSD and no other IO. > - to impose a mod_qos concurrency limit on package-grep.cgi I can't really tell what that would mean, but I guess that queries would get delayed if too many of them get stacked onto each other? That would help with the load of course. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ Waldorf MIDI Implementation & additional documentation: http://Synth.Stromeko.net/Downloads.html#WaldorfDocs