From: ASSI <Stromeko@nexgo.de>
To: "Frank Ch. Eigler" <fche@elastic.org>
Cc: Jon Turney <jon.turney@dronecode.org.uk>, overseers@sourceware.org
Subject: Re: [Cygwin] package-grep on Sourceware
Date: Mon, 25 Nov 2019 16:48:00 -0000 [thread overview]
Message-ID: <1810471.5kLSPG0ym2@gertrud> (raw)
In-Reply-To: <20191125003701.GB1093554@elastic.org>
Hi Frank,
On Monday, November 25, 2019 1:37:01 AM CET Frank Ch. Eigler wrote:
> > you've made changes to the package-grep CGI script on sourceware
> > that pretty much remove its originally intended functionality. The
> > original was searching the index of the package _contents_ as it
> > would get installed on the disk, while the JSON file you search now
> > only has the metadata for the packages. That's not even
> > superficially the same thing. [...]
>
> OK. As you may be aware, this simplification was done because the
> previous code's workload was too high (grepping through 600MB+ of text
> per cgi call), and we at overseers@ weren't told otherwise.
I wasn't questioning your motives, that part was pretty clear from the change
you did. I would like to understand what characteristic of the workload you
were trying to avoid as to me it looks like avoiding IO comes at a significant
cost in either memory or CPU consumption (or both). I understand that the
server is a fairly well equipped machine, but not in enough detail to make
that call.
> > I've looked into what could be done to do that in a more
> > server-friendly way, but it turns out that there either isn't
> > anything that's ready-made for this or I can't find it. [...]
>
> Thanks for looking into it. A few other possibilities to
> ameliorate the impact of restoring the previous version:
>
> - using xargs grep to use our extra cores, to reduce latency
Then I'd suggest to just use ripgrep (is that available?), which automatically
spreads the work across the available cores (and can be limited to less than
that if so desired). Is that available on the machine? Would it be possible
to cache the list files in a tmpfs? This doesn't improve performance on my
machine, but I have an SSD and no other IO.
> - to impose a mod_qos concurrency limit on package-grep.cgi
I can't really tell what that would mean, but I guess that queries would get
delayed if too many of them get stacked onto each other? That would help with
the load of course.
Regards,
Achim.
--
+<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+
Waldorf MIDI Implementation & additional documentation:
http://Synth.Stromeko.net/Downloads.html#WaldorfDocs
next prev parent reply other threads:[~2019-11-25 16:48 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <24997709.IERGFR1AUN@gertrud>
2019-11-25 0:37 ` Frank Ch. Eigler
2019-11-25 16:48 ` ASSI [this message]
2019-11-27 19:38 ` Frank Ch. Eigler
2019-12-08 11:05 ` ASSI
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1810471.5kLSPG0ym2@gertrud \
--to=stromeko@nexgo.de \
--cc=fche@elastic.org \
--cc=jon.turney@dronecode.org.uk \
--cc=overseers@sourceware.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).