Re: [Cygwin] package-grep on Sourceware

public inbox for overseers@sourceware.org
 help / color / mirror / Atom feed

From: ASSI <Stromeko@nexgo.de>
To: "Frank Ch. Eigler" <fche@elastic.org>
Cc: Jon Turney <jon.turney@dronecode.org.uk>, overseers@sourceware.org
Subject: Re: [Cygwin] package-grep on Sourceware
Date: Mon, 25 Nov 2019 16:48:00 -0000	[thread overview]
Message-ID: <1810471.5kLSPG0ym2@gertrud> (raw)
In-Reply-To: <20191125003701.GB1093554@elastic.org>

Hi Frank,

On Monday, November 25, 2019 1:37:01 AM CET Frank Ch. Eigler wrote:
> > you've made changes to the package-grep CGI script on sourceware
> > that pretty much remove its originally intended functionality.  The
> > original was searching the index of the package _contents_ as it
> > would get installed on the disk, while the JSON file you search now
> > only has the metadata for the packages.  That's not even
> > superficially the same thing.  [...]
> 
> OK.  As you may be aware, this simplification was done because the
> previous code's workload was too high (grepping through 600MB+ of text
> per cgi call), and we at overseers@ weren't told otherwise.

I wasn't questioning your motives, that part was pretty clear from the change 
you did.  I would like to understand what characteristic of the workload you 
were trying to avoid as to me it looks like avoiding IO comes at a significant 
cost in either memory or CPU consumption (or both).  I understand that the 
server is a fairly well equipped machine, but not in enough detail to make 
that call.

> > I've looked into what could be done to do that in a more
> > server-friendly way, but it turns out that there either isn't
> > anything that's ready-made for this or I can't find it. [...]
> 
> Thanks for looking into it.  A few other possibilities to
> ameliorate the impact of restoring the previous version:
> 
> - using xargs grep to use our extra cores, to reduce latency

Then I'd suggest to just use ripgrep (is that available?), which automatically 
spreads the work across the available cores (and can be limited to less than 
that if so desired).  Is that available on the machine?  Would it be possible 
to cache the list files in a tmpfs?  This doesn't improve performance on my 
machine, but I have an SSD and no other IO.

> - to impose a mod_qos concurrency limit on package-grep.cgi

I can't really tell what that would mean, but I guess that queries would get 
delayed if too many of them get stacked onto each other?  That would help with 
the load of course.

Regards,
Achim.
-- 
+<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+

Waldorf MIDI Implementation & additional documentation:
http://Synth.Stromeko.net/Downloads.html#WaldorfDocs

next prev parent reply	other threads:[~2019-11-25 16:48 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <24997709.IERGFR1AUN@gertrud>
2019-11-25  0:37 ` Frank Ch. Eigler
2019-11-25 16:48   ` ASSI [this message]
2019-11-27 19:38     ` Frank Ch. Eigler
2019-12-08 11:05       ` ASSI

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1810471.5kLSPG0ym2@gertrud \
    --to=stromeko@nexgo.de \
    --cc=fche@elastic.org \
    --cc=jon.turney@dronecode.org.uk \
    --cc=overseers@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).