public inbox for bunsen@sourceware.org
 help / color / mirror / Atom feed
From: "Serhei Makarov" <me@serhei.io>
To: Bunsen <bunsen@sourceware.org>
Subject: Re: bunsen (re)design discussion #2: Repository Layout, SQLite design idea + Django-esque API(?)
Date: Sat, 19 Mar 2022 22:59:13 -0400	[thread overview]
Message-ID: <c7d926d8-1da9-4379-bf01-67ec8a478ae2@www.fastmail.com> (raw)
In-Reply-To: <1796a6e2-2b2a-49a7-b350-9d58700d3e30@www.fastmail.com>

> This is a subject of ongoing discussion with fche, because we have
> quite different opinions about what is convenient to keep in Git vs
> SQLite, how flexible the Git layout should be etc. In principle, the
> analysis and data representation will work more or less the same
> regardless of what solution we settle on. Therefore, the repository
> layout can be made configurable (and potentially the configurability
> can be reduced down the line as we settle on what makes sense and what
> doesn't).

Git+JSON (the current format)
- Good for cloning
   (git clone + git pull, options for shallow cloning,
    cloning subsets of branches)

SQLite
- Awful for cloning
   (essentially, requires redoing the parse on the receiving end,
    or returning data to the Git+JSON format)
- Likely to be more efficient at certain queries
  - The 'sliding window' analyses that compare nearby points in a history
    would be possible but tedious to express
  - Designing a *stable* schema for analyses to query against would be tricky,
    requiring space-optimized tables to be munged into views
    following an unchanging schema.
    Otherwise any change to optimizations will require embedded SQL
    in analysis scripts to be rewritten.

Given the toss-up and potential complementary strengths,
it would be best to have a way to support both formats and experiment
extensively.

The configuration file could allow options as follows:

[core]
git_repo = ... // stores both testlogs and testruns
testlogs_repo = ... // testlogs in Git format with properly named branches
testruns_repo = ... // testruns in JSON format
testruns_db = ... // enables data to be stored in SQLite DB
cache_db = ... // can be the same as cache_db

(Obviously, not all at once. This is meant to capture the possible use cases.)

Then we could also add options per-project:

[project "systemtap"]
raw_testlogs_repo = ... // testlogs in Git format with any branches, for importing
parse_module = systemtap.parse_dejagnu

[project "systemtap-contrib"]
testlogs_repo = ... // archive testruns in a separate repo (or db) from the main repo

I'll think a bit more about the configuration format in light of the 'Makefile' model.

  reply	other threads:[~2022-03-20  2:59 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-17 21:23 Serhei Makarov
2022-03-20  2:59 ` Serhei Makarov [this message]
2022-03-21 19:45   ` Frank Ch. Eigler
2022-03-21 20:23     ` Serhei Makarov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c7d926d8-1da9-4379-bf01-67ec8a478ae2@www.fastmail.com \
    --to=me@serhei.io \
    --cc=bunsen@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).