public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
From: Mike Frysinger <vapier@gentoo.org>
To: Andrew Burgess <aburgess@redhat.com>
Cc: jaydeep.patil@imgtec.com, gdb-patches@sourceware.org,
	joseph.faulls@imgtec.com, bhushan.attarde@imgtec.com
Subject: Re: [PATCH v2 1/3] [sim/riscv] Add basic semi-hosting support
Date: Mon, 18 Dec 2023 18:06:19 -0500	[thread overview]
Message-ID: <ZYDQa8jVST6MuvUL@vapier> (raw)
In-Reply-To: <87sf3z4r4o.fsf@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 8304 bytes --]

On 18 Dec 2023 12:44, Andrew Burgess wrote:
> Mike Frysinger <vapier@gentoo.org> writes:
> > On 12 Dec 2023 17:24, Andrew Burgess wrote:
> >> Mike Frysinger <vapier@gentoo.org> writes:
> >> > On 30 Oct 2023 13:00, jaydeep.patil@imgtec.com wrote:
> >> >> Added support for basic semi-hosting calls OPEN, EXIT and GET_CMDLINE.
> >> >
> >> > what host environment are you implementing ?  none of the syscalls you've
> >> > defined match what have long been in the riscv libgloss port.  those are
> >> > the only ones i'd really expect at this point to be wired up.
> >> 
> >> This would be the RISC-V semihosting target, which is included in
> >> newlib (since 2020), but is separate to libgloss.
> >
> > included where exactly ?  newlib/libc/machine/riscv/ has no syscalls afaict.
> > the word "semi" doesn't really appear anywhere in the codebase.
> 
> I was referring to newlib the project rather than newlib the libc
> library.  Which you figured out once you did a grep ...

some ports put their syscalls under newlib/.  it was more common in the past,
but the point of libgloss was to pull such things out.  it's why i was precise
when i said "libgloss" and not generically "newlib".  new ports should not be
putting things like syscalls into newlib, they should be in libgloss.

the exact path is important because we automate importing of syscalls from
libgloss & newlib (see sim/common/gennltvals.py), and handcoding the table in
the sim is not what we want at all.

> > if you're referring to this commit, it's in libgloss, not newlib.
> > https://sourceware.org/git/?p=newlib-cygwin.git;a=commit;h=865cd30dcc2f00c81c8b3624a9f3464138cd24a5
> 
> Yeah, it's a bit "yuck".  The commit lives in the libgloss directory,
> but actually adds a parallel set of build rules that create a
> libsemihost.a as a separate thing from libgloss.a, hence my "separate to
> libgloss" comment.  Which I'll argue is both correct and incorrect at
> the same time (correct: it's a separate library, incorrect: it's within
> the libgloss part of the newlib project tree).

the output of the libgloss/ tree is not standard by any means.  while riscv
creates a literal "libgloss.a" file, that is not the most common behavior.
considering the mess and complete lack of agreement between gcc specs and
the libgloss project outputs, better to be precise here.

> > where exactly is the riscv semihost standard defined such that people are
> > implementing the same API (source files) & ABI (the stub that processes the
> > ebreak calls) ?
> 
> This is where things get even more iffy.  As best I can tell the RISC-V
> semihosting spec is here:
> 
>   https://github.com/riscv-software-src/riscv-semihosting/blob/main/riscv-semihosting-spec.adoc
> 
> But this looks far from complete to me.  The commit that added
> semihosting to newlib (the project) can be found here:
> 
>   https://inbox.sourceware.org/newlib/694d497b-bc07-a3ba-2643-a7336927e9a7@embecosm.com/
> 
> Comments in that thread seem to indicate that the situation is more of a
> ad hoc standard, with (maybe) openocd being a first reference
> implementation?  However, it is supported by QEMU with commit
> a10b9d93ecea0a8 (though I think there are additional follow up commits
> relating to this topic) so there are definitely other simulators
> supporting this out there.

the rsicv-semihosting spec is real rough.  ignoring the opening paragraph:
> This is a Discussion Document: Assume everything can change. This document is not complete yet and was created only for the purpose of conversation outside of the document. See http://riscv.org/spec-state for more details.

it merely defines part of the calling standard.  basically how to trigger the
semihost.  it doesn't cover the actual calling convention (where do extra args
go), the syscall ABI (numbers/arguments/etc...), or how errors are passed back
(in the return register ?  sep location ?  what are the error numbers ?).  or
how are non-basic types defined & passed (e.g. is off_t always 32-bit ?  how
are 64-bit values passed on a 32-bit system ?).  padding ?  alignment ?

if there's disagreements/bugs in implementations, how do we decide which is the
"correct" behavior ?  if someone wants to add another syscall, who decides ?

the libgloss port seems like "written to work against whatever reference
implementation they had available" rather than to/with a spec.

the qemu commit also lacks details beyond the bare min spec you linked.  it
makes it sound like riscv is just reusing the semihost syscall numbers that
arm already has ?  is it just copying & pasting the entire arm ABI then ?
if that's the case, it screams even more "don't put it in riscv/".  we have
arm & aarch64 ports which could also benefit.

to compare, QEMU links to this page for ARM:
https://github.com/ARM-software/abi-aa/blob/main/semihosting/semihosting.rst
that's an actual spec -- 1500 lines of detailed content.

> >> Where libgloss syscalls use ecall, the semihosting uses ebreak with two
> >> specific nop instructions (one before, one after the ebreak).
> >
> > the calling convention doesn't really matter to the sim.  it can do either.
> >
> > the question is whether we want to support them simultaneously, or only one
> > at a time.  what are other stubs doing ?
> 
> I haven't looked.  I'll leave that as an exercise for Jaydeep.  I'd be
> open to supporting both simultaneously as a first cut ... but I guess if
> you feel strongly then I'm not against making it a requirement for this
> to be switchable... it feels like that should be simple enough.

if they're both actually developed/used, i'm fine with both.  i'm not keen on
supporting the treadmill of "new group wants new feature, designs new thing in
isolation, adds another syscall ABI".  we're left holding the bag of dead ABIs
no one cares about.  so when someone shows up with an under-specified new thing,
i'm automatically suspicious.

> >> Do you see any reason why we can't support both of these syscall
> >> libraries?  Potentially we could have a switch to select between them,
> >> but I'm inclined to say we should just support both until someone comes
> >> with a use-case where supporting both is a bad idea...  But maybe you
> >> have some deeper insights here.
> >
> > supporting them both isn't a problem.  the port, as written, isn't following
> > our existing conventions for integrating with syscalls, but before i went down
> > that hole, i wanted to understand at a higher level the diff between the two.
> > the patch def needs a lot of work either way.
> 
> For my own education; the ECALL handling does make use of sim_syscall.
> When you say: "the port, as written, isn't following our existing
> conventions for integrating with syscalls", do you mean the new
> semihosting port?  Or are you saying the (pre-patch) existing code is
> also not using the standard sim syscall mechanism?
> 
> The problem I see (at a quick glance) with the semihosting API is that
> the syscall arguments are read from memory, while the existing syscall
> mechanism seems to assume syscall arguments are in registers.  My
> initial thoughts were that getting the semihosting support to use the
> existing mechanism might not be straight forward.

i'm referring to the new semihost code in this thread.

the sim_syscall is not perfect by any means, but i'd like to focus on improving
the existing core rather than each port & syscall ABI doing it themselves.  we
end up with each port working diff depending on the host OS, and having to copy
and paste fixes/improvements between them.  i'm a bit guilty of this myself with
the Blackfin port, but in my defense, i wasn't as familiar with the codebase.

sim_syscall itself is a convenience if you have arguments in registers.  if they
are in memory, then you can unpack before calling sim_syscall_multi yourself.

other problems at a glance (as i haven't gone too deep yet):
* isn't respecting --environment framework (existing code needs fixing too)
* hardcoding syscall numbers instead of using our framework to extract
* no error handling afaict
* casting from target addresses/ints to host uintptr_t looks fundamentally wrong
-mike

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2023-12-18 23:06 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-30 13:00 [PATCH v2 0/3] sim: riscv: Compressed instruction simulation and " jaydeep.patil
2023-10-30 13:00 ` [PATCH v2 1/3] [sim/riscv] Add basic " jaydeep.patil
2023-11-29  7:57   ` Mike Frysinger
2023-12-12 17:24     ` Andrew Burgess
2023-12-13  3:43       ` Mike Frysinger
2023-12-18 12:44         ` Andrew Burgess
2023-12-18 23:06           ` Mike Frysinger [this message]
2023-12-19  6:13     ` [EXTERNAL] " Jaydeep Patil
2023-12-20  1:45       ` Mike Frysinger
2023-12-20  8:52         ` Jaydeep Patil
2023-12-12 17:57   ` Andrew Burgess
2023-10-30 13:00 ` [PATCH v2 2/3] [sim/riscv] Add support for compressed integer instruction set jaydeep.patil
2023-11-29  7:58   ` Mike Frysinger
2023-12-19  6:11     ` [EXTERNAL] " Jaydeep Patil
2023-12-20  1:32       ` Mike Frysinger
2023-10-30 13:00 ` [PATCH v2 3/3] [sim/riscv] Add semi-hosting support jaydeep.patil
2023-11-13 12:07 ` [PATCH v2 0/3] sim: riscv: Compressed instruction simulation and " Jaydeep Patil

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZYDQa8jVST6MuvUL@vapier \
    --to=vapier@gentoo.org \
    --cc=aburgess@redhat.com \
    --cc=bhushan.attarde@imgtec.com \
    --cc=gdb-patches@sourceware.org \
    --cc=jaydeep.patil@imgtec.com \
    --cc=joseph.faulls@imgtec.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).