public inbox for newlib@sourceware.org
 help / color / mirror / Atom feed
From: Orlando Arias <orlandoarias@gmail.com>
To: "Hans-Bernhard Bröker" <HBBroeker@t-online.de>, newlib@sourceware.org
Subject: Re: Help porting newlib to a new CPU architecture (sorta)
Date: Tue, 6 Jul 2021 16:46:33 -0400	[thread overview]
Message-ID: <2bb2d181-36e4-1cc6-bdff-9eb0aea895ec@gmail.com> (raw)
In-Reply-To: <a9567e2f-33a8-4aa8-e6ab-4f589158a07f@t-online.de>


[-- Attachment #1.1: Type: text/plain, Size: 3629 bytes --]

Greetings,

On 7/6/21 4:01 PM, Hans-Bernhard Bröker wrote:
> Am 06.07.2021 um 21:04 schrieb Orlando Arias:
> 
>> Right, went back and looked at the standard. There is no description of
>> what the abstract machine for the execution environment should be. I
>> guess my confusion came from the second paragraph in [1]. Harvard
>> architectures still have the thing that you have to define whether a
>> pointer refers to something in program space or data space, and standard
>> C has no way of signaling this. 
> 
> You're mixing thing up there.  Standard C has a perfectly fine
> distinction between program space and data space, including pointers
> thereto.  Function pointers and data pointers _are_ distinct.
> 
> What Standard C does lack is a standardized distinction between pointers
> into ROM data and RAM data.  const-qualified pointers may seem like they
> offer that, but ultimately they don't.

Possibly I am explaining myself incorrectly here, and likely I am mixing
terminology, yes. There is also a good likelyhood I am conflating things
as well. If that is the case, my apologies and please feel free to
correct my understanding/terminology. What I mean to say is the
following [through an example].

Consider the AVR architecture, where program and data spaces have
distinct address spaces. We have a pointer to a string literal that
resides in program memory. We wish to compare it to a string that
resides in data memory. We could use a [naive] comparison method, such
as strcpy().

const char* str PROGMEM = "hello";

const char* a = str;
const char* b = data_memory_location;

while(*a != '\0' && *a == *b) {
	a++; b++;
}
return *a - *b;

The problem with this code is that we are treating a as a pointer in
data memory. Declaring a to be PROGMEM does not help. We actually need
to rewrite the code to force the compiler to use the proper instruction:

char t;
while((t = pgm_read_byte(a)) != '\0' && t == *b)
	a++; b++;
}

return t - *b;

We use the pgm_read_byte() macro to issue the LPM instruction, instead
of a regular load instruction. In fact, avr-libc provides a collection
of functions [which can be identified by their suffix _P] for the
particular event where data resides in program memory. For example, we
have strcmp_P(), where the second argument refers to a pointer to data
in the program memory address space.

> 
>> This is what I meant by the von Neumann requirement: all pointers
>> dereference to the same address space. 
> 
> That's stated broadly enough to be wrong.  The C virtual machine is, in
> fact, a Harvard architecture.  It assumes that const and non-const data
> live in the same address space, but that doesn't make it von-Neumann.

Right, so herein lies a problem. A Harvard machine implies that program
and data are in different address spaces. Unless my understanding is
wrong, this means that there is one address bus for data, and one
address bus for instructions. Dereferencing a function pointer and
dereferencing a data pointer would result in dereferencing to different
address spaces. Now, I believe that doing something like (char*)fn_ptr
in C is either undefined behavior or implementation-defined behavior.
However, the implementations I have seen would treat this pointer as
something in data memory, rather than something in program memory.
Actually modifying what fn_ptr points to would require the use of an
extension to the language [which would be implied if the behavior was
indeed UB or implementation defined]. Please correct me on this one.

Cheers,
Orlando.


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

  reply	other threads:[~2021-07-06 20:45 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-06  0:49 ElijaxApps
2021-07-06  4:35 ` Mike Frysinger
2021-07-06 13:05   ` Paul Koning
2021-07-07 20:32     ` ElijaxApps
2021-07-07 20:56       ` Orlando Arias
2021-07-06 14:02   ` Brian Inglis
2021-07-06 14:35     ` Orlando Arias
2021-07-06 18:08       ` Brian Inglis
2021-07-06 19:04         ` Orlando Arias
2021-07-06 20:01           ` Hans-Bernhard Bröker
2021-07-06 20:46             ` Orlando Arias [this message]
2021-07-07  5:45               ` Brian Inglis
2021-07-07 13:58                 ` Orlando Arias
2021-07-07 15:18                   ` Dave Nadler
2021-07-07 18:43               ` Hans-Bernhard Bröker
2021-07-07 20:23                 ` Orlando Arias
2021-07-06 21:08 ElijaxApps
2021-07-06 22:00 ` Joel Sherrill
2021-07-06 23:50   ` Paul Koning
2021-07-07  0:29     ` ElijaxApps
2021-07-07 15:09   ` Grant Edwards

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2bb2d181-36e4-1cc6-bdff-9eb0aea895ec@gmail.com \
    --to=orlandoarias@gmail.com \
    --cc=HBBroeker@t-online.de \
    --cc=newlib@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).