public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* s390 port
@ 2023-01-28 18:51 Paul Edwards
  2023-01-29 13:08 ` Joe Monk
  0 siblings, 1 reply; 26+ messages in thread
From: Paul Edwards @ 2023-01-28 18:51 UTC (permalink / raw)
  To: GCC Development, Joe Monk

[-- Attachment #1: Type: text/plain, Size: 12541 bytes --]

Hi Joe.

Sorry for the delay (1 year and 4 months) in responding
to this. There's a long and sad story as to what caused
the delay, but we're here now.

First of all, Hercules is a very important target. Even
if gcc -m31 only allowed writing above 2 GiB on Hercules,
that would still be an extremely important result, and
justify changing the option to -m32, which is what it
inherently is. Just because some arbitrary hardware
masks bits at 24, or 31, or 32, or fails to even do a
wrap at 64, doesn't alter the inherent fact that GCC
is using 32-bit registers. Not 64. Not 31. Not 24.

They are general purpose registers being used, so both
address and data registers are 32 bits.

If you have poorly-written assembler that only works if
addresses are being masked to 24 bits, then there would
be some justification in referring to that as a 24-bit
program.

If you have poorly-written assembler that only works if
addresses are being masked to 31 bits, then there would
be some justification in referring to that as a 31-bit
program.

But if you have a program that works in both of those
AMODEs, ie what IBM calls "AMODE ANY", it would be a
bit odd to call it an ANY-bit program, but that would
be the exact name you need if you want to continue
along that path. And an ANY-including-32-bit program
if it also capable of running as AM32 on any real,
emulated, or theoretical environment.

If you have a poorly-written operating system (like z/OS),
that doesn't provide address masking (via DAT) to 32 bits
for 32-bit programs, so your only option is to run them
as AM31, where negative indexes work, or only run programs
that don't use negative indexes (and ensure that the
high 32 bits of 64 bit registers are 0), then there would
be justification in calling this an AM64-intolerant
program or AM64-tolerant program, respectively.

z/OS has an additional problem that even in AM64, and
even with an AM64-tolerant 32-bit program, there is no
way to request memory in the 2 GiB - 4 GiB region other
than via crapshoot (use_2g_to_32g or whatever), and
even if you win the crapshoot, you can't have a nice
display of the 2 GiB boundary being crossed in a single
instruction. You could if you switched to supervisor
mode/key zero and didn't mind clobbering what was already
there, but you would probably still need to switch DAT off.
And then because you don't know what damage you have
done, you would need to freeze the system and re-IPL.

Instead of attempting that, what I did was use a
properly-written OS, z/PDOS, that uses DAT (virtual
memory) to map the 4 GiB to 8 GiB region to 0 to 4 GiB,
so that even in AM64, you effectively get AM32. This is
the proper way to handle memory when you run 32-bit
programs on a 64-bit system. 32 and 64-bit programs
can run transparently with no mode switching required.
The 4 GiB to 8 GiB virtual storage region is effectively
dead.

It is only used for negative indexes, which are a
fundmanental part of indexed addressing. Even positive
indexes need wrapping. E.g. if you have an address at
the 3.5 GiB mark and you wish to access memory at the
0.5 GiB mark, you would use a positive index of 1 GiB
to get there. On an AM64 system, without a 32-bit mode
in effect, this would index to location 4.5 GiB without
an appropriate DAT mapping.

Note that the index that would do such a thing may be
in a variable (register) that is only known at runtime,
so it is not something that you can change GCC to stop
generating, and I was wrong to ask for that (for years).

So, with that said, I have been able to satisfy your
challenge, using real hardware. A real z114 using a
real 3270 terminal. You can see that beautiful terminal here:
https://groups.io/g/hercules-380/message/2391https://groups.io/g/hercules-380/message/2392

The second photo of the first link shows the CPU (2818)

z114 = 2818-M05/M10

I can obtain a picture of the sticker if needed.

No Hercules in sight.

You could move the goal posts and say that running under
z/VM doesn't count either.

If you do that, I can run z/PDOS directly on an LPAR
and run the memory test (in fact, this has already
been done), but we don't know the procedure (and may
not have permission) to use the HMC to display memory.
z/PDOS can display its own memory, and this can show
that the memory at 80000000 is different from location 0,
if you accept z/PDOS reporting itself.

But z/VM is the more "independent" way of displaying
memory, so that there is no chance that z/PDOS can "cheat".

Here is the test code in z/PDOS:

        else if (memcmp(prog, "MEMTEST", 7) == 0)
        {
            printf("writing 4 bytes to address X'7FFFFFFE'\n");
            memcpy((char *)0x7ffffffe, "\x01\x02\x03\x04", 4);
            printf("done!\n");
            *pdos->context->postecb = 0;
            pdos->context->regs[15] = 0;
        }

and the memcpy generates a single MVC instruction:

         MVC   0(4,2),0(3)

Note that MVC is an instruction that has been available
since the S/360 (in the 1960s). I am actually using the
i370 target of GCC 3.2.3 for this test, but the principle
is the same for s390 (as opposed to s390x) on the latest
GCC. Both are 32-bit.

Note that the i370 target was written by Jan Stein in 1989
when he worked at Amdahl, long before AM64 existed.

It only used S/370 instructions, so runs on anything from
a S/370 up (thanks to upward compatibility).

That MVC instruction works perfectly fine on z/Arch, as it
does on S/370.

Other instructions generated by GCC, such as BALR, have
changed behavior slightly as they went from AM24 on S/370
to AM31 on S/370 XA, and AM64 on z/Arch (and for that
matter, AM32 on S/380 under Hercules/380, or I assume
AM32 on a 360/67).

The behavior changed in an upwardly-compatible way, so long
as the program was written in a reasonable manner - ie to
not be deliberately dependent on that AM24 or AM31 specific
behavior. The code GCC generates has indeed been written
in that "reasonable manner".

Other instructions, such as BXLE, that, for certain use
cases, break down at the top end of the lower half of the
32-bit address space, just as BXLEG breaks down at the
top end of the lower half of the 64-bit address space, are
not generated by GCC at all, so are not relevant.

Bottom line - GCC generates 32-bit clean code, and as such,
the option should be -m32, not -m31, not -m24, not -mANY.
Keeping -m31 for compatibility reasons is obviously fine,
as would be adding -m24. But both of those things obscure
the fact that this is 32-bit clean code.

Here is the rest of the context of the generated code:

         MVC   88(4,13),=A(@@LC33)
         LA    1,88(,13)
         L     15,=A(@@7)
         BALR  14,15
         L     3,=A(@@LC34)
         L     2,=F'2147483646'
         MVC   0(4,2),0(3)
         MVC   88(4,13),=A(@@LC35)
         LA    1,88(,13)
         L     15,=A(@@7)
         BALR  14,15


@@LC32   EQU   *
         DC    C'MEMTEST'
         DC    X'0'
@@LC33   EQU   *
         DC    C'writing 4 bytes to address X''7FFFFFFE'''
         DC    X'15'
         DC    X'0'
@@LC34   EQU   *
         DC    X'1'
         DC    X'2'
         DC    X'3'
         DC    X'4'
         DC    X'0'
@@LC35   EQU   *
         DC    C'done!'
         DC    X'15'
         DC    X'0'

As you can see from the photo of the real 3270 terminal,
that MVC instruction has successfully straddled the
2 GiB mark, even in a single instruction.

As you can see from the photo in the second link above,
the memory at location 0 is different (still contains
the IPL PSW!) from the memory at location x'80000000'.

Do you have any further objections, other than a logical
fallacy such as argumentum ad populum or argumentum ad
baculum, to oppose gcc having -m32 as an option for the
S/390 target, or if the i370 code is added back in, for
that too, given that that is the correct technical nature
of the GCC-generated code?

Thanks. Paul.




"Simply switching off optimization made the negative
indexes go away, allowing more than 2 GiB to be
addressed in standard z/Arch, with "-m31".

Prove it on real hardware, not hercules. Hercules doesnt count.

Joe

On Wed, Sep 29, 2021 at 7:09 PM Paul Edwards via Gcc <gcc@gcc.gnu.org>
wrote:

>* We have fait accompli now:
*>>* https://gcc.gnu.org/pipermail/gcc/2021-September/237456.html
<https://gcc.gnu.org/pipermail/gcc/2021-September/237456.html>
*>>* Simply switching off optimization made the negative
*>* indexes go away, allowing more than 2 GiB to be
*>* addressed in standard z/Arch, with "-m31".
*>>* The above request is to add "-m32" as an alias for
*>* "-m31", but I would like to add as a request for it to
*>* work with optimization on.
*>>* BFN. Paul.
*>>>>>* -----Original Message-----
*>* From: Paul Edwards
*>* Sent: Friday, September 3, 2021 11:12 PM
*>* To: Jakub Jelinek
*>* Cc: Ulrich Weigand ; gcc@gcc.gnu.org <gcc@gcc.gnu.org> ; Ulrich Weigand
*>* Subject: Re: s390 port
*>>* >> > This is not in one single place, but spread throughout the
*>* >> > compiler, both common code and back-end.  I do not think it will
*>* >> > be possible to get the compiler to generate correct code if
*>* >> > you do not specify the address size correctly.
*>>* >> 1. Is there any way to put a constraint on index
*>* >> registers, to say that a particular machine can
*>* >> only index in the range of –512 to +512 or some
*>* >> other arbitrary set? If so, I can do 0 to 2 GiB.
*>>* >> 2. Is there a way of saying a machine doesn’t
*>* >> support indexing at all?
*>>* > There is a way to do that, but it isn't about changing a single or a
*>* > couple
*>* > of spots, one needs to change a lot of *.md patterns, a lot of macros,
*>* > target hooks and as Ulrich said, most important is to use the right Pmode
*>* > which can differ from ptr_mode provided one e.g. defines ptr_extend
*>* > pattern
*>* > etc.
*>>* Pardon? All that is required just to put a constraint
*>* on an index register? If a range of a machine is
*>* limited to -512 to +512, it shouldn't be necessary
*>* to change md patterns etc etc.
*>>* > Just look at the amount of work needed for the x32 or aarch64 ilp32
*>* > support,
*>>* That's different. That's because Intel stuffed up.
*>* IBM didn't. IBM came within an ace of a perfect
*>* architecture. It's as if Intel had created an x32
*>* instead of an 80386 in 1986.
*>>* IBM got it almost right in the 1960s.
*>>* > and not just work spent one time on adding that support, but the
*>* > continuous
*>* > amount of work on maintaining it.  The initial work is certainly a few
*>* > weeks if not months of work,
*>>* I've been trying to figure out how to lift the 31-bit
*>* restriction on mainframes since around 1987.
*>>* If I have to pay someone for 2 month of work, at
*>* this stage, I'm willing to do that, but:
*>>* 1. I would like it done on GCC 3.2.3 plus maybe
*>* GCC 3.4.6.
*>>* 2. How much will it cost in US$?
*>>* > then there needs to be somebody who regularly
*>* > tests gcc trunk and branches in such configuration so that it doesn't
*>* > bitrot, and not just that but somebody who actually fixes bugs in it.
*>>* I'll take responsibility for giving the GCC 3.X.X
*>* releases the TLC they deserve. And I'll encourage
*>* my daughter to maintain them after I've kicked
*>* the bucket.
*>>* > If something doesn't fit into 2GB of address space,
*>* > isn't it likely it won't fit into 4GB of address space
*>* > in a year or two?
*>>* Nope. 2 GiB is already a shitload of memory. It only
*>* takes something like 23 MB for GCC 3.2.3 to recompile
*>* itself, and I think 60 MB for GCC 3.4.6 to recompile
*>* itself. That's the heaviest real workload I do. A 4 GiB
*>* limitation instead of 2 GiB makes it just that much
*>* less likely I'll ever hit a real limit.
*>>* Someone told me that the only non-scientific application
*>* they knew of that came close to hitting the 2 GiB limit
*>* was IBM's C compiler. I doubt that IBM's C compiler
*>* technology is evolving at such a rate that it only takes
*>* 1-2 years for them to subsequently hit 4 GiB. Quite
*>* apart from the fact that I don't really trust that even
*>* IBM C is hitting a 2 GiB limit for what GCC can do in
*>* 23 MiB. But it could be true - I'm not familiar with
*>* compiler internals.
*>>* BFN. Paul.
*>>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: s390 port
  2023-01-28 18:51 s390 port Paul Edwards
@ 2023-01-29 13:08 ` Joe Monk
  2023-01-29 14:30   ` Paul Edwards
  0 siblings, 1 reply; 26+ messages in thread
From: Joe Monk @ 2023-01-29 13:08 UTC (permalink / raw)
  To: Paul Edwards; +Cc: GCC Development

[-- Attachment #1: Type: text/plain, Size: 13782 bytes --]

"A 24-bit or 31-bit virtual address is expanded to 64 bits by appending 40
or 33 zeros, respectively, on the left before it is translated by means of
the DAT process, and a 24-bit or 31-bit real address is similarly expanded
to 64 bits before it is transformed by prefixing. A 24-bit or 31-bit
absolute address is expanded to 64 bits before main storage is accessed."
IBM z/Arch POO page 3-6.

I dont see 32 bits anywhere in that process. Unless and until IBM changes
the architecture definition to include 32 bits in address sizes, there is
no need for a -m32 switch.

Joe

On Sat, Jan 28, 2023 at 12:51 PM Paul Edwards <mutazilah@gmail.com> wrote:

> Hi Joe.
>
> Sorry for the delay (1 year and 4 months) in responding
> to this. There's a long and sad story as to what caused
> the delay, but we're here now.
>
> First of all, Hercules is a very important target. Even
> if gcc -m31 only allowed writing above 2 GiB on Hercules,
> that would still be an extremely important result, and
> justify changing the option to -m32, which is what it
> inherently is. Just because some arbitrary hardware
> masks bits at 24, or 31, or 32, or fails to even do a
> wrap at 64, doesn't alter the inherent fact that GCC
> is using 32-bit registers. Not 64. Not 31. Not 24.
>
> They are general purpose registers being used, so both
> address and data registers are 32 bits.
>
> If you have poorly-written assembler that only works if
> addresses are being masked to 24 bits, then there would
> be some justification in referring to that as a 24-bit
> program.
>
> If you have poorly-written assembler that only works if
> addresses are being masked to 31 bits, then there would
> be some justification in referring to that as a 31-bit
> program.
>
> But if you have a program that works in both of those
> AMODEs, ie what IBM calls "AMODE ANY", it would be a
> bit odd to call it an ANY-bit program, but that would
> be the exact name you need if you want to continue
> along that path. And an ANY-including-32-bit program
> if it also capable of running as AM32 on any real,
> emulated, or theoretical environment.
>
> If you have a poorly-written operating system (like z/OS),
> that doesn't provide address masking (via DAT) to 32 bits
> for 32-bit programs, so your only option is to run them
> as AM31, where negative indexes work, or only run programs
> that don't use negative indexes (and ensure that the
> high 32 bits of 64 bit registers are 0), then there would
> be justification in calling this an AM64-intolerant
> program or AM64-tolerant program, respectively.
>
> z/OS has an additional problem that even in AM64, and
> even with an AM64-tolerant 32-bit program, there is no
> way to request memory in the 2 GiB - 4 GiB region other
> than via crapshoot (use_2g_to_32g or whatever), and
> even if you win the crapshoot, you can't have a nice
> display of the 2 GiB boundary being crossed in a single
> instruction. You could if you switched to supervisor
> mode/key zero and didn't mind clobbering what was already
> there, but you would probably still need to switch DAT off.
> And then because you don't know what damage you have
> done, you would need to freeze the system and re-IPL.
>
> Instead of attempting that, what I did was use a
> properly-written OS, z/PDOS, that uses DAT (virtual
> memory) to map the 4 GiB to 8 GiB region to 0 to 4 GiB,
> so that even in AM64, you effectively get AM32. This is
> the proper way to handle memory when you run 32-bit
> programs on a 64-bit system. 32 and 64-bit programs
> can run transparently with no mode switching required.
> The 4 GiB to 8 GiB virtual storage region is effectively
> dead.
>
> It is only used for negative indexes, which are a
> fundmanental part of indexed addressing. Even positive
> indexes need wrapping. E.g. if you have an address at
> the 3.5 GiB mark and you wish to access memory at the
> 0.5 GiB mark, you would use a positive index of 1 GiB
> to get there. On an AM64 system, without a 32-bit mode
> in effect, this would index to location 4.5 GiB without
> an appropriate DAT mapping.
>
> Note that the index that would do such a thing may be
> in a variable (register) that is only known at runtime,
> so it is not something that you can change GCC to stop
> generating, and I was wrong to ask for that (for years).
>
> So, with that said, I have been able to satisfy your
> challenge, using real hardware. A real z114 using a
> real 3270 terminal. You can see that beautiful terminal here:
> https://groups.io/g/hercules-380/message/2391https://groups.io/g/hercules-380/message/2392
>
> The second photo of the first link shows the CPU (2818)
>
> z114 = 2818-M05/M10
>
> I can obtain a picture of the sticker if needed.
>
> No Hercules in sight.
>
> You could move the goal posts and say that running under
> z/VM doesn't count either.
>
> If you do that, I can run z/PDOS directly on an LPAR
> and run the memory test (in fact, this has already
> been done), but we don't know the procedure (and may
> not have permission) to use the HMC to display memory.
> z/PDOS can display its own memory, and this can show
> that the memory at 80000000 is different from location 0,
> if you accept z/PDOS reporting itself.
>
> But z/VM is the more "independent" way of displaying
> memory, so that there is no chance that z/PDOS can "cheat".
>
> Here is the test code in z/PDOS:
>
>         else if (memcmp(prog, "MEMTEST", 7) == 0)
>         {
>             printf("writing 4 bytes to address X'7FFFFFFE'\n");
>             memcpy((char *)0x7ffffffe, "\x01\x02\x03\x04", 4);
>             printf("done!\n");
>             *pdos->context->postecb = 0;
>             pdos->context->regs[15] = 0;
>         }
>
> and the memcpy generates a single MVC instruction:
>
>          MVC   0(4,2),0(3)
>
> Note that MVC is an instruction that has been available
> since the S/360 (in the 1960s). I am actually using the
> i370 target of GCC 3.2.3 for this test, but the principle
> is the same for s390 (as opposed to s390x) on the latest
> GCC. Both are 32-bit.
>
> Note that the i370 target was written by Jan Stein in 1989
> when he worked at Amdahl, long before AM64 existed.
>
> It only used S/370 instructions, so runs on anything from
> a S/370 up (thanks to upward compatibility).
>
> That MVC instruction works perfectly fine on z/Arch, as it
> does on S/370.
>
> Other instructions generated by GCC, such as BALR, have
> changed behavior slightly as they went from AM24 on S/370
> to AM31 on S/370 XA, and AM64 on z/Arch (and for that
> matter, AM32 on S/380 under Hercules/380, or I assume
> AM32 on a 360/67).
>
> The behavior changed in an upwardly-compatible way, so long
> as the program was written in a reasonable manner - ie to
> not be deliberately dependent on that AM24 or AM31 specific
> behavior. The code GCC generates has indeed been written
> in that "reasonable manner".
>
> Other instructions, such as BXLE, that, for certain use
> cases, break down at the top end of the lower half of the
> 32-bit address space, just as BXLEG breaks down at the
> top end of the lower half of the 64-bit address space, are
> not generated by GCC at all, so are not relevant.
>
> Bottom line - GCC generates 32-bit clean code, and as such,
> the option should be -m32, not -m31, not -m24, not -mANY.
> Keeping -m31 for compatibility reasons is obviously fine,
> as would be adding -m24. But both of those things obscure
> the fact that this is 32-bit clean code.
>
> Here is the rest of the context of the generated code:
>
>          MVC   88(4,13),=A(@@LC33)
>          LA    1,88(,13)
>          L     15,=A(@@7)
>          BALR  14,15
>          L     3,=A(@@LC34)
>          L     2,=F'2147483646'
>          MVC   0(4,2),0(3)
>          MVC   88(4,13),=A(@@LC35)
>          LA    1,88(,13)
>          L     15,=A(@@7)
>          BALR  14,15
>
>
> @@LC32   EQU   *
>          DC    C'MEMTEST'
>          DC    X'0'
> @@LC33   EQU   *
>          DC    C'writing 4 bytes to address X''7FFFFFFE'''
>          DC    X'15'
>          DC    X'0'
> @@LC34   EQU   *
>          DC    X'1'
>          DC    X'2'
>          DC    X'3'
>          DC    X'4'
>          DC    X'0'
> @@LC35   EQU   *
>          DC    C'done!'
>          DC    X'15'
>          DC    X'0'
>
> As you can see from the photo of the real 3270 terminal,
> that MVC instruction has successfully straddled the
> 2 GiB mark, even in a single instruction.
>
> As you can see from the photo in the second link above,
> the memory at location 0 is different (still contains
> the IPL PSW!) from the memory at location x'80000000'.
>
> Do you have any further objections, other than a logical
> fallacy such as argumentum ad populum or argumentum ad
> baculum, to oppose gcc having -m32 as an option for the
> S/390 target, or if the i370 code is added back in, for
> that too, given that that is the correct technical nature
> of the GCC-generated code?
>
> Thanks. Paul.
>
>
>
>
> "Simply switching off optimization made the negative
> indexes go away, allowing more than 2 GiB to be
> addressed in standard z/Arch, with "-m31".
>
> Prove it on real hardware, not hercules. Hercules doesnt count.
>
> Joe
>
> On Wed, Sep 29, 2021 at 7:09 PM Paul Edwards via Gcc <gcc@gcc.gnu.org>
> wrote:
>
> >* We have fait accompli now:
> *>>* https://gcc.gnu.org/pipermail/gcc/2021-September/237456.html <https://gcc.gnu.org/pipermail/gcc/2021-September/237456.html>
> *>>* Simply switching off optimization made the negative
> *>* indexes go away, allowing more than 2 GiB to be
> *>* addressed in standard z/Arch, with "-m31".
> *>>* The above request is to add "-m32" as an alias for
> *>* "-m31", but I would like to add as a request for it to
> *>* work with optimization on.
> *>>* BFN. Paul.
> *>>>>>* -----Original Message-----
> *>* From: Paul Edwards
> *>* Sent: Friday, September 3, 2021 11:12 PM
> *>* To: Jakub Jelinek
> *>* Cc: Ulrich Weigand ; gcc@gcc.gnu.org <gcc@gcc.gnu.org> ; Ulrich Weigand
> *>* Subject: Re: s390 port
> *>>* >> > This is not in one single place, but spread throughout the
> *>* >> > compiler, both common code and back-end.  I do not think it will
> *>* >> > be possible to get the compiler to generate correct code if
> *>* >> > you do not specify the address size correctly.
> *>>* >> 1. Is there any way to put a constraint on index
> *>* >> registers, to say that a particular machine can
> *>* >> only index in the range of –512 to +512 or some
> *>* >> other arbitrary set? If so, I can do 0 to 2 GiB.
> *>>* >> 2. Is there a way of saying a machine doesn’t
> *>* >> support indexing at all?
> *>>* > There is a way to do that, but it isn't about changing a single or a
> *>* > couple
> *>* > of spots, one needs to change a lot of *.md patterns, a lot of macros,
> *>* > target hooks and as Ulrich said, most important is to use the right Pmode
> *>* > which can differ from ptr_mode provided one e.g. defines ptr_extend
> *>* > pattern
> *>* > etc.
> *>>* Pardon? All that is required just to put a constraint
> *>* on an index register? If a range of a machine is
> *>* limited to -512 to +512, it shouldn't be necessary
> *>* to change md patterns etc etc.
> *>>* > Just look at the amount of work needed for the x32 or aarch64 ilp32
> *>* > support,
> *>>* That's different. That's because Intel stuffed up.
> *>* IBM didn't. IBM came within an ace of a perfect
> *>* architecture. It's as if Intel had created an x32
> *>* instead of an 80386 in 1986.
> *>>* IBM got it almost right in the 1960s.
> *>>* > and not just work spent one time on adding that support, but the
> *>* > continuous
> *>* > amount of work on maintaining it.  The initial work is certainly a few
> *>* > weeks if not months of work,
> *>>* I've been trying to figure out how to lift the 31-bit
> *>* restriction on mainframes since around 1987.
> *>>* If I have to pay someone for 2 month of work, at
> *>* this stage, I'm willing to do that, but:
> *>>* 1. I would like it done on GCC 3.2.3 plus maybe
> *>* GCC 3.4.6.
> *>>* 2. How much will it cost in US$?
> *>>* > then there needs to be somebody who regularly
> *>* > tests gcc trunk and branches in such configuration so that it doesn't
> *>* > bitrot, and not just that but somebody who actually fixes bugs in it.
> *>>* I'll take responsibility for giving the GCC 3.X.X
> *>* releases the TLC they deserve. And I'll encourage
> *>* my daughter to maintain them after I've kicked
> *>* the bucket.
> *>>* > If something doesn't fit into 2GB of address space,
> *>* > isn't it likely it won't fit into 4GB of address space
> *>* > in a year or two?
> *>>* Nope. 2 GiB is already a shitload of memory. It only
> *>* takes something like 23 MB for GCC 3.2.3 to recompile
> *>* itself, and I think 60 MB for GCC 3.4.6 to recompile
> *>* itself. That's the heaviest real workload I do. A 4 GiB
> *>* limitation instead of 2 GiB makes it just that much
> *>* less likely I'll ever hit a real limit.
> *>>* Someone told me that the only non-scientific application
> *>* they knew of that came close to hitting the 2 GiB limit
> *>* was IBM's C compiler. I doubt that IBM's C compiler
> *>* technology is evolving at such a rate that it only takes
> *>* 1-2 years for them to subsequently hit 4 GiB. Quite
> *>* apart from the fact that I don't really trust that even
> *>* IBM C is hitting a 2 GiB limit for what GCC can do in
> *>* 23 MiB. But it could be true - I'm not familiar with
> *>* compiler internals.
> *>>* BFN. Paul.
> *>>
>
>
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: s390 port
  2023-01-29 13:08 ` Joe Monk
@ 2023-01-29 14:30   ` Paul Edwards
  0 siblings, 0 replies; 26+ messages in thread
From: Paul Edwards @ 2023-01-29 14:30 UTC (permalink / raw)
  To: Joe Monk; +Cc: GCC Development

[-- Attachment #1: Type: text/plain, Size: 14923 bytes --]

Hi Joe.

They aren't 24 or 31 bit addresses.

All that code I showed was running in AM64. The very
first thing that z/PDOS does when it IPLs is activate
z/Arch mode and enable AM64. That's about 10
assembler instructions and then it's pure 64-bit for
eternity.

Only the lower 32 bits of the 64-bit registers are actually
populated though, because at least at the application
level I am using pure S/370 instructions (from a machine
definition in 1989). Even in the OS (z/PDOS) it's almost
all GCC-generated code, ie S/370 instructions also. All
32-bit registers. All 32-bit clean.

That is why the code ... WORKS.

As opposed to your irrelevant quoting from the POP,
which is ... irrelevant.

BFN. Paul.



On Sun, 29 Jan 2023 at 21:08, Joe Monk <joemonk64@gmail.com> wrote:

> "A 24-bit or 31-bit virtual address is expanded to 64 bits by appending
> 40 or 33 zeros, respectively, on the left before it is translated by means
> of the DAT process, and a 24-bit or 31-bit real address is similarly
> expanded to 64 bits before it is transformed by prefixing. A 24-bit or
> 31-bit absolute address is expanded to 64 bits before main storage is
> accessed." IBM z/Arch POO page 3-6.
>
> I dont see 32 bits anywhere in that process. Unless and until IBM changes
> the architecture definition to include 32 bits in address sizes, there is
> no need for a -m32 switch.
>
> Joe
>
> On Sat, Jan 28, 2023 at 12:51 PM Paul Edwards <mutazilah@gmail.com> wrote:
>
>> Hi Joe.
>>
>> Sorry for the delay (1 year and 4 months) in responding
>> to this. There's a long and sad story as to what caused
>> the delay, but we're here now.
>>
>> First of all, Hercules is a very important target. Even
>> if gcc -m31 only allowed writing above 2 GiB on Hercules,
>> that would still be an extremely important result, and
>> justify changing the option to -m32, which is what it
>> inherently is. Just because some arbitrary hardware
>> masks bits at 24, or 31, or 32, or fails to even do a
>> wrap at 64, doesn't alter the inherent fact that GCC
>> is using 32-bit registers. Not 64. Not 31. Not 24.
>>
>> They are general purpose registers being used, so both
>> address and data registers are 32 bits.
>>
>> If you have poorly-written assembler that only works if
>> addresses are being masked to 24 bits, then there would
>> be some justification in referring to that as a 24-bit
>> program.
>>
>> If you have poorly-written assembler that only works if
>> addresses are being masked to 31 bits, then there would
>> be some justification in referring to that as a 31-bit
>> program.
>>
>> But if you have a program that works in both of those
>> AMODEs, ie what IBM calls "AMODE ANY", it would be a
>> bit odd to call it an ANY-bit program, but that would
>> be the exact name you need if you want to continue
>> along that path. And an ANY-including-32-bit program
>> if it also capable of running as AM32 on any real,
>> emulated, or theoretical environment.
>>
>> If you have a poorly-written operating system (like z/OS),
>> that doesn't provide address masking (via DAT) to 32 bits
>> for 32-bit programs, so your only option is to run them
>> as AM31, where negative indexes work, or only run programs
>> that don't use negative indexes (and ensure that the
>> high 32 bits of 64 bit registers are 0), then there would
>> be justification in calling this an AM64-intolerant
>> program or AM64-tolerant program, respectively.
>>
>> z/OS has an additional problem that even in AM64, and
>> even with an AM64-tolerant 32-bit program, there is no
>> way to request memory in the 2 GiB - 4 GiB region other
>> than via crapshoot (use_2g_to_32g or whatever), and
>> even if you win the crapshoot, you can't have a nice
>> display of the 2 GiB boundary being crossed in a single
>> instruction. You could if you switched to supervisor
>> mode/key zero and didn't mind clobbering what was already
>> there, but you would probably still need to switch DAT off.
>> And then because you don't know what damage you have
>> done, you would need to freeze the system and re-IPL.
>>
>> Instead of attempting that, what I did was use a
>> properly-written OS, z/PDOS, that uses DAT (virtual
>> memory) to map the 4 GiB to 8 GiB region to 0 to 4 GiB,
>> so that even in AM64, you effectively get AM32. This is
>> the proper way to handle memory when you run 32-bit
>> programs on a 64-bit system. 32 and 64-bit programs
>> can run transparently with no mode switching required.
>> The 4 GiB to 8 GiB virtual storage region is effectively
>> dead.
>>
>> It is only used for negative indexes, which are a
>> fundmanental part of indexed addressing. Even positive
>> indexes need wrapping. E.g. if you have an address at
>> the 3.5 GiB mark and you wish to access memory at the
>> 0.5 GiB mark, you would use a positive index of 1 GiB
>> to get there. On an AM64 system, without a 32-bit mode
>> in effect, this would index to location 4.5 GiB without
>> an appropriate DAT mapping.
>>
>> Note that the index that would do such a thing may be
>> in a variable (register) that is only known at runtime,
>> so it is not something that you can change GCC to stop
>> generating, and I was wrong to ask for that (for years).
>>
>> So, with that said, I have been able to satisfy your
>> challenge, using real hardware. A real z114 using a
>> real 3270 terminal. You can see that beautiful terminal here:
>> https://groups.io/g/hercules-380/message/2391https://groups.io/g/hercules-380/message/2392
>>
>> The second photo of the first link shows the CPU (2818)
>>
>> z114 = 2818-M05/M10
>>
>> I can obtain a picture of the sticker if needed.
>>
>> No Hercules in sight.
>>
>> You could move the goal posts and say that running under
>> z/VM doesn't count either.
>>
>> If you do that, I can run z/PDOS directly on an LPAR
>> and run the memory test (in fact, this has already
>> been done), but we don't know the procedure (and may
>> not have permission) to use the HMC to display memory.
>> z/PDOS can display its own memory, and this can show
>> that the memory at 80000000 is different from location 0,
>> if you accept z/PDOS reporting itself.
>>
>> But z/VM is the more "independent" way of displaying
>> memory, so that there is no chance that z/PDOS can "cheat".
>>
>> Here is the test code in z/PDOS:
>>
>>         else if (memcmp(prog, "MEMTEST", 7) == 0)
>>         {
>>             printf("writing 4 bytes to address X'7FFFFFFE'\n");
>>             memcpy((char *)0x7ffffffe, "\x01\x02\x03\x04", 4);
>>             printf("done!\n");
>>             *pdos->context->postecb = 0;
>>             pdos->context->regs[15] = 0;
>>         }
>>
>> and the memcpy generates a single MVC instruction:
>>
>>          MVC   0(4,2),0(3)
>>
>> Note that MVC is an instruction that has been available
>> since the S/360 (in the 1960s). I am actually using the
>> i370 target of GCC 3.2.3 for this test, but the principle
>> is the same for s390 (as opposed to s390x) on the latest
>> GCC. Both are 32-bit.
>>
>> Note that the i370 target was written by Jan Stein in 1989
>> when he worked at Amdahl, long before AM64 existed.
>>
>> It only used S/370 instructions, so runs on anything from
>> a S/370 up (thanks to upward compatibility).
>>
>> That MVC instruction works perfectly fine on z/Arch, as it
>> does on S/370.
>>
>> Other instructions generated by GCC, such as BALR, have
>> changed behavior slightly as they went from AM24 on S/370
>> to AM31 on S/370 XA, and AM64 on z/Arch (and for that
>> matter, AM32 on S/380 under Hercules/380, or I assume
>> AM32 on a 360/67).
>>
>> The behavior changed in an upwardly-compatible way, so long
>> as the program was written in a reasonable manner - ie to
>> not be deliberately dependent on that AM24 or AM31 specific
>> behavior. The code GCC generates has indeed been written
>> in that "reasonable manner".
>>
>> Other instructions, such as BXLE, that, for certain use
>> cases, break down at the top end of the lower half of the
>> 32-bit address space, just as BXLEG breaks down at the
>> top end of the lower half of the 64-bit address space, are
>> not generated by GCC at all, so are not relevant.
>>
>> Bottom line - GCC generates 32-bit clean code, and as such,
>> the option should be -m32, not -m31, not -m24, not -mANY.
>> Keeping -m31 for compatibility reasons is obviously fine,
>> as would be adding -m24. But both of those things obscure
>> the fact that this is 32-bit clean code.
>>
>> Here is the rest of the context of the generated code:
>>
>>          MVC   88(4,13),=A(@@LC33)
>>          LA    1,88(,13)
>>          L     15,=A(@@7)
>>          BALR  14,15
>>          L     3,=A(@@LC34)
>>          L     2,=F'2147483646'
>>          MVC   0(4,2),0(3)
>>          MVC   88(4,13),=A(@@LC35)
>>          LA    1,88(,13)
>>          L     15,=A(@@7)
>>          BALR  14,15
>>
>>
>> @@LC32   EQU   *
>>          DC    C'MEMTEST'
>>          DC    X'0'
>> @@LC33   EQU   *
>>          DC    C'writing 4 bytes to address X''7FFFFFFE'''
>>          DC    X'15'
>>          DC    X'0'
>> @@LC34   EQU   *
>>          DC    X'1'
>>          DC    X'2'
>>          DC    X'3'
>>          DC    X'4'
>>          DC    X'0'
>> @@LC35   EQU   *
>>          DC    C'done!'
>>          DC    X'15'
>>          DC    X'0'
>>
>> As you can see from the photo of the real 3270 terminal,
>> that MVC instruction has successfully straddled the
>> 2 GiB mark, even in a single instruction.
>>
>> As you can see from the photo in the second link above,
>> the memory at location 0 is different (still contains
>> the IPL PSW!) from the memory at location x'80000000'.
>>
>> Do you have any further objections, other than a logical
>> fallacy such as argumentum ad populum or argumentum ad
>> baculum, to oppose gcc having -m32 as an option for the
>> S/390 target, or if the i370 code is added back in, for
>> that too, given that that is the correct technical nature
>> of the GCC-generated code?
>>
>> Thanks. Paul.
>>
>>
>>
>>
>> "Simply switching off optimization made the negative
>> indexes go away, allowing more than 2 GiB to be
>> addressed in standard z/Arch, with "-m31".
>>
>> Prove it on real hardware, not hercules. Hercules doesnt count.
>>
>> Joe
>>
>> On Wed, Sep 29, 2021 at 7:09 PM Paul Edwards via Gcc <gcc@gcc.gnu.org>
>> wrote:
>>
>> >* We have fait accompli now:
>> *>>* https://gcc.gnu.org/pipermail/gcc/2021-September/237456.html <https://gcc.gnu.org/pipermail/gcc/2021-September/237456.html>
>> *>>* Simply switching off optimization made the negative
>> *>* indexes go away, allowing more than 2 GiB to be
>> *>* addressed in standard z/Arch, with "-m31".
>> *>>* The above request is to add "-m32" as an alias for
>> *>* "-m31", but I would like to add as a request for it to
>> *>* work with optimization on.
>> *>>* BFN. Paul.
>> *>>>>>* -----Original Message-----
>> *>* From: Paul Edwards
>> *>* Sent: Friday, September 3, 2021 11:12 PM
>> *>* To: Jakub Jelinek
>> *>* Cc: Ulrich Weigand ; gcc@gcc.gnu.org <gcc@gcc.gnu.org> ; Ulrich Weigand
>> *>* Subject: Re: s390 port
>> *>>* >> > This is not in one single place, but spread throughout the
>> *>* >> > compiler, both common code and back-end.  I do not think it will
>> *>* >> > be possible to get the compiler to generate correct code if
>> *>* >> > you do not specify the address size correctly.
>> *>>* >> 1. Is there any way to put a constraint on index
>> *>* >> registers, to say that a particular machine can
>> *>* >> only index in the range of –512 to +512 or some
>> *>* >> other arbitrary set? If so, I can do 0 to 2 GiB.
>> *>>* >> 2. Is there a way of saying a machine doesn’t
>> *>* >> support indexing at all?
>> *>>* > There is a way to do that, but it isn't about changing a single or a
>> *>* > couple
>> *>* > of spots, one needs to change a lot of *.md patterns, a lot of macros,
>> *>* > target hooks and as Ulrich said, most important is to use the right Pmode
>> *>* > which can differ from ptr_mode provided one e.g. defines ptr_extend
>> *>* > pattern
>> *>* > etc.
>> *>>* Pardon? All that is required just to put a constraint
>> *>* on an index register? If a range of a machine is
>> *>* limited to -512 to +512, it shouldn't be necessary
>> *>* to change md patterns etc etc.
>> *>>* > Just look at the amount of work needed for the x32 or aarch64 ilp32
>> *>* > support,
>> *>>* That's different. That's because Intel stuffed up.
>> *>* IBM didn't. IBM came within an ace of a perfect
>> *>* architecture. It's as if Intel had created an x32
>> *>* instead of an 80386 in 1986.
>> *>>* IBM got it almost right in the 1960s.
>> *>>* > and not just work spent one time on adding that support, but the
>> *>* > continuous
>> *>* > amount of work on maintaining it.  The initial work is certainly a few
>> *>* > weeks if not months of work,
>> *>>* I've been trying to figure out how to lift the 31-bit
>> *>* restriction on mainframes since around 1987.
>> *>>* If I have to pay someone for 2 month of work, at
>> *>* this stage, I'm willing to do that, but:
>> *>>* 1. I would like it done on GCC 3.2.3 plus maybe
>> *>* GCC 3.4.6.
>> *>>* 2. How much will it cost in US$?
>> *>>* > then there needs to be somebody who regularly
>> *>* > tests gcc trunk and branches in such configuration so that it doesn't
>> *>* > bitrot, and not just that but somebody who actually fixes bugs in it.
>> *>>* I'll take responsibility for giving the GCC 3.X.X
>> *>* releases the TLC they deserve. And I'll encourage
>> *>* my daughter to maintain them after I've kicked
>> *>* the bucket.
>> *>>* > If something doesn't fit into 2GB of address space,
>> *>* > isn't it likely it won't fit into 4GB of address space
>> *>* > in a year or two?
>> *>>* Nope. 2 GiB is already a shitload of memory. It only
>> *>* takes something like 23 MB for GCC 3.2.3 to recompile
>> *>* itself, and I think 60 MB for GCC 3.4.6 to recompile
>> *>* itself. That's the heaviest real workload I do. A 4 GiB
>> *>* limitation instead of 2 GiB makes it just that much
>> *>* less likely I'll ever hit a real limit.
>> *>>* Someone told me that the only non-scientific application
>> *>* they knew of that came close to hitting the 2 GiB limit
>> *>* was IBM's C compiler. I doubt that IBM's C compiler
>> *>* technology is evolving at such a rate that it only takes
>> *>* 1-2 years for them to subsequently hit 4 GiB. Quite
>> *>* apart from the fact that I don't really trust that even
>> *>* IBM C is hitting a 2 GiB limit for what GCC can do in
>> *>* 23 MiB. But it could be true - I'm not familiar with
>> *>* compiler internals.
>> *>>* BFN. Paul.
>> *>>
>>
>>
>>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: s390 port
  2021-09-03 12:12                       ` Ulrich Weigand
  2021-09-03 12:38                         ` Paul Edwards
@ 2022-12-20  4:27                         ` Paul Edwards
  1 sibling, 0 replies; 26+ messages in thread
From: Paul Edwards @ 2022-12-20  4:27 UTC (permalink / raw)
  To: Ulrich Weigand; +Cc: gcc, Ulrich Weigand

[-- Attachment #1: Type: text/plain, Size: 1931 bytes --]

On Fri, 3 Sept 2021 at 20:12, Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
wrote:

> "Paul Edwards" <mutazilah@gmail.com> wrote on 03.09.2021 13:35:10:
> > >  Specifically, if you try to run AMODE64 with Pmode equals
> > >  SImode, the compiler will not be aware that the hardware
> > >  uses the high 32 bits of base and index registers, and
> > >  will not necessarily keep them zero.
> > The compiler naturally keeps them zero. The
> > instructions that are used to load registers
> > do not pollute the high-order 32 bits.
>
> While this is true for most instructions, the compiler will not
> restrict itself to using only those.  (As just one obvious
> example, the compiler may use "lay" with a negative displacement,
> which will set the high bits of a GPR in AMODE64.)
>
> (And, b.t.w. not the -m31 DImode, which is a pair of 32-bit
> GPRs, but rather the -m64 DImode, which is a single 64-bit GPR.)
>

Hi all.

Turns out I have been asking the wrong question for several years.

I was going to generate a peephole (an idea from the author of
UDOS, now KinnowOS) to detect when a negative index was
being used, and force an addition instead of an index, when I
realized that it wasn't just literals that could use a negative
value.

That is when I realized that negative numbers were perfectly
valid/normal for indexing, and that it is the OS/hardware that
needs to adapt to this reality when transitioning from 32-bit
hardware to 64-bit hardware.

As such, I have updated z/PDOS-32 to use DAT to map the
4 GiB to 8 GiB region to 0 to 4 GiB, so that negative indexing
works fine.

You can download this from http://pdos.org (down the bottom).

So would it be possible now to update gcc to make -m32 and
-m31 and -m24 all work, as they all generate the exact same
code, regrardless of whether you are running as AM24 on
S/370, AM31 on S/390 or AM32 on Hercules/380 or AM64
with DAT set appropriately on z/Arch.

Thanks. Paul.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: s390 port
@ 2021-09-30 21:39 Paul Edwards
  0 siblings, 0 replies; 26+ messages in thread
From: Paul Edwards @ 2021-09-30 21:39 UTC (permalink / raw)
  To: gcc

>> Simply switching off optimization made the negative
>> indexes go away, allowing more than 2 GiB to be
>> addressed in standard z/Arch, with "-m31".

> Prove it on real hardware, not hercules. Hercules doesnt count.

Real mainframe hardware is not easily accessible.
Hercules is the most convenient way people have
of accessing a mainframe. Do you have any reason
to suggest that Hercules doesn't emulate real
hardware in this respect?

BFN. Paul.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: s390 port
  2021-09-30  0:08 Paul Edwards
@ 2021-09-30  0:59 ` Joe Monk
  0 siblings, 0 replies; 26+ messages in thread
From: Joe Monk @ 2021-09-30  0:59 UTC (permalink / raw)
  To: gcc

"Simply switching off optimization made the negative
indexes go away, allowing more than 2 GiB to be
addressed in standard z/Arch, with "-m31".

Prove it on real hardware, not hercules. Hercules doesnt count.

Joe

On Wed, Sep 29, 2021 at 7:09 PM Paul Edwards via Gcc <gcc@gcc.gnu.org>
wrote:

> We have fait accompli now:
>
> https://gcc.gnu.org/pipermail/gcc/2021-September/237456.html
>
> Simply switching off optimization made the negative
> indexes go away, allowing more than 2 GiB to be
> addressed in standard z/Arch, with "-m31".
>
> The above request is to add "-m32" as an alias for
> "-m31", but I would like to add as a request for it to
> work with optimization on.
>
> BFN. Paul.
>
>
>
>
> -----Original Message-----
> From: Paul Edwards
> Sent: Friday, September 3, 2021 11:12 PM
> To: Jakub Jelinek
> Cc: Ulrich Weigand ; gcc@gcc.gnu.org ; Ulrich Weigand
> Subject: Re: s390 port
>
> >> > This is not in one single place, but spread throughout the
> >> > compiler, both common code and back-end.  I do not think it will
> >> > be possible to get the compiler to generate correct code if
> >> > you do not specify the address size correctly.
>
> >> 1. Is there any way to put a constraint on index
> >> registers, to say that a particular machine can
> >> only index in the range of –512 to +512 or some
> >> other arbitrary set? If so, I can do 0 to 2 GiB.
>
> >> 2. Is there a way of saying a machine doesn’t
> >> support indexing at all?
>
> > There is a way to do that, but it isn't about changing a single or a
> > couple
> > of spots, one needs to change a lot of *.md patterns, a lot of macros,
> > target hooks and as Ulrich said, most important is to use the right Pmode
> > which can differ from ptr_mode provided one e.g. defines ptr_extend
> > pattern
> > etc.
>
> Pardon? All that is required just to put a constraint
> on an index register? If a range of a machine is
> limited to -512 to +512, it shouldn't be necessary
> to change md patterns etc etc.
>
> > Just look at the amount of work needed for the x32 or aarch64 ilp32
> > support,
>
> That's different. That's because Intel stuffed up.
> IBM didn't. IBM came within an ace of a perfect
> architecture. It's as if Intel had created an x32
> instead of an 80386 in 1986.
>
> IBM got it almost right in the 1960s.
>
> > and not just work spent one time on adding that support, but the
> > continuous
> > amount of work on maintaining it.  The initial work is certainly a few
> > weeks if not months of work,
>
> I've been trying to figure out how to lift the 31-bit
> restriction on mainframes since around 1987.
>
> If I have to pay someone for 2 month of work, at
> this stage, I'm willing to do that, but:
>
> 1. I would like it done on GCC 3.2.3 plus maybe
> GCC 3.4.6.
>
> 2. How much will it cost in US$?
>
> > then there needs to be somebody who regularly
> > tests gcc trunk and branches in such configuration so that it doesn't
> > bitrot, and not just that but somebody who actually fixes bugs in it.
>
> I'll take responsibility for giving the GCC 3.X.X
> releases the TLC they deserve. And I'll encourage
> my daughter to maintain them after I've kicked
> the bucket.
>
> > If something doesn't fit into 2GB of address space,
> > isn't it likely it won't fit into 4GB of address space
> > in a year or two?
>
> Nope. 2 GiB is already a shitload of memory. It only
> takes something like 23 MB for GCC 3.2.3 to recompile
> itself, and I think 60 MB for GCC 3.4.6 to recompile
> itself. That's the heaviest real workload I do. A 4 GiB
> limitation instead of 2 GiB makes it just that much
> less likely I'll ever hit a real limit.
>
> Someone told me that the only non-scientific application
> they knew of that came close to hitting the 2 GiB limit
> was IBM's C compiler. I doubt that IBM's C compiler
> technology is evolving at such a rate that it only takes
> 1-2 years for them to subsequently hit 4 GiB. Quite
> apart from the fact that I don't really trust that even
> IBM C is hitting a 2 GiB limit for what GCC can do in
> 23 MiB. But it could be true - I'm not familiar with
> compiler internals.
>
> BFN. Paul.
>
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: s390 port
@ 2021-09-30  0:08 Paul Edwards
  2021-09-30  0:59 ` Joe Monk
  0 siblings, 1 reply; 26+ messages in thread
From: Paul Edwards @ 2021-09-30  0:08 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Ulrich Weigand, gcc, Ulrich Weigand

We have fait accompli now:

https://gcc.gnu.org/pipermail/gcc/2021-September/237456.html

Simply switching off optimization made the negative
indexes go away, allowing more than 2 GiB to be
addressed in standard z/Arch, with "-m31".

The above request is to add "-m32" as an alias for
"-m31", but I would like to add as a request for it to
work with optimization on.

BFN. Paul.




-----Original Message----- 
From: Paul Edwards
Sent: Friday, September 3, 2021 11:12 PM
To: Jakub Jelinek
Cc: Ulrich Weigand ; gcc@gcc.gnu.org ; Ulrich Weigand
Subject: Re: s390 port

>> > This is not in one single place, but spread throughout the
>> > compiler, both common code and back-end.  I do not think it will
>> > be possible to get the compiler to generate correct code if
>> > you do not specify the address size correctly.

>> 1. Is there any way to put a constraint on index
>> registers, to say that a particular machine can
>> only index in the range of –512 to +512 or some
>> other arbitrary set? If so, I can do 0 to 2 GiB.

>> 2. Is there a way of saying a machine doesn’t
>> support indexing at all?

> There is a way to do that, but it isn't about changing a single or a 
> couple
> of spots, one needs to change a lot of *.md patterns, a lot of macros,
> target hooks and as Ulrich said, most important is to use the right Pmode
> which can differ from ptr_mode provided one e.g. defines ptr_extend 
> pattern
> etc.

Pardon? All that is required just to put a constraint
on an index register? If a range of a machine is
limited to -512 to +512, it shouldn't be necessary
to change md patterns etc etc.

> Just look at the amount of work needed for the x32 or aarch64 ilp32 
> support,

That's different. That's because Intel stuffed up.
IBM didn't. IBM came within an ace of a perfect
architecture. It's as if Intel had created an x32
instead of an 80386 in 1986.

IBM got it almost right in the 1960s.

> and not just work spent one time on adding that support, but the 
> continuous
> amount of work on maintaining it.  The initial work is certainly a few
> weeks if not months of work,

I've been trying to figure out how to lift the 31-bit
restriction on mainframes since around 1987.

If I have to pay someone for 2 month of work, at
this stage, I'm willing to do that, but:

1. I would like it done on GCC 3.2.3 plus maybe
GCC 3.4.6.

2. How much will it cost in US$?

> then there needs to be somebody who regularly
> tests gcc trunk and branches in such configuration so that it doesn't
> bitrot, and not just that but somebody who actually fixes bugs in it.

I'll take responsibility for giving the GCC 3.X.X
releases the TLC they deserve. And I'll encourage
my daughter to maintain them after I've kicked
the bucket.

> If something doesn't fit into 2GB of address space,
> isn't it likely it won't fit into 4GB of address space
> in a year or two?

Nope. 2 GiB is already a shitload of memory. It only
takes something like 23 MB for GCC 3.2.3 to recompile
itself, and I think 60 MB for GCC 3.4.6 to recompile
itself. That's the heaviest real workload I do. A 4 GiB
limitation instead of 2 GiB makes it just that much
less likely I'll ever hit a real limit.

Someone told me that the only non-scientific application
they knew of that came close to hitting the 2 GiB limit
was IBM's C compiler. I doubt that IBM's C compiler
technology is evolving at such a rate that it only takes
1-2 years for them to subsequently hit 4 GiB. Quite
apart from the fact that I don't really trust that even
IBM C is hitting a 2 GiB limit for what GCC can do in
23 MiB. But it could be true - I'm not familiar with
compiler internals.

BFN. Paul. 


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: s390 port
  2021-09-07  7:21 ` s390 port Joe Monk
@ 2021-09-08  3:46   ` Paul Edwards
  0 siblings, 0 replies; 26+ messages in thread
From: Paul Edwards @ 2021-09-08  3:46 UTC (permalink / raw)
  To: Joe Monk; +Cc: gcc

Hi Joe.

Thanks for your comments.

> It is unclear how this would even work. 

> For instance, the LA instruction clears the top bit.

In AM64, LA does not clear any bits.

> Also, instructions like LPR, LNR,

These operate on data registers, not addresses,
and will continue to work unchanged.

> BXLE, BXH all treat the value in the register as signed,
> so the top bit is not available.

These are already a problem if you are putting
addresses in them and it is approaching the 2 GiB
mark. The POP has a special mention of that.

Fun fact: The z/Arch POP has the same problem with
the G version of those instructions, when it hits the
63-bit mark, but the POP incorrectly states that the
problem occurs near the 64-bit mark. I reported the
problem with the POP but nothing seems to have
been done.

The solution is to drop these instructions from the
repertoire. C-generated assembler for both i370 and
s390 targets does not use these.

BFN. Paul.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: s390 port
  2021-09-06 22:44 Build gcc question Gary Oblock
@ 2021-09-07  7:21 ` Joe Monk
  2021-09-08  3:46   ` Paul Edwards
  0 siblings, 1 reply; 26+ messages in thread
From: Joe Monk @ 2021-09-07  7:21 UTC (permalink / raw)
  To: Paul Edwards; +Cc: gcc

It is unclear how this would even work.

For instance, the LA instruction clears the top bit.

Also, instructions like LPR, LNR, BXLE, BXH all treat the value in the
register as signed, so the top bit is not available.

Joe

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: s390 port
  2021-09-03 12:53                           ` Jakub Jelinek
@ 2021-09-03 13:12                             ` Paul Edwards
  0 siblings, 0 replies; 26+ messages in thread
From: Paul Edwards @ 2021-09-03 13:12 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Ulrich Weigand, gcc, Ulrich Weigand

>> > This is not in one single place, but spread throughout the
>> > compiler, both common code and back-end.  I do not think it will
>> > be possible to get the compiler to generate correct code if
>> > you do not specify the address size correctly.

>> 1. Is there any way to put a constraint on index
>> registers, to say that a particular machine can
>> only index in the range of –512 to +512 or some
>> other arbitrary set? If so, I can do 0 to 2 GiB.

>> 2. Is there a way of saying a machine doesn’t
>> support indexing at all?

> There is a way to do that, but it isn't about changing a single or a 
> couple
> of spots, one needs to change a lot of *.md patterns, a lot of macros,
> target hooks and as Ulrich said, most important is to use the right Pmode
> which can differ from ptr_mode provided one e.g. defines ptr_extend 
> pattern
> etc.

Pardon? All that is required just to put a constraint
on an index register? If a range of a machine is
limited to -512 to +512, it shouldn't be necessary
to change md patterns etc etc.

> Just look at the amount of work needed for the x32 or aarch64 ilp32 
> support,

That's different. That's because Intel stuffed up.
IBM didn't. IBM came within an ace of a perfect
architecture. It's as if Intel had created an x32
instead of an 80386 in 1986.

IBM got it almost right in the 1960s.

> and not just work spent one time on adding that support, but the 
> continuous
> amount of work on maintaining it.  The initial work is certainly a few
> weeks if not months of work,

I've been trying to figure out how to lift the 31-bit
restriction on mainframes since around 1987.

If I have to pay someone for 2 month of work, at
this stage, I'm willing to do that, but:

1. I would like it done on GCC 3.2.3 plus maybe
GCC 3.4.6.

2. How much will it cost in US$?

> then there needs to be somebody who regularly
> tests gcc trunk and branches in such configuration so that it doesn't
> bitrot, and not just that but somebody who actually fixes bugs in it.

I'll take responsibility for giving the GCC 3.X.X
releases the TLC they deserve. And I'll encourage
my daughter to maintain them after I've kicked
the bucket.

> If something doesn't fit into 2GB of address space,
> isn't it likely it won't fit into 4GB of address space
> in a year or two?

Nope. 2 GiB is already a shitload of memory. It only
takes something like 23 MB for GCC 3.2.3 to recompile
itself, and I think 60 MB for GCC 3.4.6 to recompile
itself. That's the heaviest real workload I do. A 4 GiB
limitation instead of 2 GiB makes it just that much
less likely I'll ever hit a real limit.

Someone told me that the only non-scientific application
they knew of that came close to hitting the 2 GiB limit
was IBM's C compiler. I doubt that IBM's C compiler
technology is evolving at such a rate that it only takes
1-2 years for them to subsequently hit 4 GiB. Quite
apart from the fact that I don't really trust that even
IBM C is hitting a 2 GiB limit for what GCC can do in
23 MiB. But it could be true - I'm not familiar with
compiler internals.

BFN. Paul.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: s390 port
  2021-09-03 12:38                         ` Paul Edwards
@ 2021-09-03 12:53                           ` Jakub Jelinek
  2021-09-03 13:12                             ` Paul Edwards
  0 siblings, 1 reply; 26+ messages in thread
From: Jakub Jelinek @ 2021-09-03 12:53 UTC (permalink / raw)
  To: Paul Edwards; +Cc: Ulrich Weigand, gcc, Ulrich Weigand

On Fri, Sep 03, 2021 at 10:38:36PM +1000, Paul Edwards via Gcc wrote:
> > This is not in one single place, but spread throughout the
> > compiler, both common code and back-end.  I do not think it will
> > be possible to get the compiler to generate correct code if
> > you do not specify the address size correctly.
> 1. Is there any way to put a constraint on index
> registers, to say that a particular machine can
> only index in the range of –512 to +512 or some
> other arbitrary set? If so, I can do 0 to 2 GiB.
> 2. Is there a way of saying a machine doesn’t
> support indexing at all?

There is a way to do that, but it isn't about changing a single or a couple
of spots, one needs to change a lot of *.md patterns, a lot of macros,
target hooks and as Ulrich said, most important is to use the right Pmode
which can differ from ptr_mode provided one e.g. defines ptr_extend pattern
etc.
Just look at the amount of work needed for the x32 or aarch64 ilp32 support,
and not just work spent one time on adding that support, but the continuous
amount of work on maintaining it.  The initial work is certainly a few
weeks if not months of work, then there needs to be somebody who regularly
tests gcc trunk and branches in such configuration so that it doesn't
bitrot, and not just that but somebody who actually fixes bugs in it.

If something doesn't fit into 2GB of address space, isn't it likely it won't
fit into 4GB of address space in a year or two?

	Jakub


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: s390 port
  2021-09-03 12:12                       ` Ulrich Weigand
@ 2021-09-03 12:38                         ` Paul Edwards
  2021-09-03 12:53                           ` Jakub Jelinek
  2022-12-20  4:27                         ` Paul Edwards
  1 sibling, 1 reply; 26+ messages in thread
From: Paul Edwards @ 2021-09-03 12:38 UTC (permalink / raw)
  To: Ulrich Weigand; +Cc: gcc, Ulrich Weigand

>> >  Also, the compiler
>> >  will assume the base + index (+ displacement) arithmetic
>> >  will operate in 32 bits -- I'm pretty sure this is
>> >  actually the root cause of your "negative index" problem.

>> Where is this logic please? Can I do a #if 0 or similar
>> to disable it?

> This is not in one single place, but spread throughout the
> compiler, both common code and back-end.  I do not think it will
> be possible to get the compiler to generate correct code if
> you do not specify the address size correctly.
1. Is there any way to put a constraint on index
registers, to say that a particular machine can
only index in the range of –512 to +512 or some
other arbitrary set? If so, I can do 0 to 2 GiB.
2. Is there a way of saying a machine doesn’t
support indexing at all?
>> > If you want to go for an "x32" like mode, I think this
>> > is wrong approach.  The right approach would be to
>> > start from "-m64", and simply modify the pointer size
>> > to be 32 bits.
>> > This would work by setting POINTER_SIZE to 32, while
>> > leaving everything else like for -m64.
>  
>> That will generate 64-bit z/Arch instructions.
>> I wish to generate ESA/390 instructions.

> Why? AMODE64 exists only in z/Arch, so of course there
> will be z/Arch instructions available ...

For the same reason people constructed Babbage’s
invention, I wish to demonstrate the minor changes
that would have been required to the S/360 so that
we would never have arrived at a 31-bit black hole,
and we could have in fact had the perfect 32-bit
machine. Almost identical to the 31-bit machine.
A S/360+, a S/370+ and a S/390+. 

>> > We've thought about implementing this mode for Linux,
>> > but decided not to do it, since it would only provide
>> > marginal performance improvements, and has the drawback
>> > of being another new ABI that would be incompatible to
>> > the whole existing software ecosystem.
>> Shouldn’t the end user be able to decide this
>> for themselves?

> It's open source, of course everybode can decide what they
> want to work on themselves.  But we decide what we spend
> our own time on based on we think is useful ...

Sure.

>> No-one at all is interested in 32-bit mainframes?

> Not any more, at least not in Linux.  Linux is pretty much
> 64-bit only at this point.

I think z/OS is pretty much still 31-bit only,
as far as apps are concerned, right? I’d like to
bump that up to 32-bit.

BFN. Paul.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: s390 port
  2021-09-03 11:35                     ` Paul Edwards
@ 2021-09-03 12:12                       ` Ulrich Weigand
  2021-09-03 12:38                         ` Paul Edwards
  2022-12-20  4:27                         ` Paul Edwards
  0 siblings, 2 replies; 26+ messages in thread
From: Ulrich Weigand @ 2021-09-03 12:12 UTC (permalink / raw)
  To: Paul Edwards; +Cc: gcc, Ulrich Weigand



"Paul Edwards" <mutazilah@gmail.com> wrote on 03.09.2021 13:35:10:
> >  Specifically, if you try to run AMODE64 with Pmode equals
> >  SImode, the compiler will not be aware that the hardware
> >  uses the high 32 bits of base and index registers, and
> >  will not necessarily keep them zero.
> The compiler naturally keeps them zero. The
> instructions that are used to load registers
> do not pollute the high-order 32 bits.

While this is true for most instructions, the compiler will not
restrict itself to using only those.  (As just one obvious
example, the compiler may use "lay" with a negative displacement,
which will set the high bits of a GPR in AMODE64.)

It is of course possible to change the back-end to ensure that
SImode operations always leave the high part unmodified; for
example LLVM does that, because it wants to allocate the high
parts seperately for use with the high-word facility instructions.
But GCC currently does not do so.

> >  Also, the compiler
> >  will assume the base + index (+ displacement) arithmetic
> >  will operate in 32 bits -- I'm pretty sure this is
> >  actually the root cause of your "negative index" problem.
> Where is this logic please? Can I do a #if 0 or similar
> to disable it?

This is not in one single place, but spread throughout the
compiler, both common code and back-end.  I do not think it will
be possible to get the compiler to generate correct code if
you do not specify the address size correctly.  AMODE64 will
require Pmode == DImode.

(And, b.t.w. not the -m31 DImode, which is a pair of 32-bit
GPRs, but rather the -m64 DImode, which is a single 64-bit GPR.)

> > If you want to go for an "x32" like mode, I think this
> > is wrong approach.  The right approach would be to
> > start from "-m64", and simply modify the pointer size
> > to be 32 bits.
> > This would work by setting POINTER_SIZE to 32, while
> > leaving everything else like for -m64.
>
> That will generate 64-bit z/Arch instructions.
> I wish to generate ESA/390 instructions.

Why? AMODE64 exists only in z/Arch, so of course there
will be z/Arch instructions available ...

> > We've thought about implementing this mode for Linux,
> > but decided not to do it, since it would only provide
> > marginal performance improvements, and has the drawback
> > of being another new ABI that would be incompatible to
> > the whole existing software ecosystem.
> Shouldn’t the end user be able to decide this
> for themselves?

It's open source, of course everybode can decide what they
want to work on themselves.  But we decide what we spend
our own time on based on we think is useful ...

> No-one at all is interested in 32-bit mainframes?

Not any more, at least not in Linux.  Linux is pretty much
64-bit only at this point.


Bye,
Ulrich

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: s390 port
  2021-09-03 11:18                   ` Ulrich Weigand
@ 2021-09-03 11:35                     ` Paul Edwards
  2021-09-03 12:12                       ` Ulrich Weigand
  0 siblings, 1 reply; 26+ messages in thread
From: Paul Edwards @ 2021-09-03 11:35 UTC (permalink / raw)
  To: Ulrich Weigand; +Cc: gcc, Ulrich Weigand

> - AMODE64 means the native address size is 64 bits.  This
>  implies that Pmode has to be DImode, since Pmode tells
>  the compiler what the native address size is.

>  Specifically, if you try to run AMODE64 with Pmode equals
>  SImode, the compiler will not be aware that the hardware
>  uses the high 32 bits of base and index registers, and
>  will not necessarily keep them zero.

The compiler naturally keeps them zero. The

instructions that are used to load registers

do not pollute the high-order 32 bits.



>  Also, the compiler
>  will assume the base + index (+ displacement) arithmetic
>  will operate in 32 bits -- I'm pretty sure this is
>  actually the root cause of your "negative index" problem.


Where is this logic please? Can I do a #if 0 or similar

to disable it?


> Note that even if Pmode == DImode, you can still use 32-bit
> *pointer* sizes.  This is exactly what e.g. the Intel x32
> mode does (as was mentioned by Andreas).


I’m happy to try the approach from BOTH directions

and see which one hits “-m32” first.


>> I’d like to approach the problem from the other
>> direction – what modifications are required to
>> be made to “-m31” so that it does “-m32” instead?
>> I’m happy to simply retire “-m31”, but I don’t care
>> if both exist.

> If you want to go for an "x32" like mode, I think this
> is wrong approach.  The right approach would be to
> start from "-m64", and simply modify the pointer size
> to be 32 bits.


> This would work by setting POINTER_SIZE to 32, while
> leaving everything else like for -m64.



That will generate 64-bit z/Arch instructions.

I wish to generate ESA/390 instructions.



> I'm sure there
> will be a few other places that need adaptation, but
> it should be pretty straightforward.

No, modifying GCC is beyond my ability. I

need 20 lines of code from someone who is

familiar with the system.



>  You can also
> check the Intel back-end where they're using the
> TARGET_X32 macro.


See above about beyond my ability.

> We've thought about implementing this mode for Linux,
> but decided not to do it, since it would only provide
> marginal performance improvements, and has the drawback
> of being another new ABI that would be incompatible to
> the whole existing software ecosystem.


Shouldn’t the end user be able to decide this

for themselves? No-one at all is interested in

32-bit mainframes?


> (The latter point may not be an issue for you if you're
> looking into a completely new OS anyway.)


Correct.

Thanks. Paul.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: s390 port
  2021-09-02 20:05                 ` Paul Edwards
  2021-09-02 20:16                   ` Andreas Schwab
@ 2021-09-03 11:18                   ` Ulrich Weigand
  2021-09-03 11:35                     ` Paul Edwards
  1 sibling, 1 reply; 26+ messages in thread
From: Ulrich Weigand @ 2021-09-03 11:18 UTC (permalink / raw)
  To: Paul Edwards; +Cc: gcc, Ulrich Weigand



"Paul Edwards" <mutazilah@gmail.com> wrote on 02.09.2021 22:05:39:
> > Is this about supporting a 4GB address space instead
> > of a 2GB space?
>
> Yes, correct.

OK, that makes things clearer.  This implies in particular:

- 4GB address space means you need to run in AMODE64

- AMODE64 means the native address size is 64 bits.  This
  implies that Pmode has to be DImode, since Pmode tells
  the compiler what the native address size is.

  Specifically, if you try to run AMODE64 with Pmode equals
  SImode, the compiler will not be aware that the hardware
  uses the high 32 bits of base and index registers, and
  will not necessarily keep them zero.  Also, the compiler
  will assume the base + index (+ displacement) arithmetic
  will operate in 32 bits -- I'm pretty sure this is
  actually the root cause of your "negative index" problem.

> > Is it about supporting a 32-bit pointer type in an
> > otherwise AM64 environment?  (This is already used
> > by the TPF target, but the 32-bit pointer will still
> > refer to a 2GB address space.)
> Yes, all pointers will be 32-bit – a normal 32-bit system.

Note that even if Pmode == DImode, you can still use 32-bit
*pointer* sizes.  This is exactly what e.g. the Intel x32
mode does (as was mentioned by Andreas).

> I’d like to approach the problem from the other
> direction – what modifications are required to
> be made to “-m31” so that it does “-m32” instead?
> I’m happy to simply retire “-m31”, but I don’t care
> if both exist.

If you want to go for an "x32" like mode, I think this
is wrong approach.  The right approach would be to
start from "-m64", and simply modify the pointer size
to be 32 bits.

This would work by setting POINTER_SIZE to 32, while
leaving everything else like for -m64.  I'm sure there
will be a few other places that need adaptation, but
it should be pretty straightforward.  You can also
check the Intel back-end where they're using the
TARGET_X32 macro.


We've thought about implementing this mode for Linux,
but decided not to do it, since it would only provide
marginal performance improvements, and has the drawback
of being another new ABI that would be incompatible to
the whole existing software ecosystem.

(The latter point may not be an issue for you if you're
looking into a completely new OS anyway.)

Bye,
Ulrich

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: s390 port
  2021-09-02 20:05                 ` Paul Edwards
@ 2021-09-02 20:16                   ` Andreas Schwab
  2021-09-03 11:18                   ` Ulrich Weigand
  1 sibling, 0 replies; 26+ messages in thread
From: Andreas Schwab @ 2021-09-02 20:16 UTC (permalink / raw)
  To: Paul Edwards via Gcc; +Cc: Ulrich Weigand, Paul Edwards, Ulrich Weigand

On Sep 03 2021, Paul Edwards via Gcc wrote:

> The “legacy” environment of z/Linux etc would be 32-bit
> instead of 31-bit. IBM’s reputation will be restored. IBM
> will have the best architecture on the planet. Better than
> x64 because no mode switch is required shifting between
> 32-bit and 64-bit applications. All run as AM64 = AM-infinity.

That looks like -mabi=ilp32 on aarch64, or -mx32 on x86_64.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: s390 port
  2021-09-02 19:46               ` Ulrich Weigand
@ 2021-09-02 20:05                 ` Paul Edwards
  2021-09-02 20:16                   ` Andreas Schwab
  2021-09-03 11:18                   ` Ulrich Weigand
  0 siblings, 2 replies; 26+ messages in thread
From: Paul Edwards @ 2021-09-02 20:05 UTC (permalink / raw)
  To: Ulrich Weigand; +Cc: gcc, Ulrich Weigand

Hi Ulrich. Thanks for your detailed reply.
>> > Therefore again my question, what is the actual goal
>> > you want to achieve?   I'm still not sure I understand
>> > that ...

>> I would like to know what is required to implement
>> “-m32” in the S/390 target. I realize that z/Arch
>> doesn’t have a specific AM32, but I don’t need a
>> specific AM32. What would actually happen if you
>> coded a “-m32” and then ran it in an AM64
>> environment?

> That depends on what that would actually do.  I'm still not
> quite sure what the actual requirements are.

> Is this about supporting a 4GB address space instead
> of a 2GB space?
Yes, correct.
> (I'm not aware of that being used anywhere currently.)
I’m about to use it. I just need to get past
the problem with negative indexes being used,
and I need your help.

> Is it about supporting a 32-bit pointer type in an
> otherwise AM64 environment?  (This is already used
> by the TPF target, but the 32-bit pointer will still
> refer to a 2GB address space.)

Yes, all pointers will be 32-bit – a normal 32-bit system.

> Is it something else?

Nope, you got it.

> In either case, what is the actual benefit of that mode?
> (I.e. what benefit would justify the effort to implement it?)

The “legacy” environment of z/Linux etc would be 32-bit
instead of 31-bit. IBM’s reputation will be restored. IBM
will have the best architecture on the planet. Better than
x64 because no mode switch is required shifting between
32-bit and 64-bit applications. All run as AM64 = AM-infinity.

>> >> Also, I just realized – if GCC is using LA for maths
>> >> for 32-bit registers, then values will be limited to
>> >> 2 GiB instead of 4 GiB for unsigned, but that is not
>> >> the case.
> 
>> > That's why GCC makes sure to only use the instruction
>> > when a 31-bit addition is wanted.  This can be the
>> > case either when GCC can prove that the involved
>> > operands are pointer values (which are by definition
>> > restricted to 31-bit values in -m31 mode)
>  
>> The compiler doesn’t create a restriction there.
>> It just generates a simple LA and it works
>> differently depending on whether it is AM24/31/64.

> It is the other way around.  The compiler knows
> exactly how the LA instruction behaves in hardware,
> and will use the instruction whenever that behavior
> matches the semantics of (a part of) the program.
> Since the behavior of the instruction differs based
> on the addressing mode, the compiler will have to
> know which mode the executable will be running in.

The i370 port produces code that works in AM24, AM31,
AM32 and AM64 (except for negative indexes). I’m surprised
the s390 port doesn’t too. As far as I can remember from
using IBM C, it supports execution in any AMODE too.

> Currently, the -m31/-m64 switch basically changes several
> things (at the same time)
> - the assumption on which AM the executable will run in 
> - the (used) size of a general-purpose register
> - the (default) size of a pointer type
> - ABI (function calling convention) details

> In theory, it would be possible to split this apart
> into distinct features, so that it would be possible
> to implement a mode where you can have code that uses
> 32-bit pointers but is running in AM64 (which would
> then support a 4 GB address space).

> Is this what you mean by an "-m32" mode?

Yes, correct.

> Basically, this would involve looking at all uses of
> the TARGET_64BIT macro in the back-end and determine
> which of them actually depend on which of the above
> features, and disentangle it accordingly.

> I guess that would be possible, but it requires a
> nontrivial effort.

I’d like to approach the problem from the other
direction – what modifications are required to
be made to “-m31” so that it does “-m32” instead?
I’m happy to simply retire “-m31”, but I don’t care
if both exist.

If “-m31” is retired, and made an alias for “-m32”,
my guess is that 20 lines of code need to be changed.

The most important thing is to stop generating
negative indexes.

ie if you have “char *p” and you go p[-1] I don’t
want 0xFFFFFFFF generated as an index. I instead
want a subtraction done.

I was under the impression that this was governed
by the Pmode – whether it was set to DImode or
SImode. But I tried forcing Pmode to DImode, even
for “–m31”, but it gave an internal error, which I
showed you already.

What am I missing?

Thanks. Paul.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: s390 port
  2021-09-02 15:26             ` Paul Edwards
@ 2021-09-02 19:46               ` Ulrich Weigand
  2021-09-02 20:05                 ` Paul Edwards
  0 siblings, 1 reply; 26+ messages in thread
From: Ulrich Weigand @ 2021-09-02 19:46 UTC (permalink / raw)
  To: Paul Edwards; +Cc: gcc, Ulrich Weigand



"Paul Edwards" <mutazilah@gmail.com> wrote on 02.09.2021 17:26:25:

> > Therefore again my question, what is the actual goal
> > you want to achieve?   I'm still not sure I understand
> > that ...
> I would like to know what is required to implement
> “-m32” in the S/390 target. I realize that z/Arch
> doesn’t have a specific AM32, but I don’t need a
> specific AM32. What would actually happen if you
> coded a “-m32” and then ran it in an AM64
> environment?

That depends on what that would actually do.  I'm still not
quite sure what the actual requirements are.

Is this about supporting a 4GB address space instead
of a 2GB space?  (I'm not aware of that being used
anywhere currently.)

Is it about supporting a 32-bit pointer type in an
otherwise AM64 environment?  (This is already used
by the TPF target, but the 32-bit pointer will still
refer to a 2GB address space.)

Is it something else?

In either case, what is the actual benefit of that mode?
(I.e. what benefit would justify the effort to implement it?)


> >> Also, I just realized – if GCC is using LA for maths
> >> for 32-bit registers, then values will be limited to
> >> 2 GiB instead of 4 GiB for unsigned, but that is not
> >> the case.
>
> > That's why GCC makes sure to only use the instruction
> > when a 31-bit addition is wanted.  This can be the
> > case either when GCC can prove that the involved
> > operands are pointer values (which are by definition
> > restricted to 31-bit values in -m31 mode)
>
> The compiler doesn’t create a restriction there.
> It just generates a simple LA and it works
> differently depending on whether it is AM24/31/64.

It is the other way around.  The compiler knows
exactly how the LA instruction behaves in hardware,
and will use the instruction whenever that behavior
matches the semantics of (a part of) the program.
Since the behavior of the instruction differs based
on the addressing mode, the compiler will have to
know which mode the executable will be running in.


Currently, the -m31/-m64 switch basically changes several
things (at the same time):
- the assumption on which AM the executable will run in
- the (used) size of a general-purpose register
- the (default) size of a pointer type
- ABI (function calling convention) details

In theory, it would be possible to split this apart
into distinct features, so that it would be possible
to implement a mode where you can have code that uses
32-bit pointers but is running in AM64 (which would
then support a 4 GB address space).

Is this what you mean by an "-m32" mode?


Basically, this would involve looking at all uses of
the TARGET_64BIT macro in the back-end and determine
which of them actually depend on which of the above
features, and disentangle it accordingly.

I guess that would be possible, but it requires a
nontrivial effort.


Bye,
Ulrich

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: s390 port
  2021-09-02 15:13           ` Ulrich Weigand
@ 2021-09-02 15:26             ` Paul Edwards
  2021-09-02 19:46               ` Ulrich Weigand
  0 siblings, 1 reply; 26+ messages in thread
From: Paul Edwards @ 2021-09-02 15:26 UTC (permalink / raw)
  To: Ulrich Weigand; +Cc: gcc, Ulrich Weigand

>> I just checked my copy of s390.md and I don’t see
>> LA being used for arithmetic.

> This would be the "*la_31" and "*la_31_and" patterns.
Sorry, I did a grep for “LA”, forgetting that
s390.md doesn’t use uppercase instructions.

> (Note that the addition is implicit in the use of
> the "address_operand" constraint.)

If it is an address we are talking about, then that LA
instruction is going to work perfectly fine in AM24,
AM31 and AM64, and in the AM64 case it is going
to be the equivalent of AM32, so maybe the s390
port could have a “-m32” option for use when
running 32-bit applications as AM64?

>> If your copy of s390.md is using LA for arithmetic
>> then would it be possible to have an option to
>> use a normal mathematics instruction instead of
>> LA?

> LA was just an example.  It doesn't usually make sense
> to reason on an "use instruction X" basis, that's not
> how compiler optimizations work.  You rather start with
> a set of semantic invariants and then make sure those
> are preserved through all transformations.

Ok, that’s above my head.

> Therefore again my question, what is the actual goal
> you want to achieve?   I'm still not sure I understand
> that ...

I would like to know what is required to implement
“-m32” in the S/390 target. I realize that z/Arch
doesn’t have a specific AM32, but I don’t need a
specific AM32. What would actually happen if you
coded a “-m32” and then ran it in an AM64
environment?

My experiments show “with one single problem
discovered so far, actually –m31 and –m32 are
identical and work fine under AM64”.

>> Also, I just realized – if GCC is using LA for maths
>> for 32-bit registers, then values will be limited to
>> 2 GiB instead of 4 GiB for unsigned, but that is not
>> the case.

> That's why GCC makes sure to only use the instruction
> when a 31-bit addition is wanted.  This can be the
> case either when GCC can prove that the involved
> operands are pointer values (which are by definition
> restricted to 31-bit values in -m31 mode)

The compiler doesn’t create a restriction there.
It just generates a simple LA and it works
differently depending on whether it is AM24/31/64.

> or when
> there is an explict 31-bit addition (using e.g. an
> & 0x7fffffff) in the source code.

Ok, thankyou, this is what I needed to know.
I believe I would like to have a –m32 that
drops this test. I don’t want GCC to assume
that such an AND instruction can be implemented
with the use of the “LA” instruction. I want
to see an explicit “N” instruction used. Can
I have this as part of “-m32”?

Thanks. Paul.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: s390 port
  2021-09-02 15:01         ` Paul Edwards
@ 2021-09-02 15:13           ` Ulrich Weigand
  2021-09-02 15:26             ` Paul Edwards
  0 siblings, 1 reply; 26+ messages in thread
From: Ulrich Weigand @ 2021-09-02 15:13 UTC (permalink / raw)
  To: Paul Edwards; +Cc: gcc, Ulrich Weigand



Hi Paul,

> I just checked my copy of s390.md and I don’t see
> LA being used for arithmetic.

This would be the "*la_31" and "*la_31_and" patterns.
(Note that the addition is implicit in the use of
the "address_operand" constraint.)

> If your copy of s390.md is using LA for arithmetic
> then would it be possible to have an option to
> use a normal mathematics instruction instead of
> LA?

LA was just an example.  It doesn't usually make sense
to reason on an "use instruction X" basis, that's not
how compiler optimizations work.  You rather start with
a set of semantic invariants and then make sure those
are preserved through all transformations.

Therefore again my question, what is the actual goal
you want to achieve?   I'm still not sure I understand
that ...

> Also, I just realized – if GCC is using LA for maths
> for 32-bit registers, then values will be limited to
> 2 GiB instead of 4 GiB for unsigned, but that is not
> the case.

That's why GCC makes sure to only use the instruction
when a 31-bit addition is wanted.  This can be the
case either when GCC can prove that the involved
operands are pointer values (which are by definition
restricted to 31-bit values in -m31 mode), or when
there is an explict 31-bit addition (using e.g. an
& 0x7fffffff) in the source code.

Bye,
Ulrich

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: s390 port
  2021-09-02 14:53       ` Ulrich Weigand
@ 2021-09-02 15:01         ` Paul Edwards
  2021-09-02 15:13           ` Ulrich Weigand
  0 siblings, 1 reply; 26+ messages in thread
From: Paul Edwards @ 2021-09-02 15:01 UTC (permalink / raw)
  To: Ulrich Weigand; +Cc: gcc, Ulrich Weigand

Hi Ulrich.

I just checked my copy of s390.md and I don’t see
LA being used for arithmetic.

If your copy of s390.md is using LA for arithmetic
then would it be possible to have an option to
use a normal mathematics instruction instead of
LA?

Do you have any more examples besides LA being
used for maths instead of a proper maths instruction?

Also, I just realized – if GCC is using LA for maths
for 32-bit registers, then values will be limited to
2 GiB instead of 4 GiB for unsigned, but that is not
the case.

BFN. Paul.




From: Ulrich Weigand 
Sent: Friday, September 3, 2021 12:53 AM
To: Paul Edwards 
Cc: gcc@gcc.gnu.org ; Ulrich Weigand 
Subject: Re: s390 port

"Paul Edwards" <mutazilah@gmail.com> wrote on 02.09.2021 16:50:35:

> Could you give me an example of an instruction
> generated by –m31 that is not expected to work
> on an AM64 system?

Well, everything related to address computation, of course.

For example, GCC may use LA on -m31 to implement a
31-bit addition, while it may use LA on -m64 to
implement a 64-bit addition.

Bye,
Ulrich




^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: s390 port
  2021-09-02 14:50     ` Paul Edwards
@ 2021-09-02 14:53       ` Ulrich Weigand
  2021-09-02 15:01         ` Paul Edwards
  0 siblings, 1 reply; 26+ messages in thread
From: Ulrich Weigand @ 2021-09-02 14:53 UTC (permalink / raw)
  To: Paul Edwards; +Cc: gcc, Ulrich Weigand



"Paul Edwards" <mutazilah@gmail.com> wrote on 02.09.2021 16:50:35:

> Could you give me an example of an instruction
> generated by –m31 that is not expected to work
> on an AM64 system?

Well, everything related to address computation, of course.

For example, GCC may use LA on -m31 to implement a
31-bit addition, while it may use LA on -m64 to
implement a 64-bit addition.

Bye,
Ulrich

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: s390 port
  2021-09-02 14:34   ` Ulrich Weigand
@ 2021-09-02 14:50     ` Paul Edwards
  2021-09-02 14:53       ` Ulrich Weigand
  0 siblings, 1 reply; 26+ messages in thread
From: Paul Edwards @ 2021-09-02 14:50 UTC (permalink / raw)
  To: Ulrich Weigand; +Cc: gcc, Ulrich Weigand

Hi Ulrich.

Thanks a lot for your reply.

Could you give me an example of an instruction
generated by –m31 that is not expected to work
on an AM64 system?

E.g. the 32-bit

LR R2,R3

will definitely work on AM64.

So what specifically won’t work? How many different
things won’t work?

Thanks. Paul.




From: Ulrich Weigand 
Sent: Friday, September 3, 2021 12:34 AM
To: Paul Edwards 
Cc: gcc@gcc.gnu.org ; Ulrich Weigand 
Subject: Re: s390 port

Hi Paul,

"Paul Edwards" <mutazilah@gmail.com> wrote on 02.09.2021 10:15:44:

> We got the IPL process in place on ESA/390, and then
> I decided that the next thing to do would be to switch
> to z/Arch so that we could get rid of the AMODE 31
> architectural limit on 32-bit programs.
> 
> It all worked fine, and we were able to use GCC 11 to
> target S/390 and use the -m31 to generate 32-bit code,
> run it under z/Arch as AM64, sort of making it the
> equivalent of AM32. Really it is the equivalent of
> AM-infinity, and there's the rub - GCC 11 is generating
> negative indexes, which cause memory above 4 GiB
> to be accessed (instead of wrapping at 2/4 GiB), which
> of course fails.

Can you elaborate what exactly your goals are?  The point
of the -m31 vs. -m64 option is exactly to match the
AMODE 31 vs. AMODE 64 hardware distinction, so trying to
run -m31 code in AMODE 64 is not supposed to work.

Bye,
Ulrich




^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: s390 port
  2021-09-02  8:15 ` s390 port Paul Edwards
@ 2021-09-02 14:34   ` Ulrich Weigand
  2021-09-02 14:50     ` Paul Edwards
  0 siblings, 1 reply; 26+ messages in thread
From: Ulrich Weigand @ 2021-09-02 14:34 UTC (permalink / raw)
  To: Paul Edwards; +Cc: gcc, Ulrich Weigand



Hi Paul,

"Paul Edwards" <mutazilah@gmail.com> wrote on 02.09.2021 10:15:44:

> We got the IPL process in place on ESA/390, and then
> I decided that the next thing to do would be to switch
> to z/Arch so that we could get rid of the AMODE 31
> architectural limit on 32-bit programs.
>
> It all worked fine, and we were able to use GCC 11 to
> target S/390 and use the -m31 to generate 32-bit code,
> run it under z/Arch as AM64, sort of making it the
> equivalent of AM32. Really it is the equivalent of
> AM-infinity, and there's the rub - GCC 11 is generating
> negative indexes, which cause memory above 4 GiB
> to be accessed (instead of wrapping at 2/4 GiB), which
> of course fails.

Can you elaborate what exactly your goals are?  The point
of the -m31 vs. -m64 option is exactly to match the
AMODE 31 vs. AMODE 64 hardware distinction, so trying to
run -m31 code in AMODE 64 is not supposed to work.

Bye,
Ulrich

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: s390 port
@ 2021-09-02 10:56 Paul Edwards
  0 siblings, 0 replies; 26+ messages in thread
From: Paul Edwards @ 2021-09-02 10:56 UTC (permalink / raw)
  To: Ulrich Weigand; +Cc: gcc

Hi Ulrich.

Below is the output as text.

Thanks. Paul.



make  all-recursive
make[1]: Entering directory '/home/robertapengelly/Desktop/UDOS'
Making all in kernel
make[2]: Entering directory '/home/robertapengelly/Desktop/UDOS/kernel'
depbase=`echo irq.o | sed 's|[^/]*$|.deps/&|;s|\.o$||'`;\
s390-linux-gcc -DHAVE_CONFIG_H -I. -I..     -ffreestanding -fno-stack-protector 
 -pipe -Wall -Wextra -pedantic -Wshadow -Wpointer-arith -Wcast-align -Wwrite-strings 
 -Wstrict-prototypes -Wmissing-declarations -Wdouble-promotion -Wredundant-decls 
 -Wnested-externs -Winline -Wconversion -fexec-charset=IBM-1047 -O2 -m31 -g  
-MT irq.o -MD -MP -MF $depbase.Tpo -c -o irq.o irq.c &&\
mv -f $depbase.Tpo $depbase.Po
during RTL pass: reload
In file included from irq.c:3:
./panic.h: In function ‘kpanic’:
./panic.h:21:1: internal compiler error: maximum number of generated reload 
insns per insn achieved (90)
   21 | }
      | ^
0xce67fc lra_constraints(bool)
    ../../gcc/gcc/lra-constraints.c:5091
0xcd2fa2 lra(_IO_FILE*)
    ../../gcc/gcc/lra.c:2336
0xc8a2f9 do_reload
    ../../gcc/gcc/ira.c:5932
0xc8a2f9 execute
    ../../gcc/gcc/ira.c:6118
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.
make[2]: *** [Makefile:418: irq.o] Error 1
make[2]: Leaving directory '/home/robertapengelly/Desktop/UDOS/kernel'
make[1]: *** [Makefile:406: all-recursive] Error 1
make[1]: Leaving directory '/home/robertapengelly/Desktop/UDOS'
make: *** [Makefile:326: all] Error 2




-----Original Message----- 
From: Paul Edwards
Sent: Thursday, September 2, 2021 6:15 PM
To: Ulrich Weigand
Cc: gcc@gcc.gnu.org
Subject: s390 port

Hi Ulrich.

Sorry for the necro - things happen slowly Down Under. :-)

Anyway, I am helping someone with their public domain
project, UDOS - https://github.com/udos-project/udos

(just a hobby, won't be big and professional like Linux)

We got the IPL process in place on ESA/390, and then
I decided that the next thing to do would be to switch
to z/Arch so that we could get rid of the AMODE 31
architectural limit on 32-bit programs.

It all worked fine, and we were able to use GCC 11 to
target S/390 and use the -m31 to generate 32-bit code,
run it under z/Arch as AM64, sort of making it the
equivalent of AM32. Really it is the equivalent of
AM-infinity, and there's the rub - GCC 11 is generating
negative indexes, which cause memory above 4 GiB
to be accessed (instead of wrapping at 2/4 GiB), which
of course fails.

Do you have any idea how to stop the S/390 target
from generating negative indexes? I thought the
solution might be to change the Pmode to DImode
even for non-TARGET64, but as you can see here:

http://www.pdos.org/gccfail.png

we got an internal compile error - maximum number
of generated reload insns per insn achieved (90).

I then tried changing the other SImode reference
(CASE_VECTOR_MODE) to DImode too, but that gave
the same internal error.

Here is what the failure looks like (see the large R4):

01:28:27 PSW=00042001 80000000 0000000000005870 INST=A73A0001     AHI   3,1
add_halfword_immediate
01:28:27 R0=00000000000001FD R1=00000000000000E2 R2=000000000009E579
R3=00000000000080B2
01:28:27 R4=00000000FFFFF000 R5=000000000001E5C8 R6=0000000000007FFF
R7=0000000000002000
01:28:27 R8=000000000000201F R9=0000000000000000 RA=00000000000080B0
RB=00000000000080B2
01:28:27 RC=000000000009E580 RD=0000000000008138 RE=0000000000007B4C
RF=000000000001E4E4
01:28:27 PSW=00042001 80000000 0000000000005874 INST=42142FFF     STC
1,4095(4,2)            store_character
01:28:27 R:000000010009E578: Translation exception 0005
01:28:27 R0=00000000000001FD R1=00000000000000E2 R2=000000000009E579
R3=00000000000080B3
01:28:27 R4=00000000FFFFF000 R5=000000000001E5C8 R6=0000000000007FFF
R7=0000000000002000
01:28:27 R8=000000000000201F R9=0000000000000000 RA=00000000000080B0
RB=00000000000080B2
01:28:27 RC=000000000009E580 RD=0000000000008138 RE=0000000000007B4C
RF=000000000001E4E4
01:28:27 HHCCP014I CPU0000: Addressing exception CODE=0005 ILC=4
01:28:27 PSW=00042001 80000000 0000000000005878 INST=42142FFF     STC
1,4095(4,2)            store_character
01:28:27 R:000000010009E578: Translation exception 0005
01:28:27 R0=00000000000001FD R1=00000000000000E2 R2=000000000009E579
R3=00000000000080B3
01:28:27 R4=00000000FFFFF000 R5=000000000001E5C8 R6=0000000000007FFF
R7=0000000000002000
01:28:27 R8=000000000000201F R9=0000000000000000 RA=00000000000080B0
RB=00000000000080B2
01:28:27 RC=000000000009E580 RD=0000000000008138 RE=0000000000007B4C
RF=000000000001E4E4
01:28:27 HHCCP043I Wait state PSW loaded: PSW=00060001 80000000
0000000000000444
01:28:40 quit
01:28:40 HHCIN900I Begin Hercules shutdown

Any idea what we can do?

Thanks. Paul.




-----Original Message----- 
From: Ulrich Weigand
Sent: Saturday, June 6, 2009 1:20 AM
To: Paul Edwards
Cc: gcc@gcc.gnu.org
Subject: Re: i370 port

Paul Edwards wrote:

> In addition, that code has been ported to GCC 3.4.6, which is now
> working as a cross-compiler at least.  It's still some months away
> from working natively though.  It takes a lot of effort to convert
> the Posix-expecting GCC compiler into C90 compliance.  This has
> been done though, in a way that has minimal code changes to the
> GCC mainline.

You're referring to building GCC for a non-Posix *host*, right?
I assume those changes are not (primarily) in the back-end, but
throughout GCC common code?

> Yes, I'm aware that there is an S/390 port, but it isn't EBCDIC, isn't
> HLASM, isn't 370, isn't C90, isn't MVS.  It may well be possible to
> change all those things, and I suspect that in a few years from now
> I may be sending another message asking what I need to do to get
> all my changes to the s390 target into the s390 target.  At that time,
> I suspect there will be a lot of objection to "polluting" the s390 target
> with all those "unnecessary" things.

Actually, I would really like to see the s390 target optionally support
the MVS ABI and HLASM assembler format, so I wouldn't have any objection
to patches that add these features ...

I understand current GCC supports various source and target character
sets a lot better out of the box, so it may be EBCDIC isn't even an
issue any more.   If there are other problems related to MVS host
support, I suppose those would need to be fixed in common code anyway,
no matter whether the s390 or i370 back-ends are used.

The only point in your list I'm sceptical about is 370 architecture
support -- I don't quite see why this is still useful today (the s390
port does require at a minimum a S/390 G2 with the branch relative
instructions ... but those have been around for nearly 15 years).

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  Ulrich.Weigand@de.ibm.com 


^ permalink raw reply	[flat|nested] 26+ messages in thread

* s390 port
  2009-06-05 15:21 i370 port Ulrich Weigand
@ 2021-09-02  8:15 ` Paul Edwards
  2021-09-02 14:34   ` Ulrich Weigand
  0 siblings, 1 reply; 26+ messages in thread
From: Paul Edwards @ 2021-09-02  8:15 UTC (permalink / raw)
  To: Ulrich Weigand; +Cc: gcc

Hi Ulrich.

Sorry for the necro - things happen slowly Down Under. :-)

Anyway, I am helping someone with their public domain
project, UDOS - https://github.com/udos-project/udos

(just a hobby, won't be big and professional like Linux)

We got the IPL process in place on ESA/390, and then
I decided that the next thing to do would be to switch
to z/Arch so that we could get rid of the AMODE 31
architectural limit on 32-bit programs.

It all worked fine, and we were able to use GCC 11 to
target S/390 and use the -m31 to generate 32-bit code,
run it under z/Arch as AM64, sort of making it the
equivalent of AM32. Really it is the equivalent of
AM-infinity, and there's the rub - GCC 11 is generating
negative indexes, which cause memory above 4 GiB
to be accessed (instead of wrapping at 2/4 GiB), which
of course fails.

Do you have any idea how to stop the S/390 target
from generating negative indexes? I thought the
solution might be to change the Pmode to DImode
even for non-TARGET64, but as you can see here:

http://www.pdos.org/gccfail.png

we got an internal compile error - maximum number
of generated reload insns per insn achieved (90).

I then tried changing the other SImode reference
(CASE_VECTOR_MODE) to DImode too, but that gave
the same internal error.

Here is what the failure looks like (see the large R4):

01:28:27 PSW=00042001 80000000 0000000000005870 INST=A73A0001     AHI   3,1 
add_halfword_immediate
01:28:27 R0=00000000000001FD R1=00000000000000E2 R2=000000000009E579 
R3=00000000000080B2
01:28:27 R4=00000000FFFFF000 R5=000000000001E5C8 R6=0000000000007FFF 
R7=0000000000002000
01:28:27 R8=000000000000201F R9=0000000000000000 RA=00000000000080B0 
RB=00000000000080B2
01:28:27 RC=000000000009E580 RD=0000000000008138 RE=0000000000007B4C 
RF=000000000001E4E4
01:28:27 PSW=00042001 80000000 0000000000005874 INST=42142FFF     STC 
1,4095(4,2)            store_character
01:28:27 R:000000010009E578: Translation exception 0005
01:28:27 R0=00000000000001FD R1=00000000000000E2 R2=000000000009E579 
R3=00000000000080B3
01:28:27 R4=00000000FFFFF000 R5=000000000001E5C8 R6=0000000000007FFF 
R7=0000000000002000
01:28:27 R8=000000000000201F R9=0000000000000000 RA=00000000000080B0 
RB=00000000000080B2
01:28:27 RC=000000000009E580 RD=0000000000008138 RE=0000000000007B4C 
RF=000000000001E4E4
01:28:27 HHCCP014I CPU0000: Addressing exception CODE=0005 ILC=4
01:28:27 PSW=00042001 80000000 0000000000005878 INST=42142FFF     STC 
1,4095(4,2)            store_character
01:28:27 R:000000010009E578: Translation exception 0005
01:28:27 R0=00000000000001FD R1=00000000000000E2 R2=000000000009E579 
R3=00000000000080B3
01:28:27 R4=00000000FFFFF000 R5=000000000001E5C8 R6=0000000000007FFF 
R7=0000000000002000
01:28:27 R8=000000000000201F R9=0000000000000000 RA=00000000000080B0 
RB=00000000000080B2
01:28:27 RC=000000000009E580 RD=0000000000008138 RE=0000000000007B4C 
RF=000000000001E4E4
01:28:27 HHCCP043I Wait state PSW loaded: PSW=00060001 80000000 
0000000000000444
01:28:40 quit
01:28:40 HHCIN900I Begin Hercules shutdown

Any idea what we can do?

Thanks. Paul.




-----Original Message----- 
From: Ulrich Weigand
Sent: Saturday, June 6, 2009 1:20 AM
To: Paul Edwards
Cc: gcc@gcc.gnu.org
Subject: Re: i370 port

Paul Edwards wrote:

> In addition, that code has been ported to GCC 3.4.6, which is now
> working as a cross-compiler at least.  It's still some months away
> from working natively though.  It takes a lot of effort to convert
> the Posix-expecting GCC compiler into C90 compliance.  This has
> been done though, in a way that has minimal code changes to the
> GCC mainline.

You're referring to building GCC for a non-Posix *host*, right?
I assume those changes are not (primarily) in the back-end, but
throughout GCC common code?

> Yes, I'm aware that there is an S/390 port, but it isn't EBCDIC, isn't
> HLASM, isn't 370, isn't C90, isn't MVS.  It may well be possible to
> change all those things, and I suspect that in a few years from now
> I may be sending another message asking what I need to do to get
> all my changes to the s390 target into the s390 target.  At that time,
> I suspect there will be a lot of objection to "polluting" the s390 target
> with all those "unnecessary" things.

Actually, I would really like to see the s390 target optionally support
the MVS ABI and HLASM assembler format, so I wouldn't have any objection
to patches that add these features ...

I understand current GCC supports various source and target character
sets a lot better out of the box, so it may be EBCDIC isn't even an
issue any more.   If there are other problems related to MVS host
support, I suppose those would need to be fixed in common code anyway,
no matter whether the s390 or i370 back-ends are used.

The only point in your list I'm sceptical about is 370 architecture
support -- I don't quite see why this is still useful today (the s390
port does require at a minimum a S/390 G2 with the branch relative
instructions ... but those have been around for nearly 15 years).

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  Ulrich.Weigand@de.ibm.com 


^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2023-01-29 14:31 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-28 18:51 s390 port Paul Edwards
2023-01-29 13:08 ` Joe Monk
2023-01-29 14:30   ` Paul Edwards
  -- strict thread matches above, loose matches on Subject: below --
2021-09-30 21:39 Paul Edwards
2021-09-30  0:08 Paul Edwards
2021-09-30  0:59 ` Joe Monk
2021-09-06 22:44 Build gcc question Gary Oblock
2021-09-07  7:21 ` s390 port Joe Monk
2021-09-08  3:46   ` Paul Edwards
2021-09-02 10:56 Paul Edwards
2009-06-05 15:21 i370 port Ulrich Weigand
2021-09-02  8:15 ` s390 port Paul Edwards
2021-09-02 14:34   ` Ulrich Weigand
2021-09-02 14:50     ` Paul Edwards
2021-09-02 14:53       ` Ulrich Weigand
2021-09-02 15:01         ` Paul Edwards
2021-09-02 15:13           ` Ulrich Weigand
2021-09-02 15:26             ` Paul Edwards
2021-09-02 19:46               ` Ulrich Weigand
2021-09-02 20:05                 ` Paul Edwards
2021-09-02 20:16                   ` Andreas Schwab
2021-09-03 11:18                   ` Ulrich Weigand
2021-09-03 11:35                     ` Paul Edwards
2021-09-03 12:12                       ` Ulrich Weigand
2021-09-03 12:38                         ` Paul Edwards
2021-09-03 12:53                           ` Jakub Jelinek
2021-09-03 13:12                             ` Paul Edwards
2022-12-20  4:27                         ` Paul Edwards

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).