public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
* Request for an example x68 assembler portable Hello World script
@ 2019-04-26  7:25 Jesse Thompson
  2019-04-26 11:16 ` Eliot Moss
  2019-04-26 21:04 ` Jesse Thompson
  0 siblings, 2 replies; 7+ messages in thread
From: Jesse Thompson @ 2019-04-26  7:25 UTC (permalink / raw)
  To: cygwin

I would like to learn how to write assembly programs for the command line
that with as little alteration as is feasable will compile both in Cygwin
and in other flavors of Unix like Linux and/or FreeBSD.

I am targeting only x64 CPUs and I'm perfectly happy to use libc calls
instead of direct syscalls or interrupts. I'm hoping to use nasm+gcc, or
perhaps fasm to do the deed. Crosspiling is not a concern, I'll build
cygwin binaries in cygwin and unix binaries in unix.

But I'm confused by the differences in calling convention/ABI between
Windows and/or Cygwin and Linux?

For example, I can get this to compile and run in Cygwin:

```
        global  main
        extern  puts
        section .text
main:
        sub     rsp, 20h                        ; Reserve the shadow space
        mov     rcx, message                    ; First argument is address
of message
        call    puts                            ; puts(message)
        add     rsp, 20h                        ; Remove shadow space
        ret
message:
        db      'Hello', 0                      ; C strings need a zero
byte at the end
```


but it segfaults in Linux (and complains about "Symbol `puts' causes
overflow in R_X86_64_PC32 relocation")

and I can get the following to compile and run in Linux:
```
    extern puts
    global main

section .text
main:
    mov rdi,message
    call puts
    ret

message:
    db  "Hello World",0
```

but *that* segfaults in cygwin.

TL;DR: I think I could get a lot more done if I could start from a single
Hello World asm file that can compile and run in both places, calling out
to puts or something simple like that.

Any help would be appreciated, I hope everything about my question makes
sense. :)

- - Jesse

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Request for an example x68 assembler portable Hello World script
  2019-04-26  7:25 Request for an example x68 assembler portable Hello World script Jesse Thompson
@ 2019-04-26 11:16 ` Eliot Moss
  2019-04-26 21:04 ` Jesse Thompson
  1 sibling, 0 replies; 7+ messages in thread
From: Eliot Moss @ 2019-04-26 11:16 UTC (permalink / raw)
  To: cygwin

On 4/26/2019 3:25 AM, Jesse Thompson wrote:
> I would like to learn how to write assembly programs for the command line
> that with as little alteration as is feasable will compile both in Cygwin
> and in other flavors of Unix like Linux and/or FreeBSD.
> 
> I am targeting only x64 CPUs and I'm perfectly happy to use libc calls
> instead of direct syscalls or interrupts. I'm hoping to use nasm+gcc, or
> perhaps fasm to do the deed. Crosspiling is not a concern, I'll build
> cygwin binaries in cygwin and unix binaries in unix.
> 
> But I'm confused by the differences in calling convention/ABI between
> Windows and/or Cygwin and Linux?
> 
> For example, I can get this to compile and run in Cygwin:
> 
> ```
>          global  main
>          extern  puts
>          section .text
> main:
>          sub     rsp, 20h                        ; Reserve the shadow space
>          mov     rcx, message                    ; First argument is address
> of message
>          call    puts                            ; puts(message)
>          add     rsp, 20h                        ; Remove shadow space
>          ret
> message:
>          db      'Hello', 0                      ; C strings need a zero
> byte at the end
> ```
> 
> 
> but it segfaults in Linux (and complains about "Symbol `puts' causes
> overflow in R_X86_64_PC32 relocation")
> 
> and I can get the following to compile and run in Linux:
> ```
>      extern puts
>      global main
> 
> section .text
> main:
>      mov rdi,message
>      call puts
>      ret
> 
> message:
>      db  "Hello World",0
> ```
> 
> but *that* segfaults in cygwin.
> 
> TL;DR: I think I could get a lot more done if I could start from a single
> Hello World asm file that can compile and run in both places, calling out
> to puts or something simple like that.
> 
> Any help would be appreciated, I hope everything about my question makes
> sense. :)

Der Jesse -- Someone else may be able to speak to the specifics, but
register use and calling conventions, and to some extent stack layout,
very from platform to platform.  Roughly, platform = processor + OS.
So (to me anyway) it would not be at all surprising if you have to
write code different for each of Windows, Cygwin, and Linux.  Cygwin
tries to offer library level compatibility for program designed to
run under Posix (there are some seams here and there, where Windows
differences are hard to hide).  But the programs have to be recompiled
to the Cygwin ABI.

Another thing you may be encountering is the difference between the
32-bit and 64-bit worlds.  Recent x86 processor support the x86_64
version as well.  Cygwin offers both 32 and 64 bit versions, but they
are distinct, and a program needs to be compiled to the one under
which you wish to run it.  (I have both 32 and 64 bit Cygwin on my
computer, and the programs can invoke one another, but the installations
need to be in separate file hierarchies.)  The same would tend to hold
under Linux and Windows, though the OS can determine automatically
for a given program whether it is 32 or 64 bit from details of the
first bytes of the executable file.

Regards - EM

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Request for an example x68 assembler portable Hello World script
  2019-04-26  7:25 Request for an example x68 assembler portable Hello World script Jesse Thompson
  2019-04-26 11:16 ` Eliot Moss
@ 2019-04-26 21:04 ` Jesse Thompson
  2019-04-27  1:56   ` Doug Henderson
                     ` (2 more replies)
  1 sibling, 3 replies; 7+ messages in thread
From: Jesse Thompson @ 2019-04-26 21:04 UTC (permalink / raw)
  To: cygwin

> From: Eliot Moss <moss@cs.umass.edu>
> To: cygwin@cygwin.com
> Date: Fri, 26 Apr 2019 07:16:38 -0400
> Subject: Re: Request for an example x68 assembler portable Hello World
script
>
> Der Jesse -- Someone else may be able to speak to the specifics, but
> register use and calling conventions, and to some extent stack layout,
> very from platform to platform.  Roughly, platform = processor + OS.
> So (to me anyway) it would not be at all surprising if you have to
> write code different for each of Windows, Cygwin, and Linux.  Cygwin
> tries to offer library level compatibility for program designed to
> run under Posix (there are some seams here and there, where Windows
> differences are hard to hide).  But the programs have to be recompiled
> to the Cygwin ABI.
>
> Another thing you may be encountering is the difference between the
> 32-bit and 64-bit worlds.  Recent x86 processor support the x86_64
> version as well.  Cygwin offers both 32 and 64 bit versions, but they
> are distinct, and a program needs to be compiled to the one under
> which you wish to run it.  (I have both 32 and 64 bit Cygwin on my
> computer, and the programs can invoke one another, but the installations
> need to be in separate file hierarchies.)  The same would tend to hold
> under Linux and Windows, though the OS can determine automatically
> for a given program whether it is 32 or 64 bit from details of the
> first bytes of the executable file.
>
> Regards - EM

Hello Elliot, thank you for your reply.

On one point, yes I do appreciate the difference between 32-bit and 64-bit
installations of Cygwin. However for many years now I have been blessed by
being able to use only the 64 bit version, just as I only run 64 bit
versions of Windows and Linux/BSD operating systems on 64 bit processors.

To this end, I wish to be able to entirely leave behind any bonds to the 32
bit architectures. My software won't have to support them, and ideally
won't even have to acknowledge their existences. ;)

My software also won't have to support non-POSIX Windows libraries, and
should only need to relate to windows through the Cygwin POSIX libraries.

So while I can appreciate that C (and other higher level language)
compilers might hide some of the implementational details of moving from
one platform or calling convention to another, and that Cygwin's libraries
exist to help spackle over some of those other potential differences so
that at least a majority of C code written for Unix OSen can compile and
run in a windows environment, aren't there some pieces of Unix software
that include assembly? What process do developers have to go through to
port *those* bits of software that stray from the safety of higher level
compilers?

Ultimately what I am trying to research is how to begin building a simple
compilation system of my own, so how do the *makers* of compilers deal with
these differences in calling convention?

Thank you for your insight, I hope you are all having a great day and
enjoying this lovely spring weather. :)

>

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Request for an example x68 assembler portable Hello World script
  2019-04-26 21:04 ` Jesse Thompson
@ 2019-04-27  1:56   ` Doug Henderson
  2019-04-27 18:35   ` bzs
  2019-04-28 20:00   ` Eliot Moss
  2 siblings, 0 replies; 7+ messages in thread
From: Doug Henderson @ 2019-04-27  1:56 UTC (permalink / raw)
  To: cygwin

On Fri, 26 Apr 2019 at 15:04, Jesse Thompson <> wrote:
>
> > From: Eliot Moss <>
> > To: cygwin@cygwin.com
> > Date: Fri, 26 Apr 2019 07:16:38 -0400
> > Subject: Re: Request for an example x68 assembler portable Hello World


> Ultimately what I am trying to research is how to begin building a simple
> compilation system of my own, so how do the *makers* of compilers deal with
> these differences in calling convention?

Each hardware and OS combination will usually publish an ABI (an
Application Binary Interface) specification that defines how to call
the OS services, and how applications should call each other and
vendor supplied libraries. Compiler authors will usually generate code
which adheres to these specifications. The C compiler will generate
code that will call the OS and external code accordingly to the
specification. Authors of assembler code must abide by the ABI if they
want their code to inter-operate with the OS and external higher level
language code, however they are free to use any calling convention
they like within their own code.

This page: https://cs.lmu.edu/~ray/notes/gasexamples/ illustrates the
ABI for calling the OS and C-language from assembler in a linux os on
an intel or amd 64-bit cpu. The calling conventions used by Windows on
the same hardware is different, due to different ABIs for those two
OSs.

The GNU Compiler Collection (GCC) support a C-language feature that
allows you to embed assembler instruction in C and C++ code. In this
scenario, one writes otherwise normal C functions where the body of
the function is replaced by assembler instructions (or a mix of C and
assembler). The C compiler generates the correct code for the ABI. And
the assembler code can reference the function arguments by name,
regardless of how they were passed to the function. See:
https://gcc.gnu.org/onlinedocs/gcc/index.html the "Using the GNU
Compiler Collection (GCC)" document. See section 6.47 'How to Use
Inline Assembly Language in C Code" for details.

When you stray from the GNU path, though, things can get more chaotic
where platforms must be considered on a case by case basis.

Have you looked at VM based implementations? These use an intermediate
assembler-like language which is executed on a virtual machine. The VM
is specific to each platforms, but attempts to allow the "Write once,
run anywhere" goal of Java.

HTH,
Doug

-- 
Doug Henderson, Calgary, Alberta, Canada - from gmail.com

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Request for an example x68 assembler portable Hello World script
  2019-04-26 21:04 ` Jesse Thompson
  2019-04-27  1:56   ` Doug Henderson
@ 2019-04-27 18:35   ` bzs
  2019-04-28 20:00   ` Eliot Moss
  2 siblings, 0 replies; 7+ messages in thread
From: bzs @ 2019-04-27 18:35 UTC (permalink / raw)
  To: Jesse Thompson; +Cc: cygwin


Just two thoughts:

1. You probably know that 'cc -S foo.c' produces foo.s which is the
assembler output. Might be worthwhile examining how the experts who
wrote the C compiler handle all this. The output is usually quite
readable for someone prone to reading such things.

2. Rather than generating asm some developers generate C and run that
thru the C compiler. One advantage is you can leverage all the C code
optimization and debugging etc infrastructure and anything else you
can find on the C and ld etc man pages (e.g., PIC.)

But there's nothing wrong with learning assemblers and machine
languages. In the distant past I taught it for several years at Boston
University so, good luck!

-- 
        -Barry Shein

Software Tool & Die    | bzs@TheWorld.com             | http://www.TheWorld.com
Purveyors to the Trade | Voice: +1 617-STD-WRLD       | 800-THE-WRLD
The World: Since 1989  | A Public Information Utility | *oo*

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Request for an example x68 assembler portable Hello World script
  2019-04-26 21:04 ` Jesse Thompson
  2019-04-27  1:56   ` Doug Henderson
  2019-04-27 18:35   ` bzs
@ 2019-04-28 20:00   ` Eliot Moss
  2019-04-29 14:29     ` Sam Habiel
  2 siblings, 1 reply; 7+ messages in thread
From: Eliot Moss @ 2019-04-28 20:00 UTC (permalink / raw)
  To: cygwin

On 4/26/2019 5:04 PM, Jesse Thompson wrote:

> Ultimately what I am trying to research is how to begin building a simple
> compilation system of my own, so how do the *makers* of compilers deal with
> these differences in calling convention?

They make parts of the compilers conditional on the overall platform.
For example, if a compiler is written in C / C++, they use #define
and #if tests, and may include different modules in a build, etc.

They also try to code various algorithms, such a register allocation,
to be parameterized by a description of how things work on a given
platform.

There are whole swaths that are essentially target independent,
especially those having to do with higher level optimizations.
However, even there, platform differences may lead to different
parameter settings (e.g., default number of times to unroll a
loop) or strategies (presence / absence of vector units and
of predicated instructions (as on the ARM) affect how you want
to generate even the high-level target-independent code).

In the case that you are talking about, most of the code generation
and optimization strategies are the same -- there are just some
fine points different about calling sequences, register usage
conventions, etc.  I think those are mostly addressed by the kind
of parameterization-by-descriptions (or by #defines) that I have
described.

You may still see somewhat different code from different compilers,
even for the same platform, simply because the different designers
chose different base code sequences - which may be equivalent. For
example, to move a constant into a register, add-immediate (adding
to zero) and or-immediate (again, ORing with zero) give the same
result for many arguments, to the choice is arbitrary.  One can
come up with many such examples.

Supporting multiple target instruction sets, or even the range of
models of the x86 line, requires some amount of platform-specific
work, of course, and lot of attention to how to build components
that are either independent of the ISA or retargetable in some way.

Regards - Eliot Moss

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Request for an example x68 assembler portable Hello World script
  2019-04-28 20:00   ` Eliot Moss
@ 2019-04-29 14:29     ` Sam Habiel
  0 siblings, 0 replies; 7+ messages in thread
From: Sam Habiel @ 2019-04-29 14:29 UTC (permalink / raw)
  To: cygwin

I frequently cannot contribute discussion to Cygwin topics, but due to
my work porting a database (fis-gtm) to Cygwin, I can chime in here.

This is a good article to give you an overview of the different
calling conventions out there:
https://eli.thegreenplace.net/2011/09/06/stack-frame-layout-on-x86-64.

Here's a summary of what I learned:
1. Cygwin x32 and Linux x32 use the same assembly layout--application
binary interface (ABI).
2. Cygwin x64 uses the Windows 64 ABI. Linux x64 uses the AMD64 ABI.

This tutorial article is a good place to learn about x64 ABI for
Linux: https://cs.lmu.edu/~ray/notes/gasexamples/.

--Sam

On Sun, Apr 28, 2019 at 4:00 PM Eliot Moss <moss@cs.umass.edu> wrote:
>
> On 4/26/2019 5:04 PM, Jesse Thompson wrote:
>
> > Ultimately what I am trying to research is how to begin building a simple
> > compilation system of my own, so how do the *makers* of compilers deal with
> > these differences in calling convention?
>
> They make parts of the compilers conditional on the overall platform.
> For example, if a compiler is written in C / C++, they use #define
> and #if tests, and may include different modules in a build, etc.
>
> They also try to code various algorithms, such a register allocation,
> to be parameterized by a description of how things work on a given
> platform.
>
> There are whole swaths that are essentially target independent,
> especially those having to do with higher level optimizations.
> However, even there, platform differences may lead to different
> parameter settings (e.g., default number of times to unroll a
> loop) or strategies (presence / absence of vector units and
> of predicated instructions (as on the ARM) affect how you want
> to generate even the high-level target-independent code).
>
> In the case that you are talking about, most of the code generation
> and optimization strategies are the same -- there are just some
> fine points different about calling sequences, register usage
> conventions, etc.  I think those are mostly addressed by the kind
> of parameterization-by-descriptions (or by #defines) that I have
> described.
>
> You may still see somewhat different code from different compilers,
> even for the same platform, simply because the different designers
> chose different base code sequences - which may be equivalent. For
> example, to move a constant into a register, add-immediate (adding
> to zero) and or-immediate (again, ORing with zero) give the same
> result for many arguments, to the choice is arbitrary.  One can
> come up with many such examples.
>
> Supporting multiple target instruction sets, or even the range of
> models of the x86 line, requires some amount of platform-specific
> work, of course, and lot of attention to how to build components
> that are either independent of the ISA or retargetable in some way.
>
> Regards - Eliot Moss
>
> --
> Problem reports:       http://cygwin.com/problems.html
> FAQ:                   http://cygwin.com/faq/
> Documentation:         http://cygwin.com/docs.html
> Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
>

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-04-29 14:29 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-26  7:25 Request for an example x68 assembler portable Hello World script Jesse Thompson
2019-04-26 11:16 ` Eliot Moss
2019-04-26 21:04 ` Jesse Thompson
2019-04-27  1:56   ` Doug Henderson
2019-04-27 18:35   ` bzs
2019-04-28 20:00   ` Eliot Moss
2019-04-29 14:29     ` Sam Habiel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).