public inbox for
 help / color / mirror / Atom feed
From: Ramiro Polla <>
Subject: Re: Re: pthreads-win32 2.8.0, stack alignment, and SSE code
Date: Sun, 05 Oct 2008 19:25:00 -0000	[thread overview]
Message-ID: <> (raw)
In-Reply-To: <>

Sébastien Kunz-Jacques wrote:
> Ramiro Polla a écrit :
>> Imagine if someone wants to use that ATLAS library but instead of 
>> starting a new thread it wants to call directly the function that 
>> needs SSE (no I haven't checked if it is possible in this case but it 
>> could happen theoretically). And imagine that someone is using MSVC++ 
>> to call that function. MSVC++ only aligns to 4-byte (and again it is 
>> valid). That function would also crash, independent of your patch.
>> So in your specific case I think it is the ATLAS functions that should 
>> be aligned (= it would also help to use the library with other 
>> compilers).
> Actually I have tried calling ATLAS from MSVC, and it (appears to) work. 
> I suspect that ATLAS interface functions realign stack already, but I 
> didn't check this (I am going to ask the ATLAS maintainer about this). 
> The problem that made ATLAS crash without the above fix is that  some 
> internal ATLAS functions get started through pthreads, and these ones 
> definitely do not realign the stack.

Then I suspect it is only these ones that should need force_align.

 >> Your patch can also be seen as a way to always sufficiently align the
 >> stack so that any thread started by pthreads-win32 is ok for SSE
 >> instructions (the same way glibc does I think). In that case I don't
 >> have a strong opinion about it. The overhead really is negligible.
 >> Starting the thread takes much longer.
> Regarding your last comment, do you imply that the stack realignment is 
> slow? from disassemblies I saw, it stores %esp in another register, 
> aligns esp (andl    $-16, %esp), and restores it in the function 
> epilogue. The main performance penalty therefore occurs because one 
> register is used, and this is a reason to do the alignment in a function 
> like threadStart instead of the called function, if the latter does some 
> register-intensive task.

I didn't express myself very well then. I meant to say: "The overhead 
really is negligible. Starting the thread takes much longer, so the 
overhead in aligning the stack gets hidden away in the delay to start 
the thread".

Ramiro Polla

  reply	other threads:[~2008-10-05 19:25 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-10-05 12:32 Sébastien Kunz-Jacques
2008-10-05 13:41 ` Ramiro Polla
2008-10-05 14:47   ` Sébastien Kunz-Jacques
2008-10-05 18:27     ` Ramiro Polla
2008-10-05 18:52       ` Sébastien Kunz-Jacques
2008-10-05 19:25         ` Ramiro Polla [this message]
2008-10-05 20:12           ` Sébastien Kunz-Jacques
2008-10-05 22:42             ` Ramiro Polla
2008-10-09 13:14               ` Ross Johnson
2008-10-09 19:51                 ` Sébastien Kunz-Jacques
2008-10-23  5:57                 ` Sébastien Kunz-Jacques

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).