From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 30661 invoked by alias); 5 Oct 2008 19:25:15 -0000 Received: (qmail 30651 invoked by uid 22791); 5 Oct 2008 19:25:14 -0000 X-Spam-Check-By: sourceware.org Received: from sv2.lisha.ufsc.br (HELO smtp.lisha.ufsc.br) (150.162.62.2) by sourceware.org (qpsmtpd/0.31) with ESMTP; Sun, 05 Oct 2008 19:24:35 +0000 Received: by smtp.lisha.ufsc.br (Postfix, from userid 99) id 7C389BF02E; Sun, 5 Oct 2008 16:24:32 -0300 (BRT) Received: from ramiro-pollas-macbook.local (200-233-120-210.sercomtel.com.br [200.233.120.210]) (Authenticated sender: ramiro) by smtp.lisha.ufsc.br (Postfix) with ESMTP id 42F57BF027 for ; Sun, 5 Oct 2008 16:24:29 -0300 (BRT) Message-ID: <48E91469.3060508@lisha.ufsc.br> Date: Sun, 05 Oct 2008 19:25:00 -0000 From: Ramiro Polla User-Agent: Thunderbird 2.0.0.17 (Macintosh/20080914) MIME-Version: 1.0 To: pthreads-win32@sourceware.org Subject: Re: Re: pthreads-win32 2.8.0, stack alignment, and SSE code References: <48E8B399.502@yahoo.fr> <48E8C3B8.2080709@lisha.ufsc.br> <48E8D34F.4@yahoo.fr> <48E906BD.5090304@lisha.ufsc.br> <48E90CB1.2040000@yahoo.fr> In-Reply-To: <48E90CB1.2040000@yahoo.fr> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-IsSubscribed: yes Mailing-List: contact pthreads-win32-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: pthreads-win32-owner@sourceware.org X-SW-Source: 2008/txt/msg00058.txt.bz2 Sébastien Kunz-Jacques wrote: > Ramiro Polla a écrit : [...] >> Imagine if someone wants to use that ATLAS library but instead of >> starting a new thread it wants to call directly the function that >> needs SSE (no I haven't checked if it is possible in this case but it >> could happen theoretically). And imagine that someone is using MSVC++ >> to call that function. MSVC++ only aligns to 4-byte (and again it is >> valid). That function would also crash, independent of your patch. >> >> So in your specific case I think it is the ATLAS functions that should >> be aligned (= it would also help to use the library with other >> compilers). [...] > Actually I have tried calling ATLAS from MSVC, and it (appears to) work. > I suspect that ATLAS interface functions realign stack already, but I > didn't check this (I am going to ask the ATLAS maintainer about this). > The problem that made ATLAS crash without the above fix is that some > internal ATLAS functions get started through pthreads, and these ones > definitely do not realign the stack. Then I suspect it is only these ones that should need force_align. [...] >> Your patch can also be seen as a way to always sufficiently align the >> stack so that any thread started by pthreads-win32 is ok for SSE >> instructions (the same way glibc does I think). In that case I don't >> have a strong opinion about it. The overhead really is negligible. >> Starting the thread takes much longer. [...] > Regarding your last comment, do you imply that the stack realignment is > slow? from disassemblies I saw, it stores %esp in another register, > aligns esp (andl $-16, %esp), and restores it in the function > epilogue. The main performance penalty therefore occurs because one > register is used, and this is a reason to do the alignment in a function > like threadStart instead of the called function, if the latter does some > register-intensive task. I didn't express myself very well then. I meant to say: "The overhead really is negligible. Starting the thread takes much longer, so the overhead in aligning the stack gets hidden away in the delay to start the thread". Ramiro Polla