public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* naked functions on x86 architecture
@ 2009-06-12 16:20 Zachary Turner
  2009-06-12 16:32 ` Paolo Bonzini
  0 siblings, 1 reply; 6+ messages in thread
From: Zachary Turner @ 2009-06-12 16:20 UTC (permalink / raw)
  To: gcc

Hi,

I know this has been discussed before, I have read through some of the
archives and read about some of the rationale.  I want to raise it
again however, because I don't think anyone has ever presented a good
example of where it is really really useful on x86 architectures.

In general, it is very useful for selecting different versions of
instructions (byte, word, dword, qword) with a template
specialization.  I'll post some code that works under visual c++ 9.0
to demonstrate what I mean.  The following function finds the index of
the first zero (or nonzero with similar template specializations
replacing rep with repne) "element" of an arbitrarily sized array (and
is the fastest way I know to do so).

template<typename T> int __declspec(naked) scas();

template<> int __declspec(naked) scas<boost::uint8_t>() { __asm rep
scasb __asm mov eax, edi __asm ret }
template<> int __declspec(naked) scas<boost::uint16_t>() { __asm rep
scasw __asm mov eax, edi __asm ret }
template<> int __declspec(naked) scas<boost::uint32_t>() { __asm rep
scasd __asm mov eax, edi __asm ret }
#if (sizeof(void*) == sizeof(boost::uint64_t))
template<> int __declspec(naked) scas<boost::uint64_t>() { __asm rep
scasq __asm mov rax, rdi __asm ret }
#endif

template<typename T>
int find_first_nonzero_scas(T* x, int cnt)
{
    int result = 0;
    __asm {
        xor eax, eax
        mov edi, x
        mov ecx, cnt
    }
    result = scas<T>();
    result -= reinterpret_cast<int>(x);
    result /= sizeof(T);
    return --result;
}


This is one example, but it illustrates a general concept that I think
is really useful and I personally have used numerous times for lots of
other instructions than SCAS.  If there is a way to achieve this
without using a naked function then please advise.  I'd rather not
resort to an if/then/else when the value of every test is known at
compile time.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2009-06-12 18:47 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-06-12 16:20 naked functions on x86 architecture Zachary Turner
2009-06-12 16:32 ` Paolo Bonzini
2009-06-12 17:25   ` Zachary Turner
2009-06-12 17:39     ` Andrew Haley
2009-06-12 17:56       ` Zachary Turner
2009-06-12 18:47         ` Andrew Haley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).