On Sep 16, 2005, Alexandre Oliva wrote: > Over the past few months, I've been working on porting to IA32 and > AMD64/EM64T the interesting bits of the TLS design I came up with for > FR-V, achieving some impressive speedups along with slight code size > reductions in the most common cases. > Although the design is not set in stone yet, it's fully implemented > and functional with patches I'm about to post for binutils, gcc and > glibc mainline, as follow-ups to this message, except that the GCC > patch will go to gcc-patches, as expected. Here's the patch for binutils. I'm not entirely happy with two aspects of the patch: - the way I managed to emit the `call *(%[er]ax)' instruction from `call *variable@TLSCALL(%[er]ax)', dropping the offset from the instruction but still emitting the relocation, seems fragile to me, but there were not additional bits available to do something cleaner. Any suggestions on a better approach? - local_tlsdesc_gotent is probably too wasteful, since very few of all local symbols are going to require TLS descriptor entries. I hope this is not too much of a problem, but I could introduce another data structure if people feel strongly about it. Also note the several FIXMEs with decisions yet to be made on exact instructions to be generated in several cases. I'm yet to develop some means to better evaluate the performance of each alternative, but even then, I have limited hardware to test on. I'd welcome feedback from people more familiar with performance features of various x86-compatible processors. Anyone? Thanks in advance, Here's the patch. Built and tested on x86_64-linux-gnu and i686-pc-linux-gnu. Ok to install?