* Performance of ld on GFS (Global File System)
@ 2008-09-24 20:03 Harald Anlauf
2008-09-25 2:54 ` Alan Modra
0 siblings, 1 reply; 11+ messages in thread
From: Harald Anlauf @ 2008-09-24 20:03 UTC (permalink / raw)
To: binutils
Dear binutil experts,
when linking on a Linux system based on SLES 10, which
uses binutils-2.16.x, I experience the following performance
problem:
Linking the files of a project where the main object and the
(static) libraries are placed on a local disk is quite fast,
with wall clock times typically of the order of 1 second
or less, since files will usually be cached by the operating
system. When the files reside on an NFS file system, things
are a bit slower, but then waiting just a few seconds longer
is not a problem.
Placing the files on a state-of-the-art server running
the same(*) OS and utilities but where the home file system
is a GFS (Global File System by NEC), performance breaks
down completely. System times goes up by some two orders of
magnitude (a factor of 100, really!), and wall time may
increase even worse.
The main reason is most likely the blocksize used by GFS
which is tuned for high throughput (the blocksize typically
being between say 4MB and 128MB) and the case of large files.
Operations like open and close are probably quite expensive.
(*) Tools like cp, ln, mkdir etc. are modified to use a
larger blocksize for better performance.
Subjectively, link times appear to increase ever faster than
linearly with the number of libraries on the command line.
Does anybody know whether newer bintuils address this performance
problem, or does anybody have any suggestions how to work around
this issue?
Thanks in advance for any helpful pointers!
Cheers,
Harald
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Performance of ld on GFS (Global File System)
2008-09-24 20:03 Performance of ld on GFS (Global File System) Harald Anlauf
@ 2008-09-25 2:54 ` Alan Modra
2008-09-25 13:48 ` Harald Anlauf
0 siblings, 1 reply; 11+ messages in thread
From: Alan Modra @ 2008-09-25 2:54 UTC (permalink / raw)
To: Harald Anlauf; +Cc: binutils
On Wed, Sep 24, 2008 at 10:02:45PM +0200, Harald Anlauf wrote:
> Operations like open and close are probably quite expensive.
In that case you might want to tweak bfd/cache.c BFD_CACHE_MAX_OPEN.
--
Alan Modra
Australia Development Lab, IBM
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Performance of ld on GFS (Global File System)
2008-09-25 2:54 ` Alan Modra
@ 2008-09-25 13:48 ` Harald Anlauf
2008-09-26 14:47 ` Ian Lance Taylor
2008-10-13 15:16 ` Harald Anlauf
0 siblings, 2 replies; 11+ messages in thread
From: Harald Anlauf @ 2008-09-25 13:48 UTC (permalink / raw)
To: Alan Modra; +Cc: binutils
Hi Alan,
> On Wed, Sep 24, 2008 at 10:02:45PM +0200, Harald Anlauf wrote:
> > Operations like open and close are probably quite expensive.
>
> In that case you might want to tweak bfd/cache.c BFD_CACHE_MAX_OPEN.
I increased BFD_CACHE_MAX_OPEN from 10 to 100, which I presumed to
be large enough. On GFS, system time and wall time for linking went down
between 30 to 40%, which is good, but still far from a factor 100 or so. :-(
(The change was neutral on the system with local disk.)
(Empirically increasing BFD_CACHE_MAX_OPEN to 200 did not improve
things further.)
Anything else I can try? Any other option that reduces the number of
filesystem related system calls may be helpful.
Cheers,
Harald
--
GMX Kostenlose Spiele: Einfach online spielen und Spaà haben mit Pastry Passion!
http://games.entertainment.gmx.net/de/entertainment/games/free/puzzle/6169196
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Performance of ld on GFS (Global File System)
2008-09-25 13:48 ` Harald Anlauf
@ 2008-09-26 14:47 ` Ian Lance Taylor
2008-09-29 9:56 ` Harald Anlauf
2008-10-13 15:16 ` Harald Anlauf
1 sibling, 1 reply; 11+ messages in thread
From: Ian Lance Taylor @ 2008-09-26 14:47 UTC (permalink / raw)
To: Harald Anlauf; +Cc: Alan Modra, binutils
"Harald Anlauf" <anlauf@gmx.de> writes:
>> On Wed, Sep 24, 2008 at 10:02:45PM +0200, Harald Anlauf wrote:
>> > Operations like open and close are probably quite expensive.
>>
>> In that case you might want to tweak bfd/cache.c BFD_CACHE_MAX_OPEN.
>
> I increased BFD_CACHE_MAX_OPEN from 10 to 100, which I presumed to
> be large enough. On GFS, system time and wall time for linking went down
> between 30 to 40%, which is good, but still far from a factor 100 or so. :-(
> (The change was neutral on the system with local disk.)
>
> (Empirically increasing BFD_CACHE_MAX_OPEN to 200 did not improve
> things further.)
>
> Anything else I can try? Any other option that reduces the number of
> filesystem related system calls may be helpful.
The newer gold linker tries pretty hard to minimize system calls. It
does expect to be able to mmap the input files for read access.
Ian
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Performance of ld on GFS (Global File System)
2008-09-26 14:47 ` Ian Lance Taylor
@ 2008-09-29 9:56 ` Harald Anlauf
2008-09-29 16:49 ` Ian Lance Taylor
0 siblings, 1 reply; 11+ messages in thread
From: Harald Anlauf @ 2008-09-29 9:56 UTC (permalink / raw)
To: Ian Lance Taylor; +Cc: binutils, amodra
Hi Ian,
> The newer gold linker tries pretty hard to minimize system calls. It
> does expect to be able to mmap the input files for read access.
I downloaded and compiled the latest snapshot of binutils:
GNU gold (GNU Binutils 2.19.50.20080929) 1.7
However, linking unfortunately failed, because the (Fortran) program
I work on needs OpenMPI which uses COMMON blocks:
./gold: lib/libbasic.a(mo_mpi.o): multiple definition of mpi_fortran_in_place_
./gold: lib/librttov7.a(RTTOV7_MPI.o): previous definition here
[...more similar messages deleted...]
./gold: mpi_fortran_in_place_: unsupported symbol section 0xff02
[...more similar messages deleted...]
At least these messages were emitted quickly... ;-)
Are there any chances that COMMON blocks will be supported?
Cheers,
Harald
--
GMX startet ShortView.de. Hier findest Du Leute mit Deinen Interessen!
Jetzt dabei sein: http://www.shortview.de/wasistshortview.php?mc=sv_ext_mf@gmx
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Performance of ld on GFS (Global File System)
2008-09-29 9:56 ` Harald Anlauf
@ 2008-09-29 16:49 ` Ian Lance Taylor
2008-09-29 21:56 ` Harald Anlauf
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: Ian Lance Taylor @ 2008-09-29 16:49 UTC (permalink / raw)
To: Harald Anlauf; +Cc: binutils, amodra
"Harald Anlauf" <anlauf@gmx.de> writes:
>> The newer gold linker tries pretty hard to minimize system calls. It
>> does expect to be able to mmap the input files for read access.
>
> I downloaded and compiled the latest snapshot of binutils:
>
> GNU gold (GNU Binutils 2.19.50.20080929) 1.7
>
> However, linking unfortunately failed, because the (Fortran) program
> I work on needs OpenMPI which uses COMMON blocks:
>
> ./gold: lib/libbasic.a(mo_mpi.o): multiple definition of mpi_fortran_in_place_
> ./gold: lib/librttov7.a(RTTOV7_MPI.o): previous definition here
> [...more similar messages deleted...]
> ./gold: mpi_fortran_in_place_: unsupported symbol section 0xff02
> [...more similar messages deleted...]
>
> At least these messages were emitted quickly... ;-)
>
> Are there any chances that COMMON blocks will be supported?
I'm not aware of any bugs in common support, so this is something new.
The section index 0xff02 is in the range reserved for processor
specific codes. The code for a common symbol is 0xfff2. Can you give
more details about your platform and compiler?
Ian
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Performance of ld on GFS (Global File System)
2008-09-29 16:49 ` Ian Lance Taylor
@ 2008-09-29 21:56 ` Harald Anlauf
2008-09-30 16:46 ` Harald Anlauf
2008-10-03 22:34 ` Cary Coutant
2 siblings, 0 replies; 11+ messages in thread
From: Harald Anlauf @ 2008-09-29 21:56 UTC (permalink / raw)
To: Ian Lance Taylor; +Cc: binutils, amodra
[-- Attachment #1: Type: text/plain, Size: 854 bytes --]
Ian Lance Taylor wrote:
> I'm not aware of any bugs in common support, so this is something new.
> The section index 0xff02 is in the range reserved for processor
> specific codes. The code for a common symbol is 0xfff2. Can you give
> more details about your platform and compiler?
The system in question runs Linux (SLES 10) on an x86_64
processor. The problem occurs when using the Sunstudio 12
Fortran compiler; I will have to check whether I can
reproduce the problem with gfortran.
I am attaching the assembler code for a minimal program
program test
implicit none
include "mpif.h"
print *, MPI_BOTTOM
end program test
when compiled with sunf95 and gfortran-4.3. mpif.h is a
Fortran header file from OpenMPI. I shall try to find an
even simpler example.
Anyway, maybe you already get some idea from this example.
Cheers,
Harald
[-- Attachment #2: commontest.s-gfortran --]
[-- Type: text/plain, Size: 1672 bytes --]
.file "commontest.f90"
.section .rodata
.align 16
.type options.0.924, @object
.size options.0.924, 28
options.0.924:
.long 68
.long 127
.long 0
.long 0
.long 0
.long 1
.long 0
.LC0:
.string "commontest.f90"
.text
.globl MAIN__
.type MAIN__, @function
MAIN__:
.LFB2:
pushq %rbp
.LCFI0:
movq %rsp, %rbp
.LCFI1:
subq $400, %rsp
.LCFI2:
movl $options.0.924, %esi
movl $7, %edi
call _gfortran_set_options
movq $.LC0, -392(%rbp)
movl $4, -384(%rbp)
movl $128, -400(%rbp)
movl $6, -396(%rbp)
leaq -400(%rbp), %rdi
call _gfortran_st_write
leaq -400(%rbp), %rdi
movl $4, %edx
movl $mpi_fortran_bottom_, %esi
call _gfortran_transfer_integer
leaq -400(%rbp), %rdi
call _gfortran_st_write_done
leave
ret
.LFE2:
.size MAIN__, .-MAIN__
.comm mpi_fortran_argv_null_,1,16
.comm mpi_fortran_argvs_null_,8,16
.comm mpi_fortran_bottom_,4,16
.comm mpi_fortran_errcodes_ignore_,4,16
.comm mpi_fortran_in_place_,4,16
.comm mpi_fortran_status_ignore_,20,16
.comm mpi_fortran_statuses_ignore_,8,16
.section .eh_frame,"a",@progbits
.Lframe1:
.long .LECIE1-.LSCIE1
.LSCIE1:
.long 0x0
.byte 0x1
.string "zR"
.uleb128 0x1
.sleb128 -8
.byte 0x10
.uleb128 0x1
.byte 0x3
.byte 0xc
.uleb128 0x7
.uleb128 0x8
.byte 0x90
.uleb128 0x1
.align 8
.LECIE1:
.LSFDE1:
.long .LEFDE1-.LASFDE1
.LASFDE1:
.long .LASFDE1-.Lframe1
.long .LFB2
.long .LFE2-.LFB2
.uleb128 0x0
.byte 0x4
.long .LCFI0-.LFB2
.byte 0xe
.uleb128 0x10
.byte 0x86
.uleb128 0x2
.byte 0x4
.long .LCFI1-.LCFI0
.byte 0xd
.uleb128 0x6
.align 8
.LEFDE1:
.ident "GCC: (GNU) 4.3.3 20080923 (prerelease) [gcc-4_3-branch revision 138185]"
.section .note.GNU-stack,"",@progbits
[-- Attachment #3: commontest.s-sunf95 --]
[-- Type: text/plain, Size: 5170 bytes --]
.section .text,"ax"
.align 4
.globl main
.type main,@function
.align 16
main:
.L_y1:
pushq %rbp
.L_y2:
movq %rsp,%rbp
.L_y3:
subq $32,%rsp
.L2:
movl %edi, -4(%rbp)
movq %rsi, -16(%rbp)
movq %rdx, -24(%rbp)
.L3:
leaq -24(%rbp), %rdx
leaq -16(%rbp), %rsi
leaq -4(%rbp), %rdi
movl $0, %eax
call f90_init
movq -24(%rbp), %rdx
movq -16(%rbp), %rsi
movl -4(%rbp), %edi
movl $0, %eax
call __f90_init
movl $0, %eax
call MAIN_
movl $0, -28(%rbp)
.L1:
movl -28(%rbp), %eax
leave
ret
.L4:
leave
ret
.L_y0:
.size main,.-main
.align 4
.globl MAIN_
.type MAIN_,@function
.align 16
MAIN_:
.L_y5:
pushq %rbp
.L_y6:
movq %rsp,%rbp
.L_y7:
subq $32,%rsp
.L8:
.L9:
/ File commontest.f90:
/ Line 4
leaq MAIN.SRC_LOC$1, %r8
movq %r8, -24(%rbp)
movl $8, %eax
movl %eax, -32(%rbp)
leaq -32(%rbp), %rdi
movl $0, %eax
call __f90_sslw
movl mpi_fortran_bottom_, %esi
leaq -32(%rbp), %rdi
movl $0, %eax
call __f90_slw_i4
leaq -32(%rbp), %rdi
movl $0, %eax
call __f90_eslw
/ Line 5
.L5:
.L6:
.L7:
.L10:
leave
ret
.L_y4:
.size MAIN_,.-MAIN_
.section .data,"aw"
.align 16
MAIN.SRC_LOC$1:
.4byte 0x13,0x0,0x4,0x0
.quad MAIN.STR$1
.type MAIN.SRC_LOC$1,@object
.size MAIN.SRC_LOC$1,24
.comm mpi_fortran_bottom_,4,16
.align 8
__f95__happiness:
.4byte 0x6f0
.type __f95__happiness,@object
.size __f95__happiness,4
.globl __f95_real_size
.align 8
__f95_real_size:
.4byte 0x4
.type __f95_real_size,@object
.size __f95_real_size,4
.globl __f95_double_size
.align 8
__f95_double_size:
.4byte 0x8
.type __f95_double_size,@object
.size __f95_double_size,4
.globl __f95_integer_size
.align 8
__f95_integer_size:
.4byte 0x4
.type __f95_integer_size,@object
.size __f95_integer_size,4
.comm mpi_fortran_argvs_null_,8,16
.comm mpi_fortran_argv_null_,1,16
.comm mpi_fortran_errcodes_ignore_,4,16
.comm mpi_fortran_in_place_,4,16
.comm mpi_fortran_statuses_ignore_,8,16
.comm mpi_fortran_status_ignore_,20,16
.section .rodata,"a"
MAIN.STR$1:
.byte 0x63,0x6f,0x6d,0x6d,0x6f,0x6e,0x74,0x65,0x73,0x74
.byte 0x2e,0x66,0x39,0x30,0x0
.type MAIN.STR$1,@object
.size MAIN.STR$1,15
.type f90_init,@function
.type __f90_init,@function
.type __f90_sslw,@function
.type __f90_slw_i4,@function
.type __f90_eslw,@function
.section .eh_frame,"a",@progbits
.align 8
.Lframe1:
.long .LECIE1-.LBCIE1
.LBCIE1:
.long 0x0
.byte 0x1
.string ""
.uleb128 0x1
.sleb128 -8
.byte 0x10
.byte 0xc
.uleb128 0x7
.uleb128 0x8
.byte 0x90
.uleb128 0x1
.byte 0x8
.byte 0x3
.byte 0x8
.byte 0x6
.byte 0x8
.byte 0xc
.byte 0x8
.byte 0xd
.byte 0x8
.byte 0xe
.byte 0x8
.byte 0xf
.align 8
.LECIE1:
.long .LEFDE1-.LBFDE1
.LBFDE1:
.long .LBFDE1-.Lframe1
.quad .L_y1
.quad .L_y0-.L_y1
.cfa_advance_loc .L_y2-.L_y1
.byte 0xe
.uleb128 0x10
.byte 0x86
.uleb128 0x2
.cfa_advance_loc .L_y3-.L_y2
.byte 0xd
.uleb128 0x6
.align 8
.LEFDE1:
.long .LEFDE2-.LBFDE2
.LBFDE2:
.long .LBFDE2-.Lframe1
.quad .L_y5
.quad .L_y4-.L_y5
.cfa_advance_loc .L_y6-.L_y5
.byte 0xe
.uleb128 0x10
.byte 0x86
.uleb128 0x2
.cfa_advance_loc .L_y7-.L_y6
.byte 0xd
.uleb128 0x6
.align 8
.LEFDE2:
.file "commontest.f90"
.globl __fsr_init_value
__fsr_init_value = 0x34
/ Begin sdCreateSection : .debug_loc
/ Section Info: link_name/strtab=, entsize=0x1, adralign=0x1, flags=0x0
/ Section Data Blocks:
.section .debug_loc
/ End sdCreateSection
/ Begin sdCreateSection : .debug_info
/ Section Info: link_name/strtab=, entsize=0x1, adralign=0x1, flags=0x0
/ Section Data Blocks:
/ reloc[0]: knd=2, off=14, siz=8, lab1=.debug_abbrev, lab2=, loff=0
/ reloc[1]: knd=2, off=281, siz=8, lab1=.debug_line, lab2=, loff=0
.section .debug_info
.byte 0xff,0xff,0xff,0xff,0x18,0x01,0x00,0x00
.byte 0x00,0x00,0x00,0x00,0x02,0x00
.8byte .debug_abbrev
.byte 0x08,0x01
.ascii "commontest.f90\0"
.byte 0x08
.ascii "/e/uhome/hanlauf/f90/\0"
.ascii "/opt/sun/sunstudio12/prod/bin/f90 -S -I/e/uhome/hanlauf/opt/sunf95/openmpi-1.2/include -qoption f90comp -h.XANaCGCkLU4ImXY. commontest.f90\0"
.ascii "R=Sun Fortran 95 8.3 Linux_i386;G=.XANaCGCkLU4ImXY.;backend;raw;\0"
.ascii "DBG_GEN 5.2.2\0"
.byte 0x03
.8byte .debug_line
.byte 0x00,0x00,0x00
/ End sdCreateSection
/ Begin sdCreateSection : .debug_line
/ Section Info: link_name/strtab=, entsize=0x1, adralign=0x1, flags=0x0
/ Section Data Blocks:
.section .debug_line
.byte 0xff,0xff,0xff,0xff,0x42,0x00,0x00,0x00
.byte 0x00,0x00,0x00,0x00,0x02,0x00,0x38,0x00
.byte 0x00,0x00,0x00,0x00,0x00,0x00,0x01,0x00
.byte 0xff,0x04,0x0a,0x00,0x01,0x01,0x01,0x01
.byte 0x00,0x00,0x00,0x01,0x2f,0x65,0x2f,0x75
.byte 0x68,0x6f,0x6d,0x65,0x2f,0x68,0x61,0x6e
.byte 0x6c,0x61,0x75,0x66,0x2f,0x66,0x39,0x30
.byte 0x2f,0x00,0x00,0x63,0x6f,0x6d,0x6d,0x6f
.byte 0x6e,0x74,0x65,0x73,0x74,0x2e,0x66,0x39
.byte 0x30,0x00,0x01,0x00,0x00,0x00
/ End sdCreateSection
/ Begin sdCreateSection : .debug_abbrev
/ Section Info: link_name/strtab=, entsize=0x1, adralign=0x1, flags=0x0
/ Section Data Blocks:
.section .debug_abbrev
.byte 0x01,0x11,0x00,0x03,0x08,0x13,0x0b,0x1b
.byte 0x08,0x85,0x44,0x08,0x87,0x44,0x08,0x25
.byte 0x08,0x42,0x0b,0x10,0x07,0x00,0x00,0x00
/ End sdCreateSection
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Performance of ld on GFS (Global File System)
2008-09-29 16:49 ` Ian Lance Taylor
2008-09-29 21:56 ` Harald Anlauf
@ 2008-09-30 16:46 ` Harald Anlauf
2008-10-03 22:34 ` Cary Coutant
2 siblings, 0 replies; 11+ messages in thread
From: Harald Anlauf @ 2008-09-30 16:46 UTC (permalink / raw)
To: Ian Lance Taylor; +Cc: amodra, binutils
[-- Attachment #1: Type: text/plain, Size: 962 bytes --]
Ian,
I have now reduced the problem to the following. Consider the Fortran program:
program test
implicit none
integer MPI_BOTTOM
common/mpi_fortran_bottom/MPI_BOTTOM
print *, MPI_BOTTOM
end program test
When compiling with sunf95 and with default flags, I can successfully link with gold.
Default flags imply a 'small memory model', i.e. -xmodel=small.
Compiling with "sunf95 -xmodel=medium", I get the error.
See
http://docs.sun.com/app/docs/doc/819-5263/aevkd?a=view
for an explanation of these options.
I shall attach the different assembler files for the above main program.
The Sun compiler uses the assembler "fbe" by default. I do not know
whether (and how) it is possible to use the GNU as instead.
The differences between both assembler versions are very small.
I hope this does give more insight.
Cheers,
Harald
--
Psssst! Schon vom neuen GMX MultiMessenger gehört? Der kann`s mit allen: http://www.gmx.net/de/go/multimessenger
[-- Attachment #2: main.s-small --]
[-- Type: application/octet-stream, Size: 4862 bytes --]
.section .text,"ax"
.align 4
.globl main
.type main,@function
.align 16
main:
.L_y1:
pushq %rbp
.L_y2:
movq %rsp,%rbp
.L_y3:
subq $32,%rsp
.L2:
movl %edi, -4(%rbp)
movq %rsi, -16(%rbp)
movq %rdx, -24(%rbp)
.L3:
leaq -24(%rbp), %rdx
leaq -16(%rbp), %rsi
leaq -4(%rbp), %rdi
movl $0, %eax
call f90_init
movq -24(%rbp), %rdx
movq -16(%rbp), %rsi
movl -4(%rbp), %edi
movl $0, %eax
call __f90_init
movl $0, %eax
call MAIN_
movl $0, -28(%rbp)
.L1:
movl -28(%rbp), %eax
leave
ret
.L4:
leave
ret
.L_y0:
.size main,.-main
.align 4
.globl MAIN_
.type MAIN_,@function
.align 16
MAIN_:
.L_y5:
pushq %rbp
.L_y6:
movq %rsp,%rbp
.L_y7:
subq $32,%rsp
.L8:
.L9:
/ File main.f90:
/ Line 7
leaq MAIN.SRC_LOC$1, %r8
movq %r8, -24(%rbp)
movl $8, %eax
movl %eax, -32(%rbp)
leaq -32(%rbp), %rdi
movl $0, %eax
call __f90_sslw
movl mpi_fortran_bottom_, %esi
leaq -32(%rbp), %rdi
movl $0, %eax
call __f90_slw_i4
leaq -32(%rbp), %rdi
movl $0, %eax
call __f90_eslw
/ Line 8
.L5:
.L6:
.L7:
.L10:
leave
ret
.L_y4:
.size MAIN_,.-MAIN_
.section .data,"aw"
.align 16
MAIN.SRC_LOC$1:
.4byte 0x13,0x0,0x7,0x0
.quad MAIN.STR$1
.type MAIN.SRC_LOC$1,@object
.size MAIN.SRC_LOC$1,24
.comm mpi_fortran_bottom_,4,16
.align 8
__f95__happiness:
.4byte 0x6f0
.type __f95__happiness,@object
.size __f95__happiness,4
.globl __f95_real_size
.align 8
__f95_real_size:
.4byte 0x4
.type __f95_real_size,@object
.size __f95_real_size,4
.globl __f95_double_size
.align 8
__f95_double_size:
.4byte 0x8
.type __f95_double_size,@object
.size __f95_double_size,4
.globl __f95_integer_size
.align 8
__f95_integer_size:
.4byte 0x4
.type __f95_integer_size,@object
.size __f95_integer_size,4
.section .rodata,"a"
MAIN.STR$1:
.byte 0x6d,0x61,0x69,0x6e,0x2e,0x66,0x39,0x30,0x0
.type MAIN.STR$1,@object
.size MAIN.STR$1,9
.type f90_init,@function
.type __f90_init,@function
.type __f90_sslw,@function
.type __f90_slw_i4,@function
.type __f90_eslw,@function
.section .eh_frame,"a",@progbits
.align 8
.Lframe1:
.long .LECIE1-.LBCIE1
.LBCIE1:
.long 0x0
.byte 0x1
.string ""
.uleb128 0x1
.sleb128 -8
.byte 0x10
.byte 0xc
.uleb128 0x7
.uleb128 0x8
.byte 0x90
.uleb128 0x1
.byte 0x8
.byte 0x3
.byte 0x8
.byte 0x6
.byte 0x8
.byte 0xc
.byte 0x8
.byte 0xd
.byte 0x8
.byte 0xe
.byte 0x8
.byte 0xf
.align 8
.LECIE1:
.long .LEFDE1-.LBFDE1
.LBFDE1:
.long .LBFDE1-.Lframe1
.quad .L_y1
.quad .L_y0-.L_y1
.cfa_advance_loc .L_y2-.L_y1
.byte 0xe
.uleb128 0x10
.byte 0x86
.uleb128 0x2
.cfa_advance_loc .L_y3-.L_y2
.byte 0xd
.uleb128 0x6
.align 8
.LEFDE1:
.long .LEFDE2-.LBFDE2
.LBFDE2:
.long .LBFDE2-.Lframe1
.quad .L_y5
.quad .L_y4-.L_y5
.cfa_advance_loc .L_y6-.L_y5
.byte 0xe
.uleb128 0x10
.byte 0x86
.uleb128 0x2
.cfa_advance_loc .L_y7-.L_y6
.byte 0xd
.uleb128 0x6
.align 8
.LEFDE2:
.file "main.f90"
.globl __fsr_init_value
__fsr_init_value = 0x34
/ Begin sdCreateSection : .debug_loc
/ Section Info: link_name/strtab=, entsize=0x1, adralign=0x1, flags=0x0
/ Section Data Blocks:
.section .debug_loc
/ End sdCreateSection
/ Begin sdCreateSection : .debug_info
/ Section Info: link_name/strtab=, entsize=0x1, adralign=0x1, flags=0x0
/ Section Data Blocks:
/ reloc[0]: knd=2, off=14, siz=8, lab1=.debug_abbrev, lab2=, loff=0
/ reloc[1]: knd=2, off=240, siz=8, lab1=.debug_line, lab2=, loff=0
.section .debug_info
.byte 0xff,0xff,0xff,0xff,0xf0,0x00,0x00,0x00
.byte 0x00,0x00,0x00,0x00,0x02,0x00
.8byte .debug_abbrev
.byte 0x08,0x01
.ascii "main.f90\0"
.byte 0x08
.ascii "/e/uhome/hanlauf/f90/ldtest/\0"
.ascii "/opt/sun/sunstudio12/prod/bin/f90 -xmodel=small -S -qoption f90comp -h.XANaCGCdmj4IWUY. main.f90\0"
.ascii "R=Sun Fortran 95 8.3 Linux_i386;G=.XANaCGCdmj4IWUY.;backend;raw;\0"
.ascii "DBG_GEN 5.2.2\0"
.byte 0x03
.8byte .debug_line
.byte 0x00,0x00,0x00,0x00
/ End sdCreateSection
/ Begin sdCreateSection : .debug_line
/ Section Info: link_name/strtab=, entsize=0x1, adralign=0x1, flags=0x0
/ Section Data Blocks:
.section .debug_line
.byte 0xff,0xff,0xff,0xff,0x43,0x00,0x00,0x00
.byte 0x00,0x00,0x00,0x00,0x02,0x00,0x39,0x00
.byte 0x00,0x00,0x00,0x00,0x00,0x00,0x01,0x00
.byte 0xff,0x04,0x0a,0x00,0x01,0x01,0x01,0x01
.byte 0x00,0x00,0x00,0x01,0x2f,0x65,0x2f,0x75
.byte 0x68,0x6f,0x6d,0x65,0x2f,0x68,0x61,0x6e
.byte 0x6c,0x61,0x75,0x66,0x2f,0x66,0x39,0x30
.byte 0x2f,0x6c,0x64,0x74,0x65,0x73,0x74,0x2f
.byte 0x00,0x00,0x6d,0x61,0x69,0x6e,0x2e,0x66
.byte 0x39,0x30,0x00,0x01,0x00,0x00,0x00
/ End sdCreateSection
/ Begin sdCreateSection : .debug_abbrev
/ Section Info: link_name/strtab=, entsize=0x1, adralign=0x1, flags=0x0
/ Section Data Blocks:
.section .debug_abbrev
.byte 0x01,0x11,0x00,0x03,0x08,0x13,0x0b,0x1b
.byte 0x08,0x85,0x44,0x08,0x87,0x44,0x08,0x25
.byte 0x08,0x42,0x0b,0x10,0x07,0x00,0x00,0x00
/ End sdCreateSection
[-- Attachment #3: main.s-medium --]
[-- Type: application/octet-stream, Size: 4880 bytes --]
.section .text,"ax"
.align 4
.globl main
.type main,@function
.align 16
main:
.L_y1:
pushq %rbp
.L_y2:
movq %rsp,%rbp
.L_y3:
subq $32,%rsp
.L2:
movl %edi, -4(%rbp)
movq %rsi, -16(%rbp)
movq %rdx, -24(%rbp)
.L3:
leaq -24(%rbp), %rdx
leaq -16(%rbp), %rsi
leaq -4(%rbp), %rdi
movl $0, %eax
call f90_init
movq -24(%rbp), %rdx
movq -16(%rbp), %rsi
movl -4(%rbp), %edi
movl $0, %eax
call __f90_init
movl $0, %eax
call MAIN_
movl $0, -28(%rbp)
.L1:
movl -28(%rbp), %eax
leave
ret
.L4:
leave
ret
.L_y0:
.size main,.-main
.align 4
.globl MAIN_
.type MAIN_,@function
.align 16
MAIN_:
.L_y5:
pushq %rbp
.L_y6:
movq %rsp,%rbp
.L_y7:
subq $32,%rsp
.L8:
.L9:
/ File main.f90:
/ Line 7
leaq MAIN.SRC_LOC$1, %r8
movq %r8, -24(%rbp)
movl $8, %eax
movl %eax, -32(%rbp)
leaq -32(%rbp), %rdi
movl $0, %eax
call __f90_sslw
movabsl mpi_fortran_bottom_, %eax
movl %eax, %esi
leaq -32(%rbp), %rdi
movl $0, %eax
call __f90_slw_i4
leaq -32(%rbp), %rdi
movl $0, %eax
call __f90_eslw
/ Line 8
.L5:
.L6:
.L7:
.L10:
leave
ret
.L_y4:
.size MAIN_,.-MAIN_
.section .data,"aw"
.align 16
MAIN.SRC_LOC$1:
.4byte 0x13,0x0,0x7,0x0
.quad MAIN.STR$1
.type MAIN.SRC_LOC$1,@object
.size MAIN.SRC_LOC$1,24
.lbcomm mpi_fortran_bottom_,4,16
.align 8
__f95__happiness:
.4byte 0x6f0
.type __f95__happiness,@object
.size __f95__happiness,4
.globl __f95_real_size
.align 8
__f95_real_size:
.4byte 0x4
.type __f95_real_size,@object
.size __f95_real_size,4
.globl __f95_double_size
.align 8
__f95_double_size:
.4byte 0x8
.type __f95_double_size,@object
.size __f95_double_size,4
.globl __f95_integer_size
.align 8
__f95_integer_size:
.4byte 0x4
.type __f95_integer_size,@object
.size __f95_integer_size,4
.section .rodata,"a"
MAIN.STR$1:
.byte 0x6d,0x61,0x69,0x6e,0x2e,0x66,0x39,0x30,0x0
.type MAIN.STR$1,@object
.size MAIN.STR$1,9
.type f90_init,@function
.type __f90_init,@function
.type __f90_sslw,@function
.type __f90_slw_i4,@function
.type __f90_eslw,@function
.section .eh_frame,"a",@progbits
.align 8
.Lframe1:
.long .LECIE1-.LBCIE1
.LBCIE1:
.long 0x0
.byte 0x1
.string ""
.uleb128 0x1
.sleb128 -8
.byte 0x10
.byte 0xc
.uleb128 0x7
.uleb128 0x8
.byte 0x90
.uleb128 0x1
.byte 0x8
.byte 0x3
.byte 0x8
.byte 0x6
.byte 0x8
.byte 0xc
.byte 0x8
.byte 0xd
.byte 0x8
.byte 0xe
.byte 0x8
.byte 0xf
.align 8
.LECIE1:
.long .LEFDE1-.LBFDE1
.LBFDE1:
.long .LBFDE1-.Lframe1
.quad .L_y1
.quad .L_y0-.L_y1
.cfa_advance_loc .L_y2-.L_y1
.byte 0xe
.uleb128 0x10
.byte 0x86
.uleb128 0x2
.cfa_advance_loc .L_y3-.L_y2
.byte 0xd
.uleb128 0x6
.align 8
.LEFDE1:
.long .LEFDE2-.LBFDE2
.LBFDE2:
.long .LBFDE2-.Lframe1
.quad .L_y5
.quad .L_y4-.L_y5
.cfa_advance_loc .L_y6-.L_y5
.byte 0xe
.uleb128 0x10
.byte 0x86
.uleb128 0x2
.cfa_advance_loc .L_y7-.L_y6
.byte 0xd
.uleb128 0x6
.align 8
.LEFDE2:
.file "main.f90"
.globl __fsr_init_value
__fsr_init_value = 0x34
/ Begin sdCreateSection : .debug_loc
/ Section Info: link_name/strtab=, entsize=0x1, adralign=0x1, flags=0x0
/ Section Data Blocks:
.section .debug_loc
/ End sdCreateSection
/ Begin sdCreateSection : .debug_info
/ Section Info: link_name/strtab=, entsize=0x1, adralign=0x1, flags=0x0
/ Section Data Blocks:
/ reloc[0]: knd=2, off=14, siz=8, lab1=.debug_abbrev, lab2=, loff=0
/ reloc[1]: knd=2, off=241, siz=8, lab1=.debug_line, lab2=, loff=0
.section .debug_info
.byte 0xff,0xff,0xff,0xff,0xf0,0x00,0x00,0x00
.byte 0x00,0x00,0x00,0x00,0x02,0x00
.8byte .debug_abbrev
.byte 0x08,0x01
.ascii "main.f90\0"
.byte 0x08
.ascii "/e/uhome/hanlauf/f90/ldtest/\0"
.ascii "/opt/sun/sunstudio12/prod/bin/f90 -xmodel=medium -S -qoption f90comp -h.XANaCGCemj4I2UY. main.f90\0"
.ascii "R=Sun Fortran 95 8.3 Linux_i386;G=.XANaCGCemj4I2UY.;backend;raw;\0"
.ascii "DBG_GEN 5.2.2\0"
.byte 0x03
.8byte .debug_line
.byte 0x00,0x00,0x00
/ End sdCreateSection
/ Begin sdCreateSection : .debug_line
/ Section Info: link_name/strtab=, entsize=0x1, adralign=0x1, flags=0x0
/ Section Data Blocks:
.section .debug_line
.byte 0xff,0xff,0xff,0xff,0x43,0x00,0x00,0x00
.byte 0x00,0x00,0x00,0x00,0x02,0x00,0x39,0x00
.byte 0x00,0x00,0x00,0x00,0x00,0x00,0x01,0x00
.byte 0xff,0x04,0x0a,0x00,0x01,0x01,0x01,0x01
.byte 0x00,0x00,0x00,0x01,0x2f,0x65,0x2f,0x75
.byte 0x68,0x6f,0x6d,0x65,0x2f,0x68,0x61,0x6e
.byte 0x6c,0x61,0x75,0x66,0x2f,0x66,0x39,0x30
.byte 0x2f,0x6c,0x64,0x74,0x65,0x73,0x74,0x2f
.byte 0x00,0x00,0x6d,0x61,0x69,0x6e,0x2e,0x66
.byte 0x39,0x30,0x00,0x01,0x00,0x00,0x00
/ End sdCreateSection
/ Begin sdCreateSection : .debug_abbrev
/ Section Info: link_name/strtab=, entsize=0x1, adralign=0x1, flags=0x0
/ Section Data Blocks:
.section .debug_abbrev
.byte 0x01,0x11,0x00,0x03,0x08,0x13,0x0b,0x1b
.byte 0x08,0x85,0x44,0x08,0x87,0x44,0x08,0x25
.byte 0x08,0x42,0x0b,0x10,0x07,0x00,0x00,0x00
/ End sdCreateSection
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Performance of ld on GFS (Global File System)
2008-09-29 16:49 ` Ian Lance Taylor
2008-09-29 21:56 ` Harald Anlauf
2008-09-30 16:46 ` Harald Anlauf
@ 2008-10-03 22:34 ` Cary Coutant
2008-10-06 8:02 ` Harald Anlauf
2 siblings, 1 reply; 11+ messages in thread
From: Cary Coutant @ 2008-10-03 22:34 UTC (permalink / raw)
To: Ian Lance Taylor; +Cc: Harald Anlauf, binutils, amodra
>> ./gold: mpi_fortran_in_place_: unsupported symbol section 0xff02
>> [...more similar messages deleted...]
>>
>> At least these messages were emitted quickly... ;-)
>>
>> Are there any chances that COMMON blocks will be supported?
>
> I'm not aware of any bugs in common support, so this is something new.
> The section index 0xff02 is in the range reserved for processor
> specific codes. The code for a common symbol is 0xfff2. Can you give
> more details about your platform and compiler?
It's probably this (from the Sun Linker and Libraries manual):
SHN_AMD64_LCOMMON
x64 specific common block label. This label is similar to
SHN_COMMON, but provides for identifying a large common block.
64-bit PA-RISC also has a special section index for huge common, which
tells the linker to put the common block in something other than .bss
(perhaps .lbss, but my memory is hazy). Presumably, SHN_AMD64_LCOMMON
has a similar purpose, and a simple readelf of your successful link
output might tell us what the right section name is, but I don't know
if the linker needs to take any additional special actions for symbols
defined this way.
-cary
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Performance of ld on GFS (Global File System)
2008-10-03 22:34 ` Cary Coutant
@ 2008-10-06 8:02 ` Harald Anlauf
0 siblings, 0 replies; 11+ messages in thread
From: Harald Anlauf @ 2008-10-06 8:02 UTC (permalink / raw)
To: Cary Coutant, iant; +Cc: amodra, binutils
[-- Attachment #1: Type: text/plain, Size: 1525 bytes --]
> It's probably this (from the Sun Linker and Libraries manual):
>
> SHN_AMD64_LCOMMON
>
> x64 specific common block label. This label is similar to
> SHN_COMMON, but provides for identifying a large common block.
>
> 64-bit PA-RISC also has a special section index for huge common, which
> tells the linker to put the common block in something other than .bss
> (perhaps .lbss, but my memory is hazy). Presumably, SHN_AMD64_LCOMMON
> has a similar purpose, and a simple readelf of your successful link
> output might tell us what the right section name is, but I don't know
> if the linker needs to take any additional special actions for symbols
> defined this way.
I am not familiar with using readelf, so I ran "readelf -a" on the objects
and on the a.out files for both "models", see the attached .tar file.
Looking at the differences in the readelf output for the object files,
I find in the symbol table:
19: 0000000000000010 4 OBJECT GLOBAL DEFAULT COM mpi_fortran_bottom_
vs.
19: 0000000000000010 4 OBJECT GLOBAL DEFAULT LARGE_COM mpi_fortran_bottom_
In the linked program I find a section .lbss (model=medium) where
the other has .bss (model=small).
BTW: the linker used in the successful link step is:
GNU ld version 2.16.91.0.5 20051219 (SUSE Linux)
I also get a successful link with a self-compiled:
GNU ld (GNU Binutils) 2.18
Harald
--
GMX startet ShortView.de. Hier findest Du Leute mit Deinen Interessen!
Jetzt dabei sein: http://www.shortview.de/wasistshortview.php?mc=sv_ext_mf@gmx
[-- Attachment #2: suncommon.tar.gz --]
[-- Type: application/x-gzip, Size: 9649 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Performance of ld on GFS (Global File System)
2008-09-25 13:48 ` Harald Anlauf
2008-09-26 14:47 ` Ian Lance Taylor
@ 2008-10-13 15:16 ` Harald Anlauf
1 sibling, 0 replies; 11+ messages in thread
From: Harald Anlauf @ 2008-10-13 15:16 UTC (permalink / raw)
To: Harald Anlauf, amodra; +Cc: binutils
> I increased BFD_CACHE_MAX_OPEN from 10 to 100, which I presumed to
> be large enough. On GFS, system time and wall time for linking went down
> between 30 to 40%, which is good, but still far from a factor 100 or so.
> :-(
> (The change was neutral on the system with local disk.)
In the meantime I was pointed out off-list that linking is much faster
when the resulting binary is placed on a local disk instead of on GFS.
It is not necessary to copy input files or libraries to the local filesystem.
So a workaround is to rename ld and call it by a suitable wrapper script,
which generates the binary on a temporary filesystem and copies it back.
This actually reduces system and wall clock time by a factor of about 100,
and I am almost back to normal.
Cheers,
Harald
--
Der GMX SmartSurfer hilft bis zu 70% Ihrer Onlinekosten zu sparen!
Ideal für Modem und ISDN: http://www.gmx.net/de/go/smartsurfer
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2008-10-13 15:16 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-09-24 20:03 Performance of ld on GFS (Global File System) Harald Anlauf
2008-09-25 2:54 ` Alan Modra
2008-09-25 13:48 ` Harald Anlauf
2008-09-26 14:47 ` Ian Lance Taylor
2008-09-29 9:56 ` Harald Anlauf
2008-09-29 16:49 ` Ian Lance Taylor
2008-09-29 21:56 ` Harald Anlauf
2008-09-30 16:46 ` Harald Anlauf
2008-10-03 22:34 ` Cary Coutant
2008-10-06 8:02 ` Harald Anlauf
2008-10-13 15:16 ` Harald Anlauf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).