public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* Performance of ld on GFS (Global File System)
@ 2008-09-24 20:03 Harald Anlauf
  2008-09-25  2:54 ` Alan Modra
  0 siblings, 1 reply; 11+ messages in thread
From: Harald Anlauf @ 2008-09-24 20:03 UTC (permalink / raw)
  To: binutils

Dear binutil experts,

when linking on a Linux system based on SLES 10, which
uses binutils-2.16.x, I experience the following performance
problem:

Linking the files of a project where the main object and the
(static) libraries are placed on a local disk is quite fast,
with wall clock times typically of the order of 1 second
or less, since files will usually be cached by the operating
system.  When the files reside on an NFS file system, things
are a bit slower, but then waiting just a few seconds longer
is not a problem.

Placing the files on a state-of-the-art server running
the same(*) OS and utilities but where the home file system
is a GFS (Global File System by NEC), performance breaks
down completely.  System times goes up by some two orders of
magnitude (a factor of 100, really!), and wall time may
increase even worse.

The main reason is most likely the blocksize used by GFS
which is tuned for high throughput (the blocksize typically
being between say 4MB and 128MB) and the case of large files.
Operations like open and close are probably quite expensive.

(*) Tools like cp, ln, mkdir etc. are modified to use a
     larger blocksize for better performance.

Subjectively, link times appear to increase ever faster than
linearly with the number of libraries on the command line.

Does anybody know whether newer bintuils address this performance
problem, or does anybody have any suggestions how to work around
this issue?

Thanks in advance for any helpful pointers!

Cheers,
Harald

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Performance of ld on GFS (Global File System)
  2008-09-24 20:03 Performance of ld on GFS (Global File System) Harald Anlauf
@ 2008-09-25  2:54 ` Alan Modra
  2008-09-25 13:48   ` Harald Anlauf
  0 siblings, 1 reply; 11+ messages in thread
From: Alan Modra @ 2008-09-25  2:54 UTC (permalink / raw)
  To: Harald Anlauf; +Cc: binutils

On Wed, Sep 24, 2008 at 10:02:45PM +0200, Harald Anlauf wrote:
> Operations like open and close are probably quite expensive.

In that case you might want to tweak bfd/cache.c BFD_CACHE_MAX_OPEN.

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Performance of ld on GFS (Global File System)
  2008-09-25  2:54 ` Alan Modra
@ 2008-09-25 13:48   ` Harald Anlauf
  2008-09-26 14:47     ` Ian Lance Taylor
  2008-10-13 15:16     ` Harald Anlauf
  0 siblings, 2 replies; 11+ messages in thread
From: Harald Anlauf @ 2008-09-25 13:48 UTC (permalink / raw)
  To: Alan Modra; +Cc: binutils

Hi Alan,

> On Wed, Sep 24, 2008 at 10:02:45PM +0200, Harald Anlauf wrote:
> > Operations like open and close are probably quite expensive.
> 
> In that case you might want to tweak bfd/cache.c BFD_CACHE_MAX_OPEN.

I increased BFD_CACHE_MAX_OPEN from 10 to 100, which I presumed to
be large enough.  On GFS, system time and wall time for linking went down
between 30 to 40%, which is good, but still far from a factor 100 or so.  :-(
(The change was neutral on the system with local disk.)

(Empirically increasing BFD_CACHE_MAX_OPEN to 200 did not improve
things further.)

Anything else I can try?  Any other option that reduces the number of
filesystem related system calls may be helpful.

Cheers,
Harald

-- 
GMX Kostenlose Spiele: Einfach online spielen und Spaß haben mit Pastry Passion!
http://games.entertainment.gmx.net/de/entertainment/games/free/puzzle/6169196

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Performance of ld on GFS (Global File System)
  2008-09-25 13:48   ` Harald Anlauf
@ 2008-09-26 14:47     ` Ian Lance Taylor
  2008-09-29  9:56       ` Harald Anlauf
  2008-10-13 15:16     ` Harald Anlauf
  1 sibling, 1 reply; 11+ messages in thread
From: Ian Lance Taylor @ 2008-09-26 14:47 UTC (permalink / raw)
  To: Harald Anlauf; +Cc: Alan Modra, binutils

"Harald Anlauf" <anlauf@gmx.de> writes:

>> On Wed, Sep 24, 2008 at 10:02:45PM +0200, Harald Anlauf wrote:
>> > Operations like open and close are probably quite expensive.
>> 
>> In that case you might want to tweak bfd/cache.c BFD_CACHE_MAX_OPEN.
>
> I increased BFD_CACHE_MAX_OPEN from 10 to 100, which I presumed to
> be large enough.  On GFS, system time and wall time for linking went down
> between 30 to 40%, which is good, but still far from a factor 100 or so.  :-(
> (The change was neutral on the system with local disk.)
>
> (Empirically increasing BFD_CACHE_MAX_OPEN to 200 did not improve
> things further.)
>
> Anything else I can try?  Any other option that reduces the number of
> filesystem related system calls may be helpful.

The newer gold linker tries pretty hard to minimize system calls.  It
does expect to be able to mmap the input files for read access.

Ian

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Performance of ld on GFS (Global File System)
  2008-09-26 14:47     ` Ian Lance Taylor
@ 2008-09-29  9:56       ` Harald Anlauf
  2008-09-29 16:49         ` Ian Lance Taylor
  0 siblings, 1 reply; 11+ messages in thread
From: Harald Anlauf @ 2008-09-29  9:56 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: binutils, amodra

Hi Ian,

> The newer gold linker tries pretty hard to minimize system calls.  It
> does expect to be able to mmap the input files for read access.

I downloaded and compiled the latest snapshot of binutils:

GNU gold (GNU Binutils 2.19.50.20080929) 1.7

However, linking unfortunately failed, because the (Fortran) program
I work on needs OpenMPI which uses COMMON blocks:

./gold: lib/libbasic.a(mo_mpi.o): multiple definition of mpi_fortran_in_place_
./gold: lib/librttov7.a(RTTOV7_MPI.o): previous definition here
[...more similar messages deleted...]
./gold: mpi_fortran_in_place_: unsupported symbol section 0xff02
[...more similar messages deleted...]

At least these messages were emitted quickly... ;-)

Are there any chances that COMMON blocks will be supported?

Cheers,
Harald

-- 
GMX startet ShortView.de. Hier findest Du Leute mit Deinen Interessen!
Jetzt dabei sein: http://www.shortview.de/wasistshortview.php?mc=sv_ext_mf@gmx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Performance of ld on GFS (Global File System)
  2008-09-29  9:56       ` Harald Anlauf
@ 2008-09-29 16:49         ` Ian Lance Taylor
  2008-09-29 21:56           ` Harald Anlauf
                             ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Ian Lance Taylor @ 2008-09-29 16:49 UTC (permalink / raw)
  To: Harald Anlauf; +Cc: binutils, amodra

"Harald Anlauf" <anlauf@gmx.de> writes:

>> The newer gold linker tries pretty hard to minimize system calls.  It
>> does expect to be able to mmap the input files for read access.
>
> I downloaded and compiled the latest snapshot of binutils:
>
> GNU gold (GNU Binutils 2.19.50.20080929) 1.7
>
> However, linking unfortunately failed, because the (Fortran) program
> I work on needs OpenMPI which uses COMMON blocks:
>
> ./gold: lib/libbasic.a(mo_mpi.o): multiple definition of mpi_fortran_in_place_
> ./gold: lib/librttov7.a(RTTOV7_MPI.o): previous definition here
> [...more similar messages deleted...]
> ./gold: mpi_fortran_in_place_: unsupported symbol section 0xff02
> [...more similar messages deleted...]
>
> At least these messages were emitted quickly... ;-)
>
> Are there any chances that COMMON blocks will be supported?

I'm not aware of any bugs in common support, so this is something new.
The section index 0xff02 is in the range reserved for processor
specific codes.  The code for a common symbol is 0xfff2.  Can you give
more details about your platform and compiler?

Ian

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Performance of ld on GFS (Global File System)
  2008-09-29 16:49         ` Ian Lance Taylor
@ 2008-09-29 21:56           ` Harald Anlauf
  2008-09-30 16:46           ` Harald Anlauf
  2008-10-03 22:34           ` Cary Coutant
  2 siblings, 0 replies; 11+ messages in thread
From: Harald Anlauf @ 2008-09-29 21:56 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: binutils, amodra

[-- Attachment #1: Type: text/plain, Size: 854 bytes --]

Ian Lance Taylor wrote:
> I'm not aware of any bugs in common support, so this is something new.
> The section index 0xff02 is in the range reserved for processor
> specific codes.  The code for a common symbol is 0xfff2.  Can you give
> more details about your platform and compiler?

The system in question runs Linux (SLES 10) on an x86_64
processor.  The problem occurs when using the Sunstudio 12
Fortran compiler; I will have to check whether I can
reproduce the problem with gfortran.

I am attaching the assembler code for a minimal program

program test
   implicit none
   include "mpif.h"
   print *, MPI_BOTTOM
end program test

when compiled with sunf95 and gfortran-4.3.  mpif.h is a
Fortran header file from OpenMPI.  I shall try to find an
even simpler example.

Anyway, maybe you already get some idea from this example.

Cheers,
Harald

[-- Attachment #2: commontest.s-gfortran --]
[-- Type: text/plain, Size: 1672 bytes --]

	.file	"commontest.f90"
	.section	.rodata
	.align 16
	.type	options.0.924, @object
	.size	options.0.924, 28
options.0.924:
	.long	68
	.long	127
	.long	0
	.long	0
	.long	0
	.long	1
	.long	0
.LC0:
	.string	"commontest.f90"
	.text
.globl MAIN__
	.type	MAIN__, @function
MAIN__:
.LFB2:
	pushq	%rbp
.LCFI0:
	movq	%rsp, %rbp
.LCFI1:
	subq	$400, %rsp
.LCFI2:
	movl	$options.0.924, %esi
	movl	$7, %edi
	call	_gfortran_set_options
	movq	$.LC0, -392(%rbp)
	movl	$4, -384(%rbp)
	movl	$128, -400(%rbp)
	movl	$6, -396(%rbp)
	leaq	-400(%rbp), %rdi
	call	_gfortran_st_write
	leaq	-400(%rbp), %rdi
	movl	$4, %edx
	movl	$mpi_fortran_bottom_, %esi
	call	_gfortran_transfer_integer
	leaq	-400(%rbp), %rdi
	call	_gfortran_st_write_done
	leave
	ret
.LFE2:
	.size	MAIN__, .-MAIN__
	.comm	mpi_fortran_argv_null_,1,16
	.comm	mpi_fortran_argvs_null_,8,16
	.comm	mpi_fortran_bottom_,4,16
	.comm	mpi_fortran_errcodes_ignore_,4,16
	.comm	mpi_fortran_in_place_,4,16
	.comm	mpi_fortran_status_ignore_,20,16
	.comm	mpi_fortran_statuses_ignore_,8,16
	.section	.eh_frame,"a",@progbits
.Lframe1:
	.long	.LECIE1-.LSCIE1
.LSCIE1:
	.long	0x0
	.byte	0x1
	.string	"zR"
	.uleb128 0x1
	.sleb128 -8
	.byte	0x10
	.uleb128 0x1
	.byte	0x3
	.byte	0xc
	.uleb128 0x7
	.uleb128 0x8
	.byte	0x90
	.uleb128 0x1
	.align 8
.LECIE1:
.LSFDE1:
	.long	.LEFDE1-.LASFDE1
.LASFDE1:
	.long	.LASFDE1-.Lframe1
	.long	.LFB2
	.long	.LFE2-.LFB2
	.uleb128 0x0
	.byte	0x4
	.long	.LCFI0-.LFB2
	.byte	0xe
	.uleb128 0x10
	.byte	0x86
	.uleb128 0x2
	.byte	0x4
	.long	.LCFI1-.LCFI0
	.byte	0xd
	.uleb128 0x6
	.align 8
.LEFDE1:
	.ident	"GCC: (GNU) 4.3.3 20080923 (prerelease) [gcc-4_3-branch revision 138185]"
	.section	.note.GNU-stack,"",@progbits

[-- Attachment #3: commontest.s-sunf95 --]
[-- Type: text/plain, Size: 5170 bytes --]


	.section	.text,"ax"
	.align	4

	.globl	main
	.type	main,@function
	.align	16
main:
.L_y1:
	pushq	%rbp
.L_y2:
	movq	%rsp,%rbp
.L_y3:
	subq	$32,%rsp
.L2:
	movl	%edi, -4(%rbp)
	movq	%rsi, -16(%rbp)
	movq	%rdx, -24(%rbp)
.L3:
	leaq	-24(%rbp), %rdx
	leaq	-16(%rbp), %rsi
	leaq	-4(%rbp), %rdi
	movl	$0, %eax
	call	f90_init
	movq	-24(%rbp), %rdx
	movq	-16(%rbp), %rsi
	movl	-4(%rbp), %edi
	movl	$0, %eax
	call	__f90_init
	movl	$0, %eax
	call	MAIN_
	movl	$0, -28(%rbp)
.L1:
	movl	-28(%rbp), %eax
	leave
	ret
.L4:
	leave
	ret
.L_y0:
	.size	main,.-main
	.align	4

	.globl	MAIN_
	.type	MAIN_,@function
	.align	16
MAIN_:
.L_y5:
	pushq	%rbp
.L_y6:
	movq	%rsp,%rbp
.L_y7:
	subq	$32,%rsp
.L8:
.L9:

/ File commontest.f90:
/ Line 4
	leaq	MAIN.SRC_LOC$1, %r8
	movq	%r8, -24(%rbp)
	movl	$8, %eax
	movl	%eax, -32(%rbp)
	leaq	-32(%rbp), %rdi
	movl	$0, %eax
	call	__f90_sslw
	movl	mpi_fortran_bottom_, %esi
	leaq	-32(%rbp), %rdi
	movl	$0, %eax
	call	__f90_slw_i4
	leaq	-32(%rbp), %rdi
	movl	$0, %eax
	call	__f90_eslw
/ Line 5
.L5:
.L6:
.L7:
.L10:
	leave
	ret
.L_y4:
	.size	MAIN_,.-MAIN_

	.section	.data,"aw"
	.align	16
MAIN.SRC_LOC$1:
	.4byte	0x13,0x0,0x4,0x0
	.quad	MAIN.STR$1
	.type	MAIN.SRC_LOC$1,@object
	.size	MAIN.SRC_LOC$1,24
	.comm	mpi_fortran_bottom_,4,16
	.align	8
__f95__happiness:
	.4byte	0x6f0
	.type	__f95__happiness,@object
	.size	__f95__happiness,4
	.globl	__f95_real_size
	.align	8
__f95_real_size:
	.4byte	0x4
	.type	__f95_real_size,@object
	.size	__f95_real_size,4
	.globl	__f95_double_size
	.align	8
__f95_double_size:
	.4byte	0x8
	.type	__f95_double_size,@object
	.size	__f95_double_size,4
	.globl	__f95_integer_size
	.align	8
__f95_integer_size:
	.4byte	0x4
	.type	__f95_integer_size,@object
	.size	__f95_integer_size,4
	.comm	mpi_fortran_argvs_null_,8,16
	.comm	mpi_fortran_argv_null_,1,16
	.comm	mpi_fortran_errcodes_ignore_,4,16
	.comm	mpi_fortran_in_place_,4,16
	.comm	mpi_fortran_statuses_ignore_,8,16
	.comm	mpi_fortran_status_ignore_,20,16

	.section	.rodata,"a"
MAIN.STR$1:
	.byte	0x63,0x6f,0x6d,0x6d,0x6f,0x6e,0x74,0x65,0x73,0x74
	.byte	0x2e,0x66,0x39,0x30,0x0
	.type	MAIN.STR$1,@object
	.size	MAIN.STR$1,15
	.type	f90_init,@function
	.type	__f90_init,@function
	.type	__f90_sslw,@function
	.type	__f90_slw_i4,@function
	.type	__f90_eslw,@function

	.section	.eh_frame,"a",@progbits
	.align 8
.Lframe1:
	.long	.LECIE1-.LBCIE1
.LBCIE1:
	.long	0x0
	.byte	0x1
	.string	""
	.uleb128	0x1
	.sleb128	-8
	.byte	0x10
	.byte	0xc
	.uleb128	0x7
	.uleb128	0x8
	.byte	0x90
	.uleb128	0x1
	.byte	0x8
	.byte	0x3
	.byte	0x8
	.byte	0x6
	.byte	0x8
	.byte	0xc
	.byte	0x8
	.byte	0xd
	.byte	0x8
	.byte	0xe
	.byte	0x8
	.byte	0xf
	.align 8
.LECIE1:
	.long	.LEFDE1-.LBFDE1
.LBFDE1:
	.long	.LBFDE1-.Lframe1
	.quad	.L_y1
	.quad	.L_y0-.L_y1
	.cfa_advance_loc	.L_y2-.L_y1
	.byte	0xe
	.uleb128	0x10
	.byte	0x86
	.uleb128	0x2
	.cfa_advance_loc	.L_y3-.L_y2
	.byte	0xd
	.uleb128	0x6
	.align	8
.LEFDE1:
	.long	.LEFDE2-.LBFDE2
.LBFDE2:
	.long	.LBFDE2-.Lframe1
	.quad	.L_y5
	.quad	.L_y4-.L_y5
	.cfa_advance_loc	.L_y6-.L_y5
	.byte	0xe
	.uleb128	0x10
	.byte	0x86
	.uleb128	0x2
	.cfa_advance_loc	.L_y7-.L_y6
	.byte	0xd
	.uleb128	0x6
	.align	8
.LEFDE2:

	.file	"commontest.f90"

	.globl	__fsr_init_value
__fsr_init_value = 0x34
/  Begin sdCreateSection : .debug_loc
/  Section Info: link_name/strtab=, entsize=0x1, adralign=0x1, flags=0x0
/  Section Data Blocks:
	.section .debug_loc
/  End sdCreateSection
/  Begin sdCreateSection : .debug_info
/  Section Info: link_name/strtab=, entsize=0x1, adralign=0x1, flags=0x0
/  Section Data Blocks:
/   reloc[0]: knd=2, off=14, siz=8, lab1=.debug_abbrev, lab2=, loff=0
/   reloc[1]: knd=2, off=281, siz=8, lab1=.debug_line, lab2=, loff=0
	.section .debug_info
	.byte 0xff,0xff,0xff,0xff,0x18,0x01,0x00,0x00
	.byte 0x00,0x00,0x00,0x00,0x02,0x00
	.8byte .debug_abbrev
	.byte 0x08,0x01
	.ascii "commontest.f90\0"
	.byte 0x08
	.ascii "/e/uhome/hanlauf/f90/\0"
	.ascii "/opt/sun/sunstudio12/prod/bin/f90 -S -I/e/uhome/hanlauf/opt/sunf95/openmpi-1.2/include -qoption f90comp -h.XANaCGCkLU4ImXY. commontest.f90\0"
	.ascii "R=Sun Fortran 95 8.3 Linux_i386;G=.XANaCGCkLU4ImXY.;backend;raw;\0"
	.ascii "DBG_GEN 5.2.2\0"
	.byte 0x03
	.8byte .debug_line
	.byte 0x00,0x00,0x00
/  End sdCreateSection
/  Begin sdCreateSection : .debug_line
/  Section Info: link_name/strtab=, entsize=0x1, adralign=0x1, flags=0x0
/  Section Data Blocks:
	.section .debug_line
	.byte 0xff,0xff,0xff,0xff,0x42,0x00,0x00,0x00
	.byte 0x00,0x00,0x00,0x00,0x02,0x00,0x38,0x00
	.byte 0x00,0x00,0x00,0x00,0x00,0x00,0x01,0x00
	.byte 0xff,0x04,0x0a,0x00,0x01,0x01,0x01,0x01
	.byte 0x00,0x00,0x00,0x01,0x2f,0x65,0x2f,0x75
	.byte 0x68,0x6f,0x6d,0x65,0x2f,0x68,0x61,0x6e
	.byte 0x6c,0x61,0x75,0x66,0x2f,0x66,0x39,0x30
	.byte 0x2f,0x00,0x00,0x63,0x6f,0x6d,0x6d,0x6f
	.byte 0x6e,0x74,0x65,0x73,0x74,0x2e,0x66,0x39
	.byte 0x30,0x00,0x01,0x00,0x00,0x00
/  End sdCreateSection
/  Begin sdCreateSection : .debug_abbrev
/  Section Info: link_name/strtab=, entsize=0x1, adralign=0x1, flags=0x0
/  Section Data Blocks:
	.section .debug_abbrev
	.byte 0x01,0x11,0x00,0x03,0x08,0x13,0x0b,0x1b
	.byte 0x08,0x85,0x44,0x08,0x87,0x44,0x08,0x25
	.byte 0x08,0x42,0x0b,0x10,0x07,0x00,0x00,0x00
/  End sdCreateSection

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Performance of ld on GFS (Global File System)
  2008-09-29 16:49         ` Ian Lance Taylor
  2008-09-29 21:56           ` Harald Anlauf
@ 2008-09-30 16:46           ` Harald Anlauf
  2008-10-03 22:34           ` Cary Coutant
  2 siblings, 0 replies; 11+ messages in thread
From: Harald Anlauf @ 2008-09-30 16:46 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: amodra, binutils

[-- Attachment #1: Type: text/plain, Size: 962 bytes --]

Ian,

I have now reduced the problem to the following.  Consider the Fortran program:

program test
  implicit none

  integer MPI_BOTTOM
  common/mpi_fortran_bottom/MPI_BOTTOM

  print *, MPI_BOTTOM
end program test

When compiling with sunf95 and with default flags, I can successfully link with gold.
Default flags imply a 'small memory model', i.e. -xmodel=small.
Compiling with "sunf95 -xmodel=medium", I get the error.

See
http://docs.sun.com/app/docs/doc/819-5263/aevkd?a=view
for an explanation of these options.

I shall attach the different assembler files for the above main program.
The Sun compiler uses the assembler "fbe" by default.  I do not know
whether (and how) it is possible to use the GNU as instead.

The differences between both assembler versions are very small.

I hope this does give more insight.

Cheers,
Harald

-- 
Psssst! Schon vom neuen GMX MultiMessenger gehört? Der kann`s mit allen: http://www.gmx.net/de/go/multimessenger

[-- Attachment #2: main.s-small --]
[-- Type: application/octet-stream, Size: 4862 bytes --]


	.section	.text,"ax"
	.align	4

	.globl	main
	.type	main,@function
	.align	16
main:
.L_y1:
	pushq	%rbp
.L_y2:
	movq	%rsp,%rbp
.L_y3:
	subq	$32,%rsp
.L2:
	movl	%edi, -4(%rbp)
	movq	%rsi, -16(%rbp)
	movq	%rdx, -24(%rbp)
.L3:
	leaq	-24(%rbp), %rdx
	leaq	-16(%rbp), %rsi
	leaq	-4(%rbp), %rdi
	movl	$0, %eax
	call	f90_init
	movq	-24(%rbp), %rdx
	movq	-16(%rbp), %rsi
	movl	-4(%rbp), %edi
	movl	$0, %eax
	call	__f90_init
	movl	$0, %eax
	call	MAIN_
	movl	$0, -28(%rbp)
.L1:
	movl	-28(%rbp), %eax
	leave
	ret
.L4:
	leave
	ret
.L_y0:
	.size	main,.-main
	.align	4

	.globl	MAIN_
	.type	MAIN_,@function
	.align	16
MAIN_:
.L_y5:
	pushq	%rbp
.L_y6:
	movq	%rsp,%rbp
.L_y7:
	subq	$32,%rsp
.L8:
.L9:

/ File main.f90:
/ Line 7
	leaq	MAIN.SRC_LOC$1, %r8
	movq	%r8, -24(%rbp)
	movl	$8, %eax
	movl	%eax, -32(%rbp)
	leaq	-32(%rbp), %rdi
	movl	$0, %eax
	call	__f90_sslw
	movl	mpi_fortran_bottom_, %esi
	leaq	-32(%rbp), %rdi
	movl	$0, %eax
	call	__f90_slw_i4
	leaq	-32(%rbp), %rdi
	movl	$0, %eax
	call	__f90_eslw
/ Line 8
.L5:
.L6:
.L7:
.L10:
	leave
	ret
.L_y4:
	.size	MAIN_,.-MAIN_

	.section	.data,"aw"
	.align	16
MAIN.SRC_LOC$1:
	.4byte	0x13,0x0,0x7,0x0
	.quad	MAIN.STR$1
	.type	MAIN.SRC_LOC$1,@object
	.size	MAIN.SRC_LOC$1,24
	.comm	mpi_fortran_bottom_,4,16
	.align	8
__f95__happiness:
	.4byte	0x6f0
	.type	__f95__happiness,@object
	.size	__f95__happiness,4
	.globl	__f95_real_size
	.align	8
__f95_real_size:
	.4byte	0x4
	.type	__f95_real_size,@object
	.size	__f95_real_size,4
	.globl	__f95_double_size
	.align	8
__f95_double_size:
	.4byte	0x8
	.type	__f95_double_size,@object
	.size	__f95_double_size,4
	.globl	__f95_integer_size
	.align	8
__f95_integer_size:
	.4byte	0x4
	.type	__f95_integer_size,@object
	.size	__f95_integer_size,4

	.section	.rodata,"a"
MAIN.STR$1:
	.byte	0x6d,0x61,0x69,0x6e,0x2e,0x66,0x39,0x30,0x0
	.type	MAIN.STR$1,@object
	.size	MAIN.STR$1,9
	.type	f90_init,@function
	.type	__f90_init,@function
	.type	__f90_sslw,@function
	.type	__f90_slw_i4,@function
	.type	__f90_eslw,@function

	.section	.eh_frame,"a",@progbits
	.align 8
.Lframe1:
	.long	.LECIE1-.LBCIE1
.LBCIE1:
	.long	0x0
	.byte	0x1
	.string	""
	.uleb128	0x1
	.sleb128	-8
	.byte	0x10
	.byte	0xc
	.uleb128	0x7
	.uleb128	0x8
	.byte	0x90
	.uleb128	0x1
	.byte	0x8
	.byte	0x3
	.byte	0x8
	.byte	0x6
	.byte	0x8
	.byte	0xc
	.byte	0x8
	.byte	0xd
	.byte	0x8
	.byte	0xe
	.byte	0x8
	.byte	0xf
	.align 8
.LECIE1:
	.long	.LEFDE1-.LBFDE1
.LBFDE1:
	.long	.LBFDE1-.Lframe1
	.quad	.L_y1
	.quad	.L_y0-.L_y1
	.cfa_advance_loc	.L_y2-.L_y1
	.byte	0xe
	.uleb128	0x10
	.byte	0x86
	.uleb128	0x2
	.cfa_advance_loc	.L_y3-.L_y2
	.byte	0xd
	.uleb128	0x6
	.align	8
.LEFDE1:
	.long	.LEFDE2-.LBFDE2
.LBFDE2:
	.long	.LBFDE2-.Lframe1
	.quad	.L_y5
	.quad	.L_y4-.L_y5
	.cfa_advance_loc	.L_y6-.L_y5
	.byte	0xe
	.uleb128	0x10
	.byte	0x86
	.uleb128	0x2
	.cfa_advance_loc	.L_y7-.L_y6
	.byte	0xd
	.uleb128	0x6
	.align	8
.LEFDE2:

	.file	"main.f90"

	.globl	__fsr_init_value
__fsr_init_value = 0x34
/  Begin sdCreateSection : .debug_loc
/  Section Info: link_name/strtab=, entsize=0x1, adralign=0x1, flags=0x0
/  Section Data Blocks:
	.section .debug_loc
/  End sdCreateSection
/  Begin sdCreateSection : .debug_info
/  Section Info: link_name/strtab=, entsize=0x1, adralign=0x1, flags=0x0
/  Section Data Blocks:
/   reloc[0]: knd=2, off=14, siz=8, lab1=.debug_abbrev, lab2=, loff=0
/   reloc[1]: knd=2, off=240, siz=8, lab1=.debug_line, lab2=, loff=0
	.section .debug_info
	.byte 0xff,0xff,0xff,0xff,0xf0,0x00,0x00,0x00
	.byte 0x00,0x00,0x00,0x00,0x02,0x00
	.8byte .debug_abbrev
	.byte 0x08,0x01
	.ascii "main.f90\0"
	.byte 0x08
	.ascii "/e/uhome/hanlauf/f90/ldtest/\0"
	.ascii "/opt/sun/sunstudio12/prod/bin/f90 -xmodel=small -S -qoption f90comp -h.XANaCGCdmj4IWUY. main.f90\0"
	.ascii "R=Sun Fortran 95 8.3 Linux_i386;G=.XANaCGCdmj4IWUY.;backend;raw;\0"
	.ascii "DBG_GEN 5.2.2\0"
	.byte 0x03
	.8byte .debug_line
	.byte 0x00,0x00,0x00,0x00
/  End sdCreateSection
/  Begin sdCreateSection : .debug_line
/  Section Info: link_name/strtab=, entsize=0x1, adralign=0x1, flags=0x0
/  Section Data Blocks:
	.section .debug_line
	.byte 0xff,0xff,0xff,0xff,0x43,0x00,0x00,0x00
	.byte 0x00,0x00,0x00,0x00,0x02,0x00,0x39,0x00
	.byte 0x00,0x00,0x00,0x00,0x00,0x00,0x01,0x00
	.byte 0xff,0x04,0x0a,0x00,0x01,0x01,0x01,0x01
	.byte 0x00,0x00,0x00,0x01,0x2f,0x65,0x2f,0x75
	.byte 0x68,0x6f,0x6d,0x65,0x2f,0x68,0x61,0x6e
	.byte 0x6c,0x61,0x75,0x66,0x2f,0x66,0x39,0x30
	.byte 0x2f,0x6c,0x64,0x74,0x65,0x73,0x74,0x2f
	.byte 0x00,0x00,0x6d,0x61,0x69,0x6e,0x2e,0x66
	.byte 0x39,0x30,0x00,0x01,0x00,0x00,0x00
/  End sdCreateSection
/  Begin sdCreateSection : .debug_abbrev
/  Section Info: link_name/strtab=, entsize=0x1, adralign=0x1, flags=0x0
/  Section Data Blocks:
	.section .debug_abbrev
	.byte 0x01,0x11,0x00,0x03,0x08,0x13,0x0b,0x1b
	.byte 0x08,0x85,0x44,0x08,0x87,0x44,0x08,0x25
	.byte 0x08,0x42,0x0b,0x10,0x07,0x00,0x00,0x00
/  End sdCreateSection

[-- Attachment #3: main.s-medium --]
[-- Type: application/octet-stream, Size: 4880 bytes --]


	.section	.text,"ax"
	.align	4

	.globl	main
	.type	main,@function
	.align	16
main:
.L_y1:
	pushq	%rbp
.L_y2:
	movq	%rsp,%rbp
.L_y3:
	subq	$32,%rsp
.L2:
	movl	%edi, -4(%rbp)
	movq	%rsi, -16(%rbp)
	movq	%rdx, -24(%rbp)
.L3:
	leaq	-24(%rbp), %rdx
	leaq	-16(%rbp), %rsi
	leaq	-4(%rbp), %rdi
	movl	$0, %eax
	call	f90_init
	movq	-24(%rbp), %rdx
	movq	-16(%rbp), %rsi
	movl	-4(%rbp), %edi
	movl	$0, %eax
	call	__f90_init
	movl	$0, %eax
	call	MAIN_
	movl	$0, -28(%rbp)
.L1:
	movl	-28(%rbp), %eax
	leave
	ret
.L4:
	leave
	ret
.L_y0:
	.size	main,.-main
	.align	4

	.globl	MAIN_
	.type	MAIN_,@function
	.align	16
MAIN_:
.L_y5:
	pushq	%rbp
.L_y6:
	movq	%rsp,%rbp
.L_y7:
	subq	$32,%rsp
.L8:
.L9:

/ File main.f90:
/ Line 7
	leaq	MAIN.SRC_LOC$1, %r8
	movq	%r8, -24(%rbp)
	movl	$8, %eax
	movl	%eax, -32(%rbp)
	leaq	-32(%rbp), %rdi
	movl	$0, %eax
	call	__f90_sslw
	movabsl	mpi_fortran_bottom_, %eax
	movl	%eax, %esi
	leaq	-32(%rbp), %rdi
	movl	$0, %eax
	call	__f90_slw_i4
	leaq	-32(%rbp), %rdi
	movl	$0, %eax
	call	__f90_eslw
/ Line 8
.L5:
.L6:
.L7:
.L10:
	leave
	ret
.L_y4:
	.size	MAIN_,.-MAIN_

	.section	.data,"aw"
	.align	16
MAIN.SRC_LOC$1:
	.4byte	0x13,0x0,0x7,0x0
	.quad	MAIN.STR$1
	.type	MAIN.SRC_LOC$1,@object
	.size	MAIN.SRC_LOC$1,24
	.lbcomm	mpi_fortran_bottom_,4,16
	.align	8
__f95__happiness:
	.4byte	0x6f0
	.type	__f95__happiness,@object
	.size	__f95__happiness,4
	.globl	__f95_real_size
	.align	8
__f95_real_size:
	.4byte	0x4
	.type	__f95_real_size,@object
	.size	__f95_real_size,4
	.globl	__f95_double_size
	.align	8
__f95_double_size:
	.4byte	0x8
	.type	__f95_double_size,@object
	.size	__f95_double_size,4
	.globl	__f95_integer_size
	.align	8
__f95_integer_size:
	.4byte	0x4
	.type	__f95_integer_size,@object
	.size	__f95_integer_size,4

	.section	.rodata,"a"
MAIN.STR$1:
	.byte	0x6d,0x61,0x69,0x6e,0x2e,0x66,0x39,0x30,0x0
	.type	MAIN.STR$1,@object
	.size	MAIN.STR$1,9
	.type	f90_init,@function
	.type	__f90_init,@function
	.type	__f90_sslw,@function
	.type	__f90_slw_i4,@function
	.type	__f90_eslw,@function

	.section	.eh_frame,"a",@progbits
	.align 8
.Lframe1:
	.long	.LECIE1-.LBCIE1
.LBCIE1:
	.long	0x0
	.byte	0x1
	.string	""
	.uleb128	0x1
	.sleb128	-8
	.byte	0x10
	.byte	0xc
	.uleb128	0x7
	.uleb128	0x8
	.byte	0x90
	.uleb128	0x1
	.byte	0x8
	.byte	0x3
	.byte	0x8
	.byte	0x6
	.byte	0x8
	.byte	0xc
	.byte	0x8
	.byte	0xd
	.byte	0x8
	.byte	0xe
	.byte	0x8
	.byte	0xf
	.align 8
.LECIE1:
	.long	.LEFDE1-.LBFDE1
.LBFDE1:
	.long	.LBFDE1-.Lframe1
	.quad	.L_y1
	.quad	.L_y0-.L_y1
	.cfa_advance_loc	.L_y2-.L_y1
	.byte	0xe
	.uleb128	0x10
	.byte	0x86
	.uleb128	0x2
	.cfa_advance_loc	.L_y3-.L_y2
	.byte	0xd
	.uleb128	0x6
	.align	8
.LEFDE1:
	.long	.LEFDE2-.LBFDE2
.LBFDE2:
	.long	.LBFDE2-.Lframe1
	.quad	.L_y5
	.quad	.L_y4-.L_y5
	.cfa_advance_loc	.L_y6-.L_y5
	.byte	0xe
	.uleb128	0x10
	.byte	0x86
	.uleb128	0x2
	.cfa_advance_loc	.L_y7-.L_y6
	.byte	0xd
	.uleb128	0x6
	.align	8
.LEFDE2:

	.file	"main.f90"

	.globl	__fsr_init_value
__fsr_init_value = 0x34
/  Begin sdCreateSection : .debug_loc
/  Section Info: link_name/strtab=, entsize=0x1, adralign=0x1, flags=0x0
/  Section Data Blocks:
	.section .debug_loc
/  End sdCreateSection
/  Begin sdCreateSection : .debug_info
/  Section Info: link_name/strtab=, entsize=0x1, adralign=0x1, flags=0x0
/  Section Data Blocks:
/   reloc[0]: knd=2, off=14, siz=8, lab1=.debug_abbrev, lab2=, loff=0
/   reloc[1]: knd=2, off=241, siz=8, lab1=.debug_line, lab2=, loff=0
	.section .debug_info
	.byte 0xff,0xff,0xff,0xff,0xf0,0x00,0x00,0x00
	.byte 0x00,0x00,0x00,0x00,0x02,0x00
	.8byte .debug_abbrev
	.byte 0x08,0x01
	.ascii "main.f90\0"
	.byte 0x08
	.ascii "/e/uhome/hanlauf/f90/ldtest/\0"
	.ascii "/opt/sun/sunstudio12/prod/bin/f90 -xmodel=medium -S -qoption f90comp -h.XANaCGCemj4I2UY. main.f90\0"
	.ascii "R=Sun Fortran 95 8.3 Linux_i386;G=.XANaCGCemj4I2UY.;backend;raw;\0"
	.ascii "DBG_GEN 5.2.2\0"
	.byte 0x03
	.8byte .debug_line
	.byte 0x00,0x00,0x00
/  End sdCreateSection
/  Begin sdCreateSection : .debug_line
/  Section Info: link_name/strtab=, entsize=0x1, adralign=0x1, flags=0x0
/  Section Data Blocks:
	.section .debug_line
	.byte 0xff,0xff,0xff,0xff,0x43,0x00,0x00,0x00
	.byte 0x00,0x00,0x00,0x00,0x02,0x00,0x39,0x00
	.byte 0x00,0x00,0x00,0x00,0x00,0x00,0x01,0x00
	.byte 0xff,0x04,0x0a,0x00,0x01,0x01,0x01,0x01
	.byte 0x00,0x00,0x00,0x01,0x2f,0x65,0x2f,0x75
	.byte 0x68,0x6f,0x6d,0x65,0x2f,0x68,0x61,0x6e
	.byte 0x6c,0x61,0x75,0x66,0x2f,0x66,0x39,0x30
	.byte 0x2f,0x6c,0x64,0x74,0x65,0x73,0x74,0x2f
	.byte 0x00,0x00,0x6d,0x61,0x69,0x6e,0x2e,0x66
	.byte 0x39,0x30,0x00,0x01,0x00,0x00,0x00
/  End sdCreateSection
/  Begin sdCreateSection : .debug_abbrev
/  Section Info: link_name/strtab=, entsize=0x1, adralign=0x1, flags=0x0
/  Section Data Blocks:
	.section .debug_abbrev
	.byte 0x01,0x11,0x00,0x03,0x08,0x13,0x0b,0x1b
	.byte 0x08,0x85,0x44,0x08,0x87,0x44,0x08,0x25
	.byte 0x08,0x42,0x0b,0x10,0x07,0x00,0x00,0x00
/  End sdCreateSection

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Performance of ld on GFS (Global File System)
  2008-09-29 16:49         ` Ian Lance Taylor
  2008-09-29 21:56           ` Harald Anlauf
  2008-09-30 16:46           ` Harald Anlauf
@ 2008-10-03 22:34           ` Cary Coutant
  2008-10-06  8:02             ` Harald Anlauf
  2 siblings, 1 reply; 11+ messages in thread
From: Cary Coutant @ 2008-10-03 22:34 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: Harald Anlauf, binutils, amodra

>> ./gold: mpi_fortran_in_place_: unsupported symbol section 0xff02
>> [...more similar messages deleted...]
>>
>> At least these messages were emitted quickly... ;-)
>>
>> Are there any chances that COMMON blocks will be supported?
>
> I'm not aware of any bugs in common support, so this is something new.
> The section index 0xff02 is in the range reserved for processor
> specific codes.  The code for a common symbol is 0xfff2.  Can you give
> more details about your platform and compiler?

It's probably this (from the Sun Linker and Libraries manual):

SHN_AMD64_LCOMMON

    x64 specific common block label. This label is similar to
SHN_COMMON, but provides for identifying a large common block.

64-bit PA-RISC also has a special section index for huge common, which
tells the linker to put the common block in something other than .bss
(perhaps .lbss, but my memory is hazy). Presumably, SHN_AMD64_LCOMMON
has a similar purpose, and a simple readelf of your successful link
output might tell us what the right section name is, but I don't know
if the linker needs to take any additional special actions for symbols
defined this way.

-cary

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Performance of ld on GFS (Global File System)
  2008-10-03 22:34           ` Cary Coutant
@ 2008-10-06  8:02             ` Harald Anlauf
  0 siblings, 0 replies; 11+ messages in thread
From: Harald Anlauf @ 2008-10-06  8:02 UTC (permalink / raw)
  To: Cary Coutant, iant; +Cc: amodra, binutils

[-- Attachment #1: Type: text/plain, Size: 1525 bytes --]

> It's probably this (from the Sun Linker and Libraries manual):
> 
> SHN_AMD64_LCOMMON
> 
>     x64 specific common block label. This label is similar to
> SHN_COMMON, but provides for identifying a large common block.
> 
> 64-bit PA-RISC also has a special section index for huge common, which
> tells the linker to put the common block in something other than .bss
> (perhaps .lbss, but my memory is hazy). Presumably, SHN_AMD64_LCOMMON
> has a similar purpose, and a simple readelf of your successful link
> output might tell us what the right section name is, but I don't know
> if the linker needs to take any additional special actions for symbols
> defined this way.

I am not familiar with using readelf, so I ran "readelf -a" on the objects
and on the a.out files for both "models", see the attached .tar file.

Looking at the differences in the readelf output for the object files,
I find in the symbol table:
19: 0000000000000010     4 OBJECT  GLOBAL DEFAULT  COM mpi_fortran_bottom_
vs.
19: 0000000000000010     4 OBJECT  GLOBAL DEFAULT LARGE_COM mpi_fortran_bottom_

In the linked program I find a section .lbss (model=medium) where
the other has .bss (model=small).

BTW: the linker used in the successful link step is:
GNU ld version 2.16.91.0.5 20051219 (SUSE Linux)
I also get a successful link with a self-compiled:
GNU ld (GNU Binutils) 2.18

Harald

-- 
GMX startet ShortView.de. Hier findest Du Leute mit Deinen Interessen!
Jetzt dabei sein: http://www.shortview.de/wasistshortview.php?mc=sv_ext_mf@gmx

[-- Attachment #2: suncommon.tar.gz --]
[-- Type: application/x-gzip, Size: 9649 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Performance of ld on GFS (Global File System)
  2008-09-25 13:48   ` Harald Anlauf
  2008-09-26 14:47     ` Ian Lance Taylor
@ 2008-10-13 15:16     ` Harald Anlauf
  1 sibling, 0 replies; 11+ messages in thread
From: Harald Anlauf @ 2008-10-13 15:16 UTC (permalink / raw)
  To: Harald Anlauf, amodra; +Cc: binutils

> I increased BFD_CACHE_MAX_OPEN from 10 to 100, which I presumed to
> be large enough.  On GFS, system time and wall time for linking went down
> between 30 to 40%, which is good, but still far from a factor 100 or so. 
> :-(
> (The change was neutral on the system with local disk.)

In the meantime I was pointed out off-list that linking is much faster
when the resulting binary is placed on a local disk instead of on GFS.
It is not necessary to copy input files or libraries to the local filesystem.

So a workaround is to rename ld and call it by a suitable wrapper script,
which generates the binary on a temporary filesystem and copies it back.

This actually reduces system and wall clock time by a factor of about 100,
and I am almost back to normal.

Cheers,
Harald

-- 
Der GMX SmartSurfer hilft bis zu 70% Ihrer Onlinekosten zu sparen! 
Ideal für Modem und ISDN: http://www.gmx.net/de/go/smartsurfer

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2008-10-13 15:16 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-09-24 20:03 Performance of ld on GFS (Global File System) Harald Anlauf
2008-09-25  2:54 ` Alan Modra
2008-09-25 13:48   ` Harald Anlauf
2008-09-26 14:47     ` Ian Lance Taylor
2008-09-29  9:56       ` Harald Anlauf
2008-09-29 16:49         ` Ian Lance Taylor
2008-09-29 21:56           ` Harald Anlauf
2008-09-30 16:46           ` Harald Anlauf
2008-10-03 22:34           ` Cary Coutant
2008-10-06  8:02             ` Harald Anlauf
2008-10-13 15:16     ` Harald Anlauf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).