public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* Mmap and performance of gold
@ 2009-08-20 18:18 Mikolaj Zalewski
  2009-08-21 10:41 ` Ian Lance Taylor
  0 siblings, 1 reply; 2+ messages in thread
From: Mikolaj Zalewski @ 2009-08-20 18:18 UTC (permalink / raw)
  To: binutils

[-- Attachment #1: Type: text/plain, Size: 1404 bytes --]

  Hi,
  I have seen a lot of CPU time spent inside mmap on my workstation.
To avoid blowing up the memory in 32-bit mode, gold tries to mmap only
the parts of the files that are needed, what generates lots of calls
to mmap small areas. Today, I've made a test by writing a quick patch
mmapping the whole file and not caring about alignment (it doesn't
make a difference on x86_64?) and made some measurements. Mmapping
once seems to speed up linking a 100MB, 64-bit C++ binary with gold
(values varied by some +/-0.1s between runs. I've taken the median of
three runs):

64-bit build -O2:
Mmapping all files:
real	0m3.640s
user	0m2.884s
sys	0m0.716s

Mmaping selected regions:
real	0m5.318s
user	0m2.984s
sys	0m2.312s

32-bit build -O2:
Mmapping all files:
real	0m3.787s
user	0m3.120s
sys	0m0.608s

Mmaping selected regions:
real	0m5.422s
user	0m3.124s
sys	0m2.240s

Does the attached patch make a similar difference on other systems, or
is there something strange with my kernel? Maybe we should mmap whole
files on 64-bit systems or add a command line options with a default
depending on whether the system is 64-bit. My current patch works on
architectures that don't care about alignment, but this could be
improved (the benchmark was done on a build using thin archives, so
probably the alignment change didn't affect the results in either
way).

Mikołaj

[-- Attachment #2: full-mmap.patch --]
[-- Type: text/x-patch, Size: 4747 bytes --]

diff --git a/gold/fileread.cc b/gold/fileread.cc
index bb10aa9..2c6cabd 100644
--- a/gold/fileread.cc
+++ b/gold/fileread.cc
@@ -50,13 +50,13 @@ File_read::View::~View()
   gold_assert(!this->is_locked());
   if (!this->mapped_)
     delete[] this->data_;
-  else
+/*  else
     {
       if (::munmap(const_cast<unsigned char*>(this->data_), this->size_) != 0)
         gold_warning(_("munmap failed: %s"), strerror(errno));
 
       File_read::current_mapped_bytes -= this->size_;
-    }
+    }*/
 }
 
 void
@@ -88,6 +88,11 @@ unsigned long long File_read::maximum_mapped_bytes;
 File_read::~File_read()
 {
   gold_assert(this->token_.is_writable());
+  if (contents_mmapped_) {
+    if (::munmap(const_cast<unsigned char*>(this->contents_), this->size_) != 0)
+      gold_warning(_("munmap failed: %s"), strerror(errno));
+    File_read::current_mapped_bytes -= this->size_;
+  }
   if (this->is_descriptor_opened_)
     {
       release_descriptor(this->descriptor_, true);
@@ -122,6 +127,15 @@ File_read::open(const Task* task, const std::string& name)
       this->size_ = s.st_size;
       gold_debug(DEBUG_FILES, "Attempt to open %s succeeded",
                  this->name_.c_str());
+      this->contents_ = static_cast<const unsigned char*>(
+          ::mmap(NULL, this->size_, PROT_READ, MAP_PRIVATE,
+                 this->descriptor_, 0));
+      if (this->contents_ == MAP_FAILED)
+        gold_fatal(_("%s: mmap failed: %s"),
+                   this->filename().c_str(),
+                   strerror(errno));
+      this->contents_mmapped_ = true;
+      this->mapped_bytes_ += this->size_;
 
       this->token_.add_writer(task);
     }
@@ -231,13 +245,13 @@ File_read::is_locked() const
 // not for the requested BYTESHIFT.
 
 inline File_read::View*
-File_read::find_view(off_t start, section_size_type size,
-		     unsigned int byteshift, File_read::View** vshifted) const
+File_read::find_view(off_t /*start*/, section_size_type /*size*/,
+		     unsigned int /*byteshift*/, File_read::View** vshifted) const
 {
   if (vshifted != NULL)
     *vshifted = NULL;
 
-  off_t page = File_read::page_offset(start);
+/*  off_t page = File_read::page_offset(start);
 
   unsigned int bszero = 0;
   Views::const_iterator p = this->views_.upper_bound(std::make_pair(page - 1,
@@ -260,7 +274,7 @@ File_read::find_view(off_t start, section_size_type size,
 	}
 
       ++p;
-    }
+    }*/
 
   return NULL;
 }
@@ -352,7 +366,7 @@ File_read::add_view(File_read::View* v)
 
 File_read::View*
 File_read::make_view(off_t start, section_size_type size,
-		     unsigned int byteshift, bool cache)
+		     unsigned int /*byteshift*/, bool cache)
 {
   gold_assert(size > 0);
 
@@ -366,7 +380,7 @@ File_read::make_view(off_t start, section_size_type size,
 		   static_cast<long long>(size),
 		   static_cast<long long>(start));
 
-  off_t poff = File_read::page_offset(start);
+/*  off_t poff = File_read::page_offset(start);
 
   section_size_type psize = File_read::pages(size + (start - poff));
 
@@ -404,7 +418,10 @@ File_read::make_view(off_t start, section_size_type size,
 
   this->add_view(v);
 
-  return v;
+  return v;*/
+  gold_assert(contents_ != NULL);
+  return new File_read::View(0, this->size_, this->contents_, 0,
+                             cache, true);
 }
 
 // Find a View or make a new one, shifted as required by the file
@@ -776,7 +793,7 @@ File_read::get_mtime()
 {
   struct stat file_stat;
   this->reopen_descriptor();
-  
+
   if (fstat(this->descriptor_, &file_stat) < 0)
     gold_fatal(_("%s: stat failed: %s"), this->name_.c_str(),
 	       strerror(errno));
diff --git a/gold/fileread.h b/gold/fileread.h
index 920a4da..e63ebb1 100644
--- a/gold/fileread.h
+++ b/gold/fileread.h
@@ -66,7 +66,7 @@ class File_read
   File_read()
     : name_(), descriptor_(-1), is_descriptor_opened_(false), object_count_(0),
       size_(0), token_(false), views_(), saved_views_(), contents_(NULL),
-      mapped_bytes_(0), released_(true)
+      contents_mmapped_(false), mapped_bytes_(0), released_(true)
   { }
 
   ~File_read();
@@ -207,7 +207,7 @@ class File_read
     this->reopen_descriptor();
     return this->descriptor_;
   }
-  
+
   // Return the file last modification time.  Calls gold_fatal if the stat
   // system call failed.
   Timespec
@@ -396,6 +396,8 @@ class File_read
   Saved_views saved_views_;
   // Specified file contents.  Used only for testing purposes.
   const unsigned char* contents_;
+  // Whether the CONTENTS_ is mapped and should be unmapped.
+  bool contents_mmapped_;
   // Total amount of space mapped into memory.  This is only changed
   // while the file is locked.  When we unlock the file, we transfer
   // the total to total_mapped_bytes, and reset this to zero.

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Mmap and performance of gold
  2009-08-20 18:18 Mmap and performance of gold Mikolaj Zalewski
@ 2009-08-21 10:41 ` Ian Lance Taylor
  0 siblings, 0 replies; 2+ messages in thread
From: Ian Lance Taylor @ 2009-08-21 10:41 UTC (permalink / raw)
  To: Mikolaj Zalewski; +Cc: binutils

Mikolaj Zalewski <mikolajz@google.com> writes:

> 64-bit build -O2:
> Mmapping all files:
> real	0m3.640s
> user	0m2.884s
> sys	0m0.716s
>
> Mmaping selected regions:
> real	0m5.318s
> user	0m2.984s
> sys	0m2.312s
>
> 32-bit build -O2:
> Mmapping all files:
> real	0m3.787s
> user	0m3.120s
> sys	0m0.608s
>
> Mmaping selected regions:
> real	0m5.422s
> user	0m3.124s
> sys	0m2.240s

Those results are certainly compelling.  Thanks for trying this.

As you know, we can't mmap entire files on a 32-bit host: in a large
link, we will run out of address space.  However, we clearly should mmap
entire files on a 64-bit host.  Your suggestion of a command line option
with different defaults makes sense.

We do have to worry about alignment, though, so that archives work
correctly on processors which do not permit misaligned accesses.  Even
on x86 it will make some difference, as misaligned accesses are slower.
This just means that in some cases we will have to memcpy the data from
the mapped memory into an aligned buffer.

Ian

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2009-08-21  3:49 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-08-20 18:18 Mmap and performance of gold Mikolaj Zalewski
2009-08-21 10:41 ` Ian Lance Taylor

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).