public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
From: "H.J. Lu" <hjl.tools@gmail.com>
To: Fangrui Song <maskray@google.com>
Cc: Florian Weimer <fweimer@redhat.com>,
	GNU C Library <libc-alpha@sourceware.org>,
	 Binutils <binutils@sourceware.org>
Subject: Re: PT_GNU_RELRO is somewhat broken
Date: Wed, 11 May 2022 12:27:04 -0700	[thread overview]
Message-ID: <CAMe9rOrB2w45O8JC20WWDDeuXjkZRcoDpeinjBDQG1myYNTJBw@mail.gmail.com> (raw)
In-Reply-To: <20220511181704.y4pldvlqnbix3p53@google.com>

On Wed, May 11, 2022 at 11:17 AM Fangrui Song <maskray@google.com> wrote:
>
> On 2022-05-11, H.J. Lu via Libc-alpha wrote:
> >On Wed, May 11, 2022 at 9:59 AM Florian Weimer via Libc-alpha
> ><libc-alpha@sourceware.org> wrote:
> >>
> >> PT_GNU_RELRO is supposed to identify a region in the process image which
> >> has to be flipped to PROT_READ (only) permission after relocation
> >> (“Read-Only after RELocation”).
> >>
> >> glibc has this code in the dynamic loader in elf/dl-reloc.c:
> >>
> >> | void
> >> | _dl_protect_relro (struct link_map *l)
> >> | {
> >> |   ElfW(Addr) start = ALIGN_DOWN((l->l_addr
> >> |                                  + l->l_relro_addr),
> >> |                                 GLRO(dl_pagesize));
> >> |   ElfW(Addr) end = ALIGN_DOWN((l->l_addr
> >> |                                + l->l_relro_addr
> >> |                                + l->l_relro_size),
> >> |                               GLRO(dl_pagesize));
> >> |   if (start != end
> >> |       && __mprotect ((void *) start, end - start, PROT_READ) < 0)
> >> |     {
> >> |       static const char errstring[] = N_("\
> >> | cannot apply additional memory protection after relocation");
> >> |       _dl_signal_error (errno, l->l_name, NULL, errstring);
> >> |     }
> >> | }
> >>
> >> I assume the intent is to conservatively apply the largest possible
> >> RELRO region given GLRO(dl_pagesize), the run-time page size reported by
> >> the kernel.  If the binary is built to a smaller page size (to save disk
> >> space), glibc can still load it, but apply only some RELRO protection.
> >> But _dl_relocate_object has a bug: to be conservative, it would have to
> >> use ALGIN_UP for the start (lower) address of the range.
> >>
> >> But it turns out we can't make this change without incurring a loss of
> >> hardening: BFD ld does not align the start address to a page boundary.
> >> For example, /bin/true in Fedora 35 x86-64 has this:
> >>
> >> | $ readelf -l /bin/true
> >> |
> >> | Elf file type is DYN (Position-Independent Executable file)
> >> | Entry point 0x1960
> >> | There are 13 program headers, starting at offset 64
> >> |
> >> | Program Headers:
> >> |   Type           Offset             VirtAddr           PhysAddr
> >> |                  FileSiz            MemSiz              Flags  Align
> >> |   PHDR           0x0000000000000040 0x0000000000000040 0x0000000000000040
> >> |                  0x00000000000002d8 0x00000000000002d8  R      0x8
> >> |   INTERP         0x0000000000000318 0x0000000000000318 0x0000000000000318
> >> |                  0x000000000000001c 0x000000000000001c  R      0x1
> >> |       [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
> >> |   LOAD           0x0000000000000000 0x0000000000000000 0x0000000000000000
> >> |                  0x0000000000000ff8 0x0000000000000ff8  R      0x1000
> >> |   LOAD           0x0000000000001000 0x0000000000001000 0x0000000000001000
> >> |                  0x00000000000029a1 0x00000000000029a1  R E    0x1000
> >> |   LOAD           0x0000000000004000 0x0000000000004000 0x0000000000004000
> >> |                  0x0000000000000d38 0x0000000000000d38  R      0x1000
> >> |   LOAD           0x0000000000005c78 0x0000000000006c78 0x0000000000006c78
> >> |                  0x0000000000000390 0x00000000000003a0  RW     0x1000
> >> |   DYNAMIC        0x0000000000005c90 0x0000000000006c90 0x0000000000006c90
> >> |                  0x00000000000001f0 0x00000000000001f0  RW     0x8
> >> |   NOTE           0x0000000000000338 0x0000000000000338 0x0000000000000338
> >> |                  0x0000000000000050 0x0000000000000050  R      0x8
> >> |   NOTE           0x0000000000000388 0x0000000000000388 0x0000000000000388
> >> |                  0x0000000000000044 0x0000000000000044  R      0x4
> >> |   GNU_PROPERTY   0x0000000000000338 0x0000000000000338 0x0000000000000338
> >> |                  0x0000000000000050 0x0000000000000050  R      0x8
> >> |   GNU_EH_FRAME   0x00000000000049c4 0x00000000000049c4 0x00000000000049c4
> >> |                  0x000000000000007c 0x000000000000007c  R      0x4
> >> |   GNU_STACK      0x0000000000000000 0x0000000000000000 0x0000000000000000
> >> |                  0x0000000000000000 0x0000000000000000  RW     0x10
> >> |   GNU_RELRO      0x0000000000005c78 0x0000000000006c78 0x0000000000006c78
> >> |                  0x0000000000000388 0x0000000000000388  R      0x1
> >> | […]
> >>
> >> The virtual address for PT_GNU_RELRO is 0x388, which is definitely not
> >> aligned to a 4K page.  (0x388 + 0x6c78 == 0x7000, so at least the end
> >> address is aligned.)  In practice, this seems to work because the RELRO
> >> area seems to be at the start of the RW LOAD segment, so we can safely
> >> flip the slack space at the start of the page to RO.  It still looks
> >> like a major wart to me, though.
> >
> >After relocation, we change the end of the RO segment (aligned down from
> >the beginning of the RELRO area) to the end of the RELRO segment to RO.
> >Since the end of the RELRO segment must be aligned to the page size,
> >ALIGN_DOWN on the end of the RELRO segment doesn't lose any protection.
> >
> >> Any suggestions what should we do to fix this properly, mainly for
> >> targets that have varying page size in practice?
> >
> >The end of the RELRO segment should be aligned to the maximum page
> >size.
> >
>
> PT_GNU_RELRO is designed/implemented this way:
>
> * there can be at most one PT_GNU_RELRO
> * p_vaddr(PT_GNU_RELRO) = p_vaddr(first RW PT_LOAD); https://sourceware.org/binutils/docs/ld/Builtin-Functions.html DATA_SEGMENT_RELRO_END is designed this way
> * p_vaddr(PT_GNU_RELRO) + p_memsz(PT_GNU_RELRO) is aligned by common-page-size. comon page size is chosen probably because of less waste

ld aligns DATA_SEGMENT_RELRO_END to the maximum page size.

> If the proposal is to align p_vaddr(PT_GNU_RELRO) +
> p_memsz(PT_GNU_RELRO) to max page size, that will penalize the size of
> many max-page-size>4096 ports with the current GNU ld section/segment
> layout.  See https://sourceware.org/bugzilla/show_bug.cgi?id=24490 and
> https://sourceware.org/bugzilla/show_bug.cgi?id=23704 for GNU ld's -z
> separate-code complaints.

Separate RW PT_LOAD for PT_GNU_RELRO can reduce file size
for -z no-separate-code.  ld implements -z separate-code in such a
way that not only executable sections are in separate RE pages in
memory, but also mapping the RE segment won't map in other contents
on disk.

> Note: ld.lld used (before 9.0.0) to place PT_GNU_RELRO in the middle of
> the RW PT_LOAD.  I changed it to the start in
> https://reviews.llvm.org/D58892 With the new scheme, it doesn't really
> matter whether p_vaddr(PT_GNU_RELRO) + p_memsz(PT_GNU_RELRO) is aligned
> to max-page-size or common-page-size: the file size does not change.

This layout is generated by lld:

  LOAD           0x000000 0x0000000000200000 0x0000000000200000
0x00055c 0x00055c R   0x1000
  LOAD           0x000560 0x0000000000201560 0x0000000000201560
0x000160 0x000160 R E 0x1000
  LOAD           0x0006c0 0x00000000002026c0 0x00000000002026c0
0x0001a0 0x0001a0 RW  0x1000
  LOAD           0x000860 0x0000000000203860 0x0000000000203860
0x000028 0x000029 RW  0x1000
  DYNAMIC        0x0006d0 0x00000000002026d0 0x00000000002026d0
0x000180 0x000180 RW  0x8
  GNU_RELRO      0x0006c0 0x00000000002026c0 0x00000000002026c0
0x0001a0 0x000940 R   0x1

The beginning of the first RE page in memory also includes the end
of the previous R segment and the end of the last RE page also
includes the beginning of the next RW segment.  --rosegment still
leaves non-instructions bytes in the RE pages.

-- 
H.J.

  reply	other threads:[~2022-05-11 19:27 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-11 16:59 Florian Weimer
2022-05-11 17:50 ` H.J. Lu
2022-05-11 18:17   ` Fangrui Song
2022-05-11 19:27     ` H.J. Lu [this message]
2022-05-11 19:33       ` Fangrui Song
2022-05-11 19:42       ` Fangrui Song
2022-05-11 19:55         ` H.J. Lu
2022-05-11 20:50           ` Fangrui Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAMe9rOrB2w45O8JC20WWDDeuXjkZRcoDpeinjBDQG1myYNTJBw@mail.gmail.com \
    --to=hjl.tools@gmail.com \
    --cc=binutils@sourceware.org \
    --cc=fweimer@redhat.com \
    --cc=libc-alpha@sourceware.org \
    --cc=maskray@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).