From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pl1-x631.google.com (mail-pl1-x631.google.com [IPv6:2607:f8b0:4864:20::631]) by sourceware.org (Postfix) with ESMTPS id 479E638438DE; Wed, 11 May 2022 19:27:41 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 479E638438DE Received: by mail-pl1-x631.google.com with SMTP id n8so2849349plh.1; Wed, 11 May 2022 12:27:41 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=MadOWVRdVONvc2zU01af3MEicGmbj+mDiEMUji55JVY=; b=247dsVgREvGukkzGJT54tVAUrUudDxm7aJTisXsPJgrbl+aQyt2y/oc4gPr+qmZIwy p8T4DRiJfnxuxiUaEWbBZF8XbOWV30Qe2d/o6UjnsohzIMRGIZvgYVXbkaVjwWuhzOXN LsbSwXJoBwE8/28RP7RSYjB/54Wrvetrj5sxTLy/pQVVdD1v9K7jGO1/f+aX4nGrLKAV 99HG14t6dtdrtt8X6DPCRynwDOUYMKQdew+2hEo7u9kpix2iUgJpG6ZFdM30PXTwllIS j+7UtuDB+hx7fBgcalpX2DfKqTiBhNKVhf339q6XGcnXYHPw79xDYqdcp/gaYHLIlwNQ FZNQ== X-Gm-Message-State: AOAM5329iYOYFtBEOViGeUUJ12dXuzRnir0NhbJUlIM1C5fSsYBrhzsx ozBEYqkDYSp4pPm9/VyxtFSapvh6+CvMObn25s0= X-Google-Smtp-Source: ABdhPJy3rj1cir4paioqsyzr3icDKg01BxNtaHQMsCIx9QWINuUXyFLwi4s4v5IzRCrYh0HQZ6OX6HwMMe4HCIRE7n8= X-Received: by 2002:a17:90a:ba15:b0:1cb:be7d:bbca with SMTP id s21-20020a17090aba1500b001cbbe7dbbcamr6843364pjr.143.1652297260210; Wed, 11 May 2022 12:27:40 -0700 (PDT) MIME-Version: 1.0 References: <871qx0dmz5.fsf@oldenburg.str.redhat.com> <20220511181704.y4pldvlqnbix3p53@google.com> In-Reply-To: <20220511181704.y4pldvlqnbix3p53@google.com> From: "H.J. Lu" Date: Wed, 11 May 2022 12:27:04 -0700 Message-ID: Subject: Re: PT_GNU_RELRO is somewhat broken To: Fangrui Song Cc: Florian Weimer , GNU C Library , Binutils Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-3019.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: binutils@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Binutils mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 11 May 2022 19:27:43 -0000 On Wed, May 11, 2022 at 11:17 AM Fangrui Song wrote: > > On 2022-05-11, H.J. Lu via Libc-alpha wrote: > >On Wed, May 11, 2022 at 9:59 AM Florian Weimer via Libc-alpha > > wrote: > >> > >> PT_GNU_RELRO is supposed to identify a region in the process image whi= ch > >> has to be flipped to PROT_READ (only) permission after relocation > >> (=E2=80=9CRead-Only after RELocation=E2=80=9D). > >> > >> glibc has this code in the dynamic loader in elf/dl-reloc.c: > >> > >> | void > >> | _dl_protect_relro (struct link_map *l) > >> | { > >> | ElfW(Addr) start =3D ALIGN_DOWN((l->l_addr > >> | + l->l_relro_addr), > >> | GLRO(dl_pagesize)); > >> | ElfW(Addr) end =3D ALIGN_DOWN((l->l_addr > >> | + l->l_relro_addr > >> | + l->l_relro_size), > >> | GLRO(dl_pagesize)); > >> | if (start !=3D end > >> | && __mprotect ((void *) start, end - start, PROT_READ) < 0) > >> | { > >> | static const char errstring[] =3D N_("\ > >> | cannot apply additional memory protection after relocation"); > >> | _dl_signal_error (errno, l->l_name, NULL, errstring); > >> | } > >> | } > >> > >> I assume the intent is to conservatively apply the largest possible > >> RELRO region given GLRO(dl_pagesize), the run-time page size reported = by > >> the kernel. If the binary is built to a smaller page size (to save di= sk > >> space), glibc can still load it, but apply only some RELRO protection. > >> But _dl_relocate_object has a bug: to be conservative, it would have t= o > >> use ALGIN_UP for the start (lower) address of the range. > >> > >> But it turns out we can't make this change without incurring a loss of > >> hardening: BFD ld does not align the start address to a page boundary. > >> For example, /bin/true in Fedora 35 x86-64 has this: > >> > >> | $ readelf -l /bin/true > >> | > >> | Elf file type is DYN (Position-Independent Executable file) > >> | Entry point 0x1960 > >> | There are 13 program headers, starting at offset 64 > >> | > >> | Program Headers: > >> | Type Offset VirtAddr PhysAddr > >> | FileSiz MemSiz Flags Align > >> | PHDR 0x0000000000000040 0x0000000000000040 0x00000000000= 00040 > >> | 0x00000000000002d8 0x00000000000002d8 R 0x8 > >> | INTERP 0x0000000000000318 0x0000000000000318 0x00000000000= 00318 > >> | 0x000000000000001c 0x000000000000001c R 0x1 > >> | [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2] > >> | LOAD 0x0000000000000000 0x0000000000000000 0x00000000000= 00000 > >> | 0x0000000000000ff8 0x0000000000000ff8 R 0x100= 0 > >> | LOAD 0x0000000000001000 0x0000000000001000 0x00000000000= 01000 > >> | 0x00000000000029a1 0x00000000000029a1 R E 0x100= 0 > >> | LOAD 0x0000000000004000 0x0000000000004000 0x00000000000= 04000 > >> | 0x0000000000000d38 0x0000000000000d38 R 0x100= 0 > >> | LOAD 0x0000000000005c78 0x0000000000006c78 0x00000000000= 06c78 > >> | 0x0000000000000390 0x00000000000003a0 RW 0x100= 0 > >> | DYNAMIC 0x0000000000005c90 0x0000000000006c90 0x00000000000= 06c90 > >> | 0x00000000000001f0 0x00000000000001f0 RW 0x8 > >> | NOTE 0x0000000000000338 0x0000000000000338 0x00000000000= 00338 > >> | 0x0000000000000050 0x0000000000000050 R 0x8 > >> | NOTE 0x0000000000000388 0x0000000000000388 0x00000000000= 00388 > >> | 0x0000000000000044 0x0000000000000044 R 0x4 > >> | GNU_PROPERTY 0x0000000000000338 0x0000000000000338 0x00000000000= 00338 > >> | 0x0000000000000050 0x0000000000000050 R 0x8 > >> | GNU_EH_FRAME 0x00000000000049c4 0x00000000000049c4 0x00000000000= 049c4 > >> | 0x000000000000007c 0x000000000000007c R 0x4 > >> | GNU_STACK 0x0000000000000000 0x0000000000000000 0x00000000000= 00000 > >> | 0x0000000000000000 0x0000000000000000 RW 0x10 > >> | GNU_RELRO 0x0000000000005c78 0x0000000000006c78 0x00000000000= 06c78 > >> | 0x0000000000000388 0x0000000000000388 R 0x1 > >> | [=E2=80=A6] > >> > >> The virtual address for PT_GNU_RELRO is 0x388, which is definitely not > >> aligned to a 4K page. (0x388 + 0x6c78 =3D=3D 0x7000, so at least the = end > >> address is aligned.) In practice, this seems to work because the RELR= O > >> area seems to be at the start of the RW LOAD segment, so we can safely > >> flip the slack space at the start of the page to RO. It still looks > >> like a major wart to me, though. > > > >After relocation, we change the end of the RO segment (aligned down from > >the beginning of the RELRO area) to the end of the RELRO segment to RO. > >Since the end of the RELRO segment must be aligned to the page size, > >ALIGN_DOWN on the end of the RELRO segment doesn't lose any protection. > > > >> Any suggestions what should we do to fix this properly, mainly for > >> targets that have varying page size in practice? > > > >The end of the RELRO segment should be aligned to the maximum page > >size. > > > > PT_GNU_RELRO is designed/implemented this way: > > * there can be at most one PT_GNU_RELRO > * p_vaddr(PT_GNU_RELRO) =3D p_vaddr(first RW PT_LOAD); https://sourceware= .org/binutils/docs/ld/Builtin-Functions.html DATA_SEGMENT_RELRO_END is desi= gned this way > * p_vaddr(PT_GNU_RELRO) + p_memsz(PT_GNU_RELRO) is aligned by common-page= -size. comon page size is chosen probably because of less waste ld aligns DATA_SEGMENT_RELRO_END to the maximum page size. > If the proposal is to align p_vaddr(PT_GNU_RELRO) + > p_memsz(PT_GNU_RELRO) to max page size, that will penalize the size of > many max-page-size>4096 ports with the current GNU ld section/segment > layout. See https://sourceware.org/bugzilla/show_bug.cgi?id=3D24490 and > https://sourceware.org/bugzilla/show_bug.cgi?id=3D23704 for GNU ld's -z > separate-code complaints. Separate RW PT_LOAD for PT_GNU_RELRO can reduce file size for -z no-separate-code. ld implements -z separate-code in such a way that not only executable sections are in separate RE pages in memory, but also mapping the RE segment won't map in other contents on disk. > Note: ld.lld used (before 9.0.0) to place PT_GNU_RELRO in the middle of > the RW PT_LOAD. I changed it to the start in > https://reviews.llvm.org/D58892 With the new scheme, it doesn't really > matter whether p_vaddr(PT_GNU_RELRO) + p_memsz(PT_GNU_RELRO) is aligned > to max-page-size or common-page-size: the file size does not change. This layout is generated by lld: LOAD 0x000000 0x0000000000200000 0x0000000000200000 0x00055c 0x00055c R 0x1000 LOAD 0x000560 0x0000000000201560 0x0000000000201560 0x000160 0x000160 R E 0x1000 LOAD 0x0006c0 0x00000000002026c0 0x00000000002026c0 0x0001a0 0x0001a0 RW 0x1000 LOAD 0x000860 0x0000000000203860 0x0000000000203860 0x000028 0x000029 RW 0x1000 DYNAMIC 0x0006d0 0x00000000002026d0 0x00000000002026d0 0x000180 0x000180 RW 0x8 GNU_RELRO 0x0006c0 0x00000000002026c0 0x00000000002026c0 0x0001a0 0x000940 R 0x1 The beginning of the first RE page in memory also includes the end of the previous R segment and the end of the last RE page also includes the beginning of the next RW segment. --rosegment still leaves non-instructions bytes in the RE pages. --=20 H.J.