ELF function override
While reading up on the ELF spec (I have a fun life) I found myself
wondering whether it were possible to override dynamically loaded library
functions in some strange ways. For those wondering, yes I do know the
LD_PRELOAD
trick, and I do realise what I talk about here has
no real use case. It's just fun.
To recap some of the characteristics of dynamic linking:
- Library functions are called via indirection.
- The main code calles a linkage stub in
.plt
. - Library function addresses are stored in a table called
.got.plt
. - This table is lazily filled in at run-time
Direct Patch of Function Table
This first attempt was based on noticing how the program achieved the
lazy loading of function addresses.
When the linkage stub is called, it acts as if the address of the libary
function is correct in the .got.plt
.
It reads this value and unconditionally jumps to the position.
In order to allow the linker to lazily fill the table in, each position is
initialised pointing at the code that calls the dynamic linker to set-up
that particular function table entry.
With this information, we can see that if patch the binary after compilation so that the function table initially points to our alternate function, the dynamic linker will never be called to modify that entry and our function will be called instead. To do this, we need the position of our alternate function in memory and the position of the relevant function table entry in the file.
The position of the alternate function may be found after compilation with
vshcmd: > readelf -s testprog | grep altstrcmp
48: 0000000000601038 8 OBJECT GLOBAL DEFAULT 24 altstrcmp
fake_plt [22:19:05] $
Similarly, the position of the function table entry in the file may be
found by finding the position of the dynamic relocations, and subtracting
the difference between the memory address and file offset of the
.got.plt
section.
vshcmd: > readelf -r testprog
Relocation section '.rela.dyn' at offset 0x3a0 contains 2 entries:
Offset Info Type Sym. Value Sym. Name + Addend
000000600ff0 000200000006 R_X86_64_GLOB_DAT 0000000000000000 __libc_start_main@GLIBC_2.2.5 + 0
000000600ff8 000400000006 R_X86_64_GLOB_DAT 0000000000000000 __gmon_start__ + 0
Relocation section '.rela.plt' at offset 0x3d0 contains 2 entries:
Offset Info Type Sym. Value Sym. Name + Addend
000000601018 000100000007 R_X86_64_JUMP_SLO 0000000000000000 puts@GLIBC_2.2.5 + 0
000000601020 000300000007 R_X86_64_JUMP_SLO 0000000000000000 strcmp@GLIBC_2.2.5 + 0
fake_plt [22:16:47] $
vshcmd: > readelf --sections testprog | grep -A 1 .got.plt
[23] .got.plt PROGBITS 0000000000601000 00001000
0000000000000028 0000000000000008 WA 0 0 8
elf [18:19:50] $
vshcmd: > offset=$(python -c 'print(0x601020 - (0x601000 - 0x1000))')
elf [18:21:42] $
Using this information, we can patch the binary with
vshcmd: > python -c 'import sys; sys.stdout.buffer.write(b"\x38\x10\x60")' | dd of=testprog bs=1 seek=$offset conv=notrunc
3+0 records in
3+0 records out
3 bytes copied, 0.0732023 s, 0.0 kB/s
elf [22:19:10] $
and it will use altstrcmp
instead of strcmp
.
That worked nicely, but the code had no control over whether it was using the default or my alternate function. I wanted more control over when my function was overwritten, to allow the user to tell the program to switch between functions.
This would allow hypothetical use cases such as the user turning on
debug
mode by sending a SIGUSR1
or some message over a
communication channel.
Patch a Data Variable
For my second attempt, I made a global variable in the code to hold the
position in program memory of the .got.plt
entry for the
interesting function.
I then found the position of that initialised variable in the program file by getting the position in memory with// Need to initialise this so it's defined in .data instead of .bss. // That means we can modify it in the file with `dd`. int (**strcmpgot) (const char *, const char *) = (int (**) (const char *, const char *))100;
...if (check_password(argv[1])) { puts("Congratulations!!!"); } else { int (*origstrcmp) (const char *, const char *) = *strcmpgot; *strcmpgot = mystrcmp; if (check_password(argv[1])) { puts("So close ... !!"); } else { puts("Sorry, that's the wrong password"); } *strcmpgot = origstrcmp; // Double check we've reset it. assert(strcmp("hello", "hello") == 0); }
readelf -s testprog | grep strcmpgot
and adjusting by the
difference between the file offset and memory address of the
.data
section.
With this address, I patched the value of that variable in the compiled
binary with the position of the relevant function table entry in memory.
vshcmd: > python -c 'import sys; sys.stdout.buffer.write(b"\x20\x10\x60")' | dd of=testprog bs=1 seek=$offset conv=notrunc
3+0 records in
3+0 records out
3 bytes copied, 0.0466237 s, 0.1 kB/s
elf [18:30:05] $
vshcmd: > # Check the patched binary behaves as expected
vshcmd: > ./testprog 'etmrhdr '
Congratulations!!!
elf [18:30:06] $
vshcmd: > ./testprog 'funsies!'
So close ... !!
elf [18:30:07] $
vshcmd: > ./testprog 'hello'
Sorry, that's the wrong password
elf [18:30:08] $
That was much better, with the code having full control over when the alternate function was used, but it still required patching the program after it had been compiled.
Without Patching the Binary
All we need to know in the code is where in memory the
strcmp
function table entry is.
We know this is encoded in the linker stub that our main code uses, that's
where the linker stub reads its destination address from.
If we can find the linker stub in the program, then we should be able to
decode it and find where we need to read from ourselves.
It turns out that finding the position of the relevant linker stub is
pretty easy.
Recalling that all of our main code uses that linker stub instead of the
actual function we want, we can simply ask for the address with
&strcmp
in our main program.
uint8_t *pltaddr = (uint8_t *)&strcmp;
Finding the position of the code from the instructions we see there is a little more tricky. In order to do this, we disassemble the linker stub, and view the instruction opcodes it uses.
vshcmd: > objdump -d testprog -j .plt
testprog: file format elf64-x86-64
Disassembly of section .plt:
0000000000400460 <.plt>:
400460: ff 35 a2 0b 20 00 pushq 0x200ba2(%rip) # 601008 <_GLOBAL_OFFSET_TABLE_+0x8>
400466: ff 25 a4 0b 20 00 jmpq *0x200ba4(%rip) # 601010 <_GLOBAL_OFFSET_TABLE_+0x10>
40046c: 0f 1f 40 00 nopl 0x0(%rax)
0000000000400470 <puts@plt>:
400470: ff 25 a2 0b 20 00 jmpq *0x200ba2(%rip) # 601018 <puts@GLIBC_2.2.5>
400476: 68 00 00 00 00 pushq $0x0
40047b: e9 e0 ff ff ff jmpq 400460 <.plt>
0000000000400480 <__assert_fail@plt>:
400480: ff 25 9a 0b 20 00 jmpq *0x200b9a(%rip) # 601020 <__assert_fail@GLIBC_2.2.5>
400486: 68 01 00 00 00 pushq $0x1
40048b: e9 d0 ff ff ff jmpq 400460 <.plt>
0000000000400490 <strcmp@plt>:
400490: ff 25 92 0b 20 00 jmpq *0x200b92(%rip) # 601028 <strcmp@GLIBC_2.2.5>
400496: 68 02 00 00 00 pushq $0x2
40049b: e9 c0 ff ff ff jmpq 400460 <.plt>
playing_with_elf [11:33:08] $
Reading the opcodes, and comparing to
the instruction specification
we can see this uses the absolute indirect
form of the
jmp
instruction.
There's more detail in
the test program comments, but essentially
this means we need to read the 32 bit offset two bytes forwards from the
&strcmp
stub, and add to it &strcmp + 6
.
typedef int (**strcmpptr) (const char *, const char *);
strcmpptr getgot(void)
{
uint8_t *pltaddr = (uint8_t *)&strcmp;
assert(pltaddr[0] == 0xff);
assert(pltaddr[1] == 0x25);
uintptr_t offset = *(uint32_t *)(pltaddr + 2);
offset += (uintptr_t)(pltaddr + 6);
return (strcmpptr)offset;
}
After compilation and test, we see we now have a working example that doesn't require patching the binary! It is still very brittle, relying on decoding an instruction (which is clearly processor specific), can we do anything about that?
Reading the In-Memory PHDR
If we want to find the address of the strcmp
function
pointer entry without relying on architecture specifics, then we're going
to have to use the ELF data structures that were made specifically for that
reason.
We could do this by reading our own file, but in the ELF man page it
mentions the PT_PHDR
header that we can use instead.
This header tells the loader where to put the program header in memory, and with that information in memory, we have easy access for reading and parsing.
vshcmd: > readelf -d -l testprog
Elf file type is EXEC (Executable file)
Entry point 0x400570
There are 9 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
PHDR 0x0000000000000040 0x0000000000400040 0x0000000000400040
0x00000000000001f8 0x00000000000001f8 R E 0x8
INTERP 0x0000000000000238 0x0000000000400238 0x0000000000400238
0x000000000000001c 0x000000000000001c R 0x1
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x000000000000120c 0x000000000000120c R E 0x200000
LOAD 0x0000000000001e08 0x0000000000601e08 0x0000000000601e08
0x0000000000000248 0x0000000000000268 RW 0x200000
DYNAMIC 0x0000000000001e20 0x0000000000601e20 0x0000000000601e20
0x00000000000001d0 0x00000000000001d0 RW 0x8
NOTE 0x0000000000000254 0x0000000000400254 0x0000000000400254
0x0000000000000044 0x0000000000000044 R 0x4
GNU_EH_FRAME 0x0000000000001018 0x0000000000401018 0x0000000000401018
0x000000000000005c 0x000000000000005c R 0x4
GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000000000 0x0000000000000000 RW 0x10
GNU_RELRO 0x0000000000001e08 0x0000000000601e08 0x0000000000601e08
0x00000000000001f8 0x00000000000001f8 R 0x1
Section to Segment mapping:
Segment Sections...
00
01 .interp
02 .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt .init .plt .text .fini .rodata .eh_frame_hdr .eh_frame
03 .init_array .fini_array .jcr .dynamic .got .got.plt .data .bss
04 .dynamic
05 .note.ABI-tag .note.gnu.build-id
06 .eh_frame_hdr
07
08 .init_array .fini_array .jcr .dynamic .got
Dynamic section at offset 0x1e20 contains 24 entries:
Tag Type Name/Value
0x0000000000000001 (NEEDED) Shared library: [libc.so.6]
0x000000000000000c (INIT) 0x4004f8
0x000000000000000d (FINI) 0x400d84
0x0000000000000019 (INIT_ARRAY) 0x601e08
0x000000000000001b (INIT_ARRAYSZ) 8 (bytes)
0x000000000000001a (FINI_ARRAY) 0x601e10
0x000000000000001c (FINI_ARRAYSZ) 8 (bytes)
0x000000006ffffef5 (GNU_HASH) 0x400298
0x0000000000000005 (STRTAB) 0x400398
0x0000000000000006 (SYMTAB) 0x4002c0
0x000000000000000a (STRSZ) 104 (bytes)
0x000000000000000b (SYMENT) 24 (bytes)
0x0000000000000015 (DEBUG) 0x0
0x0000000000000003 (PLTGOT) 0x602000
0x0000000000000002 (PLTRELSZ) 120 (bytes)
0x0000000000000014 (PLTREL) RELA
0x0000000000000017 (JMPREL) 0x400480
0x0000000000000007 (RELA) 0x400438
0x0000000000000008 (RELASZ) 72 (bytes)
0x0000000000000009 (RELAENT) 24 (bytes)
0x000000006ffffffe (VERNEED) 0x400418
0x000000006fffffff (VERNEEDNUM) 1
0x000000006ffffff0 (VERSYM) 0x400400
0x0000000000000000 (NULL) 0x0
elf_override [15:46:10] $
We read this header to find where the _DYNAMIC
array is in
memory, and it is this array that contains all the information we need to
find the position in memory of the strcmp
pointer.
Each element in the _DYNAMIC
array contains a tag that
tells us what information this element contains (shown without their
DT_
prefixes above).
The JMPREL
tag shows us where the relocations can be
found, and we can associate each relocation with a symbol name using
information in the STRTAB
and SYMTAB
elements.
uint16_t find_sym_index(const char * const target,
const Elf64_Sym * const symtab, const size_t num_symbols,
const char * const strtab, const size_t strsz)
{
const Elf64_Sym *cursym;
for (cursym = symtab;
cursym < symtab + num_symbols;
cursym += 1) {
assert(cursym->st_name <= strsz);
if (strcmp(strtab + cursym->st_name, target) == 0) {
break;
}
}
// Not found == num_symbols, found == index
return cursym - symtab;
}
strcmpptr find_rela_addr(const uint16_t sym_index,
const Elf64_Rela * const rela, const size_t relasz)
{
for (const Elf64_Rela * currel = rela;
(void *)currel < (void *)rela + relasz;
currel += 1) {
if (ELF64_R_SYM(currel->r_info) == sym_index) {
// XXX If this addend isn't 0 then the .got.plt is structured in a
// way I don't understand, fail and alert me so I can investigate.
assert(currel->r_addend == 0);
return (strcmpptr) currel->r_offset;
}
}
return NULL;
}
The only piece of information we need is where in memory the
PHDR
program header is located.
From observation it appears that it always starts at
0x400040
, and this value works when hard-coded into the above
program, but I haven't found that specified anywhere.
I believe it's possible to specify that value with a linker script
by using the PHDRS command,
but according to that link once you specify one program header you have to
specify them all.
That's getting to lose a lot of flexibility, and I refuse to resort to
reading the file, so we'll try another tack...
When working on a GNU
system, we have some extra niceties
that can come in handy sometimes one of these is the
dl_iterate_phdr()
function.
This iterates over each of the programs currently loaded shared objects and
calls a user specified callback with a structure specifying, among other
things, the position of the program headers in memory.
Using this function, and checking for the object that specifies "us" by
comparing info->dlpi_name
to ""
, we finally
have all the information we need for a processor independant ability to
overwrite our dynamic library functions.
struct callback_data { uintptr_t addr; void *phdr; }; int get_object_phdr(struct dl_phdr_info *info, size_t size, void *data) { struct callback_data *c = (struct callback_data *)data; if (info->dlpi_name[0] == '\0') { c->phdr = info->dlpi_phdr; c->addr = info->dlpi_addr; return 1; } return 0; }
...struct callback_data cbdata; dl_iterate_phdr(get_object_phdr, &cbdata);
... thinking on that dl_iterate_phdr()
function ... it
iterates over all shared objects ... can we override a function
that one of our libaries use?
Other object files
Seeing as dl_iterate_phdr()
iterates over all loaded shared
objects, I thought it might be possible to read the dynamic sections of
other libraries.
This would open up some actual applications, like a temporary
LD_PRELOAD
to replace commands with introspective counterparts
when calling specific library functions.
This appears to work, though there are a few places that my code doesn't
match my interpretation of the manual, so I'm not about to use this in
production :-)
There are a bunch of minor adjustments to be made to the program
in the previous section that amount to accounting
for offsets from the base address of the dynamic libary.
There is also removing the addition of a base offset from pointers
in the Elf64_Dyn
structures.
On reading the manual it appears that these should be there, but the manual
refers to the format of the file, not the format of the
program header in memory.
From this insight I believe the fact the base offset is always 0 in my
previous program is hiding a bug.
If you want to know the full extent of the differences you can diff the test programs from this section and the previous one. Suffice to say that with a little tweaking of the previous program, we get to temporarily overwrite what external functions a dynamically loaded library calls.
I think that's pretty cool.