I was called to handle an incident in which a malicious IP address is accessed each time the system boots. They couldn't find out what process is making the connection.
Using one of the BCC eBPF tools called
tcpconnect.py, I was able to locate the malicious process that's disguised as a system service. This post mainly discusses how I reversed it and extracted its obfuscated config data.
First of all, it's a Go program, and more, it's a stripped Go program. This means despite that the function names are stripped, there's a way to recovery them. To do that, we need a helper extension in Ghidra.
Typically, a statically linked Go program is non-PIE, we can confirm it by using
pwndbg or simply observing the start address when importing it in Ghidra.
This can save us a lot of time when debugging without symbols, we will get there later.
Ghidra_GolangAnalyzerExtension installed, let's open the malware ELF file. And enable the Go analyzer. Then wait for it to finish.
When it finishes, you will see all function names (and file names and data types) are recovered.
Now everything has been set up. We will try running the program without network connectivity.
We can see that without giving it any input, it tries to connect to 443 port of a public IPv4 address using TCP, which means there has to be some config data inside the ELF file somewhere.
However, I couldn't find anything that looks like an IP address in the binary. Ususally this kind of data is encrypted to prevent analysis, which is why you are reading this post in the first place.
We always start looking from the function names, for Go binaries, specifically we will look at
main.* functions since they are from the
main package and are the main procedures of the program.
We can see a function called
main.xor which I bet is highly unlikely to be a normal harmless function. Let's keep it in mind and keep looking.
main.main just sets up runtime and extract whatever config data and call
main.start. What about
init function does exactly what its name implies, it initializes config data for the program!
Unsurprisingly, it calls
main.xor to do something with a chunk of data.
Now we are not going to write a script using its
xor to extract whatever data from the static binary, we are going to let it emit deobfuscated config data by itself.
Remember that this ELF is non-PIE. We can copy the code address where
main.xor is being called, and set it as a breakpoint in GDB.
Hit run, we can observe
main.xor being called.
Looking at its function arguments, we can easily spot the data chunk.
There's nothing to be excited about, yet.
main.xor finish its job and look at the data chunk again:
Looks promising, let's dig out more:
Everything is there, the cert, the key, and the config including the malicious IP address.
Take a look back at Ghidra: