The current hook method will block the module's unloading till the last sleeping proce

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Use stub funcitons for the un/hook process,about alexandernst/monks

Comments (39)

alexandernst commented on September 8, 2024

Hi @milabs ! Have you managed to get some free time and work on this? How is it going?

from monks.

milabs commented on September 8, 2024

@alexandernst Not really, I'm sorry man :(

from monks.

alexandernst commented on September 8, 2024

@milabs Don't worry 😉 Do you plan working on it or should I take it? If the second, can you give me some tips (maybe links) to some docs about how to implement this?

from monks.

milabs commented on September 8, 2024

@alexandernst It may be the best in that case. Feel free to ask questions ;)

from monks.

alexandernst commented on September 8, 2024

@milabs Ok :) So, I had a reading session and I didn't get a lot of things clear. A stub function is an empty function basically. Or a function that returns a known value, which is useful for debugging.
I'm not sure how I'd use a stub function here.

from monks.

alexandernst commented on September 8, 2024

Maybe http://stackoverflow.com/questions/10405436/anonymous-functions-using-gcc-statement-expressions ?

from monks.

milabs commented on September 8, 2024

@alexandernst Stub is the piece of code that you need to cook. I think that you'll need to write some pattern in assembly (like call 0 // call 0 // call 0 // ret, see the #17 ). Next, you'll need to make a stub's copy for each syscall and replace zeroes with the proper values... udis86 is very usefull as you can know :)

from monks.

alexandernst commented on September 8, 2024

@milabs Oh, ok. I think I start understanding. Next question: who allocates the memory that will hold the stub? And what happens when the module is unlodaded? Will that memory stay "occupied" forever? (until next reboot, ofc)

from monks.

alexandernst commented on September 8, 2024

@milabs Hmm, and yet another thing. That stub is just some ASM calling the functions from my module (from #17, sys_read_post_hook_action), which won't exist anymore when I unload the module.

Perhaps I should create a stub (as in executable memory area) and place there the entire sys_read_post_hook_action function, right?

from monks.

milabs commented on September 8, 2024

When unloading you'll need to change calls with the NOPs. That prevents the system to follow unloaded function. And you are right, the memory doesn't be freed =) And another one thing. Take a look at the stop_machine interface. It helps us to do big things like nop'ping the stubs atomic.

from monks.

alexandernst commented on September 8, 2024

Ok, more questions :) How can I create a executable memory area? Is there anything in the kernel that will allow me to (remotely) do that?

from monks.

milabs commented on September 8, 2024

@alexandernst I've used module_alloc function a time ago. It's not exported but that don't stop me from hacking :) You can start reading that function and if you invent how simply create executable memory I'll be happy :)

http://lxr.free-electrons.com/source/arch/x86/kernel/module.c?v=3.8#L46

from monks.

alexandernst commented on September 8, 2024

Yaiks, an un-exported function :/, maybe I'll have to write my own function (copying module_alloc) to avoid old/future changes in the kernel. Ok, I think that's for now, I'll try to write a POC. If I get stuck again (most probably) I'll ask you :)
Thank you!

from monks.

alexandernst commented on September 8, 2024

Wait... I think... wouldn't be just kmalloc with `GFP_KERNEL | PAGE_KERNEL_EXEC`` enough?

from monks.

alexandernst commented on September 8, 2024

Ah, no, sorry, no such flag in kmalloc, instead __vmalloc(byte_size, GFP_KERNEL, PAGE_KERNEL_EXEC); should mimic perfectly what that function is doing.
Anyways, I should get some sleep now (1am here). I'll play with that tomorrow and let you know if I get stuck 😄

from monks.

alexandernst commented on September 8, 2024

After thinking about it for a few hours I think I have all the steps:

Load module

Create a stub like the following one:

stub: 
CALL sys_read_pre_hook_action
CALL real_sys_read
IF <some conditions>
    call sys_read_post_hook_action
IF <counter for remaining syscalls calls == 0>
    restore original syscall address in the syscall table
    free <this stub>
RET

Replace the original syscall address in the syscall table with the address of the stub we just created.
Do some stuff.
Replace with NOPs lines 1, 3 and 4 from the stub.
Unload the module without free-ing the stub.

I'd need to create some kind of macro/template for creating those stubs, as I'll have one for each syscall.
What do you think? Am I missing something?

from monks.

alexandernst commented on September 8, 2024

@milabs Ok, I got to another mental-block. Can you help me?

So, let's say I create the stub like this:

CALL real_sys_read
CALL sys_read_post_hook

This will work perfectly, as when I unload the module, I'll just change to stub to:

CALL real_sys_read
NOP NOP NOP NOP NOP

Then that stub will stay in memory till the next reboot.
So far so good. But now, I'd like to improve it. I'd like to make the stub free itself. For that to happen I need to keep the current __INCR and __DECR macros and create the stub as I already said in my last comment.

The first line of the stub will call the __INCR macro, then the second line will call the real syscall, and then I'd do some checks to see if I should call the fake syscall or free the stub itself.

Let's have a look at the __INCR macro:

#define __INCR(F) atomic_inc(&__syscall_info___NR_##F.counter);

That's pretty clear. A single line that will make an atomic increase of the value of the syscall struct.
For that to keep working I need to

a) allocate __syscall_info___NR_##F in memory (which is really easy)
and
b) allocate the macro in memory, which I have no idea how to do.

My question is: How can I allocate the macro __INCR in memory (in a stub, like the one I'm already creating) ?

from monks.

alexandernst commented on September 8, 2024

Oh, I think I just found a way (and my question wasn't that smart anyways!) 😄

from monks.

alexandernst commented on September 8, 2024

@milabs Hi again! Do you know any library/thing that will let me generate binary code out of ASM in runtime? (so I can gen that code and memcpy it to the stub)

from monks.

milabs commented on September 8, 2024

@alexandernst Do you really need this??

from monks.

alexandernst commented on September 8, 2024

@milabs Hmmm... Maybe I'm not asking for the right tool. But then, I'd like to be able to let the stub know about the address of the atomic counter from here 1db42c3 so the stub can know when to free itself. How could I do that?

from monks.

milabs commented on September 8, 2024

@alexandernst Write stub in assembly and then use udis86 to fixup the refs.

from monks.

alexandernst commented on September 8, 2024

@milabs Ok, I think I'll manage to do that. 😄

from monks.

alexandernst commented on September 8, 2024

@milabs I thought it would be easier, but even a simple "Hello world" with opcode won't run and it will just trigger a kernel oops. I wrote a simple demo and asked in SO: http://stackoverflow.com/questions/20430835/running-code-inside-executable-memory
Can you give me a hint, please?

from monks.

milabs commented on September 8, 2024

@alexandernst Still have no answer?

from monks.

alexandernst commented on September 8, 2024

@milabs I'm almost there. The only missing thing is how to do an indirect call (E8 xx xx xx xx holds 4 bytes at most, which means not all addresses can be called).

from monks.

alexandernst commented on September 8, 2024

@milabs Ok, I finished the POC code to generate some opcode, http://pastebin.com/CWNhruDG Anyways, after loading the module, it generates this opcode: 48 bf 24 00 18 a0 ff ff ff ff 48 bf 2d 00 18 a0 ff ff ff ff 48 c7 c0 02 00 00 00 48 ba ab 05 6c 81 ff ff ff ff ff d2 c3 (addresses may vary, of course), which udcli disassembles as:

mov rdi, 0xffffffffa0180024
mov rdi, 0xffffffffa018002d
mov rax, 0x2
mov rdx, 0xffffffff816c05ab
call rdx
ret

which is correct. That's exactly what my original code looked as. Anyways, it won't work. It won't do anything at all. I mean, the entire output caused by the module is:

[  704.004855] hello: module license 'unspecified' taints kernel.
[  704.005315] &printk: ffffffff816c05ab
[  704.005320] Bytecode: 
[  704.005323] 48bf240018a0ffffffff48bf2d0018a0ffffffff48c7c00200000048baab056c81ffffffffffd2c3
[  704.005323] End

The "Hello world!" message is missing! Why? Why isn't my code running? Or maybe it's running but it isn't causing any output?

from monks.

milabs commented on September 8, 2024

@alexandernst x86_64 calling conventions supposes that function args is in regs RDI, RSI, RDX and RCX. You code must looks like this:

// printk("\n\n\n%s\n\n\n", "hello world");
mov rdi, offset of ("\n\n\n%s\n\n\n")
mov rsi, offset of ("hello world")
mov rax, &printk
call rax

http://en.wikipedia.org/wiki/X86_calling_conventions#x86-64_calling_conventions

from monks.

alexandernst commented on September 8, 2024

@milabs What exactly is the offset in your example code?

offset = \<variable addr\> - <(current address + 5)> ?

from monks.

milabs commented on September 8, 2024

@alexandernst No, as RDI and RSI are 64 bit registers, offset is not relative. Offset is the variable address. Think about the CPU. It fetches instructions one by one. Relative RIP addressing means that the address is relative to RIP pointer. But the CPU doesn't know anything about the instruction before it fetched. After that, RIP points to the next instruction and all relative offsets related to that RIP.

from monks.

alexandernst commented on September 8, 2024

@milabs Hmmm, ok, I'll try it as soon as I get home (at the office now) 😄

from monks.

alexandernst commented on September 8, 2024

@milabs It works !!!!!!!! 😄
Now I need to get a simple "Hello world" for x86 (which I don't think will be any different) and then start coding the real part.

from monks.

milabs commented on September 8, 2024

@alexandernst Excellent :) Tell that to all the SO peoples :)

from monks.

alexandernst commented on September 8, 2024

@milabs I was re-reading the calling conventions and I have 2 questions.

First question:

In x64:
Userland uses RDI, RSI, RDX, RCX, R8, R9, if there are more than 6 arguments, the stack is used too.
Syscalls uses RDI, RSI, RDX, R10, R8, R9, if there are more than 6 arguments, the stack is used too.

In x86:
Userland uses stack for all arguments
Syscalls use EBX, ECX, EDX, ESI, EDI, EBP, if there are more than 6 arguments, the stack is used too.

Have I understood the docs right?

Second question: Are there syscalls with more than 6 arguments? Which ones?

from monks.

alexandernst commented on September 8, 2024

@milabs Look at d50d2e1 I'm almost done!! 😄 😄 😄

I'm only missing the unhook part, which can be done in two different ways.

The first way, which is the less eficient, is to completely remove the fake syscall call and leave only the real syscall call. This way procmon will waste around 60bytes for each syscall. Not much, but feels kind of dirty.

The second way is to check inside the stub if the atomic counter has reached 0, and if so, do 3 things.
a) restore the original syscall address
b) kfree itself
c) place the result of the last syscall in eax/rax.

"a" shouldn't be that hard to do, even in plain ASM. Problem is, how to kfree the stub itself, and also make it finish running itself so it can place the result of the syscall.

BTW: If I go with method 1, we won't need atomic inc/dec anymore! RIght?

from monks.

milabs commented on September 8, 2024

@alexandernst Great!

First way, I think. And you can use a single memory area for all the stubs as you always known amount of the hooked calls. Just preallocate the memory and split it later.

As for the second way, I think that it's too complex and doing kfree itself is not a good idea..

And one more thing. Why do we needed a counter for each hooked syscall and not the generic one?

from monks.

alexandernst commented on September 8, 2024

@milabs Hi! Sorry for taking me so long to reply!

Ok, I'll reconsider this in a future version maybe. :)

Well... an individual counter (per syscall) is needed because there are some syscalls that can be "restored" immediately, but others can't (like __READ). Anyways, actually it doesn't matter if some of the syscalls can be restored earlier than others because right now the entire module is kept in memory until all of the syscalls are restored. And when I merge the new branch, I won't need any of the counters at all :)

from monks.

alexandernst commented on September 8, 2024

@milabs It's done !!!!!!!!!!!!! I made it!!!!!!!!!!!! :D:D:D
It took me almost 2 months of work and +70 commits, but I finally made it!
Thank you for all the tips and help 😄

from monks.

milabs commented on September 8, 2024

@alexandernst Great work!

from monks.

Use stub funcitons for the un/hook process about monks HOT 39 CLOSED

Comments (39)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent