从0到1构建一个Hook工具之注入器篇（一）

n1ng · 发表于 2026-3-16 13:34

本帖最后由 n1ng 于 2026-3-16 17:23 编辑

前言

其实在很早之前就对Frida这类Hook工具是怎么做的就挺好奇的了，正好最近比较空，于是想着自己写一个Hook工具~~(其实是到处抄)~~，顺便在这里做一个记录，一方面希望可以学到这方面的知识、和大家一起交流，另一方面也做一个分享。

项目地址：https://github.com/x1aon1ng/Ninjector

目标

这个系列预期会有三部分：Injector、Java Hook、Native Hook，一部分可能会有几篇博客记录，这里先从Injector的attach注入模式开始。

原理和实现

知道这些基础后会更好理解下文

进程隔离：现代操作系统中每个进程都有着自己的虚拟地址空间，比如进程A中某函数地址为0x12345678，进程B中也有0x12345678，虽然数值一样但他们并不是同一块真实内存，也就是说A和B进程在地址空间中是隔离开、互不影响的。
dlopen系列函数：在Android/Linux中，so文件(shared object)是共享库，dlopen函数的作用就是在运行时把一个so文件加载到当前进程中，执行dlopen后当前进程就会把这个so映射进自己的地址空间(追一下Android中System.loadLibrary()方法就会发现他其实后面就是调用了dlopen)。
ptrace机制：ptrace本来最经典的用途是调试器控制被调试进程，比如gdb这类工具背后就大量使用了ptrace，ptrace使得一个进程可以观察、暂停、修改另一个进程的执行状态，这个执行状态包括：寄存器、内存、运行/暂停状态，而这些恰好就是注入器最需要的功能。
maps文件：/proc/\<pid>/maps文件可以看到pid对应进程当前的内存映射，内容包括这个进程映射了哪些文件、权限如何、地址范围。所以，比如我们想要调用dlopen函数，我们就需要知道libdl.so被加载到了哪里、dlopen在libdl.so这个模块里的位置。
ARM64函数调用约定：顾名思义，“函数调用约定”就是函数在被调用时参数怎么传递、返回值放在哪、谁负责保存现场等这一类规则。在Android开发中正常使用dlopen时，编译器帮我们把一切都安排好了，参数放在寄存器还是栈、返回值从哪里拿、哪些寄存器需要保存，这些都不用我们担心，但是注入的场景下，我们是在手动“伪造”一次函数调用，因此必须要知道这些。而ARM64函数调用约定的一个重要内容就是：前8个整数/指针参数放在x0-x7寄存器中，例如本地调用了dlopen(path, flag)，x0寄存器就存储了path，x1寄存器就存储了flag，并且返回值后面也会被存储在x0寄存器。

为什么先从attach开始

当然是attach模式相较spawn模式更容易实现。首先回答一个问题，注入到底是什么？上文提到，进程之间的内存空间都是互相隔离的，你进程里面的地址、变量、函数调用都默认只属于你自己，因此对注入最朴素的理解就是：让”别人的进程“替你来执行一段你指定的动作。

那怎么实现注入呢，在attach模式中这个”动作“并不是一个复杂的逻辑，甚至只需要一句代码：dlopen("target.so")，只要目标进程自己执行了这句，动态链接器就会把target.so加载进去，so里面的构造函数、JNI初始化、hook初始化就都会跑起来，所以attach注入的目标就是”想办法让目标进程自己调用一次dlopen“。那么实现attach注入要做的事情就很明确了：

找到目标进程
使用ptrace附加目标进程
解析目标进程中函数地址，即远程函数地址
把target.so的路径写入目标进程空间
远程调用dlopen

具体实现

项目结构设计

Ninjector/
  jni/
    common/
      log.h
    process/
      process.h
      process.cpp
    ptrace/
      ptrace_arm64.h
      ptrace_arm64.cpp
    injector/
      injector.h
      injector.cpp
    main.cpp
    Android.mk
    Application.mk

一次attach注入的链路为：

用户执行inject命令
注入器通过ptrace attach
解析malloc/dlopen/dlerror等函数的远程地址
远程调用malloc
把target.so路径写入目标进程内存
远程调用dlopen
detach

核心就三句代码：

void *p = malloc(strlen(path) + 1);//p指向新申请的一段内存空间
strcpy(p, path);//将path路径写入p处地址
dlopen(p, RTLD_NOW | RTLD_GLOBAL);//作为dlopen函数的参数传入，dlopen加载path路径下的so

process模块

这个模块主要是一些辅助函数，如：

从/proc按进程名查找pid

因为使用ptrace时需要提供pid作为参数，为了避免每次注入要手动查pid所以写这样一个函数方便获取pid，具体实现其实就是遍历/proc/\<pid>/cmdline然后每一条记录和process_name对比。

计算模块基址

模块基址其实就是某个so在当前进程地址空间里面起始加载的位置，为什么需要模块基址呢，上文提到函数的地址在不同进程中是不通用的，比如本进程中dlopen地址为0x12345678，而在目标进程中却是0x87654321，虽然二者地址不同但是我们知道dlopen在libdl.so中的位置是不变的，因此想要知道函数地址，就要先知道模块的基址。而maps文件刚好就存储了模块的地址范围，因此就有了下面这段代码：

    FILE* fp = fopen(maps_path, "r");
    if (fp == nullptr) {
        LOGE("get_module_base: failed to open %s", maps_path);
        return 0;
    }

    char line[512] = {0};
    long base_addr = 0;
    while (fgets(line, sizeof(line), fp) != nullptr) {
        if (strstr(line, module_name) != nullptr) {
            char* start = strtok(line, "-");
            if (start != nullptr) {
                base_addr = strtoul(start, nullptr, 16);
                break;
            }
        }
    }

计算远程函数地址

经过上文的介绍，我们已经知道了本地函数地址和远程函数地址的区别，并且已经知道了如何去获取模块的基址，我们很容易就能想到远程函数地址的计算公式：

remote_addr = local_func - local_module_base + remote_module_base

即先求函数在本地模块的相对偏移，也就是local_func - local_module_base，然后再把这个偏移和远程模块的基址相加，这样就得到了远程函数地址。

举个具体例子：在本地进程中libdl.so的基址为0x70000000，dlopen地址是0x70001234，在目标进程中libdl.so的基址为0x71000000,所以dlopen在目标进程内的地址就是dlopen在libdl.so中的偏移(0x70001234 - 0x70000000) = 0x1234，再加上目标检测中libdl.so的基址0x71000000 = 0x71001234，因此有了下面这段代码：

long get_remote_addr(pid_t pid, void* local_func) {
    if (local_func == nullptr) {
        LOGE("get_remote_addr: local_func is null");
        return 0;
    }

    const char* module_path = get_module_name(-1, reinterpret_cast<uintptr_t>(local_func));
    if (module_path == nullptr) {
        LOGE("get_remote_addr: failed to get local module name");
        return 0;
    }

    long local_base = get_module_base(-1, module_path);
    long remote_base = get_module_base(pid, module_path);
    if (local_base == 0 || remote_base == 0) {
        LOGE("get_remote_addr: failed to get module base, module=%s local=%lx remote=%lx",
             module_path, local_base, remote_base);
        return 0;
    }

    long remote_addr = reinterpret_cast<long>(local_func) - local_base + remote_base;
    LOGD("get_remote_addr: module=%s local_func=%lx local_base=%lx remote_base=%lx remote_addr=%lx",
         module_path, reinterpret_cast<long>(local_func), local_base, remote_base, remote_addr);
    return remote_addr;
}

到这里我们已经完成了process模块的设计，接下来是对ptrace提供的api的封装，以方便后续的调用。

ptrace模块

首先是ptrace的attach和detach的功能封装，其实就是调用了ptrace方法：

bool attach_process(pid_t pid) {
    if (pid <= 0) {
        LOGE("attach_process: invalid pid=%d", pid);
        return false;
    }

    if (xptrace(PTRACE_ATTACH, pid, nullptr, nullptr) == -1) {
        LOGE("attach_process: PTRACE_ATTACH failed, pid=%d", pid);
        return false;
    }

    int status = 0;
    if (waitpid(pid, &status, WUNTRACED) == -1) {
        LOGE("attach_process: waitpid failed, pid=%d errno=%d (%s)",
             pid, errno, strerror(errno));
        return false;
    }

    LOGI("attach_process: attached to pid=%d status=0x%x", pid, status);
    return true;
}

然后是内存读写功能，但是可以发现这里代码要比上面的多一部分，其实是因为我们这里使用的是ptrace进行跨进程的数据读写，和memcpy之间复制不同，ptrace的读写接口PTRACE_PEEKDATA/PTRACE_POKEDATA本身是按“机器字”工作的，粗略理解就是：一次ptrace的写，不是任意长度的字节流入，而是写一个机器字大小的数据，再arm64上也就是8字节。

并且还需要对尾字进行处理，比如假设目标地址最后8个字节内容为“AA BB CC DD EE FF 11 22”，我们只想要改前三个字节为 “78 79 7A”，如果我们之间粗暴的写一个完整字，但剩下的5个字节没有处理好，那就有可能把内存中不该改的数据覆盖掉，所以我们这样做：1. 先把目标地址这一整个字读出来，只替换前 remain个字节，再把整个字写回去，如下所示

void ptrace_write(pid_t pid, long address, void* data, size_t size) {
    if (pid <= 0 || address == 0 || data == nullptr || size == 0) {
        LOGE("ptrace_write: invalid args pid=%d address=%lx size=%zu", pid, address, size);
        return;
    }

    const size_t word_size = sizeof(unsigned long);
    size_t full_words = size / word_size;
    size_t remain = size % word_size;

    auto* bytes = reinterpret_cast<unsigned char*>(data);
    for (size_t i = 0; i < full_words; ++i) {
        unsigned long word = *reinterpret_cast<unsigned long*>(bytes + i * word_size);
        xptrace(PTRACE_POKEDATA,
                pid,
                reinterpret_cast<void*>(address + i * word_size),
                reinterpret_cast<void*>(word));
    }

    if (remain > 0) {
        long tail_addr = address + full_words * word_size;
        unsigned long word = static_cast<unsigned long>(
            xptrace(PTRACE_PEEKDATA, pid, reinterpret_cast<void*>(tail_addr), nullptr)
        );
        memcpy(&word, bytes + full_words * word_size, remain);
        xptrace(PTRACE_POKEDATA,
                pid,
                reinterpret_cast<void*>(tail_addr),
                reinterpret_cast<void*>(word));
    }
}

然后是对远程调用的封装，在这之前，我们已经实现了ptrace的attach/detach、对内存的读写，并且已经计算出来目标函数在远程进程里的地址，那么接下来的问题就是：怎么让目标进程去执行这个函数。

在上文中简单提到了远程调用其实就是要手动“伪造”一次函数调用，并且已经简单了解arm64的函数调用约定，而call_remote_call()就是一次伪造函数调用。

我们先从一次远程调用dlopen(remote_path, RTLD_NOW|RTLD_GLOBAL)来看：这样的一次调用，在底层大致是在构造这样的一个状态：

x0 = remote_path
x1 = RTLD_NOW | RTLD_GLOBAL
pc = remote_dlopen
sp = 目标进程当前有效栈
恢复执行
x0作为返回值

一次call_remote_call需要做的就是把参数放进x0-x7，把pc改为目标函数入口，借用目标进程自己的栈和执行流跑完这次调用，从x0取出返回值，恢复现场

template<typename Ret>
Ret call_remote_call(pid_t pid, long address, int argc, long* args) {
    pt_regs regs{};
    pt_regs backup_regs{};

    iovec regs_iov{
        .iov_base = ®s,
        .iov_len = sizeof(pt_regs)
    };
    iovec backup_iov{
        .iov_base = &backup_regs,
        .iov_len = sizeof(pt_regs)
    };

    xptrace(PTRACE_GETREGSET, pid, reinterpret_cast<void*>(NT_PRSTATUS), ®s_iov);
    backup_regs = regs;

    for (int i = 0; i < argc && i < REGS_ARG_NUM; ++i) {
        regs.uregs[i] = args[i];
    }

    if (argc > REGS_ARG_NUM) {
        size_t stack_size = (argc - REGS_ARG_NUM) * sizeof(long);
        regs.ARM_sp -= stack_size;
        ptrace_write(pid, regs.ARM_sp, args + REGS_ARG_NUM, stack_size);
    }

    regs.ARM_lr = 0;
    regs.ARM_pc = address;

#define CPSR_T_MASK (1u << 5)
    if (regs.ARM_pc & 1) {
        regs.ARM_pc &= (~1u);
        regs.ARM_cpsr |= CPSR_T_MASK;
    } else {
        regs.ARM_cpsr &= ~CPSR_T_MASK;
    }

    xptrace(PTRACE_SETREGSET, pid, reinterpret_cast<void*>(NT_PRSTATUS), ®s_iov);
    xptrace(PTRACE_CONT, pid, nullptr, nullptr);

    int status = 0;
    waitpid(pid, &status, WUNTRACED);
    while ((status & 0xFF) != 0x7f) {
        xptrace(PTRACE_CONT, pid, nullptr, nullptr);
        waitpid(pid, &status, WUNTRACED);
    }

    xptrace(PTRACE_GETREGSET, pid, reinterpret_cast<void*>(NT_PRSTATUS), ®s_iov);
    xptrace(PTRACE_SETREGSET, pid, reinterpret_cast<void*>(NT_PRSTATUS), &backup_iov);

    if constexpr (std::is_void_v<Ret>) {
        return;
    } else {
        return reinterpret_cast<Ret>(regs.uregs[0]);
    }
}

Injector模块

Injector部分主要就是使用上文的哪些辅助函数，把他们组织成一条完整的attach注入链路。

首先是remote_alloc_string函数，就是通过call_remote_call函数远程调用malloc然后写入target.so的路径作为后面dlopen的参数。

    size_t len = strlen(str) + 1;
    void* remote_buf = call_remote_function<void*, size_t>(
        pid,
        reinterpret_cast<void*>(malloc),
        len
    );

然后就是关键的注入主函数，做的事情包括：

参数检查：确认pid、so_path是否有效
attach目标进程
在目标进程里准备so路径字符串
远程调用dlopen(需要上一步的so路径作为入参)
dlopen如果失败则远程调用dlerror并获取错误字符串
远程调用free释放内存
detach

static void* inject_so_handle_by_pid(pid_t pid, const char* so_path) {
    if (pid <= 0 || so_path == nullptr || so_path[0] == '\0') {
        LOGE("inject_so_handle_by_pid: invalid args");
        return nullptr;
    }

    bool attached = false;
    void* remote_path = nullptr;
    void* handle = nullptr;

    if (!attach_process(pid)) {
        LOGE("inject_so_handle_by_pid: attach failed, pid=%d", pid);
        return nullptr;
    }
    attached = true;
    LOGI("inject_so_handle_by_pid: attached to pid=%d", pid);

    remote_path = remote_alloc_string(pid, so_path);
    if (remote_path == nullptr) {
        LOGE("inject_so_handle_by_pid: remote_alloc_string failed");
        goto fail;
    }

    handle = call_remote_function<void*, const char*, int>(
        pid,
        reinterpret_cast<void*>(dlopen),
        reinterpret_cast<const char*>(remote_path),
        RTLD_NOW | RTLD_GLOBAL
    );
    if (handle == nullptr) {
        LOGE("inject_so_handle_by_pid: remote dlopen failed");

        char* remote_error = call_remote_function<char*>(
            pid,
            reinterpret_cast<void*>(dlerror)
        );
        if (remote_error != nullptr) {
            char error_buf[512] = {0};
            ptrace_read(pid,
                        reinterpret_cast<long>(remote_error),
                        reinterpret_cast<uint8_t*>(error_buf),
                        sizeof(error_buf) - 1);
            LOGE("inject_so_handle_by_pid: dlerror=%s", error_buf);
        } else {
            LOGE("inject_so_handle_by_pid: dlerror returned null");
        }

        goto fail;
    }

    LOGI("inject_so_handle_by_pid: remote dlopen success, handle=%p", handle);

    call_remote_function<void, void*>(
        pid,
        reinterpret_cast<void*>(free),
        remote_path
    );
    remote_path = nullptr;

    if (!detach_process(pid)) {
        LOGE("inject_so_handle_by_pid: detach failed after success");
        return nullptr;
    }

    LOGI("inject_so_handle_by_pid: inject success");
    return handle;

fail:
    if (remote_path != nullptr) {
        call_remote_function<void, void*>(
            pid,
            reinterpret_cast<void*>(free),
            remote_path
        );
    }

    if (attached) {
        detach_process(pid);
    }
    return nullptr;
}

帐号		自动登录	找回密码
密码			注册[Register]

[Android 原创] 从0到1构建一个Hook工具之注入器篇（一）

前言

目标