吾爱破解 - LCG - LSG |安卓破解|病毒分析|www.52pojie.cn

 找回密码
 注册[Register]

QQ登录

只需一步,快速开始

查看: 1015|回复: 0
收起左侧

[其他转载] ARM汇编 1.ELF, Memory, Registers

[复制链接]
t7sqynt3 发表于 2020-7-22 15:00

前言:逆向工程需要汇编基础,先尝试整理目前所学的ARM汇编知识。由于水平有限,可能会有所疏漏,欢迎进行指正和讨论。此系列为原创,如引用需标明出处。然而,不建议作为学术研究的引用,因为内容未经过peer review。

You should cite this article if you want to use it in your work.

Warning: this article may not be precise and professional, and it is not peer-reviewed.

欢迎专业人士对此文进行翻译,因为作者本人不知道准确的翻译术语。

1.ELF, Memory, Registers

Virtual Address Space

For each process on Linux, the OS creates a virtual private memory. The range of virtual address space is the same for all processes. The address in the physical memory of the corresponding content in virtual address space can be very different. In addition, the positional relationship of each segments can be different in the physical memory. (For example, the memory mapped segment for dynamic libraries and other mappings.)

As we all know, the virtual address space for 32-bit machines has a maximum size of 4GB. However, your program can use much less than 4GB memory.

Linux running on 32-bit ARM machines has mainly two kinds of virtual address space: User mode virtual address space and Kernel mode virtual address space. Below is the illustration of the layout:

1.png

Addresses from top to bottom are from high to low. A normal process uses the user mode virtual address space. The kernel mode virtual address space is not directly visible to a process.

ELF and memory layout

ELF is the abbreviation for Executable and Linkable Format. It is a typical format for executable files on Linux. When you execute an ELF file, the OS allocates memory for the process and copy sections of the ELF file into memory, where the sections are located in different places in the user mode virtual address space.

The graph below shows the relationship:

2.png

Notice that some part of the ELF files may be omitted for simplicity here.

The descriptions of each section in ELF files:

  • Program header table: used by loader to copy sections of ELF files into segments in memory.

  • .init: code to run before function main().

  • .text: code, containing functions including main().

  • .rodata: read-only data.

  • .data: global variables that are initialized. Initial values are stored.

  • .bss: global variables that are not initialized. Only size is stored.

  • .symtab: linkage information.

  • Debug info: data for the debugger.

  • Section header table: used by linker to combine files.

The sections .symtab, debug info, and section header table are not loaded into memory.

The descriptions of each segment in the user mode address space layout: (Addresses from top to bottom are from high to low)

  • Text segment: your code, startup code, and read-only data. Usually read-only.

  • Data segment: variables with initial values. Readable and writeable.

  • BSS segment: BSS stands for Block Started by Symbol. Size information is read from .bss section and the initial values are 0. Readable and writeable.

  • Heap: dynamically allocated memory by the program. Typically by malloc(). Readable and writeable.

  • Memory mapped segment: Dynamic libraries and other mappings. Readable and writeable.

  • Stack frames: for functions and automatic variables.

Note: if you want to make ELF file smaller, you should not initialize global variables with initial value 0. (This will not save memory though.)

Registers

Registers are fast read and write storage within the CPU. They are much faster than the fastest cache.

There are memory address register, memory data register, instruction register, current program status register (CSPR), and 16 user-level registers for ARM 32-bit CPU's. We focus on the 16 registers and the CSPR. Regarding the CSPR, we focus on the 4 flags: Negative condition code flag (N), Zero condition code flag (Z), Carry condition code flag (C), Overflow condition code flag (V).

The 16 registers are r0-r15. r0-r3 are temporary registers. They hold the first 4 arguments of function calls, and r0 holds the return value. According to the standard, you must always assume that the content in r0-r3 will change after any function calls. r12 is also a temporary register, used by the C dynamically linked library.

r4-r10 are preserved registers. According to the standard, functions must save these registers before modifying them and restore them before returning to the caller. These registers are safe to store values with function calls.

r11, r13-r15 are special use registers. You should not change the values arbitrarily. r11 is the frame pointer (fp), r13 is the stack pointer (sp), r14 is the link register (lr), and r15 is the program counter (pc). According to the standard, fp is used for a reference for automatic variables on the stack, sp is used to manipulate the stack in the function's stack frame, lr is the returning address to the caller, and pc keeps track of current instruction. (Due to pipeline, the value got by instructions is likely not the current instruction address. It is typically 8 bytes after the current instruction address, or the address of the second instruction after the current. The concept of ARM pipeline will be covered later in the optimization topics.)

发帖前要善用论坛搜索功能,那里可能会有你要找的答案或者已经有人发布过相同内容了,请勿重复发帖。

您需要登录后才可以回帖 登录 | 注册[Register]

本版积分规则 警告:本版块禁止灌水或回复与主题无关内容,违者重罚!

快速回复 收藏帖子 返回列表 搜索

RSS订阅|小黑屋|处罚记录|联系我们|吾爱破解 - LCG - LSG ( 京ICP备16042023号 | 京公网安备 11010502030087号 )

GMT+8, 2024-5-7 23:14

Powered by Discuz!

Copyright © 2001-2020, Tencent Cloud.

快速回复 返回顶部 返回列表