【V8】V8研究环境搭建

博客停更了好久,最近尝试学学v8,希望能坚持坚持,并有所成果

参考文章

https://www.anquanke.com/post/id/267518

https://tokameine.gitbook.io/chose-me-or-javascript-v8

环境搭建

运行环境搭建

clone开发者工具,下载v8源码以及依赖

sudo apt install bison cdbs curl flex g++ git python vim pkg-config
git clone https://chromium.googlesource.com/chromium/tools/depot_tools.git
echo 'export PATH=$PATH:"/path/to/depot_tools"' >> ~/.bashrc
# /path/to/depot_tools改成depot_tools的目录
fetch v8
./v8/build/install-build-deps.sh --no-chromeos-fonts

fetch v8命令会下载v8的源码,具体时间视网络情况而定

--no-chromeos-fonts为去除字体依赖,有需要可以去掉参数

编译源码,为便于编译不同版本的源码,使用如下脚本:

#!/bin/bash
VER=$1 
if [ -z $2 ]; then
        NAME=$VER
else
        NAME=$2
fi
cd /path/depot_tools/v8
# /path/depot_tools/v8 换成自己的路径
git reset --hard $VER
gclient sync -D
gn gen out/x64_$NAME.release --args='v8_monolithic=true v8_use_external_startup_data=false is_component_build=false is_debug=false target_cpu ="x64" use_goma=false goma_dir="None" v8_enable_backtrace=true v8_enable_disassembler=true v8_enable_object_print=true v8_enable_verify_heap=true'
ninja -C out/x64_$NAME.release d8

例,只需替换对应版本号即可

./build.sh "9.6.180.6"

编译时间视机器性能而定,本人使用wsl2,编译完成大致需要十多分钟,在可接受范围内

325633f1d8bbd92195913fa0aab37c3

报错处理

在编译时,若python3版本较新,可能出现报错如下:

FAILED: gen/src/inspector/protocol/Forward.h gen/src/inspector/protocol/Protocol.cpp gen/src/inspector/protocol/Protocol.h gen/src/inspector/protocol/Console.cpp gen/src/inspector/protocol/Console.h gen/src/inspector/protocol/Debugger.cpp gen/src/inspector/protocol/Debugger.h gen/src/inspector/protocol/HeapProfiler.cpp gen/src/inspector/protocol/HeapProfiler.h gen/src/inspector/protocol/Profiler.cpp gen/src/inspector/protocol/Profiler.h gen/src/inspector/protocol/Runtime.cpp gen/src/inspector/protocol/Runtime.h gen/src/inspector/protocol/Schema.cpp gen/src/inspector/protocol/Schema.h gen/include/inspector/Debugger.h gen/include/inspector/Runtime.h gen/include/inspector/Schema.h 
python3 ../../third_party/inspector_protocol/code_generator.py --jinja_dir ../../third_party/ --output_base gen/src/inspector --config ../../src/inspector/inspector_protocol_config.json --inspector_protocol_dir ///third_party/inspector_protocol
Traceback (most recent call last):
  File "/home/loorain/v8/v8/out/x64_9.6.180.6.release/../../third_party/inspector_protocol/code_generator.py", line 702, in <module>
    main()
  File "/home/loorain/v8/v8/out/x64_9.6.180.6.release/../../third_party/inspector_protocol/code_generator.py", line 584, in main
    jinja_env = initialize_jinja_env(jinja_dir, config.protocol.output, config)
  File "/home/loorain/v8/v8/out/x64_9.6.180.6.release/../../third_party/inspector_protocol/code_generator.py", line 190, in initialize_jinja_env
    import jinja2
  File "/home/loorain/v8/v8/third_party/jinja2/__init__.py", line 33, in <module>
    from jinja2.environment import Environment, Template
  File "/home/loorain/v8/v8/third_party/jinja2/environment.py", line 16, in <module>
    from jinja2.defaults import BLOCK_START_STRING, \
  File "/home/loorain/v8/v8/third_party/jinja2/defaults.py", line 32, in <module>
    from jinja2.tests import TESTS as DEFAULT_TESTS
  File "/home/loorain/v8/v8/third_party/jinja2/tests.py", line 13, in <module>
    from collections import Mapping
ImportError: cannot import name 'Mapping' from 'collections'

这个错误是由于在Python 3.10及以后的版本中,collections.Mapping已经被移动到了collections.abc模块中。解决方式:

  1. 定位出错文件:根据错误信息,出错的文件是jinja2/tests.py
  2. 编辑文件:打开jinja2/tests.py文件,找到导入Mapping的地方
  3. 修改导入语句:将from collections import Mapping修改为from collections.abc import Mapping

调试环境

将文件v8/tools/gdbinit,加入到~/.gdbinit中:

┌─[loorain@LAPTOP-Loora1N] - [~/v8/workdir] - [10125]
└─[$] cat ~/.gdbinit                                                                                                               [9:59:35]
source /home/loorain/pwndbg/gdbinit.py 
#source ~/peda/peda.py
source ~/Pwngdb/pwngdb.py
source ~/Pwngdb/angelheap/gdbinit.py
source /home/loorain/v8/v8/tools/gdbinit #这里是v8的路径

define hook-run
python
import angelheap
angelheap.init_angelheap()
end
end

测试引擎

首先测试下编译好的引擎能否使用,编写test.js

a = [1];
%DebugPrint(a);
%SystemBreak();
  • %DebugPrint(x); 打印变量x相关信息
  • %SystemBreak();抛出中断,使得gdb在此处断点

在 v8/out/x64_$name.release 目录下可以找到二进制程序 d8,它才是解析执行 js 代码的引擎。使用d8解析test.js,会报错如下:

┌─[loorain@LAPTOP-Loora1N] - [~/v8/workdir] - [10126]
└─[$] ../v8/out/x64_9.6.180.6.release/d8 test.js                                                             [9:59:47]
test.js:2: SyntaxError: Unexpected token '%'
%DebugPrint(a);
^
SyntaxError: Unexpected token '%'

这是由于类似%DebugPrint(a);这样的代码,原生引擎无法直接解析,需要加入参数--allow-natives-syntax,运行结果如下:

┌─[loorain@LAPTOP-Loora1N] - [~/v8/workdir] - [10127]
└─[$] ../v8/out/x64_9.6.180.6.release/d8 test.js --allow-natives-syntax                                                           [10:01:35]
DebugPrint: 0x3b2608049921: [JSArray]
 - map: 0x3b2608203a41 <Map(PACKED_SMI_ELEMENTS)> [FastProperties]
 - prototype: 0x3b26081cc0e9 <JSArray[0]>
 - elements: 0x3b26081d3181 <FixedArray[1]> [PACKED_SMI_ELEMENTS (COW)]
 - length: 1
 - properties: 0x3b260800222d <FixedArray[0]>
 - All own properties (excluding elements): {
    0x3b26080048f1: [String] in ReadOnlySpace: #length: 0x3b260814215d <AccessorInfo> (const accessor descriptor), location: descriptor
 }
 - elements: 0x3b26081d3181 <FixedArray[1]> {
           0: 1
 }
0x3b2608203a41: [Map]
 - type: JS_ARRAY_TYPE
 - instance size: 16
 - inobject properties: 0
 - elements kind: PACKED_SMI_ELEMENTS
 - unused property fields: 0
 - enum length: invalid
 - back pointer: 0x3b26080023b5 <undefined>
 - prototype_validity cell: 0x3b2608142405 <Cell value= 1>
 - instance descriptors #1: 0x3b26081cc59d <DescriptorArray[1]>
 - transitions #1: 0x3b26081cc5b9 <TransitionArray[4]>Transition array #1:
     0x3b260800524d <Symbol: (elements_transition_symbol)>: (transition to HOLEY_SMI_ELEMENTS) -> 0x3b2608203ab9 <Map(HOLEY_SMI_ELEMENTS)>

 - prototype: 0x3b26081cc0e9 <JSArray[0]>
 - constructor: 0x3b26081cbe85 <JSFunction Array (sfi = 0x3b260814adc9)>
 - dependent code: 0x3b26080021b9 <Other heap object (WEAK_FIXED_ARRAY_TYPE)>
 - construction counter: 0

[1]    5022 trace trap  ../v8/out/x64_9.6.180.6.release/d8 test.js --allow-natives-syntax

调试程序

接下来就可以使用gdb调试d8引擎

 gdb ../v8/out/x64_9.6.180.6.release/d8 
 ...
 pwndbg> r --allow-natives-syntax test.js
 Starting program: /home/loorain/v8/v8/out/x64_9.6.180.6.release/d8 ./test.js  --allow-natives-syntax
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7f692cbb1640 (LWP 11776)]
[New Thread 0x7f692c3b0640 (LWP 11777)]
[New Thread 0x7f692bbaf640 (LWP 11778)]
[New Thread 0x7f692b3ae640 (LWP 11779)]
[New Thread 0x7f692abad640 (LWP 11780)]
[New Thread 0x7f692a3ac640 (LWP 11781)]
[New Thread 0x7f6929bab640 (LWP 11782)]
[New Thread 0x7f69293aa640 (LWP 11783)]
[New Thread 0x7f6928ba9640 (LWP 11784)]
[New Thread 0x7f69283a8640 (LWP 11785)]
[New Thread 0x7f6927ba7640 (LWP 11786)]
[New Thread 0x7f69273a6640 (LWP 11787)]
[New Thread 0x7f6926ba5640 (LWP 11788)]
[New Thread 0x7f69263a4640 (LWP 11789)]
[New Thread 0x7f6925ba3640 (LWP 11790)]
DebugPrint: 0x2e008049931: [JSArray]
 - map: 0x02e008203a41 <Map(PACKED_SMI_ELEMENTS)> [FastProperties]
 - prototype: 0x02e0081cc0e9 <JSArray[0]>
 - elements: 0x02e0081d3181 <FixedArray[1]> [PACKED_SMI_ELEMENTS (COW)]
 - length: 1
 - properties: 0x02e00800222d <FixedArray[0]>
 - All own properties (excluding elements): {
    0x2e0080048f1: [String] in ReadOnlySpace: #length: 0x02e00814215d <AccessorInfo> (const accessor descriptor), location: descriptor
 }
 - elements: 0x02e0081d3181 <FixedArray[1]> {
           0: 1
 }
0x2e008203a41: [Map]
 - type: JS_ARRAY_TYPE
 - instance size: 16
 - inobject properties: 0
 - elements kind: PACKED_SMI_ELEMENTS
 - unused property fields: 0
 - enum length: invalid
 - back pointer: 0x02e0080023b5 <undefined>
 - prototype_validity cell: 0x02e008142405 <Cell value= 1>
 - instance descriptors #1: 0x02e0081cc59d <DescriptorArray[1]>
 - transitions #1: 0x02e0081cc5b9 <TransitionArray[4]>Transition array #1:
     0x02e00800524d <Symbol: (elements_transition_symbol)>: (transition to HOLEY_SMI_ELEMENTS) -> 0x02e008203ab9 <Map(HOLEY_SMI_ELEMENTS)>

 - prototype: 0x02e0081cc0e9 <JSArray[0]>
 - constructor: 0x02e0081cbe85 <JSFunction Array (sfi = 0x2e00814adc9)>
 - dependent code: 0x02e0080021b9 <Other heap object (WEAK_FIXED_ARRAY_TYPE)>
 - construction counter: 0

在刚刚添加的v8的gbdinit中,包含了一下新的辅助指令,如job,可以用来输出对象信息。其他具体命令可以查看刚刚的gdbinit文件,或者网址https://chromium.googlesource.com/v8/v8/+/refs/heads/main/tools/gdbinit,在其中的define部分有所定义。

image-20240724110559162

image-20240724110614457

job输出刚刚的a可以得到结果

pwndbg> job 0x2e008049931
0x2e008049931: [JSArray]
 - map: 0x02e008203a41 <Map(PACKED_SMI_ELEMENTS)> [FastProperties]
 - prototype: 0x02e0081cc0e9 <JSArray[0]>
 - elements: 0x02e0081d3181 <FixedArray[1]> [PACKED_SMI_ELEMENTS (COW)]
 - length: 1
 - properties: 0x02e00800222d <FixedArray[0]>
 - All own properties (excluding elements): {
    0x2e0080048f1: [String] in ReadOnlySpace: #length: 0x02e00814215d <AccessorInfo> (const accessor descriptor), location: descriptor
 }
 - elements: 0x02e0081d3181 <FixedArray[1]> {
           0: 1
 }

这里有经验的话,可以看到我们job查看的地址并不符合常理,末尾居然为1。这是因为使用job命令的时候,其地址要是其真实地址+1,也就是说,在上面的样例中,其真实地址为:0x2e008049930

pwndbg> x/4gx 0x2e008049930
0x2e008049930:  0x0800222d08203a41      0x00000002081d3181
0x2e008049940:  0x0000000000000000      0x0000000000000000

如果使用job命令,后面跟着的是其真实地址,会被解析成SMI(small integer)类型:

pwndbg> job 0x2e008049930
Smi: 0x4024c98 (67259544)

另外可以注意到 0x4024c98 * 2正好等于地址的低32位 0x8049930

这是因为job将其解析成了一个数据,v8 储存数据的方式有些特别,它会让这些整数都乘以二,也包括数组的长度,因此当 job 认为该地址是一个数字类型时,会将其除以二后的值当作本来的值,或者说,将原值左移一位后储存

基本调试环境到这种程度就可以了。