<menu id="guoca"></menu>
<nav id="guoca"></nav><xmp id="guoca">
  • <xmp id="guoca">
  • <nav id="guoca"><code id="guoca"></code></nav>
  • <nav id="guoca"><code id="guoca"></code></nav>

    《Chrome V8源碼》25.最難啃的骨頭——Builtin!

    VSole2021-11-25 16:36:50

    前言

    接下來的幾篇文章對Builtin做專題講解。Builtin實現了V8中大量的核心功能,可見它的重要性。但大多數的Builtin采用CAS和TQ實現,CAS和TQ與匯編類似,這給我們閱讀源碼帶來了不少困難,更難的是無法在V8運行期間調試Builtin,這讓學習Builtin愈加困難。因此,本專題將詳細講解Builtin的學習方法和調試方法,希望能起到拋磚引玉的作用。

    摘要

    本篇文章是Builtin專題的第一篇,講解Built-in Functions(Builtin)是什么,以及它的初始化。Built-in Functions(Builtin)作為V8的內建功能,實現了很多重要功能,例如ignition、bytecode handler、JavaScript API。因此學會Builtin有助于理解V8的執行邏輯,例如可以看到bytecode是怎么執行的、字符串的substring方法是怎么實現的。本文主要內容介紹Builtin的實現方法(章節2);Builtin初始化(章節3)。

    Builtin的實現方法

    Builtin的實現方法有Platform-dependent assembly language、C++、JavaScript、CodeStubAssembler和Torque,這五種方式在使用的難易度和性能方面有明顯不同。引用官方(v8.dev/docs/torque)內容如下:

    (1) Platform-dependent assembly language: can be highly efficient, but need manual ports to all platforms and are difficult to maintain.

    (2) C++: very similar in style to runtime functions and have access to V8’s powerful runtime functionality, but usually not suited to performance-sensitive areas.

    (3) JavaScript: concise and readable code, access to fast intrinsics, but frequent usage of slow runtime calls, subject to unpredictable performance through type pollution, and subtle issues around (complicated and non-obvious) JS semantics. Javascript builtins are deprecated and should not be added anymore.

    (4) CodeStubAssembler: provides efficient low-level functionality that is very close to assembly language while remaining platform-independent and preserving readability.

    (5) V8 Torque: is a V8-specific domain-specific language that is translated to CodeStubAssembler. As such, it extends upon CodeStubAssembler and offers static typing as well as readable and expressive syntax.

    Torque是CodeStubAssembler的改進版,強調在不損失性能的前提下盡量降低使用難度,讓Builtin的開發更加容易一些。

    圖1(來自官方)說明了使用Torque創建Builtin的過程。

    首先,開發者編寫的file.tq被Torque編譯器翻譯為-tq-csa.cc/.h文件;

    其次,-tq-csa.cc/.h被編譯進可執行文件mksnapshot中;

    最后,mksnapshot生成snapshot.bin文件,該文件存儲Builtin的二進制序列。

    再次強調: *-tq-csa.cc/.h是由file.tq指導Torque編譯器生成的Builtin源碼。

    V8通過反序列化方式加載snapshot文件時沒有符號表,所以調試V8源碼時不能看到Torque Builtin源碼,CodeStubAssembler Builtin也存儲在snapshot.bin文件中,所以調試時也看不到源碼。調試方法請參見mksnapshot,下面講解我的調試方法。

    Builtin初始化

    講解源碼之前先說注意事項,調試方法采用7.9版本和v8_use_snapshot選項,因為新版本不再支持v8_use_snapshot = false,無法調試Builtin的初始化。v8_use_snapshot = false會禁用snapshot.bin文件,這就意味著V8啟動時會使用C++源碼創建和初始化Builtin,而這正是我們想要看的內容。

    我認為C++、CodeStubAssembler和Torque三種Builtin最重要,因為ignition、bytecode handler、Javascript API等核心功能基本由這三種Builtin實現,下面對這三種Builtin做詳細說明。Builtin的初始化入口代碼如下:

    bool Isolate::InitWithoutSnapshot() { return Init(nullptr, nullptr); }
    

    從InitWithoutSnapshot()函數的名字也可看出禁用了snapshot.bin文件,InitWithoutSnapshot()函數執行以下代碼:

    1.  bool Isolate::Init(ReadOnlyDeserializer* read_only_deserializer,2.                     StartupDeserializer* startup_deserializer) {3.  //..............省略...............4.    bootstrapper_->Initialize(create_heap_objects);5.    if (FLAG_embedded_builtins && create_heap_objects) {6.      builtins_constants_table_builder_ = new BuiltinsConstantsTableBuilder(this);7.    }8.    setup_delegate_->SetupBuiltins(this);9.    if (FLAG_embedded_builtins && create_heap_objects) {10.      builtins_constants_table_builder_->Finalize();11.      delete builtins_constants_table_builder_;12.      builtins_constants_table_builder_ = nullptr;13.      CreateAndSetEmbeddedBlob();14.    }15.//..............省略...............16.    return true;17.  }
    

    上述第8行代碼進入SetupBuiltins(),在SetupBuiltins()中調用SetupBuiltinsInternal()以完成Builtin的初始化。SetupBuiltinsInternal()的源碼如下:

    1.  void SetupIsolateDelegate::SetupBuiltinsInternal(Isolate* isolate) {2.    Builtins* builtins = isolate->builtins();3.  //省略...................4.    int index = 0;5.    Code code;6.  #define BUILD_CPP(Name)                                                      \7.    code = BuildAdaptor(isolate, index, FUNCTION_ADDR(Builtin_##Name), #Name); \8.    AddBuiltin(builtins, index++, code);9.  #define BUILD_TFJ(Name, Argc, ...)                              \10.    code = BuildWithCodeStubAssemblerJS(                          \11.        isolate, index, &Builtins::Generate_##Name, Argc, #Name); \12.    AddBuiltin(builtins, index++, code);13.  #define BUILD_TFC(Name, InterfaceDescriptor)                      \14.    /* Return size is from the provided CallInterfaceDescriptor. */ \15.    code = BuildWithCodeStubAssemblerCS(                            \16.        isolate, index, &Builtins::Generate_##Name,                 \17.        CallDescriptors::InterfaceDescriptor, #Name);               \18.    AddBuiltin(builtins, index++, code);19.  #define BUILD_TFS(Name, ...)                                                   \20.    /* Return size for generic TF builtins (stub linkage) is always 1. */        \21.    code =                                                                       \22.        BuildWithCodeStubAssemblerCS(isolate, index, &Builtins::Generate_##Name, \23.                                     CallDescriptors::Name, #Name);              \24.    AddBuiltin(builtins, index++, code);25.  #define BUILD_TFH(Name, InterfaceDescriptor)              \26.    /* Return size for IC builtins/handlers is always 1. */ \27.    code = BuildWithCodeStubAssemblerCS(                    \28.        isolate, index, &Builtins::Generate_##Name,         \29.        CallDescriptors::InterfaceDescriptor, #Name);       \30.    AddBuiltin(builtins, index++, code);31.  #define BUILD_BCH(Name, OperandScale, Bytecode)                           \32.    code = GenerateBytecodeHandler(isolate, index, OperandScale, Bytecode); \33.    AddBuiltin(builtins, index++, code);34.  #define BUILD_ASM(Name, InterfaceDescriptor)                                \35.    code = BuildWithMacroAssembler(isolate, index, Builtins::Generate_##Name, \36.                                   #Name);                                    \37.    AddBuiltin(builtins, index++, code);38.    BUILTIN_LIST(BUILD_CPP, BUILD_TFJ, BUILD_TFC, BUILD_TFS, BUILD_TFH, BUILD_BCH,39.                 BUILD_ASM);40.  //省略...........................41.  }
    

    SetupBuiltinsInternal()的三大核心功能解釋如下:

    (1) BUILD_CPP, BUILD_TFJ, BUILD_TFC, BUILD_TFS, BUILD_TFH, BUILD_BCH和BUILD_ASM從功能上對Builtin做了區分,注釋如下:

    // CPP: Builtin in C++. Entered via BUILTIN_EXIT frame.//      Args: name// TFJ: Builtin in Turbofan, with JS linkage (callable as Javascript function).//      Args: name, arguments count, explicit argument names...// TFS: Builtin in Turbofan, with CodeStub linkage.//      Args: name, explicit argument names...// TFC: Builtin in Turbofan, with CodeStub linkage and custom descriptor.//      Args: name, interface descriptor// TFH: Handlers in Turbofan, with CodeStub linkage.//      Args: name, interface descriptor// BCH: Bytecode Handlers, with bytecode dispatch linkage.//      Args: name, OperandScale, Bytecode// ASM: Builtin in platform-dependent assembly.//      Args: name, interface descriptor
    

    (2) SetupBuiltinsInternal()的第38行代碼BUILTIN_LIST定義了所有的Builtin,源碼如下:

    1.  #define BUILTIN_LIST(CPP, TFJ, TFC, TFS, TFH, BCH, ASM)  \2.    BUILTIN_LIST_BASE(CPP, TFJ, TFC, TFS, TFH, ASM)        \3.    BUILTIN_LIST_FROM_TORQUE(CPP, TFJ, TFC, TFS, TFH, ASM) \4.    BUILTIN_LIST_INTL(CPP, TFJ, TFS)                       \5.    BUILTIN_LIST_BYTECODE_HANDLERS(BCH)6.  //================分隔線=================================7.  #define BUILTIN_LIST_FROM_TORQUE(CPP, TFJ, TFC, TFS, TFH, ASM) \8.  //...............省略............................9.  TFJ(StringPrototypeToString, 0, kReceiver) \10.  TFJ(StringPrototypeValueOf, 0, kReceiver) \11.  TFS(StringToList, kString) \12.  TFJ(StringPrototypeCharAt, 1, kReceiver, kPosition) \13.  TFJ(StringPrototypeCharCodeAt, 1, kReceiver, kPosition) \14.  TFJ(StringPrototypeCodePointAt, 1, kReceiver, kPosition) \15.  TFJ(StringPrototypeConcat, SharedFunctionInfo::kDontAdaptArgumentsSentinel) \16.  TFJ(StringConstructor, SharedFunctionInfo::kDontAdaptArgumentsSentinel) \17.  TFS(StringAddConvertLeft, kLeft, kRight) \18.  TFS(StringAddConvertRight, kLeft, kRight) \19.  TFJ(StringPrototypeEndsWith, SharedFunctionInfo::kDontAdaptArgumentsSentinel) \20.  TFS(CreateHTML, kReceiver, kMethodName, kTagName, kAttr, kAttrValue) \21.  TFJ(StringPrototypeAnchor, SharedFunctionInfo::kDontAdaptArgumentsSentinel) \22.  TFJ(StringPrototypeBig, SharedFunctionInfo::kDontAdaptArgumentsSentinel) \23.  TFJ(StringPrototypeIterator, 0, kReceiver) \24.  TFJ(StringIteratorPrototypeNext, 0, kReceiver) \25.  TFJ(StringPrototypePadStart, SharedFunctionInfo::kDontAdaptArgumentsSentinel) \26.  TFJ(StringPrototypePadEnd, SharedFunctionInfo::kDontAdaptArgumentsSentinel) \27.  TFS(StringRepeat, kString, kCount) \28.  TFJ(StringPrototypeRepeat, 1, kReceiver, kCount) \29.  TFJ(StringPrototypeSlice, SharedFunctionInfo::kDontAdaptArgumentsSentinel) \30.  TFJ(StringPrototypeStartsWith, SharedFunctionInfo::kDontAdaptArgumentsSentinel) \31.  TFJ(StringPrototypeSubstring, SharedFunctionInfo::kDontAdaptArgumentsSentinel) \
    

    BUILTIN_LIST和BUILTIN_LIST_FROM_TORQUE配合使用可以看到所有的Builtin名字,第9-31行代碼可以看到實現字符串方法的Builtin的名字,例如substring的Builtin是StringPrototypeSubstring。

    (3) BUILD_CPP, BUILD_TFJ等七個宏和BUILTIN_LIST的共同配合完成所有Builtin的初始化。以SetupBuiltinsInternal()的BUILD_CPP為例進一步分析,源碼如下:

    1.    int index = 0;2.    Code code;3.  #define BUILD_CPP(Name)                                                      \4.    code = BuildAdaptor(isolate, index, FUNCTION_ADDR(Builtin_##Name), #Name); \5.    AddBuiltin(builtins, index++, code);//...................分隔線.................// FUNCTION_ADDR(f) gets the address of a C function f.#define FUNCTION_ADDR(f) (reinterpret_cast(f))
    

    index的初始值為0,code是一個基于HeapObject的地址指針,用于保存生成的Builtin地址。FUNCTION_ADDR(Builtin_##Name)創建Builtin的地址指針,在BuildAdaptor()中完成Builtin的創建時會使用該指針。BuildAdaptor()的源碼如下:

    Code BuildAdaptor(Isolate* isolate, int32_t builtin_index,                  Address builtin_address, const char* name) {  HandleScope scope(isolate);  // Canonicalize handles, so that we can share constant pool entries pointing  // to code targets without dereferencing their handles.  CanonicalHandleScope canonical(isolate);  constexpr int kBufferSize = 32 * KB;  byte buffer[kBufferSize];  MacroAssembler masm(isolate, BuiltinAssemblerOptions(isolate, builtin_index),                      CodeObjectRequired::kYes,                      ExternalAssemblerBuffer(buffer, kBufferSize));  masm.set_builtin_index(builtin_index);  DCHECK(!masm.has_frame());  Builtins::Generate_Adaptor(&masm, builtin_address);  CodeDesc desc;  masm.GetCode(isolate, &desc);  Handle code = Factory::CodeBuilder(isolate, desc, Code::BUILTIN)                          .set_self_reference(masm.CodeObject())                          .set_builtin_index(builtin_index)                          .Build();  return *code;}
    

    上述代碼中,通過Generate_Adaptor和Factory::CodeBuilder完成Builtin的創建,code表示Builtin的地址。

    返回到#define BUILD_CPP(Name),進入AddBuiltin,源碼如下:

    void SetupIsolateDelegate::AddBuiltin(Builtins* builtins, int index,                                      Code code) {  DCHECK_EQ(index, code.builtin_index());  builtins->set_builtin(index, code);}//..............分隔線.......................void Builtins::set_builtin(int index, Code builtin) {  isolate_->heap()->set_builtin(index, builtin);}//.............分隔線..........................void Heap::set_builtin(int index, Code builtin) {  DCHECK(Builtins::IsBuiltinId(index));  DCHECK(Internals::HasHeapObjectTag(builtin.ptr()));  // The given builtin may be completely uninitialized thus we cannot check its  // type here.  isolate()->builtins_table()[index] = builtin.ptr();}
    

    上述代碼中,Builtins::set_builtin()調用Heap::set_builtin()把Builtin存儲到isolate()->builtins_table()中。builtin_table是V8_INLINE Address*類型的數組,index是數組下標,該數組存儲了所有的Builtin。至此,Builtin初始化完成,圖2是函數調用堆棧。

    Buitlin的調試方法總結如下:

    (1) 把BUILTIN_LIST宏展開,得到每個Builtin的編號index。可以借助VS2019的預處理來展開宏。

    (2) 使用index設置條件斷點,圖3展示了跟蹤12號Builtin的方法。

    在Builtin的源碼下斷點是最簡單直接的方法,如果你不知道Builtin是用哪種方式實現的(如BUILD_CPP或BUILD_TFS),那就在每個方法中都設置條件斷點。圖4中是在Substring源碼中下的斷點。

    技術總結

    (1) 調試Bultin時要使用7.x版的V8,高版本中已經沒有v8_use_snapshot了;

    (2) 編譯V8時需要設置v8_optimized_debug = false,關閉compiler optimizations;

    (3) 因為builtin_index是int32_t,設置條件斷點時要用使用(int)builtin_index。

    好了,今天到這里,下次見。

    個人能力有限,有不足與紕漏,歡迎批評指正

    微信:qq9123013 備注:v8交流 郵箱:v8blink@outlook.com

    源碼tfs
    本作品采用《CC 協議》,轉載必須注明作者和本文鏈接
    Builtin實現了V8中大量的核心功能,可見它的重要性。
    1 摘要上一篇文章中,Builtin作為先導知識,我們做了宏觀概括和介紹。Builtin是編譯好的內置代碼塊,存儲在snapshot_blob.bin文件中,V8啟動時以反序列化方式加載,運行時可以直接調用。
    隨著科技的飛速發展,網絡空間的主權完整和安全也成為影響國際關系的重要因素,國家之間的競爭也在由物理空間逐漸轉向網絡空間,國內的網絡安全也面臨著越來越多的風險和挑戰。根據Gartner提供的數據表示,75%的安全攻擊是由軟件自身漏洞造成的,針對軟件漏洞的攻擊已成為黑客入侵的主要方式之一,而且攻擊者通過挖掘軟件代碼中的多個安全漏洞,形成攻擊鏈條的不法行為,對關系到國計民生的軟件系統帶來了重大安全隱患。
    一個最簡單的linux kernel rootkit就是一個linux kernel module。
    本系列將以官網資料為基礎主要通過動態跟蹤來解析DynamoRIO的源代碼。因為如果不結合實例只是將各函數的作用寫出來,實在無法很好的說明問題,我們將以代碼覆蓋工具drcov為例,分析DynamoRIO的執行流程。
    AFL源碼淺析
    2022-10-26 09:54:13
    前言AFL是一款著名的模糊測試的工具,最近在閱讀AFL源碼,記錄一下,方便以后查閱。編譯項目:將編譯的優化選項關閉,即改寫成-O01afl-gcc.c使用gdb加載afl-gcc,并使用set arg -o test test.c設置參數2find_as函數?find_as函數首先會通過AFL_PATH環境變量的值從而獲得AFL對應的路徑?若上述環境變量不存在則獲取當前afl-gcc所在的文件路徑?判斷該路徑下的as文件是否具有可執行權限u8?//函數用來判斷指定的文件或目錄是否有可執行權限,若指定方式有效則返回0,否則返回-1
    前言Mybatis的專題文章寫到這里已經是第四篇了,前三篇講了Mybatis的基本使用,相信只要認真看了的朋友,在實際開發中正常使用應該不是問題。
    mimikatz的這個功能從本質上是解析Windows的數據庫文件,從而獲取其中存儲的用戶哈希。
    本文主要討論學習mimikatz中與Kerberos協議相關的代碼
    Twitch源碼在4chan泄露
    2021-10-08 09:33:51
    近日,有匿名用戶在4chan 匿名imageboard網站上發帖泄露Twitch源碼和用戶敏感信息。該匿名用戶分享了一個種子鏈接,該鏈接指向一個125GB的數據,據稱這些數據是從6000個內部Twitch Git庫竊取的。
    VSole
    網絡安全專家
      亚洲 欧美 自拍 唯美 另类