AFL之自定義mutator開發分析
前言:
什么是mutator?
通過使用根據給定語法執行突變的庫來啟用結構感知模糊測試,進而達成更細致化模糊數據處理的第三方組件,一般由模糊測試人員自己開發編寫。
在什么環境下需要mutator?
對于xml等固定文件格式的testcase時就需要編寫mutator進行精細化fuzzing。
為什么要用python編寫mutator?
xml型mutator因為python庫更加齊全,所以我們使用python來編寫mutator
我們先來看看afl對mutator的調用流程:
下面是C library版mutator函數afl_custom_fuzz和afl_custom_init和python版fuzz和init本質上是一樣的
afl_custom_fuzz調用分析:
60分鐘仍然一個cycle都沒有找到時,會在env上設置AFL_EXPAND_HAVOC_NOW


然后在分析到env存在AFL_EXPAND_HAVOC_NOW后,將afl->expand_havoc設置為1

當afl->expand_havoc為1時,afl->limit_time_sig = -1

確定afl->limit_time_sig <= 0就執行fuzz_one_original()

然后調用到afl_custom_fuzz

6.png
上面的el->data即為我們-i 指定目錄內部文件的testcase,然后經過自定義mutator來生成自己的testcase
afl_custom_init調用分析:
調用流程:
setup_custom_mutatorsload_custom_mutatorafl_custom_init



然后給我們的mutator配置包裝好了的afl_t結構體*afl和隨機數種子seed
開發流程:
根據官方開發文檔:Custom Mutators | AFLplusplus 可以知道我們需要定義一個init()來初始化配置我們的seed來保證我們的mutator上的rand隨機性和afl上的rand是同步的。總的來說是調用了lxml來編寫xml型的mutator:
from lxml import etree as ET
配置初始化函數init():
def init(seed):
"""
Called once when AFL starts up. Seed is used to identify the AFL instance in log files
"""
global __mutator__ #定義一個類
global __seed__ #定義一個全局隨機種子
# Get the seed
__seed__ = seed #根據afl傳入的seed重寫我們的seed
# Create a global mutation class
try:
__mutator__ = XmlMutatorMin(__seed__, verbose=__log__) #初始化類
log("init(): Mutator created")
except RuntimeError as e:
log("init(): Can't create mutator: %s" % e.message)
注意這里的__mutator__ = XmlMutatorMin(__seed__, verbose=__log__)就是我們fuzz調用到的函數。然后進入到XmlMutatorMin.py上調用__init__來初始化XmlMutatorMin類:
class XmlMutatorMin:
"""
Optionals parameters:
seed Seed used by the PRNG (default: "RANDOM")
verbose Verbosity (default: False)
"""
def __init__(self, seed="RANDOM", verbose=False):
""" Initialize seed, database and mutators """
# Verbosity
self.verbose = verbose
# Initialize PRNG
self.seed = str(seed)
if self.seed == "RANDOM":
random.seed()
else:
if self.verbose:
print("Static seed '%s'" % self.seed)
random.seed(self.seed)
# Initialize input and output documents
self.input_tree = None
self.tree = None
# High-level mutators (no database needed)
hl_mutators_delete = [
"del_node_and_children",
"del_node_but_children",
"del_attribute",
"del_content",
] # Delete items
hl_mutators_fuzz = ["fuzz_attribute"] # Randomly change attribute values
# Exposed mutators
self.hl_mutators_all = hl_mutators_fuzz + hl_mutators_delete
上面的self.hl_mutators_all就是我們mutator對從afl傳入數據進行模糊處理的函數列表。然后還需要一個deinit()來結束初始化:
def deinit(): # optional for Python pass
配置數據模糊函數fuzz():
buf:傳入的數據
def fuzz(buf, add_buf, max_size):
"""
Called for each fuzzing iteration.
"""
global __mutator__
# Do we have a working mutator object?
if __mutator__ is None:
log("fuzz(): Can't fuzz, no mutator available")
return buf
# Try to use the AFL buffer
via_buffer = True
# Interpret the AFL buffer (an array of bytes) as a string
if via_buffer:
try:
buf_str = str(buf)
log("fuzz(): AFL buffer converted to a string")
except Exception:
via_buffer = False
log("fuzz(): Can't convert AFL buffer to a string")
# Load XML from the AFL string
if via_buffer:
try:
__mutator__.init_from_string(buf_str)
log(
"fuzz(): Mutator successfully initialized with AFL buffer (%d bytes)"
% len(buf_str)
)
except Exception:
via_buffer = False
log("fuzz(): Can't initialize mutator with AFL buffer")
# If init from AFL buffer wasn't succesful
if not via_buffer:
log("fuzz(): Returning unmodified AFL buffer")
return buf
# Sucessful initialization -> mutate
try:
__mutator__.mutate(max=5)
log("fuzz(): Input mutated")
except Exception:
log("fuzz(): Can't mutate input => returning buf")
return buf
# Convert mutated data to a array of bytes
try:
data = bytearray(__mutator__.save_to_string())
log("fuzz(): Mutated data converted as bytes")
except Exception:
log("fuzz(): Can't convert mutated data to bytes => returning buf")
return buf
# Everything went fine, returning mutated content
log("fuzz(): Returning %d bytes" % len(data))
return data
上述代碼對從afl傳入的buf進行了4步處理:
一、格式化輸出afl傳入的buf:
buf_str = str(buf)
二、將xml流構造成樹:
__mutator__.init_from_string(buf_str)
三、對已經被構造成樹的數據流數據進行模糊操作:
max表示了對數據處理的最大次數
__mutator__.mutate(max=5)
四、將模糊處理后的xml樹數據保存成testcase:
data = bytearray(__mutator__.save_to_string())
init_from_string:
對傳入的xml流重構成tree,進而更好的對數據進行處理
def init_from_string(self, input_string):
""" Initialize the mutator from a XML string """
# Get a pointer to the top-element
self.input_tree = self.__parse_xml(input_string)
# Get a working copy
self.tree = deepcopy(self.input_tree) # 使用deepcopy生成一個用于處理數據的臨時樹
def __parse_xml(self, xml):
""" Parse an XML string. Basic wrapper around lxml.parse() """
try:
tree = ET.parse(io.BytesIO(xml)) # 使用了lxml.etree.parse對傳入的xml流重構成樹
except ET.ParseError:
raise RuntimeError("XML isn't well-formed!")
except LookupError as e:
raise RuntimeError(e)
# Return a document wrapper
return tree
mutate:
對xml數據進行模糊處理類型可以分為:
- 刪除結點但是保存子結點
- 刪除結點但是不保存子結點
- 刪除屬性
- 刪除內容
- 對屬性數據進行模糊處理
一、保證模糊處理函數調用的隨機性:
def mutate(self, min=1, max=5): """ Execute some high-level mutators between $min and $max times, then some medium-level ones """ # High-level mutation self.__exec_among(self, self.hl_mutators_all, min, max) def __exec_among(self, module, functions, min_times, max_times): """ Randomly execute $functions between $min and $max times """ for i in xrange(random.randint(min_times, max_times)): # Function names are mangled because they are "private" getattr(module, "_XmlMutatorMin__" + random.choice(functions))()
min和max保證了調用數據模糊處理的次數隨機性(即調用多少次對數據的處理操作) __exec_among保證了對數據模糊處理的類型隨機性(即隨機選擇上面的模糊處理類型操作)
二、刪除結點:
def __del_node(self, delete_children):
""" Called by the __del_node_* mutators """
# Select a node to modify (but the root one)
(rand_elem_id, rand_elem) = self.__pick_element(exclude_root_node=True)
# If the document includes only a top-level element
# Then we can't pick a element (given that "exclude_root_node = True")
# Is the document deep enough?
if rand_elem is None:
if self.verbose:
print("Can't delete a node: document not deep enough!")
return
# Log something
if self.verbose:
but_or_and = "and" if delete_children else "but"
print(
"Deleting tag #%i '%s' %s its children"
% (rand_elem_id, rand_elem.tag, but_or_and)
)
if delete_children is False:
# Link children of the random (soon to be deleted) node to its parent
for child in rand_elem:
rand_elem.getparent().append(child)
# Remove the node
rand_elem.getparent().remove(rand_elem)
調用了lxml.etree.getparent()獲取到當前結點的父結點,進而刪除node 如果delete_children為true則調用lxml.etree.append()將子node附加到父node的上一個node 如果delete_children為false則調用lxml.etree.remove()將整個父node刪除
保存子結點:
def __del_node_and_children(self): """High-level minimizing mutator Delete a random node and its children (i.e. delete a random tree)""" self.__del_node(True)
不保存子結點:
def __del_node_but_children(self): """High-level minimizing mutator Delete a random node but its children (i.e. link them to the parent of the deleted node)""" self.__del_node(False)
三、刪除屬性:
def __del_attribute(self):
"""High-level minimizing mutator
Delete a random attribute from a random node"""
# Select a node to modify
(rand_elem_id, rand_elem) = self.__pick_element()
# Get all the attributes
attribs = rand_elem.keys()
# Is there attributes?
if len(attribs) < 1:
if self.verbose:
print("No attribute: can't delete!")
return
# Pick a random attribute
rand_attrib_id = random.randint(0, len(attribs) - 1)
rand_attrib = attribs[rand_attrib_id]
# Log something
if self.verbose:
print(
"Deleting attribute #%i '%s' of tag #%i '%s'"
% (rand_attrib_id, rand_attrib, rand_elem_id, rand_elem.tag)
)
# Delete the attribute
rand_elem.attrib.pop(rand_attrib)
使用lxml.etree.attrib.pop()刪除node里面的某一屬性四、刪除內容:
def __del_content(self):
"""High-level minimizing mutator
Delete the attributes and children of a random node"""
# Select a node to modify
(rand_elem_id, rand_elem) = self.__pick_element()
# Log something
if self.verbose:
print("Reseting tag #%i '%s'" % (rand_elem_id, rand_elem.tag))
# Reset the node
rand_elem.clear()
使用lxml.etree.clear()刪除node內的content五、對屬性數據進行模糊處理:
def __fuzz_attribute(self):
""" Fuzz (part of) an attribute value """
# Select a node to modify
(rand_elem_id, rand_elem) = self.__pick_element()
# Get all the attributes
attribs = rand_elem.keys()
# Is there attributes?
if len(attribs) < 1:
if self.verbose:
print("No attribute: can't replace!")
return
# Pick a random attribute
rand_attrib_id = random.randint(0, len(attribs) - 1) # 隨機獲取屬性id
rand_attrib = attribs[rand_attrib_id] # 獲取知道屬性id指向的屬性
# We have the attribute to modify
# Get its value
attrib_value = rand_elem.get(rand_attrib) # 獲取指定屬性的數據
# print("- Value: " + attrib_value)
# Should we work on the whole value?
func_call = "(?P[a-zA-Z:\-]+)\((?P.*?)\)"
p = re.compile(func_call)
l = p.findall(attrib_value)
if random.choice((True, False)) and l:
# Randomly pick one the function calls
(func, args) = random.choice(l)
# Split by "," and randomly pick one of the arguments
value = random.choice(args.split(","))
# Remove superfluous characters
unclean_value = value
value = value.strip(" ").strip("'")
# print("Selected argument: [%s]" % value)
else:
value = attrib_value
# For each type, define some possible replacement values
choices_number = (
"0",
"11111",
"-128",
"2",
"-1",
"1/3",
"42/0",
"1094861636 idiv 1.0",
"-1123329771506872 idiv 3.8",
"17=$numericRTF",
str(3 + random.randrange(0, 100)),
)
choices_letter = (
"P" * (25 * random.randrange(1, 100)),
"%s%s%s%s%s%s",
"foobar",
)
choices_alnum = (
"Abc123",
"020F0302020204030204",
"020F0302020204030204" * (random.randrange(5, 20)),
)
# Fuzz the value
if random.choice((True, False)) and value == "":
# 為NULL不管
new_value = value
elif random.choice((True, False)) and value.isdigit():
# number用number替代
new_value = random.choice(choices_number)
elif random.choice((True, False)) and value.isalpha():
# 文本用文本替代
new_value = random.choice(choices_letter)
elif random.choice((True, False)) and value.isalnum():
# 字母數字就用字母數字替代
new_value = random.choice(choices_alnum)
else:
# Default type
new_value = random.choice(choices_alnum + choices_letter + choices_number)
# If we worked on a substring, apply changes to the whole string
if value != attrib_value:
# No ' around empty values
if new_value != "" and value != "":
new_value = "'" + new_value + "'"
# Apply changes
new_value = attrib_value.replace(unclean_value, new_value)
# Log something
if self.verbose:
print(
"Fuzzing attribute #%i '%s' of tag #%i '%s'"
% (rand_attrib_id, rand_attrib, rand_elem_id, rand_elem.tag)
)
# Modify the attribute
rand_elem.set(rand_attrib, new_value.decode("utf-8"))
實質就是隨機正則某個屬性數據,然后使用自制字典同類型替換
真正使用的mutator要自開發python3兼容的版本,然后還要bytearrary()包含格式化輸出才能正常使用。
AFL++ 調用Python Library:
export PYTHONPATH=/home/mutator/ export AFL_PYTHON_MODULE=mutator afl-fuzz ....
PYTHONPATH:指向mutator.py的路徑
export AFL_PYTHON_MODULE:指定文件夾內mutator.py,注意不能有.py后綴
afl-fuzz ....:為afl-fuzz的執行命令