如何对 python 函数体进行哈希处理?

问题描述 投票:0回答:2
所以我正在研究一种分布式构建系统。系统允许将脚本片段作为构建步骤执行。我需要能够以注释和文档字符串不影响哈希的方式对这些代码片段进行哈希处理。我通过使用 ast 模块来解析代码,然后执行 ast.dump 并对生成的字符串进行哈希处理,从而达到了这一目标。对我来说,自然的下一步是清除每个 FunctionDef 节点主体中的第一个 Expr(str()) 节点。

有没有更好的方法来解决这个问题?我不禁认为这是一个必须已经解决过很多次的问题,但我在堆栈溢出上找不到任何内容。

python function hash
2个回答
5
投票
所以这是我迄今为止找到的最好的解决方案。

import ast import hashlib import inspect def _remove_docstring(node): ''' Removes all the doc strings in a FunctionDef or ClassDef as node. Arguments: node (ast.FunctionDef or ast.ClassDef): The node whose docstrings to remove. ''' if not (isinstance(node, ast.FunctionDef) or isinstance(node, ast.ClassDef)): return if len(node.body) != 0: docstr = node.body[0] if isinstance(docstr, ast.Expr) and isinstance(docstr.value, ast.Str): node.body.pop(0) #------------------------------------------------------------------------------- def hash_function(func): ''' Produces a hash for the code in the given function. Arguments: func (types.FunctionObject): The function to produce a hash for ''' func_str = inspect.getsource(func) module = ast.parse(func_str) assert len(module.body) == 1 and isinstance(module.body[0], ast.FunctionDef) # Clear function name so it doesn't affect the hash func_node = module.body[0] func_node.name = "" # Clear all the doc strings for node in ast.walk(module): _remove_docstring(node) # Convert the ast to a string for hashing ast_str = ast.dump(module, annotate_fields=False) # Produce the hash fhash = hashlib.sha256(ast_str) result = fhash.hexdigest() return result #------------------------------------------------------------------------------- # Function 1 def test(blah): 'This is a test' class Test(object): ''' My test class ''' print blah def sub_function(foo): '''arg''' print hash_function(test) #------------------------------------------------------------------------------- # Function 2 def test2(blah): 'This is a test' class Test(object): ''' My test class ''' print blah def sub_function(foo): '''arg meh''' print hash_function(test2)
    

0
投票
这是一个基于 Luke 的答案的更新示例,针对 Python 3 进行了更新并支持 lambda 函数。

def hash_function(func, char_length=8): """Produces a hash for the code in the given function. See https://stackoverflow.com/a/49998190/315168 :param char_length: How many characters you want in your hash, to reduce the hash size. :return: Part of hex hash of the function body """ assert callable(func), f"Not a function: {func}" func_str = inspect.getsource(func) # Heurestics if this is a lambda function - in this case ast will fail # Account for ending new line (may or may not be there?) lambda_like = len(func_str.split("\n")) in (1, 2) and "lambda" in func_str if not lambda_like: module = ast.parse(func_str) assert len(module.body) == 1 and isinstance(module.body[0], ast.FunctionDef) # Clear function name so it doesn't affect the hash func_node = module.body[0] func_node.name = "" # Clear all the doc strings for node in ast.walk(module): _remove_docstring(node) # Convert the ast to a string for hashing ast_str = ast.dump(module, annotate_fields=False).encode("utf-8") # Produce the hash fhash = hashlib.sha256(ast_str) else: # Handle lambda special case fhash = hashlib.sha256(func_str.encode("utf-8")) return fhash.hexdigest()[0:char_length]
    
© www.soinside.com 2019 - 2024. All rights reserved.