我正在编写一个重写C文件的工具。其中一些文件是非常古老的遗留文件,它们在每个函数的开头声明未初始化的局部变量。我需要开发一个 AST 匹配器 和相应的 MatchCallback 来找到紧随函数中最后一个未初始化变量之后的位置 - 我可以在其中放置一些特殊代码。
作为一个优点,对于在 functionDecl 开头混合使用已初始化和未初始化变量的更多最新 C 代码,我想找到紧随函数开头“{”之后的位置。
到目前为止,我已经确定了一个潜在的 clang-query 类型匹配器,它可以找到函数 decl 之后未初始化的局部变量。这应该是寻找解决方案的良好起点。
这是我用来处理 MatchCallback 的 Entry 处理程序。我需要 run 方法中的帮助来找到可以添加自己的代码的位置。
class EntryHandler : public MatchFinder::MatchCallback {
public:
explicit EntryHandler(Rewriter& rewriter)
: mRewriter(rewriter)
{}
//! This is where the magic happens.
void run(const MatchFinder::MatchResult &result) override {
if (const auto* FD = result.Nodes.getNodeAs<FunctionDecl>("func")) {
// This will be called for each function.
const auto functionName = FD->getNameInfo().getAsString();
const auto VD = result.Nodes.getNodeAs<VarDecl>("uninitVar");
const auto uninitializedVariableName = VD->getNameAsString();
// I NEED HELP HERE
// Retrieve all uninitialized variables in the function.
//for (const auto& next : result.Nodes.getMap()) {
// const auto key = next.first;
//}
//const auto next = result.Nodes.getNodeAs("uninitVar");
//for (const auto *VD : Vars) {
// // This will be called for each variable declaration in the function.
// std::cout << " Found variable: " << VD->getNameAsString() << "\n";
//}
}
}
private:
Rewriter& mRewriter;
};
// ASTConsumer implementation reads AST produced by the Clang parser.
// Here we register our instrumentation matchers and run them against the AST.
class MyASTConsumer : public ASTConsumer {
public:
MyASTConsumer(Rewriter& r)
: mEntryHandler(r) {
// Add for finding function entry declarations.
mMatcher.addMatcher(traverse(TK_IgnoreUnlessSpelledInSource,
functionDecl(isExpansionInMainFile(),
forEachDescendant(varDecl(unless(
hasInitializer(anything())),
hasLocalStorage()).bind("uninitVar")))
.bind("func")), &mEntryHandler);
}
void HandleTranslationUnit(ASTContext& context) override {
// Run the matchers when we have the whole TU parsed.
mMatcher.matchAST(context);
}
private:
EntryHandler mEntryHandler;
MatchFinder mMatcher;
};
查找最后一个未初始化的局部变量声明的任务 功能不太适合什么 Clang AST 匹配器 可以做。例如,没有
last
运算符。这
forEachDescendant
匹配器导致找到多个匹配结果
在每个函数内部(每个未初始化的声明一个),它们是
然后分别传递给MatchCallback::run
。关联
由此产生的回调是一项简单的簿记任务,但是
付出巨大的努力却没有多少收获。
相反,我认为这种情况需要直接 AST 检查,所以你有 完全控制过滤。使用匹配器(如果你愿意)之后 找到所有
FunctionDecl
节点,使用以下方式导航至声明:
FunctionDecl::getBody()
获得身体
Stmt
,
dyn_cast<>
沮丧地
CompoundStmt
,
CompoundStmt::body()
获取 Stmt
, 的可迭代序列
dyn_cast<>
将每个语句向下转换为
DeclStmt
,
DeclStmt::decls()
获得可迭代序列
Decl
,
dyn_cast<>
将每个声明向下转换为
VarDecl
,
最后,
VarDecl::hasInit()
检查初始化程序。
迭代将按照源代码顺序进行,因此您可以获取文本 最后只需记住最后一个未初始化的声明 迭代。
关于来源位置:
DeclStmt::getEndLoc()
将得到声明语句的结尾。
CompoundStmt::getBeginLoc()
将获得左大括号。
以下完整程序演示了该过程:
// match-uninit-decl.cc
// Match uninitialized declarations.
#include "clang/AST/DeclCXX.h" // clang::CXXRecordDecl
#include "clang/ASTMatchers/ASTMatchFinder.h" // clang::ast_matchers::MatchFinder
#include "clang/Tooling/CommonOptionsParser.h" // CommonOptionsParser
#include "clang/Tooling/CompilationDatabase.h" // CompilationDatabase
#include "clang/Tooling/Tooling.h" // clang::tooling::ClangTool
#include "llvm/Support/CommandLine.h" // llvm::cl::OptionCategory
using clang::CXXRecordDecl;
using clang::ast_matchers::MatchFinder;
using clang::dyn_cast;
using clang::tooling::ClangTool;
using clang::tooling::CommonOptionsParser;
using clang::tooling::CompilationDatabase;
using clang::tooling::newFrontendActionFactory;
using namespace clang::ast_matchers;
class HandleMatch : public MatchFinder::MatchCallback {
public:
virtual void run(const MatchFinder::MatchResult &Result) override
{
// Stringify a source location.
auto locStr = [&](clang::SourceLocation loc) -> std::string {
return loc.printToString(*(Result.SourceManager));
};
// Stringify a declaration's location.
auto declLocStr = [&](clang::Decl const *decl) -> std::string {
return locStr(decl->getLocation());
};
llvm::outs() << "match found:\n";
// Defensively make sure we have a function declaration.
if (clang::FunctionDecl const *functionDecl =
Result.Nodes.getNodeAs<clang::FunctionDecl>("func")) {
llvm::outs() << " func at " << declLocStr(functionDecl) << "\n";
// If it has a body (i.e., it is not a prototype), get it.
if (clang::Stmt const *body = functionDecl->getBody()) {
llvm::outs() << " opening brace at " << locStr(body->getBeginLoc()) << "\n";
// The body should always be a compound statement in C.
if (auto compound = dyn_cast<clang::CompoundStmt>(body)) {
// Keep track of the last uninitialized declaration.
clang::VarDecl const *lastUninitDecl = nullptr;
clang::DeclStmt const *lastUninitDeclStmt = nullptr;
// Look at all statements in the body.
for (clang::Stmt const *stmt : compound->body()) {
// Filter for declaration statements.
if (auto declStmt = dyn_cast<clang::DeclStmt>(stmt)) {
// Iterate over the declarations in that statement. There
// can be more than one, e.g., "int x,y;".
for (clang::Decl const *decl : declStmt->decls()) {
// Filter for those declaring variables, as opposed to
// functions, types, etc.
if (auto varDecl = dyn_cast<clang::VarDecl>(decl)) {
// Does it lack an initializer?
if (!varDecl->hasInit()) {
llvm::outs() << " uninit decl at " << declLocStr(varDecl) << "\n";
lastUninitDecl = varDecl;
lastUninitDeclStmt = declStmt;
}
}
}
}
}
if (lastUninitDecl) {
llvm::outs() << " last uninit decl at " << declLocStr(lastUninitDecl) << "\n";
llvm::outs() << " end loc: " << locStr(lastUninitDeclStmt->getEndLoc()) << "\n";
}
else {
llvm::outs() << " no uninit decls\n";
}
}
}
}
}
};
static llvm::cl::OptionCategory MyToolCategory("my-tool options");
int main(int argc, char const **argv) {
// Boilerplate.
auto ExpectedParser = CommonOptionsParser::create(argc, argv, MyToolCategory);
if (!ExpectedParser) {
llvm::errs() << ExpectedParser.takeError();
return 2;
}
CommonOptionsParser& OptionsParser = ExpectedParser.get();
ClangTool Tool(OptionsParser.getCompilations(),
OptionsParser.getSourcePathList());
MatchFinder finder;
finder.addMatcher(
// The matcher here just finds function declarations in the main
// file. All declaration filtering is done afterward.
functionDecl(
isExpansionInMainFile()
).bind("func"),
new HandleMatch
);
return Tool.run(newFrontendActionFactory(&finder).get());
}
// EOF
Makefile
:
# Makefile
# Default target.
all:
.PHONY: all
# ---- Configuration ----
# Source repo directory to optionally use.
LLVM_REPO := $(HOME)/bld/llvm-project-2023-07-14
# If 1, then use the LLVM I built from source.
USE_BUILT_LLVM := 1
# Installation directory from a binary distribution.
# Has five subdirectories: bin include lib libexec share.
ifeq ($(USE_BUILT_LLVM),1)
CLANG_LLVM_INSTALL_DIR = $(LLVM_REPO)/build
else
CLANG_LLVM_INSTALL_DIR = $(HOME)/opt/clang+llvm-14.0.0-x86_64-linux-gnu-ubuntu-18.04
endif
# ---- llvm-config query results ----
# Program to query the various LLVM configuration options.
LLVM_CONFIG := $(CLANG_LLVM_INSTALL_DIR)/bin/llvm-config
# C++ compiler options to ensure ABI compatibility.
LLVM_CXXFLAGS := $(shell $(LLVM_CONFIG) --cxxflags)
# Directory containing the clang library files, both static and dynamic.
LLVM_LIBDIR := $(shell $(LLVM_CONFIG) --libdir)
# Other flags needed for linking, whether statically or dynamically.
LLVM_LDFLAGS_AND_SYSTEM_LIBS := $(shell $(LLVM_CONFIG) --ldflags --system-libs)
# ---- Compiler options ----
# C++ compiler.
#CXX = g++
CXX = $(CLANG_LLVM_INSTALL_DIR)/bin/clang++
# Compiler options, including preprocessor options.
CXXFLAGS =
CXXFLAGS += -Wall
CXXFLAGS += -Werror
CXXFLAGS += -g
# Get llvm compilation flags.
CXXFLAGS += $(LLVM_CXXFLAGS)
# When using an LLVM I built myself, I need some additional flags.
ifeq ($(USE_BUILT_LLVM),1)
CXXFLAGS += -I$(LLVM_REPO)/clang/include
CXXFLAGS += -I$(LLVM_REPO)/build/tools/clang/include
endif
# Linker options.
LDFLAGS =
# Pull in clang+llvm via libclang-cpp.so, which has everything, but is
# only available as a dynamic library.
LDFLAGS += -lclang-cpp
# Needed for llvm::InitLLVM, which is missing from libclang-cpp.so.
LDFLAGS += -lLLVMSupport
# Arrange for the compiled binary to search the libdir for that library.
# Otherwise, one can set the LD_LIBRARY_PATH envvar before running it.
# Note: the -rpath switch does not work on Windows.
LDFLAGS += -Wl,-rpath=$(LLVM_LIBDIR)
# Get the needed -L search path, plus things like -ldl.
LDFLAGS += $(LLVM_LDFLAGS_AND_SYSTEM_LIBS)
# ---- Recipes ----
# Compile a C++ source file.
%.o: %.cpp
$(CXX) -c -o $@ $(CXXFLAGS) $<
# Executable. It needs to be inside a directory because the Clang
# command line parser reads the executable name (via /proc/self/exe,
# not argv[0]) and appends "../lib" to find its headers.
all: bin/match-uninit-decl.exe
bin/match-uninit-decl.exe: match-uninit-decl.o
@mkdir -p bin
$(CXX) -g -Wall -o $@ $^ $(LDFLAGS)
# Symlink to let Clang find its headers in the Clang installation
# directory.
all: lib
lib:
ln -s $(CLANG_LLVM_INSTALL_DIR)/lib
# Quick test.
.PHONY: run
run: bin/match-uninit-decl.exe
./bin/match-uninit-decl.exe test.c --
.PHONY: clean
clean:
$(RM) *.o lib
rm -rf bin
# EOF
test.c
:
// test.c
// Cases for last uninit decl finder.
int f1(void)
{
int x;
int y; // last uninit decl
// ^ end of its declaration statement
int z = 5;
x = 3;
y = 4;
return x+y+z;
}
int f2(void)
{
// No uninit decls.
int x2 = 2;
return x2;
}
int f3(void)
{
int x, y=7; // last uninit is 'x'
// ^ end of its declaration statement
x = 2;
return x+y;
}
int f4(void)
{
// No uninit decls in this outermost body statement.
{
int x;
x=3;
return x;
}
}
// EOF
输出:
$ ./bin/match-uninit-decl.exe test.c --
match found:
func at $PWD/test.c:4:5
opening brace at $PWD/test.c:5:1
uninit decl at $PWD/test.c:6:7
uninit decl at $PWD/test.c:7:7
last uninit decl at $PWD/test.c:7:7 <-- 'y'
end loc: $PWD/test.c:7:8
match found:
func at $PWD/test.c:17:5
opening brace at $PWD/test.c:18:1
no uninit decls
match found:
func at $PWD/test.c:24:5
opening brace at $PWD/test.c:25:1
uninit decl at $PWD/test.c:26:7
last uninit decl at $PWD/test.c:26:7 <-- 'x'
end loc: $PWD/test.c:26:13
match found:
func at $PWD/test.c:34:5
opening brace at $PWD/test.c:35:1
no uninit decls