libTooling - AST 匹配器最后一个未初始化的函数变量

问题描述 投票:0回答:1

我正在编写一个重写C文件的工具。其中一些文件是非常古老的遗留文件,它们在每个函数的开头声明未初始化的局部变量。我需要开发一个 AST 匹配器 和相应的 MatchCallback 来找到紧随函数中最后一个未初始化变量之后的位置 - 我可以在其中放置一些特殊代码。

作为一个优点,对于在 functionDecl 开头混合使用已初始化和未初始化变量的更多最新 C 代码,我想找到紧随函数开头“{”之后的位置。

到目前为止,我已经确定了一个潜在的 clang-query 类型匹配器,它可以找到函数 decl 之后未初始化的局部变量。这应该是寻找解决方案的良好起点。

这是我用来处理 MatchCallback 的 Entry 处理程序。我需要 run 方法中的帮助来找到可以添加自己的代码的位置。

class EntryHandler : public MatchFinder::MatchCallback {
public:
    explicit EntryHandler(Rewriter& rewriter)
        : mRewriter(rewriter)
    {}

    //! This is where the magic happens.
    void run(const MatchFinder::MatchResult &result) override {
        if (const auto* FD = result.Nodes.getNodeAs<FunctionDecl>("func")) {
            // This will be called for each function.
            const auto functionName = FD->getNameInfo().getAsString();
            const auto VD = result.Nodes.getNodeAs<VarDecl>("uninitVar");
            const auto uninitializedVariableName = VD->getNameAsString();
            // I NEED HELP HERE
            // Retrieve all uninitialized variables in the function.
            //for (const auto& next : result.Nodes.getMap()) {
            //    const auto key = next.first;
            //}
            //const auto next = result.Nodes.getNodeAs("uninitVar");

            //for (const auto *VD : Vars) {
            //    // This will be called for each variable declaration in the function.
            //    std::cout << "  Found variable: " << VD->getNameAsString() << "\n";
            //}
        }
    }
private:
    Rewriter& mRewriter;
};


// ASTConsumer implementation reads AST produced by the Clang parser.
// Here we register our instrumentation matchers and run them against the AST.
class MyASTConsumer : public ASTConsumer {
public:
    MyASTConsumer(Rewriter& r)
    : mEntryHandler(r) {
        // Add for finding function entry declarations.
        mMatcher.addMatcher(traverse(TK_IgnoreUnlessSpelledInSource,
            functionDecl(isExpansionInMainFile(), 
                forEachDescendant(varDecl(unless(
                    hasInitializer(anything())), 
                    hasLocalStorage()).bind("uninitVar")))
                    .bind("func")), &mEntryHandler);

    }

    void HandleTranslationUnit(ASTContext& context) override {
        // Run the matchers when we have the whole TU parsed.
        mMatcher.matchAST(context);
    }

private:
    EntryHandler mEntryHandler;
    MatchFinder mMatcher;
};
abstract-syntax-tree matcher libtooling clang-query
1个回答
0
投票

查找最后一个未初始化的局部变量声明的任务 功能不太适合什么 Clang AST 匹配器 可以做。例如,没有

last
运算符。这
forEachDescendant
匹配器导致找到多个匹配结果 在每个函数内部(每个未初始化的声明一个),它们是 然后分别传递给
MatchCallback::run
。关联 由此产生的回调是一项简单的簿记任务,但是 付出巨大的努力却没有多少收获。

相反,我认为这种情况需要直接 AST 检查,所以你有 完全控制过滤。使用匹配器(如果你愿意)之后 找到所有

FunctionDecl
节点,使用以下方式导航至声明:

  • FunctionDecl::getBody()
    获得身体
    Stmt
    ,

  • dyn_cast<>
    沮丧地
    CompoundStmt
    ,

  • CompoundStmt::body()
    获取
    Stmt
    ,

    的可迭代序列
  • dyn_cast<>
    将每个语句向下转换为
    DeclStmt
    ,

  • DeclStmt::decls()
    获得可迭代序列
    Decl
    ,

  • dyn_cast<>
    将每个声明向下转换为
    VarDecl
    , 最后,

  • VarDecl::hasInit()
    检查初始化程序。

迭代将按照源代码顺序进行,因此您可以获取文本 最后只需记住最后一个未初始化的声明 迭代。

关于来源位置:

  • DeclStmt::getEndLoc()
    将得到声明语句的结尾。

  • CompoundStmt::getBeginLoc()
    将获得左大括号。

以下完整程序演示了该过程:

// match-uninit-decl.cc
// Match uninitialized declarations.

#include "clang/AST/DeclCXX.h"                   // clang::CXXRecordDecl
#include "clang/ASTMatchers/ASTMatchFinder.h"    // clang::ast_matchers::MatchFinder
#include "clang/Tooling/CommonOptionsParser.h"   // CommonOptionsParser
#include "clang/Tooling/CompilationDatabase.h"   // CompilationDatabase
#include "clang/Tooling/Tooling.h"               // clang::tooling::ClangTool

#include "llvm/Support/CommandLine.h"            // llvm::cl::OptionCategory

using clang::CXXRecordDecl;
using clang::ast_matchers::MatchFinder;
using clang::dyn_cast;
using clang::tooling::ClangTool;
using clang::tooling::CommonOptionsParser;
using clang::tooling::CompilationDatabase;
using clang::tooling::newFrontendActionFactory;

using namespace clang::ast_matchers;


class HandleMatch : public MatchFinder::MatchCallback {
public:
  virtual void run(const MatchFinder::MatchResult &Result) override
  {
    // Stringify a source location.
    auto locStr = [&](clang::SourceLocation loc) -> std::string {
      return loc.printToString(*(Result.SourceManager));
    };

    // Stringify a declaration's location.
    auto declLocStr = [&](clang::Decl const *decl) -> std::string {
      return locStr(decl->getLocation());
    };

    llvm::outs() << "match found:\n";

    // Defensively make sure we have a function declaration.
    if (clang::FunctionDecl const *functionDecl =
          Result.Nodes.getNodeAs<clang::FunctionDecl>("func")) {
      llvm::outs() << "  func at " << declLocStr(functionDecl) << "\n";

      // If it has a body (i.e., it is not a prototype), get it.
      if (clang::Stmt const *body = functionDecl->getBody()) {
        llvm::outs() << "    opening brace at " << locStr(body->getBeginLoc()) << "\n";

        // The body should always be a compound statement in C.
        if (auto compound = dyn_cast<clang::CompoundStmt>(body)) {
          // Keep track of the last uninitialized declaration.
          clang::VarDecl const *lastUninitDecl = nullptr;
          clang::DeclStmt const *lastUninitDeclStmt = nullptr;

          // Look at all statements in the body.
          for (clang::Stmt const *stmt : compound->body()) {
            // Filter for declaration statements.
            if (auto declStmt = dyn_cast<clang::DeclStmt>(stmt)) {
              // Iterate over the declarations in that statement.  There
              // can be more than one, e.g., "int x,y;".
              for (clang::Decl const *decl : declStmt->decls()) {
                // Filter for those declaring variables, as opposed to
                // functions, types, etc.
                if (auto varDecl = dyn_cast<clang::VarDecl>(decl)) {
                  // Does it lack an initializer?
                  if (!varDecl->hasInit()) {
                    llvm::outs() << "    uninit decl at " << declLocStr(varDecl) << "\n";
                    lastUninitDecl = varDecl;
                    lastUninitDeclStmt = declStmt;
                  }
                }
              }
            }
          }

          if (lastUninitDecl) {
            llvm::outs() << "    last uninit decl at " << declLocStr(lastUninitDecl) << "\n";
            llvm::outs() << "      end loc: " << locStr(lastUninitDeclStmt->getEndLoc()) << "\n";
          }
          else {
            llvm::outs() << "    no uninit decls\n";
          }
        }
      }
    }
  }
};

static llvm::cl::OptionCategory MyToolCategory("my-tool options");

int main(int argc, char const **argv) {
  // Boilerplate.
  auto ExpectedParser = CommonOptionsParser::create(argc, argv, MyToolCategory);
  if (!ExpectedParser) {
    llvm::errs() << ExpectedParser.takeError();
    return 2;
  }
  CommonOptionsParser& OptionsParser = ExpectedParser.get();
  ClangTool Tool(OptionsParser.getCompilations(),
                 OptionsParser.getSourcePathList());

  MatchFinder finder;
  finder.addMatcher(
    // The matcher here just finds function declarations in the main
    // file.  All declaration filtering is done afterward.
    functionDecl(
      isExpansionInMainFile()
    ).bind("func"),

    new HandleMatch
  );
  return Tool.run(newFrontendActionFactory(&finder).get());
}

// EOF

Makefile

# Makefile

# Default target.
all:
.PHONY: all


# ---- Configuration ----
# Source repo directory to optionally use.
LLVM_REPO := $(HOME)/bld/llvm-project-2023-07-14

# If 1, then use the LLVM I built from source.
USE_BUILT_LLVM := 1

# Installation directory from a binary distribution.
# Has five subdirectories: bin include lib libexec share.
ifeq ($(USE_BUILT_LLVM),1)
  CLANG_LLVM_INSTALL_DIR = $(LLVM_REPO)/build
else
  CLANG_LLVM_INSTALL_DIR = $(HOME)/opt/clang+llvm-14.0.0-x86_64-linux-gnu-ubuntu-18.04
endif

# ---- llvm-config query results ----
# Program to query the various LLVM configuration options.
LLVM_CONFIG := $(CLANG_LLVM_INSTALL_DIR)/bin/llvm-config

# C++ compiler options to ensure ABI compatibility.
LLVM_CXXFLAGS := $(shell $(LLVM_CONFIG) --cxxflags)

# Directory containing the clang library files, both static and dynamic.
LLVM_LIBDIR := $(shell $(LLVM_CONFIG) --libdir)

# Other flags needed for linking, whether statically or dynamically.
LLVM_LDFLAGS_AND_SYSTEM_LIBS := $(shell $(LLVM_CONFIG) --ldflags --system-libs)


# ---- Compiler options ----
# C++ compiler.
#CXX = g++
CXX = $(CLANG_LLVM_INSTALL_DIR)/bin/clang++

# Compiler options, including preprocessor options.
CXXFLAGS =
CXXFLAGS += -Wall
CXXFLAGS += -Werror

CXXFLAGS += -g

# Get llvm compilation flags.
CXXFLAGS += $(LLVM_CXXFLAGS)

# When using an LLVM I built myself, I need some additional flags.
ifeq ($(USE_BUILT_LLVM),1)
  CXXFLAGS += -I$(LLVM_REPO)/clang/include
  CXXFLAGS += -I$(LLVM_REPO)/build/tools/clang/include
endif

# Linker options.
LDFLAGS =

# Pull in clang+llvm via libclang-cpp.so, which has everything, but is
# only available as a dynamic library.
LDFLAGS += -lclang-cpp

# Needed for llvm::InitLLVM, which is missing from libclang-cpp.so.
LDFLAGS += -lLLVMSupport

# Arrange for the compiled binary to search the libdir for that library.
# Otherwise, one can set the LD_LIBRARY_PATH envvar before running it.
# Note: the -rpath switch does not work on Windows.
LDFLAGS += -Wl,-rpath=$(LLVM_LIBDIR)

# Get the needed -L search path, plus things like -ldl.
LDFLAGS += $(LLVM_LDFLAGS_AND_SYSTEM_LIBS)


# ---- Recipes ----
# Compile a C++ source file.
%.o: %.cpp
    $(CXX) -c -o $@ $(CXXFLAGS) $<

# Executable.  It needs to be inside a directory because the Clang
# command line parser reads the executable name (via /proc/self/exe,
# not argv[0]) and appends "../lib" to find its headers.
all: bin/match-uninit-decl.exe
bin/match-uninit-decl.exe: match-uninit-decl.o
    @mkdir -p bin
    $(CXX) -g -Wall -o $@ $^ $(LDFLAGS)

# Symlink to let Clang find its headers in the Clang installation
# directory.
all: lib
lib:
    ln -s $(CLANG_LLVM_INSTALL_DIR)/lib

# Quick test.
.PHONY: run
run: bin/match-uninit-decl.exe
    ./bin/match-uninit-decl.exe test.c --

.PHONY: clean
clean:
    $(RM) *.o lib
    rm -rf bin


# EOF

test.c

// test.c
// Cases for last uninit decl finder.

int f1(void)
{
  int x;
  int y;           // last uninit decl
  //   ^ end of its declaration statement
  int z = 5;

  x = 3;
  y = 4;

  return x+y+z;
}

int f2(void)
{
  // No uninit decls.
  int x2 = 2;
  return x2;
}

int f3(void)
{
  int x, y=7;      // last uninit is 'x'
  //        ^ end of its declaration statement

  x = 2;

  return x+y;
}

int f4(void)
{
  // No uninit decls in this outermost body statement.

  {
    int x;
    x=3;
    return x;
  }
}

// EOF

输出:

$ ./bin/match-uninit-decl.exe test.c --
match found:
  func at $PWD/test.c:4:5
    opening brace at $PWD/test.c:5:1
    uninit decl at $PWD/test.c:6:7
    uninit decl at $PWD/test.c:7:7
    last uninit decl at $PWD/test.c:7:7        <-- 'y'
      end loc: $PWD/test.c:7:8
match found:
  func at $PWD/test.c:17:5
    opening brace at $PWD/test.c:18:1
    no uninit decls
match found:
  func at $PWD/test.c:24:5
    opening brace at $PWD/test.c:25:1
    uninit decl at $PWD/test.c:26:7
    last uninit decl at $PWD/test.c:26:7       <-- 'x'
      end loc: $PWD/test.c:26:13
match found:
  func at $PWD/test.c:34:5
    opening brace at $PWD/test.c:35:1
    no uninit decls
© www.soinside.com 2019 - 2024. All rights reserved.