如何从源文件中提取单个函数

Question

我正在进行一项关于 Linux 内核中极长且复杂的函数的小型学术研究。我想弄清楚是否有充分的理由编写 600 或 800 行长的函数。

为此，我想找到一个可以从 .c 文件中提取函数的工具，这样我就可以对该函数进行一些自动化测试。

例如，如果我在文件cifs_parse_mount_options()

中有函数

connect.c

，我正在寻找一个大致如下的解决方案：

extract /fs/cifs/connect.c cifs_parse_mount_options

并返回函数的 523 行代码（！），从左大括号到右大括号。

当然，任何操纵现有软件包的方法，如

gcc

来做到这一点，也是最有帮助的。

谢谢，

乌迪

编辑： Regex 提取 C 函数原型声明的答案？使我相信正则表达式匹配函数声明远非微不足道。

Answer 1

你为什么不写一个小的 PERL/PHP/Python 脚本或者甚至是一个小的 C++、Java 或 C# 程序呢？

我不知道有什么现成的工具可以做到这一点，但是编写代码来解析文本文件并从 C++ 代码文件中提取函数体应该不会超过 20 行代码。唯一的困难部分将定位函数的开头，这应该是使用 RegEx 的相对简单的任务。之后，您所需要做的就是遍历文件的其余部分，跟踪打开和关闭大括号，当您到达函数体关闭大括号时，您就完成了。

Answer 2

indent -kr code -o code.out

awk -f split.awk code.out

你必须适应一点 split.awk wich 有点特定于我的代码和重构需求（例如你有这样的结构而不是 typedefs

而且我相信您可以制作更好的脚本 :-)

--
BEGIN   { line=0; FS="";
    out=ARGV[ARGC-1]  ".out";
    var=ARGV[ARGC-1]  ".var";
    ext=ARGV[ARGC-1]  ".ext";
    def=ARGV[ARGC-1]  ".def";
    inc=ARGV[ARGC-1]  ".inc";
    typ=ARGV[ARGC-1]  ".typ";
    system ( rm " " -f " " out " " var " " ext " " def " " inc " " typ );
    }
/^[     ]*\/\/.*/   { print "comment :" $0 "\n"; print $0 >> out ; next ;}
/^#define.*/        { print "define :" $0 ; print $0 >>def ; next;}
/^#include.*/       { print "define :" $0 ; print $0 >>inc ; next;}
/^typedef.*{$/      { print "typedef var :" $0 "\n"; decl="typedef";print $0 >> typ;infile="typ";next;}
/^extern.*$/        { print "extern :" $0 "\n"; print $0 >> ext;infile="ext";next;}
/^[^    }].*{$/     { print "init var :" $0 "\n";decl="var";print $0 >> var; infile="vars";
                print $0;
                fout=gensub("^([^    \\*])*[    ]*([a-zA-A0-9_]*)\\[.*","\\2","g") ".vars";
                     print "var decl : " $0 "in file " fout;
                     print $0 >fout;
                next;
                        }
/^[^    }].*)$/     { print "func  :" $0 "\n";decl="func"; infile="func";
                print $0;
                fout=gensub("^.*[    \\*]([a-zA-A0-9_]*)[   ]*\\(.*","\\1","g") ".func";
                     print "function : " $0 "in file " fout;
                     print $0 >fout;
                next;
            }
/^}[    ]*$/        { print "end of " decl ":" $0 "\n"; 
                if(infile=="typ") {
                    print $0 >> typ;
                }else if (infile=="ext"){
                    print $0 >> ext;
                }else if (infile=="var") {
                    print $0 >> var;
                }else if ((infile=="func")||(infile=="vars")) {
                    print $0 >> fout; 
                    fflush (fout);
                    close (fout);
                }else if (infile=="def") {
                    print $0 >> def;
                }else if (infile=="inc"){
                    print $0 >> inc;
                }else print $0 >> out;
                next;
            }
/^[a-zA-Z_]/        { print "extern :" $0 "\n"; print $0 >> var;infile="var";next;}
            { print "other :" $0 "\n" ; 
                if(infile=="typ") {
                    print $0 >> typ;
                }else if (infile=="ext"){
                    print $0 >> ext;
                }else if (infile=="var") {
                    print $0 >> var;
                }else if ((infile=="func")||(infile=="vars")){
                    print $0 >> fout;
                }else if (infile=="def") {
                    print $0 >> def;
                }else if (infile=="inc"){
                    print $0 >> inc;
                }else print $0 >> out;
               next;
               }

Answer 3

如果您发现难以提取函数名称：

1> 使用 ctags（一个程序）提取函数名。 ctags -x --c-kinds=fp 路径到文件。 2>一旦你得到函数名称，写一个简单的perl脚本通过传递函数的脚本名称来提取函数的内容，如上所述。

Answer 4

Bash 内置

declare

似乎提供了类似的功能，但我不确定它是如何实现的。特别地，

declare -f

列出了当前环境下的功能：

declare -f quote
declare -f quote_readline

declare

输出当前环境下的函数列表：

quote () 
{ 
    local quoted=${1//\'/\'\\\'\'};
    printf "'%s'" "$quoted"
}
quote_readline () 
{ 
    local ret;
    _quote_readline_by_ref "$1" ret;
    printf %s "$ret"
}

最后，

declare -f quote

输出

quote

函数的函数定义。

quote () 
{ 
    local quoted=${1//\'/\'\\\'\'};
    printf "'%s'" "$quoted"
}

也许可以重新调整底层机器的用途以满足您的需求。

Answer 5

你应该使用像 clang 这样的东西，它实际上会解析你的源代码并允许你分析它。所以它可以找到多种语言的函数，即使你考虑宏。你没有机会使用正则表达式。

Answer 6

我有类似的需求，从

代码中提取一个功能，我发现

vim

（编辑器）适合我的需求（并且更容易一点）因为我不必编写任何外部工具或依赖不可靠的正则表达式，这可能会变得乏味。

测试代码：

$ cat -n c.c
   1 #include <stdio.h>
   2 static int
   3 testme (void)
   4 {
   5     int i=1;
   6 
   7     if (i == 1) {
   8           printf("\nDo something\n");
   9     }
  10     return 0;
  11 }
  12 
  13 int main (int argc, char *argv[])
  14 {
  15     testme();
  16     return 0;
  17 }

在非交互（

vim

）模式下使用

ex

与

-es

：

step.1 - 使用

vim

搜索转到函数的开头（假设函数名称在行的开头 -

+/<function-name>

并打印行号 -

!echo line(".")

step.2 - 移动到行首的下一个右括号 -

+/}

并打印行号

step.3 - 退出文件 -

+q

step.4 - 现在我们有一个 start-line# 和 end-line# - 我们将以

sed

的形式将它传递给

<start>,<end>p

（需要用

paste

进行一些按摩，在调用

sed

)

之前

完整命令：

$ vim -es c.c +/testme +'exec(":!echo ".line("."))'  +'/^}'  +'exec(":!echo ".line("."))'  +q | paste -sd "," - | xargs -i{} sed -n {}p c.c
testme (void)
{
    int i=1;

    if (i == 1) {
          printf("\nDo something\n");
    }
    return 0;
}

如何从源文件中提取单个函数

问题描述投票：0回答：6

6个回答

最新问题

如何从源文件中提取单个函数

问题描述 投票：0回答：6

6个回答

最新问题

问题描述投票：0回答：6