Tatsu par g g Per f r芒

问题描述 投票:0回答:1

我在Tatsu中实现了一个语法,用于解析量子程序Quipper ASCII(link)的描述。解析器可以工作,但对于我正在查看的文件来说速度很慢(大约10kB-1MB大小,请参阅resources目录)。解析一些文件大约需要10-30秒。语法非常简单,应该可以很快地解析这些文件。我尝试过的一件事就是尽可能增加切口,以确保没有不必要的回溯。语法指定为

@@grammar :: Quipper
@@whitespace :: /[^\S\n]*/

# The root of any program
start::BCircuit = circuit:circuit subroutines:{subroutine} {newline} $ ;

# A sequence of gates
circuit::Circuit =
    "Inputs:" ~ inputs:arity
    gatelist:{gate newline}
    "Outputs: "outputs:arity
    ;

# "Function" definitions
subroutine::Subroutine =
    newline
    "Subroutine:" ~ name:string newline
    "Shape:" shape:string newline
    "Controllable:" controllable:("yes"|"no"|"classically") newline
    circuit:circuit
    ;

# Wires and their types.
arity = @:",".{type_assignment}+ newline ;
type_assignment::TypeAssignment = number:int ":" type:("Qbit"|"Cbit") ;

# Gate control
control_app = [controlled:controlled] [no_control:no_control];
controlled::Controlled = "with controls=[" ~ controls:",".{int}+ "]" ;
no_control::NoControl = "with nocontrol" ;

# All gates
gate
    =
    | qgate
    | qrot
    | gphase
    | cnot
    | cgate
    | cswap
    | qprep
    | qunprep
    | qinit
    | cinit
    | cterm
    | qmeas
    | qdiscard
    | cdiscard
    | dterm
    | subroutine_call
    | comment
    ;

# Gate definitions
qgate::QGate::Gate = "QGate[" ~ name:string "]" inverse:["*"] "(" qubit:int ")" > control_app;
qrot::QRot::Gate = "QRot[" ~ string "," double "](" int ")" ;
gphase::GPhase::Gate = "Gphase() with t=" ~ timestep:double >control_app "with anchors=[" ~ wires:",".{wire} "]" ;
cnot::CNo::Gate  = "CNot(" ~ wire:wire ")" >control_app ;
cgate::CGat::Gate  = "CGate[" ~ name:string "]" inverse:["*"] "(" wires:",".{wire} ")" no_control;
cswap::CSwap::Gate = "CSwap(" ~ wire1:wire "," wire2:wire ")" >control_app ;
qprep::QPrep::Gate = "QPrep(" ~ wire:wire ")" no_control ;
qunprep::QUnprep::Gate = "QUnprep(" ~ wire:wire ")" no_control ;
qinit::QInit::Gate = state:("QInit0" | "QInit1") ~ "(" wire:wire ")" no_control;
cinit::CInit::Gate = state:("CInit0" | "CInit1") ~ "(" wire:wire ")" no_control;
qterm::QTerm::Gate = state:("QTerm0" | "QTerm1") ~ "(" wire:wire ")" no_control;
cterm::CTerm::Gate = state:("CTerm0" | "CTerm1") ~ "(" wire:wire ")" no_control;
qmeas::QMeas::Gate = "QMeas(" ~ wire:wire ")" ;
qdiscard::QDiscard::Gate = "QDiscard(" ~ wire:wire ")" ;
cdiscard::CDiscard::Gate = "CDiscard(" ~ wire:wire ")" ;
dterm::DTerm::Gate = state:("DTerm0" | "Dterm1") ~ "(" wire:wire ")" ;
subroutine_call::SubCall::Gate = "Subroutine" ~ ["(x" repetitions:int ")"]
    "[" name:string ", shape" shape:string "]"
    inverse:["*"]
    "(" inputs:",".{int}+ ")"
    "-> (" outputs:",".{int}+ ")"
    >control_app ;
comment::Comment::Gate = "Comment[" ~ text:string "](" wires:",".{wire}+ ")" ;

# Reference to an input wire and a textual description
wire::Wire = qubit:int ":" text:string ;

# Literals
string = '"'@:String_literal'"';  # Don't include the quotes.
String_literal::str = /[^"]+/ ;
int::int = /([+|-])?\d+/ ;
double::float = /(-)?\d+\.\d+e(-)?\d+/ ;
newline = /\n/ ;

我已经尝试分析代码以找到性能方面的瓶颈,但是时间花在了整个地方。我使用tatsu grammar.ebnf生成一个解析器,使用tatsu -g生成模型,然后我在测试用例中使用它来解析输入文件。性能结果使用标准python 3.6.4,按tottime排序,用于解析resources/PF中的所有文件:

Tue Feb 27 13:35:58 2018    parser_profiling

         4787639497 function calls (4611051402 primitive calls) in 3255.157 seconds

   Ordered by: internal time
   List reduced from 326 to 30 due to restriction <30>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
144386670  129.008    0.000  491.328    0.000 /Users/eddie/dev/quippy/.venv/lib/python3.6/site-packages/tatsu/buffering.py:245(_eat_regex)
312540554   92.592    0.000  216.890    0.000 {built-in method builtins.isinstance}
15701720/80   88.741    0.000 3249.970   40.625 /Users/eddie/dev/quippy/.venv/lib/python3.6/site-packages/tatsu/contexts.py:561(_invoke_rule)
 34327680   87.957    0.000  370.060    0.000 /Users/eddie/dev/quippy/.venv/lib/python3.6/site-packages/tatsu/ast.py:14(__init__)
 57815550   76.052    0.000  380.881    0.000 /Users/eddie/dev/quippy/.venv/lib/python3.6/site-packages/tatsu/buffering.py:322(matchre)
 95427920   75.086    0.000  149.629    0.000 /Users/eddie/dev/quippy/.venv/lib/python3.6/site-packages/tatsu/buffering.py:208(goto)
 67455280   74.390    0.000  124.299    0.000 /Users/eddie/dev/quippy/.venv/bin/../lib/python3.6/abc.py:178(__instancecheck__)
 45115140   69.378    0.000  671.723    0.000 /Users/eddie/dev/quippy/.venv/lib/python3.6/site-packages/tatsu/buffering.py:260(next_token)
 57815550   67.317    0.000  143.516    0.000 /Users/eddie/dev/quippy/.venv/lib/python3.6/site-packages/tatsu/buffering.py:329(_scanre)
 32497610   63.583    0.000   69.483    0.000 /Users/eddie/dev/quippy/.venv/lib/python3.6/site-packages/tatsu/ast.py:115(__hasattribute__)
15701720/80   63.262    0.000 3249.976   40.625 /Users/eddie/dev/quippy/.venv/lib/python3.6/site-packages/tatsu/contexts.py:501(_call)
 18626000   61.979    0.000  454.538    0.000 /Users/eddie/dev/quippy/.venv/lib/python3.6/site-packages/tatsu/contexts.py:644(_try)
 68655360   57.334    0.000   57.334    0.000 /Users/eddie/dev/quippy/.venv/lib/python3.6/site-packages/tatsu/ast.py:103(__setattr__)
 57815550   54.505    0.000   54.505    0.000 {method 'match' of '_sre.SRE_Pattern' objects}
126339356   49.908    0.000   49.908    0.000 /Users/eddie/dev/quippy/.venv/bin/../lib/python3.6/_weakrefset.py:70(__contains__)
 72092420   48.541    0.000  170.810    0.000 /Users/eddie/dev/quippy/.venv/lib/python3.6/site-packages/tatsu/buffering.py:211(move)
33410950/4018330   47.143    0.000  334.748    0.000 /Users/eddie/dev/quippy/.venv/lib/python3.6/site-packages/tatsu/objectmodel.py:149(_adopt_children)
 48867260   46.582    0.000  724.214    0.000 /Users/eddie/dev/quippy/.venv/lib/python3.6/site-packages/tatsu/contexts.py:237(_next_token)
 37932280   44.424    0.000   91.042    0.000 /Users/eddie/dev/quippy/.venv/lib/python3.6/site-packages/tatsu/contexts.py:277(_add_cst_node)
 48128890   44.030    0.000   58.493    0.000 /Users/eddie/dev/quippy/.venv/lib/python3.6/site-packages/tatsu/buffering.py:256(eat_eol_comments)
 36470510   43.970    0.000  234.524    0.000 /Users/eddie/dev/quippy/.venv/lib/python3.6/site-packages/tatsu/ast.py:48(update)
 24351260   43.216    0.000  120.680    0.000 /Users/eddie/dev/quippy/.venv/lib/python3.6/site-packages/tatsu/ast.py:60(set)
 18626000   41.821    0.000  545.252    0.000 /Users/eddie/dev/quippy/.venv/lib/python3.6/site-packages/tatsu/contexts.py:666(_option)
 75390600   40.445    0.000   40.445    0.000 {built-in method builtins.getattr}
 79996390   39.995    0.000   52.367    0.000 /Users/eddie/dev/quippy/.venv/lib/python3.6/site-packages/tatsu/contexts.py:223(_pos)
246258630   39.345    0.000   39.345    0.000 /Users/eddie/dev/quippy/.venv/lib/python3.6/site-packages/tatsu/buffering.py:155(pos)
38860150/25041530   38.587    0.000  118.596    0.000 /Users/eddie/dev/quippy/.venv/lib/python3.6/site-packages/tatsu/objectmodel.py:108(__cn)
 95427954   38.319    0.000   38.319    0.000 {built-in method builtins.min}
15701720/80   38.148    0.000 3249.974   40.625 /Users/eddie/dev/quippy/.venv/lib/python3.6/site-packages/tatsu/contexts.py:536(_recursive_call)
 32060560   37.741    0.000   45.157    0.000 /usr/local/Cellar/python3/3.6.4_2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/contextlib.py:59(__init__)

即使是最大的贡献者(_eat_regex)也无法解释50分钟的运行时间。我也找不到如何在tatsu文档中优化速度。

通过查看在函数中花费的累积时间,可以获得更多的洞察力。我已经通过cumtime对分析结果进行了排序:

Tue Feb 27 13:35:58 2018    parser_profiling

         4787639497 function calls (4611051402 primitive calls) in 3255.157 seconds

   Ordered by: cumulative time
   List reduced from 326 to 30 due to restriction <30>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000 3273.337 3273.337 {built-in method builtins.exec}
        1    0.000    0.000 3273.337 3273.337 <string>:1(<module>)
        1    0.005    0.005 3273.337 3273.337 test_model.py:95(test_optimizer)
       80    0.002    0.000 3272.954   40.912 /Users/eddie/dev/quippy/.venv/lib/python3.6/site-packages/tatsu/contexts.py:182(parse)
15701720/80   31.630    0.000 3249.977   40.625 /Users/eddie/dev/quippy/.venv/lib/python3.6/site-packages/tatsu/contexts.py:47(wrapper)
15701720/80   63.262    0.000 3249.976   40.625 /Users/eddie/dev/quippy/.venv/lib/python3.6/site-packages/tatsu/contexts.py:501(_call)
15701720/80   38.148    0.000 3249.974   40.625 /Users/eddie/dev/quippy/.venv/lib/python3.6/site-packages/tatsu/contexts.py:536(_recursive_call)
15701720/80   88.741    0.000 3249.970   40.625 /Users/eddie/dev/quippy/.venv/lib/python3.6/site-packages/tatsu/contexts.py:561(_invoke_rule)
       80    0.001    0.000 3174.181   39.677 /Users/eddie/dev/quippy/_parser.py:82(_start_)
  380/240    0.034    0.000 3169.206   13.205 /Users/eddie/dev/quippy/.venv/lib/python3.6/site-packages/tatsu/contexts.py:762(_closure)
      220    0.000    0.000 3164.452   14.384 /Users/eddie/dev/quippy/_parser.py:87(block2)
      220    0.003    0.000 3087.977   14.036 /Users/eddie/dev/quippy/_parser.py:121(_subroutine_)
624800/800   15.209    0.000 3076.119    3.845 /Users/eddie/dev/quippy/.venv/lib/python3.6/site-packages/tatsu/contexts.py:746(_repeat)
      220    0.002    0.000 3024.411   13.747 /Users/eddie/dev/quippy/_parser.py:101(_circuit_)
2578040/1272030    7.935    0.000 2970.683    0.002 /Users/eddie/dev/quippy/.venv/lib/python3.6/site-packages/tatsu/contexts.py:732(_isolate)
  1875860    3.029    0.000 2839.993    0.002 /Users/eddie/dev/quippy/_parser.py:108(block2)
  1875860   11.120    0.000 2483.056    0.001 /Users/eddie/dev/quippy/_parser.py:217(_gate_)
  1875860   25.944    0.000 1793.816    0.001 /Users/eddie/dev/quippy/_parser.py:256(_qgate_)
 48867260   46.582    0.000  724.214    0.000 /Users/eddie/dev/quippy/.venv/lib/python3.6/site-packages/tatsu/contexts.py:237(_next_token)
 45115140   69.378    0.000  671.723    0.000 /Users/eddie/dev/quippy/.venv/lib/python3.6/site-packages/tatsu/buffering.py:260(next_token)
 14443300   35.990    0.000  577.765    0.000 /Users/eddie/dev/quippy/.venv/lib/python3.6/site-packages/tatsu/contexts.py:598(_invoke_semantic_rule)
49442230/22727050   22.427    0.000  575.361    0.000 {built-in method builtins.next}
 18626000   41.821    0.000  545.252    0.000 /Users/eddie/dev/quippy/.venv/lib/python3.6/site-packages/tatsu/contexts.py:666(_option)
 48128890   19.695    0.000  497.263    0.000 /Users/eddie/dev/quippy/.venv/lib/python3.6/site-packages/tatsu/buffering.py:249(eat_whitespace)
32060560/13779580   12.309    0.000  493.588    0.000 /usr/local/Cellar/python3/3.6.4_2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/contextlib.py:79(__enter__)
144386670  129.008    0.000  491.328    0.000 /Users/eddie/dev/quippy/.venv/lib/python3.6/site-packages/tatsu/buffering.py:245(_eat_regex)
 14443300   34.625    0.000  460.941    0.000 /Users/eddie/dev/quippy/.venv/lib/python3.6/site-packages/tatsu/semantics.py:76(_default)
 18626000   61.979    0.000  454.538    0.000 /Users/eddie/dev/quippy/.venv/lib/python3.6/site-packages/tatsu/contexts.py:644(_try)
 17463860   36.024    0.000  415.958    0.000 /Users/eddie/dev/quippy/.venv/lib/python3.6/site-packages/tatsu/contexts.py:606(_token)
  4018330   18.630    0.000  410.825    0.000 /Users/eddie/dev/quippy/.venv/lib/python3.6/site-packages/tatsu/objectmodel.py:19(__init__)

我们可以看到花了相当多的时间来选择执行哪个门(毕竟这是一个很大的选择声明)。在qgate上也花了很多时间,这并不奇怪,因为这些节点在代码中占主导地位。

我可以在语法或解析中添加哪些可能的优化,以便它可以更快地解析这些文件?

performance tatsu
1个回答
© www.soinside.com 2019 - 2024. All rights reserved.