我正在尝试构建一个带有 Deterministic Finite Automaton 实现的 Rust 库。自动机表示为结构体
DFA<'a, A>
,其中通用类型 A
表示自动机正在读取的符号。我希望用户能够以两种不同的方式初始化 DFA
结构:
DFABuilder
API我希望实现不会浪费内存,所以我试图拥有尽可能少的拥有类型,并且我试图从不使用
.clone()
方法。这是由于两种不同的初始化方法而出现问题的地方。
在这种情况下,我可以假设调用代码拥有代表状态(
String
类型)和符号(A
类型)的值,所以我的代码只能引用它(&str
和 &A
,分别)。只要调用代码存在,引用就有效。
在这种情况下,是库中的代码读取文件,并在内部构造
DFA
,也使用DFABuilder
。然而,这里的问题是所有 DFABuilder
方法都期望 references 到通用类型 A
。但是我不能传递引用,因为当文件读取函数结束时,它们引用的值超出了范围。
设计
DFA<'a, A>
和 DFABuilder<'a, A>
结构以及它们中的字段的最佳方法是什么,以便我可以尽可能多地利用引用,同时也能够通过 API 初始化 DFA
结构从一个文件?
我附上了我目前在下面的代码的框架。请注意,状态在内部表示为
u32
类型 - 这是因为在构建过程中,我通过枚举状态对状态进行编码,这样我就不必使用 String
s 或 &str
s.
pub struct DFA<'a, A> {
accept_set: HashSet<u32>,
start_state: u32,
current_state: Option<u32>,
transitions: HashMap<(u32, &A), u32>
}
impl<'a, A> DFA {
// Feed a sequence of symbols to the DFA
// Returns whether the sequence is accepted
pub fn feed_sequence(&mut self, &[A]) -> bool {...}
// Feed a single symbol to the DFA
pub fn feed_symbol(&mut self, &A) {...}
}
pub struct DFABuilder<'a, A> {
string2state: HashMap<&'a str, u32>,
states: Vec<u32>,
alphabet: &'a[&'a A],
accept: HashSet<u32>,
start: u32,
transitions: HashMap<(u32, &'a A), u32>
}
impl<'a, A> DFABuilder {
pub fn with_states(mut self, states: &'a[&'a str]) -> DFABuilder<'a, A> {
// compute string2state encoding
// encode states with string2state
// store encoded states in self
}
pub fn with_alphabet(mut self, alphabet: &'a[&'a A]) -> DFABuilder<'a, A> {...}
pub fn with_accept(mut self, states: &'a[&'a str]) -> DFABuilder<'a, A> {
// encode states with string2state
// store encoded accept states in self
}
pub fn with_start(mut self, state: &'a str) -> DFABuilder<'a, A> {
// encode state with string2state
// store encoded start state in self
}
pub fn with_transition(mut self, transition: (&'a str, &'a A, &'a str)) -> DFABuilder<'a, A> {...}
pub fn build(self) -> DFA<'a, A> {...}
}
pub fn from_file<A: FromStr>(path: &str) -> DFA<'a, A> {
// Reads and parses the file into Vec<String> / Vec<A> types,
// and then uses DFABuilder to build a DFA.
//
// PROBLEM:
// This function necessarily owns the underlying `String` and `A` values
// so I can't pass a reference to them into DFABuilder, because they don't
// live long enough.
// I could redefine the DFABuilder API to take in owned values,
// but that means the calling code would lose ownership when
// initializing via an API.
}
供参考,这是定义文件的样子:
# Full state set Q, separated by spaces
s1 s2 s3 s4 s5 s6
# Full alphabet A, separated by spaces
0 1 2 3 4 5 6 7
# Start state
s1
# Acceptable states, separated by spaces
s5 s6
# List of 0 or more transitions in format STATE SYMBOL STATE
s1 0 s2
s1 1 s3
s1 2 s1
s1 7 s6
...
本质上,我的问题是:
设计这样一个库的最佳实践是什么?如何保持库内存轻,同时为客户端提供直观的 API,并启用从文件初始化?