此代码逐个字符地读取文本文件,分隔哈希表中定义的单词、数字和符号,并打印每个词位、其类型(ID、NUM、符号)及其在文件中的位置。
第一行末尾有一个空格,第二行末尾有3个空格
Hello <= !
Cosc 455!
Hi
问题: 输出不仅没有打印'Cosc'的位置,而且字符串完全乱序。我相信这与“ ' 在它之前阅读过,但我没有尝试让这条线看起来正确。
问题 2: 第 3 行的词素“Hi”被读取并保存到 curr_lex,但它不像前面的词素那样打印出它的位置和类型。如果我在第 3 行添加另一个词,它不会像其他行那样继续阅读。在'!'之后简单地停止输出。
代码
import java.util.Arrays;
import java.util.Hashtable;
import java.util.Scanner;
import java.io.File;
import java.io.FileReader;
import java.io.IOException;
import java.io.BufferedReader;
import java.io.FileNotFoundException;
public class MiniLang {
public static String curr_lex;
public static int head;
public static char curr_char;
public static int ch=0;
public static int ln;
public static boolean end = false;
public static boolean error = false;
public static boolean touching = false;
public static Hashtable<String, String[]> table = new Hashtable<String, String[]>();
static{
table.put("symbols", new String[]{"_", ":", ";", "(", ")", "<", ">", "=", "!", "+", "-", "*", "/", ">=", "!=", "<="});
}
public static void main(String[] args){
// User input file name...
Scanner scanner = new Scanner(System.in);
String fileName = "";
System.out.print("Enter the file name: ");
fileName = scanner.nextLine();
if(fileName.equalsIgnoreCase("quit") || fileName.equalsIgnoreCase("exit")){System.out.println("Exitting..."); return;}
// Requst new file after done, until "quit/exit"...
while(!fileName.equalsIgnoreCase("quit") || !fileName.equalsIgnoreCase("exit")){
System.out.println("New file...");
try{
// create buffered file...
File file = new File(fileName);
FileReader fr = new FileReader(file);
BufferedReader br = new BufferedReader(fr);
ln=1;
while(end==false && error==false){
if(touching) next(br);
else{
nextChar(br);
next(br);
}
if(curr_lex.charAt(0)==' ') break;
System.out.println(" " + position() + ": " + kind() + " " + value());
}
if(error)System.out.println("ERROR!! Invalid Token...");
}
catch(FileNotFoundException e){
System.out.println("file not found, check file name");
break;
}
System.out.print("Enter the file name: ");
fileName = scanner.nextLine();
if(fileName.equalsIgnoreCase("quit") || fileName.equalsIgnoreCase("exit")){System.out.println("Exitting..."); return;}
}
scanner.close();
}
public static void nextChar(BufferedReader br){
int c;
try{
c = br.read();
if(c==-1){
end=true;
return;//end of text
}
ch++;
char charc = (char)c;
curr_char = charc;
if(charc=='\n') {
ln+=1;
ch=0;
head=1;
nextChar(br);
}
}catch(IOException e){System.out.println("invalid input");}
}
public static void next(BufferedReader br){
head = ch;
String lex = "";
if(curr_char==' ')skipWhite(br);
if(!Arrays.asList(table.get("symbols")).contains(currentChar())){
lex = lex+curr_char; // first letter/num
nextChar(br);
while(curr_char!=' ' && !Arrays.asList(table.get("symbols")).contains(currentChar()) && end==false){
lex = lex+curr_char; // add non-symbols/spaces to lexeme
nextChar(br); //
}if(Arrays.asList(table.get("symbols")).contains(currentChar())){
curr_lex = lex; // symbol touching word/num
touching=true;
return;
}
curr_lex = lex;
touching = false;
}else{//symbol
lex = lex+curr_char; // add first symbol
nextChar(br);
if(Arrays.asList(table.get("symbols")).contains(lex+curr_char) && curr_char!=' '){
lex = lex+curr_char; // joint symbol
nextChar(br);
}else if(Arrays.asList(table.get("symbols")).contains(lex+curr_char)){error=true;}
curr_lex = lex;
}
touching = false;
}
public static String position(){
// current line and char index..
String pos = ln + ":" + head + " ";
return pos;
}
public static String currentChar(){
// current char is a static variable
return curr_char+"";
}
public static void skipWhite(BufferedReader br){
nextChar(br);
}
public static String kind(){
try{
int num = Integer.parseInt(curr_lex);
return "NUM";
}catch(NumberFormatException e){
if(Arrays.asList(table.get("symbols")).contains(curr_lex)) return curr_lex;
else return "ID";
}
}
public static String value(){
if(curr_lex.charAt(0)=='\0'){
String lex = "";
for(int i=1; i<curr_lex.length(); i++){
lex = lex+curr_lex.charAt(i);
}
return lex;
}else return curr_lex;
}
}
Main 获取文件名并开始循环直到最后一个字符或错误。
nextChar() 将当前字符读入静态变量。
next() 将连续字符添加到当前词位,直到符号、空格或文本结尾。 (它检查单个和 2 个长度的符号)。
skipwhite() 每当空白是 current_char 时调用,并且只读取下一个字符。
position(), kind(), and value() 将相关信息打印到静态变量中的当前词素。
请记住,我不会使用任何正则表达式,包括使用它的内置函数,如 String.split()。
输出
Enter the file name: input.txt
New file...
1:1 : ID Hello
1:7 : <= <=
1:10 : ! !
Cosc1 : ID
2:6 : NUM 455
2:9 : ! !
Enter the file name: