试图编程

问题描述 投票:-1回答:2

我正在尝试编写一个tokenizer,输入一个字符串,例如" 34 56 7899",并编辑 "34"、"56 "和 "7899 "这几个单一的标记。此外,我不允许使用标准的java tokenizer或字符串方法 "split"。next()方法应该有下一个标记作为输出,或者如果没有标记了,应该返回null。

public String next() {
    int counter1=0;
    int counter2=0;

    for(int i=0;i<token.length();i++) {
        counter1++;
        if((int)token.charAt(i)!=32) {
            for(int j=counter1;j<token.length();j++) {
                if((int)token.charAt(j)!=32) {
                    counter2++;
                }
                else if((int)token.charAt(j)==32||i+counter2==token.length()-1) {
                    ergebnis=token.substring(i, i+counter2);
                    token.replaceAll(token.substring(i, i+counter2)," ");
                    return ergebnis;
                }
            }
        }
    }
    return ergebnis;
}

问题是,如果我在我的主类中运行这个方法,它没有编辑标记,但也没有收到错误信息,所以我不知道为什么这个方法不工作。

如果您能帮助我解决这个问题,我将非常感激。

java token tokenize stringtokenizer
2个回答
0
投票

你正在寻找这样的东西。

public class Tokenizer
{
    private String str;
    private char delimeter;
    private int pos;

    public Tokenizer(String str, char delimeter)
    {
        this.str = str;
        this.delimeter = delimeter;
    }

    public String next()
    {
        // consume delimeters
        while(pos < str.length() && str.charAt(pos) == delimeter) pos++;

        // if we're at the end of the string there are no more tokens
        if(pos == str.length()) return null;

        // record the start position of the next token
        int tokenStart = pos;

        // consume non-delimeter characters
        while(pos < str.length() && str.charAt(pos) != delimeter) pos++;

        // pos is now one past the end of the next token, or the end of the string
        return str.substring(tokenStart, pos);
    }
}

测试:

Tokenizer t = new Tokenizer(" 34 56 7899 ", ' ');

String tk;
while((tk = t.next()) != null) 
    System.out.format("Token <%s>%n", tk);

输出:

Token <34>
Token <56>
Token <7899>

1
投票

如果你要找的是这样的东西...

首先,写一个CustomTokenizer,如下图所示--。

import java.util.ArrayList;
import java.util.List;

public class CustomTokenizer {

    private String input;
    private String delimiter;
    private List<String> tokensList;

    public CustomTokenizer(String input, String delimiter){
        this.input = input;
        this.delimiter = delimiter;
    }

    public void tokenize(){
        if(input == null || delimiter == null){
            return ;
        }
        tokensList = new ArrayList<>();
        while(!input.isEmpty()){

            // find the first index of delimiter
            int indexOfDelimiter = input.indexOf(delimiter);

            // delimiter may not be present for the last element, so indexOf() returns -1
            if(indexOfDelimiter == -1){
                // adding the last token to the arrayList
                tokensList.add(input);

                // Doing a substring from string length so that it becomes empty
                input = input.substring(input.length());
            } else {
                // Doing a substring from index 0 till the indexOf delimiter
                String temp = input.substring(0, indexOfDelimiter);
                if(!temp.isEmpty()){
                    tokensList.add(temp);
                }

                // Doing a substring from the first indexOf delimiter till the last of the string
                input = input.substring(indexOfDelimiter+1);
            }
        }
    }

    public boolean hasNext(){
        return !tokensList.isEmpty();
    }

    public String next(){
        if(tokensList.isEmpty())
            return null;
        return tokensList.remove(0);
    }
}

然后写上驱动类-

public class Runner {
    public static void main(String[] args) {

        // Pass the input string and delimiter to the CustomTokenizer Constructor
        CustomTokenizer ct = new CustomTokenizer(" 34 56 7899 ", " ");

        // calling the tokenize method to separate the tokens
        ct.tokenize();

        while(ct.hasNext()){
            System.out.println(ct.next());
        }
    }
}

输出 :

34 56 7899

© www.soinside.com 2019 - 2024. All rights reserved.