用于Java客户搜索的Radix(Trie)树实现

问题描述 投票:0回答:1

我正在研究一个项目,需要搜索数百万客户的数据。我想实现基数(trie)搜索算法。我已经阅读并实现了一个简单的字符串集合的基数。但在这里,我有一组客户,并希望通过名称或手机号码进行搜索。

客户类别:

public class Customer {

    String name;
    String mobileNumer;


    public Customer (String name, String phoneNumer) {
        this.name = name;
        this.mobileNumer = phoneNumer;
    }

    public String getName() {
        return name;
    }
    public void setName(String name) {
        this.name = name;
    }
    public String getPhoneNumer() {
        return mobileNumer;
    }
    public void setPhoneNumer(String phoneNumer) {
        this.mobileNumer = phoneNumer;
    }



}

RadixNode类:

import java.util.HashMap;
import java.util.Map;

class RadixNode {
    private final Map<Character, RadixNode> child = new HashMap<>();
    private final Map<Customer, RadixNode> mobileNum = new HashMap<>();
    private boolean endOfWord;

    Map<Character, RadixNode> getChild() {
        return child;
    }

    Map<Customer, RadixNode> getChildPhoneDir() {
        return mobileNum;
    }

    boolean isEndOfWord() {
        return endOfWord;
    }

    void setEndOfWord(boolean endOfWord) {
        this.endOfWord = endOfWord;
    }
}

基数类:

class Radix {
    private RadixNode root;

    Radix() {
        root = new RadixNode();
    }

    void insert(String word) {
        RadixNode current = root;

        for (int i = 0; i < word.length(); i++) {
            current = current.getChild().computeIfAbsent(word.charAt(i), c -> new RadixNode());
        }
        current.setEndOfWord(true);
    }

    void insert(Customer word) {
        RadixNode current = root;
        System.out.println("==========================================");
        System.out.println(word.mobileNumer.length());
        for (int i = 0; i < word.mobileNumer.length(); i++) {
            current = current.getChildPhoneDir().computeIfAbsent(word.mobileNumer.charAt(i), c -> new RadixNode());
            System.out.println(current);
        }
        current.setEndOfWord(true);
    }

    boolean delete(String word) {
        return delete(root, word, 0);
    }

    boolean containsNode(String word) {
        RadixNode current = root;

        for (int i = 0; i < word.length(); i++) {
            char ch = word.charAt(i);
            RadixNode node = current.getChild().get(ch);
            if (node == null) {
                return false;
            }
            current = node;
        }
        return current.isEndOfWord();
    }

    boolean isEmpty() {
        return root == null;
    }

    private boolean delete(RadixNode current, String word, int index) {
        if (index == word.length()) {
            if (!current.isEndOfWord()) {
                return false;
            }
            current.setEndOfWord(false);
            return current.getChild().isEmpty();
        }
        char ch = word.charAt(index);
        RadixNode node = current.getChild().get(ch);
        if (node == null) {
            return false;
        }
        boolean shouldDeleteCurrentNode = delete(node, word, index + 1) && !node.isEndOfWord();

        if (shouldDeleteCurrentNode) {
            current.getChild().remove(ch);
            return current.getChild().isEmpty();
        }
        return false;
    }

    public void displayContactsUtil(RadixNode curNode, String prefix) 
    { 

        // Check if the string 'prefix' ends at this Node 
        // If yes then display the string found so far 
        if (curNode.isEndOfWord()) 
            System.out.println(prefix); 

        // Find all the adjacent Nodes to the current 
        // Node and then call the function recursively 
        // This is similar to performing DFS on a graph 
        for (char i = 'a'; i <= 'z'; i++) 
        { 
            RadixNode nextNode = curNode.getChild().get(i); 
            if (nextNode != null) 
            { 
                    displayContactsUtil(nextNode, prefix + i); 
            } 
        } 
    }


    public boolean displayContacts(String str) 
    { 
        RadixNode prevNode = root; 

        // 'flag' denotes whether the string entered 
        // so far is present in the Contact List 

        String prefix = ""; 
        int len = str.length(); 

        // Display the contact List for string formed 
        // after entering every character 
        int i; 
        for (i = 0; i < len; i++) 
        { 
            // 'str' stores the string entered so far 
            prefix += str.charAt(i); 

            // Get the last character entered 
            char lastChar = prefix.charAt(i); 

            // Find the Node corresponding to the last 
            // character of 'str' which is pointed by 
            // prevNode of the Trie 
            RadixNode curNode = prevNode.getChild().get(lastChar); 

            // If nothing found, then break the loop as 
            // no more prefixes are going to be present. 
            if (curNode == null) 
            { 
                System.out.println("No Results Found for \"" + prefix + "\""); 
                i++; 
                break; 
            } 

            // If present in trie then display all 
            // the contacts with given prefix. 
            System.out.println("Suggestions based on \"" + prefix + "\" are"); 
            displayContactsUtil(curNode, prefix); 

            // Change prevNode for next prefix 
            prevNode = curNode; 
        } 

        for ( ; i < len; i++) 
        { 
            prefix += str.charAt(i); 
            System.out.println("No Results Found for \""  + prefix + "\""); 
        }

        return true;
    }


    public void displayContactsUtil(RadixNode curNode, String prefix, boolean isPhoneNumber) 
    { 

        // Check if the string 'prefix' ends at this Node 
        // If yes then display the string found so far 
        if (curNode.isEndOfWord()) 
            System.out.println(prefix); 

        // Find all the adjacent Nodes to the current 
        // Node and then call the function recursively 
        // This is similar to performing DFS on a graph 
        for (char i = '0'; i <= '9'; i++) 
        { 
            RadixNode nextNode = curNode.getChildPhoneDir().get(i); 
            if (nextNode != null) 
            { 
                    displayContactsUtil(nextNode, prefix + i); 
            } 
        } 
    }

    public boolean displayContacts(String str, boolean isPhoneNumber) 
    { 
        RadixNode prevNode = root; 

        // 'flag' denotes whether the string entered 
        // so far is present in the Contact List 

        String prefix = ""; 
        int len = str.length(); 

        // Display the contact List for string formed 
        // after entering every character 
        int i; 
        for (i = 0; i < len; i++) 
        { 
            // 'str' stores the string entered so far 
            prefix += str.charAt(i); 

            // Get the last character entered 
            char lastChar = prefix.charAt(i); 

            // Find the Node corresponding to the last 
            // character of 'str' which is pointed by 
            // prevNode of the Trie 
            RadixNode curNode = prevNode.getChildPhoneDir().get(lastChar); 

            // If nothing found, then break the loop as 
            // no more prefixes are going to be present. 
            if (curNode == null) 
            { 
                System.out.println("No Results Found for \"" + prefix + "\""); 
                i++; 
                break; 
            } 

            // If present in trie then display all 
            // the contacts with given prefix. 
            System.out.println("Suggestions based on \"" + prefix + "\" are"); 
            displayContactsUtil(curNode, prefix, isPhoneNumber); 

            // Change prevNode for next prefix 
            prevNode = curNode; 
        } 

        for ( ; i < len; i++) 
        { 
            prefix += str.charAt(i); 
            System.out.println("No Results Found for \""  + prefix + "\""); 
        }

        return true;
    } 


}

我试图搜索一个集合,但卡住了。任何帮助/建议将不胜感激。

java data-structures trie radix
1个回答
1
投票

我建议你做两种方法。

第一种方式:用一个trie。 可以将您需要的所有内容存储在一个trie中。您的客户类很好,这是一个可能的RadixNode实现。 我认为不能有两个具有相同名称或相同电话号码的客户。如果不是这种情况(例如可能有同名人物和不同电话号码的人)在评论中告诉我我会编辑。 重要的是要理解的是,如果您想要有两种不同的方式来查找客户,并且您使用单个线索,则每个客户将在您的线索中出现两次。一旦在路径的末尾对应其名称,并且一旦在路径结束后对应其电话号码。

import java.util.HashMap;
import java.util.Map;

class RadixNode {
    private Map<Character, RadixNode> children;
    private Customer customer;

    public RadixNode(){
        this.children = new Map<Character, RadixNode>();
        this.Customer = NULL;
    }
    Map<Character, RadixNode> getChildren() {
        return children;
    }
    boolean hasCustomer() {
        return this.customer != NULL;
    }
    Customer getCustomer() {
        return customer;
    }
    void setCustomer(Customer customer) {
        this.customer = customer;
    }
}

如您所见,只有一个地图存储节点的子节点。那是因为我们可以看到一个电话号码作为一串数字,所以这个特里会存储所有客户......两次。每个姓名一次,每个电话号码一次。 现在让我们看一下插入函数。你的特里需要一个根,我叫它root

public void insert(RadixNode root, Customer customer){
    insert_with_name(root, customer, 0);
    insert_with_phone_nb(root, customer, 0);
}

public void insert_with_name(RadixNode node, Customer customer, int idx){
    if (idx == customer.getName().length()){
        node.setCustomer(customer);
    } else {
        Character current_char = customer.getName().chatAt(idx);
        if (! node.getChlidren().containsKey(current_char){
            RadixNode new_child = new RadixNode();
            node.getChildren().put(current_char, new_child);
        }
        insert_with_name(node.getChildren().get(current_char), customer, idx+1);
    }
}

insert_with_phone_nb()方法类似。只要人们拥有唯一的姓名,唯一的电话号码以及某人的姓名不能是某人的电话号码,这就可以使用。 如您所见,该方法是递归的。我建议你递归地构建你的trie结构(通常,基于树结构的所有东西),因为它可以实现更简单,更通用的清洁代码。 搜索功能几乎是插入功能的复制粘贴:

public void search_by_name(RadixNode node, String name, int idx){
    // returns NULL if there is no user going by that name
    if (idx == name.length()){
        return node.getCustomer();
    } else {
        Character current_char =  name.chatAt(idx);
        if (! node.getChlidren().containsKey(current_char){
            return NULL;
        } else {
            return search_by_name(node.getChildren().get(current_char), name, idx+1);
        }
    }
}

第二种方式:2次尝试 原理是一样的,所有你需要做的就是重用上面的代码,但保留两个不同的root节点,每个节点都会构建一个trie(一个用于名称,一个用于电话号码)。唯一的区别是insert函数(因为它将调用带有2个不同根的insert_with_nameinsert_with_phone_nb),以及搜索函数,它也必须在右侧trie中搜索。

public void insert(RadixNode root_name_trie, RadixNode root_phone_trie, Customer customer){
    insert_with_name(root_name_trie, customer, 0);
    insert_with_phone_nb(root_phone_trie, customer, 0);
}

编辑:评论精确后,可能有同名的客户,这里是一个替代实现,允许RadixNode包含对几个Customer的引用。 例如,用Customer customer替换RadixNode中的Vector<Customer>属性。当然,这些方法必须相应地进行修改,然后通过名称搜索将返回一个客户向量(可能是空的),因为这种搜索可以导致几个结果。 在你的情况下,我会去一个包含客户向量的trie。因此,您可以按名称和电话进行搜索(将数字转换为String),并维护单个数据结构。

© www.soinside.com 2019 - 2024. All rights reserved.