使用lucene进行数据库搜索

Question

我正在使用Lucene来查询网站的数据库，但我遇到了一些问题。我实际上并不知道问题是来自索引还是搜索（更确切地说是查询的构造）。好吧，据我所知，当在几个SQL数据库表中搜索时，最好为每个表使用多个文档（我遵循这些教程：

）接近我想做的事情。事实上，在我的情况下，我必须搜索3个相关的表，因为每个表指定上述级别（例如：product - > type - > color）。因此，我的索引是这样的：

        String sql = "select c.idConteudo as ID, c.designacao as DESIGNACAO, cd.texto as DESCRICAO, ctf.webTag as TAG from Conteudo c, ConteudoDetalhe cd, ConteudoTipoFormato ctf where c.idConteudo = cd.idConteudo AND cd.idConteudoTipoFormato = ctf.idConteudoTipoFormato;";
        Statement stmt = connection.createStatement();
        ResultSet rs = stmt.executeQuery(sql);

        Document document;
        while (rs.next()) 
        {          

            String S = new String();
            S += IndexerCounter;

            document = new Document();
            document.add(new Field("ID_ID",S, Field.Store.YES, Field.Index.NO));
            document.add(new Field("ID CONTEUDO", rs.getString("ID"), Field.Store.YES, Field.Index.NO));
            document.add(new Field("DESIGNACAO", rs.getString("DESIGNACAO"), Field.Store.NO, Field.Index.TOKENIZED));
            document.add(new Field("DESCRICAO", rs.getString("DESCRICAO"), Field.Store.NO, Field.Index.TOKENIZED));
            document.add(new Field("TAG", rs.getString("TAG"), Field.Store.NO, Field.Index.TOKENIZED));


            try{
                writer.addDocument(document);
            }catch(CorruptIndexException e){
            }catch(IOException e){
            }catch(Exception e){  }  //just for knowing if something is wrong

            IndexerCounter++;
        }

如果我输出结果他们是这样的：

ID: idConteudo: designacao: texto: webTag

1:1:Xor:xor 1 Descricao:x or
2:1:Xor:xor 2 Descricao:xis Or
3:1:Xor:xor 3 Descricao:exor
4:2:And:and 1 Descricao:and
5:2:And:and 2 Descricao:&
6:2:And:and 3 Descricao:ande
7:2:And:and 4 Descricao:a n d
8:2:And:and 5 Descricao:and,
9:3:Nor:nor 1 Descricao:nor
10:3:Nor:nor 2 Descricao:not or

我真正想要的是（例如Xor）查询并在创建的文档中搜索它。因此我的搜索方法是这样的：

构造函数：

public Spider(String Query, String Pathh) {
        String[] Q;
        QueryFromUser = new String();
        QueryFromUser = Query;
        QueryToSearch1 = new String();
        QueryToSearch2 = new String();
        Path = Pathh;

        try {
            try {
                Class.forName("com.mysql.jdbc.Driver");
            } catch (ClassNotFoundException e) {
                e.printStackTrace();
                return;
            }
            try {
                connection = DriverManager.getConnection("jdbc:mysql://localhost:3306/mydb", "root", "");
            } catch (SQLException e) {
                e.printStackTrace();
                return;
            }


            Q = Query.split(" ");

            //NOTE: the AND word enables the search engine to search by the various words in a query
            for (int i = 0; i < Q.length; i++) {
                if ((Q.length - i) > 1) //prevents the last one to take a AND
                {
                    QueryToSearch1 += Q[i] + " AND ";
                } else {
                    QueryToSearch1 += Q[i];
                }
            }

            for (int i = 0; i < Q.length; i++) {
                QueryToSearch2 += "+" + Q[i];
            }
            try {
                SEARCHING_CONTENT();
            } catch (ClassNotFoundException ex) {
                Logger.getLogger(Spider.class.getName()).log(Level.SEVERE, null, ex);
            } catch (InstantiationException ex) {
                Logger.getLogger(Spider.class.getName()).log(Level.SEVERE, null, ex);
            } catch (IllegalAccessException ex) {
                Logger.getLogger(Spider.class.getName()).log(Level.SEVERE, null, ex);
            } catch (SQLException ex) {
                Logger.getLogger(Spider.class.getName()).log(Level.SEVERE, null, ex);
            } catch (ParseException ex) {
                Logger.getLogger(Spider.class.getName()).log(Level.SEVERE, null, ex);
            }
            SEARCHING_WEB();  //not for using now
        } catch (CorruptIndexException ex) {
            Logger.getLogger(Spider.class.getName()).log(Level.SEVERE, null, ex);
        } catch (IOException ex) {
            Logger.getLogger(Spider.class.getName()).log(Level.SEVERE, null, ex);
        }

想法是QueryToSearch1和QueryToSearch2有命令（我在在线教程中看到它，不太记得在哪里）AND和+。因此，对于来自用户的“不”或“查询”，将要搜索的内容将是“非AND或”用于同时搜索两个单词而“+ not +或”用于搜索这两个单词。这是我的一个疑问，我真的不知道lucene查询的构造是否是这样的。事实是，在查询方法中：

private void SEARCHING_CONTENT() throws CorruptIndexException, IOException, ClassNotFoundException, InstantiationException, IllegalAccessException, SQLException, ParseException {
        Querying(QueryToSearch1);  // search for the whole phrase
        Querying(QueryToSearch2);  //search by individual words
        //Querying(QueryFromUser);  //search by individual words
    }

    private void Querying(String QueryS) throws CorruptIndexException, IOException, ClassNotFoundException, InstantiationException, IllegalAccessException, SQLException, ParseException {
        searcher = new IndexSearcher(IndexReader.open(Path + "/INDEX_CONTENTS"));
        query = new QueryParser("TAG", new StopWords()).parse(QueryS);
        query.toString();
        hits = searcher.search(query);
        pstmt = connection.prepareStatement(sql);

        for (int i = 0; i < hits.length(); i++) {
            id = hits.doc(i).get("TAG");         
            pstmt.setString(1, id);
            displayResults(pstmt);
        }
    }

查询的文档没有匹配。重要的是要说明如下：

  query = new QueryParser("TAG", new StopWords()).parse(QueryS);

StopWords是我制作的一个类，它扩展了StandardAnalyser，但是它是一个带有我指定的单词的新类（对于我的搜索单词中没有删除重要的内容，或者或者 - 在这种情况下，这些单词可能很重要）。

问题是，正如我所说的那样。执行搜索时没有命中。我不确定这是因为索引还是因为要搜索的查询的构造（如果查询构造不好，那么就没有命中）。

我会向任何人提供任何帮助。如果需要，我很乐意提供更多信息。

非常感谢。

Answer 1

轻松第一步 - 使用Luke（ https://code.google.com/p/luke/ ）查看您的索引。你可以从Luke运行你的查询来检查，他们是否找到了什么。

Luke很容易理解，因为它有非常有用的UI（ https://code.google.com/p/luke/source/browse/wiki/img/overview.png ）

使用lucene进行数据库搜索

问题描述投票：1回答：1

1个回答

最新问题

使用lucene进行数据库搜索

问题描述 投票：1回答：1

1个回答

最新问题

问题描述投票：1回答：1