字符串数组的 JSON 提取器

Question

在 Riak 中，我有这个基本的

user

模式以及附带的

user

索引（我省略了 riak 特定的字段，如

_yz_id

等）：

<?xml version="1.0" encoding="UTF-8" ?>
<schema name="user" version="1.5">

 <fields>
   <field name="email"    type="string"   indexed="true"  stored="false"/>   
   <field name="name"     type="string"   indexed="true"  stored="false"/>   
   <field name="groups"   type="string"   indexed="true"  stored="false" multiValued="true"/>

   <dynamicField name="*" type="ignored"  indexed="false" stored="false" multiValued="true"/>

   ..riak-specific fields.. 

 </fields>

 <uniqueKey>_yz_id</uniqueKey>                                                 

 <types>                                                                       
   <fieldType name="string"  class="solr.StrField"     sortMissingLast="true"/>
   <fieldType name="_yz_str" class="solr.StrField"     sortMissingLast="true"/>
   <fieldtype name="ignored" class="solr.StrField"/>                           
 </types>

</schema>

我的用户 JSON 如下所示：

{
   "name" : "John Smith",
   "email" : "[email protected]",
   "groups" : [
      "3304cf79",
      "abe155cf"
   ]
}

当我尝试使用此查询进行搜索时：

curl http://localhost:10018/search/query/user?wt=json&q=groups:3304cf79

我没有得到

docs

回复。

这是为什么呢？ JSON 提取器是否为组创建索引条目？

Answer 1

架构是正确的。问题是它不是我用来设置存储桶属性的“原始”模式。 Yokozuna GitHub 上的This问题就是罪魁祸首。我在插入新数据后更新了架构，认为索引会重新加载。目前，他们没有。

Answer 2

import json import pandas as pd from jsonpath_ng import jsonpath, parse def process_json_data(data_file, mapping_file, root): # Load the JSON data with open(data_file) as f: data = json.load(f) # Load the mapping with open(mapping_file) as f: mapping = json.load(f) # Prepare an empty dataframe to hold the results df = pd.DataFrame() # Iterate over each datapoint in the data file for i, datapoint in enumerate(data[root]): # Prepare an empty dictionary to hold the results for this datapoint datapoint_dict = {} # Iterate over each field in the mapping file for field, path in mapping.items(): # Prepare the JSONPath expression jsonpath_expr = parse(path) # Find the first match in the datapoint match = jsonpath_expr.find(datapoint) if match: # If a match was found, add it to the dictionary datapoint_dict[field] = [m.value for m in match] else: # If no match was found, add 'no path' to the dictionary datapoint_dict[field] = ['no path'] # Create a temporary dataframe for this datapoint temp_df = pd.json_normalize(datapoint_dict) # Identify list-like columns and explode them while True: list_cols = [col for col in temp_df.columns if any(isinstance(i, list) for i in temp_df[col])] if not list_cols: break for col in list_cols: temp_df = temp_df.explode(col) # Append the temporary dataframe to the main dataframe df = df.append(temp_df) df.reset_index(drop=True, inplace=True) return df.style.set_properties(**{'border': '1px solid black'}) # Calling the function df = process_json_data('/content/jsonShredd/data.json', '/content/jsonShredd/mapping.json', 'datapoints') df

字符串数组的 JSON 提取器

问题描述投票：0回答：2

2个回答

最新问题

字符串数组的 JSON 提取器

问题描述 投票：0回答：2

2个回答

最新问题

问题描述投票：0回答：2