Ruby Nokogiri解析省略重复

Question

我正在解析XML文件，并希望省略重复值添加到我的数组。就目前而言，XML将如下所示：

<vulnerable-software-list>
  <product>cpe:/a:octopus:octopus_deploy:3.0.0</product>
  <product>cpe:/a:octopus:octopus_deploy:3.0.1</product>
  <product>cpe:/a:octopus:octopus_deploy:3.0.2</product>
  <product>cpe:/a:octopus:octopus_deploy:3.0.3</product>
  <product>cpe:/a:octopus:octopus_deploy:3.0.4</product>
  <product>cpe:/a:octopus:octopus_deploy:3.0.5</product>
  <product>cpe:/a:octopus:octopus_deploy:3.0.6</product>
</vulnerable-software-list>

document.xpath("//entry[
  number(substring(translate(last-modified-datetime,'-.T:',''), 1, 12)) > #{last_imported_at} and
  cvss/base_metrics/access-vector = 'NETWORK'
  ]").each do |entry|
  product = entry.xpath('vulnerable-software-list/product').map { |product| product.content.split(':')[-2] }
  effected_versions = entry.xpath('vulnerable-software-list/product').map { |product| product.content.split(':').last }
  puts product
end

但是，由于XML输入，这解析了相当多的重复，所以我最终得到像['Redhat','Redhat','Redhat','Fedora']这样的数组

我已经有effected_versions照顾，因为这些值不重复。

是否有一种.map方法只能添加唯一值？

Answer 1

如果需要获取一组唯一值，则只需调用uniq方法即可获得唯一值：

product =
  entry.xpath('vulnerable-software-list/product').map do |product|
    product.content.split(':')[-2]
  end.uniq

Answer 2

有很多方法可以做到这一点：

input = ['Redhat','Redhat','Redhat','Fedora']

# approach 1
# self explanatory

result = input.uniq

# approach 2
# iterate through vals, and build a hash with the vals as keys
# since hashes cannot have duplicate keys, it provides a 'unique' check

result = input.each_with_object({}) { |val, memo| memo[val] = true }.keys

# approach 3
# Similar to the previous, we iterate through vals and add them to a Set.
# Adding a duplicate value to a set has no effect, and we can convert it to array

result = input.each_with_object.(Set.new) { |val, memo| memo.add(val) }.to_a

如果你不熟悉each_with_object，它与reduce非常相似

关于性能，如果搜索它，可以找到一些信息，例如What is the fastest way to make a uniq array?

从快速测试中，我看到这些表现在越来越多的时间。 uniq比each_with_object快5倍，比Set.new方法慢25％。可能是因为使用C实现了sort。虽然我只测试了任意输入，但是对于所有情况可能都不是这样。

Ruby Nokogiri解析省略重复

问题描述投票：0回答：2

2个回答

最新问题

Ruby Nokogiri解析省略重复

问题描述 投票：0回答：2

2个回答

最新问题

问题描述投票：0回答：2