如何在Ruby中把一个字符串推入一个新的数组？

Question

我想在一个给定的字符串中搜索子串。每当输入的字符串中包含子串时，我就将其追加到一个数组中。最终我想 tally 数组来计算每个子串出现的次数。

问题是，在我的代码中，来自字典的子串只被添加一次到了 new_array.

例如：

dictionary = ["below", "down","go","going","horn","how","howdy","it","i","low","own","part","partner","sit"]

substrings("go going", dictionary)

应该输出：

{"go"=>2, "going"=>1, "i"=>1}

但我得到

{"go"=>1, "going"=>1, "i"=>1}

这是我的代码。

def substrings(word, array) 

  new_array = []

  array.each do |index| 

    if word.downcase.include? (index)

      new_array << index

    end
  end

  puts new_array.tally

end

 dictionary = ["below", "down","go","going","horn","how","howdy","it","i","low","own","part","partner","sit"]

 substrings("go going", dictionary)

Answer 1

这取决于你的字典有多大。

当子串存在于单词中时，你可以只映射所有元素与它们的出现次数。

dictionary.map {|w| [w,word.scan(w).size] if word.include?(w)}.compact.to_h

Answer 2

你可以使用扫描来计算每个子串出现的次数。

def substrings(word, array)
  output = {}
  array.each do |index|
     count_substring_appears = word.scan(index).size
     if count_substring_appears > 0
       output[index] = count_substring_appears
     end
  end

  output
end

Answer 3

只有您的字典中的 "go"、"going "和 "i "是您的短语的子串。每一个词在字典中只出现一次。所以 new_array 包含 ["go", "going", "i"] 恰恰 {"go"=>1, "going"=>1, "i"=>1}.

我想你的期望是 go 因为它在你的短语中是两次。在这种情况下，你可以将你的方法改为

def substrings(word, array) 
  new_array = []
  array.each do |index| 
    word.scan(/#{index}/).each { new_array << index }
  end
  puts new_array.tally
end

word.scan(/#{index}/) 返回你的短语中出现的每一个子串。

Answer 4

另一个选择是使用阵列#产品词的拆分后，可以使用可计数#统计如你所愿。

word = "go going"
word.split.product(dictionary).select { |a, b| a.include? b }.map(&:last).tally

#=> {"go"=>2, "going"=>1, "i"=>1}

它的输出不一样，当 word = "gogoing"，因为它被分割成一个元素数组。所以，我不能说这是否是你要找的行为。

Answer 5

如果我的理解是，我们给定一个数组。dictionary 的字，以及一个字符串。str，并且要构造一个哈希，它的键为 dictionary 且其数值等于非重叠的¹ 的子串 str 的键是一个子串。返回的哈希值应该不包括值为零的键。

这个答案解决了这样的情况，即在。

substrings(str, dictionary)

dictionary 是大的。str 是不会过大的(其含义我在后面详细说明)，而且效率也很重要。

我们首先定义一个帮助方法，其目的将变得清晰。

def substr_counts(str)
  str.split.each_with_object(Hash.new(0)) do |word,h|
    (1..word.size).each do |sub_len|
      (0..word.size-sub_len).each do |start_idx|
        h[word[start_idx,sub_len]] += 1
      end
    end
  end
end

对于问题中给出的例子。

substr_counts("go going")
  #=> {"g"=>3, "o"=>2, "go"=>2, "i"=>1, "n"=>1, "oi"=>1, "in"=>1, "ng"=>1,
  #    "goi"=>1, "oin"=>1, "ing"=>1, "goin"=>1, "oing"=>1, "going"=>1}

可见，这种方法打破了 str 为单词，计算每个单词的每个子串，并返回一个哈希，其键是子串，其值是包含该子串的所有单词中非重叠子串的总数。

现在可以快速构造出所需的哈希。

def cover_count(str, dictionary)
  h = substr_counts(str)
  dictionary.each_with_object({}) do |word,g|
    g[word] = h[word] if h.key?(word)
  end
end

dictionary = ["below", "down", "go", "going", "horn", "how", "howdy", 
              "it", "i", "low", "own", "part", "partner", "sit"]

cover_count("go going", dictionary)
  #=> {"go"=>2, "going"=>1, "i"=>1}

另一个例子。

str = "lowner partnership lownliest"
cover_count(str, dictionary)
  #=> {"i"=>2, "low"=>2, "own"=>2, "part"=>1, "partner"=>1}

这里：

substr_counts(str)
  #=> {"l"=>3, "o"=>2, "w"=>2, "n"=>3, "e"=>3, "r"=>3, "lo"=>2,
  #    ...
  #    "wnliest"=>1, "lownlies"=>1, "ownliest"=>1, "lownliest"=>1} 
substr_counts(str).size
  #=> 109

这里有一个明显的权衡。如果 str 长，特别是当它包含长的单词时。²，它将需要太长的时间来建立 h 以证明不必为每一个字确定一个词而节省的费用是合理的。dictionary的每个字中都包含该字，如果该字在 str. 然而，如果，它是值得的建设 h，总体上可以节省大量时间。

^{1. 我所说的 "不重叠 "是指，如果 str 等于 'bobobo' 它包含一个，而不是两个子串。'bobo'.}

2. substr_counts("antidisestablishmentarianism").size #=> 385，还不错。

Answer 6

你必须计算一个字符串在索引中出现的次数，所以用 scan:

def substrings(word, array) 

  hash = {}

  array.each do |index| 
    if word.downcase.include? (index)
      new_hash = {index => word.scan(/#{index}/).length}; 
      hash.merge!(new_hash) 
    end
  end

  puts hash 

end

Answer 7

我会从这个开始。

dictionary = %w[down go going it i]
target = 'go going'

dictionary.flat_map { |w|
  target.scan(Regexp.new(w, Regexp::IGNORECASE))
}.reject(&:empty?).tally
# => {"go"=>2, "going"=>1, "i"=>1}

如何在Ruby中把一个字符串推入一个新的数组？

问题描述投票：1回答：6

6个回答

最新问题

如何在Ruby中把一个字符串推入一个新的数组？

问题描述 投票：1回答：6

6个回答

最新问题

问题描述投票：1回答：6