有没有一种很好的方法来检查一个字符串是否包含一个字符串数组中的至少一个字符串?

时间:2023-02-10 01:33:28

string.include?(other_string) is used to check if a string contains another string. Is there a nice way to check if a string contains at least one string from an array of strings?

string.include?(other_string)用于检查字符串是否包含另一个字符串。有没有一种很好的方法来检查一个字符串是否包含一个字符串数组中的至少一个字符串?

string_1 = "a monkey is an animal. dogs are fun"

arrays_of_strings_to_check_against = ['banana', 'fruit', 'animal', 'dog']

This would return true, because string_1 contains the string 'animal'. If we remove 'animal' from arrays_of_strings_to_check_against, it would return false.

这将返回true,因为string_1包含字符串'animal'。如果我们从arrays_of_strings_to_check_against中删除'animal',它将返回false。

Note that the string 'dog' from arrays_of_strings_to_check_against should not match 'dogs' from string_1, because it has to be a complete match.

请注意,arrays_of_strings_to_check_against中的字符串'dog'不应与string_1中的'dogs'匹配,因为它必须是完全匹配。

I'm using Rails 3.2.0 and Ruby 1.9.2

我正在使用Rails 3.2.0和Ruby 1.9.2

6 个解决方案

#1


7  

arrays_of_strings_to_check_against.map{ |o| string_1 =~ /\b#{Regexp.escape(o)}\b/ }.any?

Or even:

arrays_of_strings_to_check_against.any?{ |o| string_1 =~ /\b#{Regexp.escape(o)}\b/ }

#2


4  

If array_of_strings_to_check_against contains only whole words, and not multi-word strings, you can & the two arrays together. If the result has length > 0, there was a match. Prior to .split(' '), however, you must remove non-word, non-space characters. Otherwise, in this case it would fail because animal. (with .) isn't in your array.

如果array_of_strings_to_check_against只包含整个单词,而不包含多字符串,则可以将两个数组放在一起。如果结果长度> 0,则匹配。但是,在.split('')之前,您必须删除非单词,非空格字符。否则,在这种情况下它会失败,因为动物。 (with。)不在你的数组中。

if (string_1.gsub(/[^\w\s]/).split(' ') & array_of_strings_to_check_against).length > 0
  puts "Match!!"
end

Update after comments: case-insensitive version

if (string_1.downcase.gsub(/[^\w\s]/).split(' ') & array_of_strings_to_check_against).length > 0
  puts "Match!!"
end

#3


3  

str1  = "a monkey is an animal. dogs are fun"
str2  = "a monkey is a primate. dogs are fun"
words = %w[banana fruit animal dog]
word_test = /\b(?:#{ words.map{|w| Regexp.escape(w) }.join("|") })\b/i

p str1 =~ word_test,  #=> 15
  str2 =~ word_test   #=> nil

If you get nil there was no match; otherwise you'll get an integer (which you can treat just like true) that is the index of the offset where the match occurred.

如果你没有,就没有比赛;否则你会得到一个整数(你可以像真的那样对待),这是匹配发生的偏移量的索引。

If you absolutely must have true or false, you can do:

如果你绝对必须有真假,你可以这样做:

any_match = !!(str =~ word_test)

The regular expression created by interpolation is:

插值创建的正则表达式是:

/\b(?:banana|fruit|animal|dog)\b/i

…where the \b matches a "word boundary", thus preventing dog from matching in dogs.

... \ b匹配“单词边界”,从而防止狗在狗中匹配。

Edit: The answer above no longer uses Regexp.union since that creates a case-sensitive regex, while the question requires case-insensitive.

编辑:上面的答案不再使用Regexp.union,因为它创建一个区分大小写的正则表达式,而问题需要不区分大小写。

Alternatively, we can force everything to lowercase before the test to gain case-insensitivity:

或者,我们可以在测试之前将所有内容强制为小写,以获得不区分大小写:

words = %w[baNanA Fruit ANIMAL dog]
word_test = /\b#{ Regexp.union(words.map(&:downcase)) }\b/
p str1.downcase =~ word_test,
  str2.downcase =~ word_test

#4


2  

Regexp.union is your friend in this case. Consider:

在这种情况下,Regexp.union是你的朋友。考虑:

# the words we're looking for...
target_words = %w[ore sit ad sint est lore]

search_text = 'Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod
tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam,
quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo
consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse
cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non
proident, sunt in culpa qui officia deserunt mollit anim id est laborum.'

# define a search ignoring case that looks for partial words...
partial_words_regex = /#{ Regexp.union(target_words).source }/i
partial_words_regex.to_s # => "(?i-mx:ore|sit|ad|sint|est|lore)"

# define a search ignoring case that looks for whole words...
whole_words_regex = /\b(?:#{ Regexp.union(target_words).source })\b/i
whole_words_regex.to_s # => "(?i-mx:\\b(?:ore|sit|ad|sint||lore)\\b)"

# find the first hit...
search_text[whole_words_regex] # => "sit"

# find all partial word hits...
search_text.scan(partial_words_regex) # => ["Lore", "sit", "ad", "ore", "lore", "ad", "lore", "sint", "est"]

# find all whole word hits...
search_text.scan(whole_words_regex) # => ["sit", "ad", "sint", "est"]

Putting it all in context:

把它全部放在上下文中:

string_1 = "a monkey is an animal. dogs are fun"
arrays_of_strings_to_check_against = ['banana', 'fruit', 'animal', 'dog']
string_1[Regexp.union(arrays_of_strings_to_check_against)] # => "animal"
string_1.scan(Regexp.union(arrays_of_strings_to_check_against)) # => ["animal", "dog"]

#5


0  

def check_string
  arrays_of_string_to_check_against.each do |item|
      is_include = string_1.include?(item)
  end
end

#6


0  

(string_1.scan(/\w+/) & arrays_of_strings_to_check_against).size > 0

#1


7  

arrays_of_strings_to_check_against.map{ |o| string_1 =~ /\b#{Regexp.escape(o)}\b/ }.any?

Or even:

arrays_of_strings_to_check_against.any?{ |o| string_1 =~ /\b#{Regexp.escape(o)}\b/ }

#2


4  

If array_of_strings_to_check_against contains only whole words, and not multi-word strings, you can & the two arrays together. If the result has length > 0, there was a match. Prior to .split(' '), however, you must remove non-word, non-space characters. Otherwise, in this case it would fail because animal. (with .) isn't in your array.

如果array_of_strings_to_check_against只包含整个单词,而不包含多字符串,则可以将两个数组放在一起。如果结果长度> 0,则匹配。但是,在.split('')之前,您必须删除非单词,非空格字符。否则,在这种情况下它会失败,因为动物。 (with。)不在你的数组中。

if (string_1.gsub(/[^\w\s]/).split(' ') & array_of_strings_to_check_against).length > 0
  puts "Match!!"
end

Update after comments: case-insensitive version

if (string_1.downcase.gsub(/[^\w\s]/).split(' ') & array_of_strings_to_check_against).length > 0
  puts "Match!!"
end

#3


3  

str1  = "a monkey is an animal. dogs are fun"
str2  = "a monkey is a primate. dogs are fun"
words = %w[banana fruit animal dog]
word_test = /\b(?:#{ words.map{|w| Regexp.escape(w) }.join("|") })\b/i

p str1 =~ word_test,  #=> 15
  str2 =~ word_test   #=> nil

If you get nil there was no match; otherwise you'll get an integer (which you can treat just like true) that is the index of the offset where the match occurred.

如果你没有,就没有比赛;否则你会得到一个整数(你可以像真的那样对待),这是匹配发生的偏移量的索引。

If you absolutely must have true or false, you can do:

如果你绝对必须有真假,你可以这样做:

any_match = !!(str =~ word_test)

The regular expression created by interpolation is:

插值创建的正则表达式是:

/\b(?:banana|fruit|animal|dog)\b/i

…where the \b matches a "word boundary", thus preventing dog from matching in dogs.

... \ b匹配“单词边界”,从而防止狗在狗中匹配。

Edit: The answer above no longer uses Regexp.union since that creates a case-sensitive regex, while the question requires case-insensitive.

编辑:上面的答案不再使用Regexp.union,因为它创建一个区分大小写的正则表达式,而问题需要不区分大小写。

Alternatively, we can force everything to lowercase before the test to gain case-insensitivity:

或者,我们可以在测试之前将所有内容强制为小写,以获得不区分大小写:

words = %w[baNanA Fruit ANIMAL dog]
word_test = /\b#{ Regexp.union(words.map(&:downcase)) }\b/
p str1.downcase =~ word_test,
  str2.downcase =~ word_test

#4


2  

Regexp.union is your friend in this case. Consider:

在这种情况下,Regexp.union是你的朋友。考虑:

# the words we're looking for...
target_words = %w[ore sit ad sint est lore]

search_text = 'Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod
tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam,
quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo
consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse
cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non
proident, sunt in culpa qui officia deserunt mollit anim id est laborum.'

# define a search ignoring case that looks for partial words...
partial_words_regex = /#{ Regexp.union(target_words).source }/i
partial_words_regex.to_s # => "(?i-mx:ore|sit|ad|sint|est|lore)"

# define a search ignoring case that looks for whole words...
whole_words_regex = /\b(?:#{ Regexp.union(target_words).source })\b/i
whole_words_regex.to_s # => "(?i-mx:\\b(?:ore|sit|ad|sint||lore)\\b)"

# find the first hit...
search_text[whole_words_regex] # => "sit"

# find all partial word hits...
search_text.scan(partial_words_regex) # => ["Lore", "sit", "ad", "ore", "lore", "ad", "lore", "sint", "est"]

# find all whole word hits...
search_text.scan(whole_words_regex) # => ["sit", "ad", "sint", "est"]

Putting it all in context:

把它全部放在上下文中:

string_1 = "a monkey is an animal. dogs are fun"
arrays_of_strings_to_check_against = ['banana', 'fruit', 'animal', 'dog']
string_1[Regexp.union(arrays_of_strings_to_check_against)] # => "animal"
string_1.scan(Regexp.union(arrays_of_strings_to_check_against)) # => ["animal", "dog"]

#5


0  

def check_string
  arrays_of_string_to_check_against.each do |item|
      is_include = string_1.include?(item)
  end
end

#6


0  

(string_1.scan(/\w+/) & arrays_of_strings_to_check_against).size > 0