在R中提取包含数字

时间:2021-08-03 20:05:20

I have words that include numbers within, or begin with or end with numbers. How do i extract those only.

我的单词包括数字,或以数字开头或结尾。我该如何仅提取它们。

s <- c("An ex4mple". "anothe 3xample" "A thir7", "And sentences w1th w0rds as w3ll")

Expected output:
c("ex4mple", "3xample", "thir7", "w1th w0rds w3ll")

Words could include more than one number.

单词可以包含多个数字。

1 个解决方案

#1


2  

We can split the strings by space into a list, loop through the elements with sapply, then match all words that have only letters from start (^) to end ($), specify invert=TRUE with value=TRUE to get those elements that don't fit the criteria, paste them together

我们可以按空格将字符串拆分为一个列表,使用sapply循环遍历元素,然后匹配所有只包含从start(^)到end($)的字母的单词,指定invert = TRUE并使用value = TRUE来获取那些元素不符合标准,将它们粘贴在一起

sapply(strsplit(s, "\\s+"), function(x) 
  paste(grep("^[A-Za-z]+$", x, invert = TRUE, value = TRUE), collapse=' '))
#[1] "ex4mple"         "3xample"         "thir7"           "w1th w0rds w3ll"

Or we can use str_extract

或者我们可以使用str_extract

library(stringr)
sapply(str_extract_all(s, '[A-Za-z]*[0-9]+[A-Za-z]*'), paste, collapse=' ')
#[1] "ex4mple"         "3xample"         "thir7"           "w1th w0rds w3ll"

data

s <- c("An ex4mple", "anothe 3xample", "A thir7", "And sentences w1th w0rds as w3ll")

#1


2  

We can split the strings by space into a list, loop through the elements with sapply, then match all words that have only letters from start (^) to end ($), specify invert=TRUE with value=TRUE to get those elements that don't fit the criteria, paste them together

我们可以按空格将字符串拆分为一个列表,使用sapply循环遍历元素,然后匹配所有只包含从start(^)到end($)的字母的单词,指定invert = TRUE并使用value = TRUE来获取那些元素不符合标准,将它们粘贴在一起

sapply(strsplit(s, "\\s+"), function(x) 
  paste(grep("^[A-Za-z]+$", x, invert = TRUE, value = TRUE), collapse=' '))
#[1] "ex4mple"         "3xample"         "thir7"           "w1th w0rds w3ll"

Or we can use str_extract

或者我们可以使用str_extract

library(stringr)
sapply(str_extract_all(s, '[A-Za-z]*[0-9]+[A-Za-z]*'), paste, collapse=' ')
#[1] "ex4mple"         "3xample"         "thir7"           "w1th w0rds w3ll"

data

s <- c("An ex4mple", "anothe 3xample", "A thir7", "And sentences w1th w0rds as w3ll")