在字符串中替换至少5位数字

时间:2021-10-22 20:06:33

I have a column in a data frame with addresses that are a composite of unit/house number, street name, locality, postcode, and phone number.

我在数据框中有一个列,其中的地址是单元/住宅号、街道名、地点、邮编和电话号码的组合。

the postcode is a four digit number.

邮编是一个四位数。

Here is an example:

这是一个例子:

"26A JULIA STREET ANYTOWN 8523 71245632"

"26A JULIA街ANYTOWN 8523 71245632"

I want to strip the phone numbers but keep the postcodes and other numbers to return:

我想去掉电话号码,但要保留邮编和其他号码。

"26A JULIA STREET ANYTOWN 8523"

"26A JULIA街ANYTOWN 8523"

I have tried the following:

我试过以下方法:

str_replace(string=field_name$ADDRESS, pattern="\\d{5,}", replacement="")

(大小写不敏感字符串= field_name $地址模式=“\ \ d { 5 }”,替换= " ")

It does not remove the phone numbers. Can anyone point out where I am going wrong.

它不会删除电话号码。谁能指出我哪里做错了吗?

1 个解决方案

#1


3  

I personally like the extra detail of the stringi package (and stringr just wraps it anyway):

我个人喜欢stringi包的额外细节(而stringr只是将它包裹起来):

library(stringi)
library(magrittr)

field_name <- data.frame(ADDRESS="26A JULIA STREET ANYTOWN 8523 71245632", stringsAsFactors=FALSE)

stri_replace_last_regex(field_name$ADDRESS, "[[:digit:]]{5,}", "") %>% 
  stri_trim()
## [1] "26A JULIA STREET ANYTOWN 8523"

#1


3  

I personally like the extra detail of the stringi package (and stringr just wraps it anyway):

我个人喜欢stringi包的额外细节(而stringr只是将它包裹起来):

library(stringi)
library(magrittr)

field_name <- data.frame(ADDRESS="26A JULIA STREET ANYTOWN 8523 71245632", stringsAsFactors=FALSE)

stri_replace_last_regex(field_name$ADDRESS, "[[:digit:]]{5,}", "") %>% 
  stri_trim()
## [1] "26A JULIA STREET ANYTOWN 8523"