I have a column in a data frame with addresses that are a composite of unit/house number, street name, locality, postcode, and phone number.
我在数据框中有一个列,其中的地址是单元/住宅号、街道名、地点、邮编和电话号码的组合。
the postcode is a four digit number.
邮编是一个四位数。
Here is an example:
这是一个例子:
"26A JULIA STREET ANYTOWN 8523 71245632"
"26A JULIA街ANYTOWN 8523 71245632"
I want to strip the phone numbers but keep the postcodes and other numbers to return:
我想去掉电话号码,但要保留邮编和其他号码。
"26A JULIA STREET ANYTOWN 8523"
"26A JULIA街ANYTOWN 8523"
I have tried the following:
我试过以下方法:
str_replace(string=field_name$ADDRESS, pattern="\\d{5,}", replacement="")
(大小写不敏感字符串= field_name $地址模式=“\ \ d { 5 }”,替换= " ")
It does not remove the phone numbers. Can anyone point out where I am going wrong.
它不会删除电话号码。谁能指出我哪里做错了吗?
1 个解决方案
#1
3
I personally like the extra detail of the stringi
package (and stringr
just wraps it anyway):
我个人喜欢stringi包的额外细节(而stringr只是将它包裹起来):
library(stringi)
library(magrittr)
field_name <- data.frame(ADDRESS="26A JULIA STREET ANYTOWN 8523 71245632", stringsAsFactors=FALSE)
stri_replace_last_regex(field_name$ADDRESS, "[[:digit:]]{5,}", "") %>%
stri_trim()
## [1] "26A JULIA STREET ANYTOWN 8523"
#1
3
I personally like the extra detail of the stringi
package (and stringr
just wraps it anyway):
我个人喜欢stringi包的额外细节(而stringr只是将它包裹起来):
library(stringi)
library(magrittr)
field_name <- data.frame(ADDRESS="26A JULIA STREET ANYTOWN 8523 71245632", stringsAsFactors=FALSE)
stri_replace_last_regex(field_name$ADDRESS, "[[:digit:]]{5,}", "") %>%
stri_trim()
## [1] "26A JULIA STREET ANYTOWN 8523"