在字符串的第一个逗号上分开。

时间:2022-08-22 12:55:01

How can I efficiently split the following string on the first comma using base?

如何有效地将下面的字符串在第一个逗号中使用base?

x <- "I want to split here, though I don't want to split elsewhere, even here."
strsplit(x, ???)

Desired outcome (2 strings):

期望结果字符串(2):

[[1]]
[1] "I want to split here"   "though I don't want to split elsewhere, even here."

Thank you in advance.

提前谢谢你。

EDIT: Didn't think to mention this. This needs to be able to generalize to a column, vector of strings like this, as in:

编辑:我没想过要提及这个。这需要能够推广到一列,像这样的字符串向量,如:

y <- c("Here's comma 1, and 2, see?", "Here's 2nd sting, like it, not a lot.")

The outcome can be two columns or one long vector (that I can take every other element of) or a list of stings with each index ([[n]]) having two strings.

结果可以是两列或一个长向量(我可以取所有其他元素),也可以是一个索引([n])有两个字符串的字符串的字符串列表。

Apologies for the lack of clarity.

对不明确表示歉意。

5 个解决方案

#1


11  

Here's what I'd probably do. It may seem hacky, but since sub() and strsplit() are both vectorized, it will also work smoothly when handed multiple strings.

我可能会这样做。这可能看起来很陈腐,但是由于sub()和strsplit()都是矢量化的,所以在处理多个字符串时,它也会运行得很顺利。

XX <- "SoMeThInGrIdIcUlOuS"
strsplit(sub(",\\s*", XX, x), XX)
# [[1]]
# [1] "I want to split here"                               
# [2] "though I don't want to split elsewhere, even here."

#2


8  

From the stringr package:

从stringr包:

str_split_fixed(x, pattern = ', ', n = 2)
#      [,1]                  
# [1,] "I want to split here"
#      [,2]                                                
# [1,] "though I don't want to split elsewhere, even here."

(That's a matrix with one row and two columns.)

它是一个有一行两列的矩阵

#3


3  

Here is yet another solution, with a regular expression to capture what is before and after the first comma.

这里还有另一个解决方案,使用正则表达式捕获第一个逗号前后的内容。

x <- "I want to split here, though I don't want to split elsewhere, even here."
library(stringr)
str_match(x, "^(.*?),\\s*(.*)")[,-1] 
# [1] "I want to split here"                              
# [2] "though I don't want to split elsewhere, even here."

#4


2  

library(stringr)

库(stringr)

str_sub(x,end = min(str_locate(string=x, ',')-1))

str_sub(x,结束= min(str_locate(字符串= x,”、“)1))

This will get the first bit you want. Change the start= and end= in str_sub to get what ever else you want.

这将得到你想要的第一点。在str_sub中更改start=和end=以获得您想要的任何其他内容。

Such as:

如:

str_sub(x,start = min(str_locate(string=x, ',')+1 ))

str_sub(x,开始= min(str_locate(字符串= x,”、“)+ 1))

and wrap in str_trim to get rid of the leading space:

用str_trim包起来去掉前缘空间:

str_trim(str_sub(x,start = min(str_locate(string=x, ',')+1 )))

str_trim(str_sub(x,开始= min(str_locate(字符串= x,”、“)+ 1)))

#5


2  

This works but I like Josh Obrien's better:

这行得通,但我更喜欢乔什·奥布里恩的:

y <- strsplit(x, ",")
sapply(y, function(x) data.frame(x= x[1], 
    z=paste(x[-1], collapse=",")), simplify=F))

Inspired by chase's response.

灵感来自追逐的反应。

A number of people gave non base approaches so I figure I'd add the one I usually use (though in this case I needed a base response):

一些人给出了非基方法,所以我想我应该加上我通常使用的一个(虽然在这种情况下我需要一个基响应):

y <- c("Here's comma 1, and 2, see?", "Here's 2nd sting, like it, not a lot.")
library(reshape2)
colsplit(y, ",", c("x","z"))

#1


11  

Here's what I'd probably do. It may seem hacky, but since sub() and strsplit() are both vectorized, it will also work smoothly when handed multiple strings.

我可能会这样做。这可能看起来很陈腐,但是由于sub()和strsplit()都是矢量化的,所以在处理多个字符串时,它也会运行得很顺利。

XX <- "SoMeThInGrIdIcUlOuS"
strsplit(sub(",\\s*", XX, x), XX)
# [[1]]
# [1] "I want to split here"                               
# [2] "though I don't want to split elsewhere, even here."

#2


8  

From the stringr package:

从stringr包:

str_split_fixed(x, pattern = ', ', n = 2)
#      [,1]                  
# [1,] "I want to split here"
#      [,2]                                                
# [1,] "though I don't want to split elsewhere, even here."

(That's a matrix with one row and two columns.)

它是一个有一行两列的矩阵

#3


3  

Here is yet another solution, with a regular expression to capture what is before and after the first comma.

这里还有另一个解决方案,使用正则表达式捕获第一个逗号前后的内容。

x <- "I want to split here, though I don't want to split elsewhere, even here."
library(stringr)
str_match(x, "^(.*?),\\s*(.*)")[,-1] 
# [1] "I want to split here"                              
# [2] "though I don't want to split elsewhere, even here."

#4


2  

library(stringr)

库(stringr)

str_sub(x,end = min(str_locate(string=x, ',')-1))

str_sub(x,结束= min(str_locate(字符串= x,”、“)1))

This will get the first bit you want. Change the start= and end= in str_sub to get what ever else you want.

这将得到你想要的第一点。在str_sub中更改start=和end=以获得您想要的任何其他内容。

Such as:

如:

str_sub(x,start = min(str_locate(string=x, ',')+1 ))

str_sub(x,开始= min(str_locate(字符串= x,”、“)+ 1))

and wrap in str_trim to get rid of the leading space:

用str_trim包起来去掉前缘空间:

str_trim(str_sub(x,start = min(str_locate(string=x, ',')+1 )))

str_trim(str_sub(x,开始= min(str_locate(字符串= x,”、“)+ 1)))

#5


2  

This works but I like Josh Obrien's better:

这行得通,但我更喜欢乔什·奥布里恩的:

y <- strsplit(x, ",")
sapply(y, function(x) data.frame(x= x[1], 
    z=paste(x[-1], collapse=",")), simplify=F))

Inspired by chase's response.

灵感来自追逐的反应。

A number of people gave non base approaches so I figure I'd add the one I usually use (though in this case I needed a base response):

一些人给出了非基方法,所以我想我应该加上我通常使用的一个(虽然在这种情况下我需要一个基响应):

y <- c("Here's comma 1, and 2, see?", "Here's 2nd sting, like it, not a lot.")
library(reshape2)
colsplit(y, ",", c("x","z"))