dplyr mutate in R - add column as concat of columns

时间:2022-10-07 22:54:12

I have a problem with using mutate{dplyr} function with the aim of adding a new column to data frame. I want a new column to be of character type and to consist of "concat" of sorted words from other columns (which are of character type, too). For example, for the following data frame:

我有一个使用mutate {dplyr}函数的问题,目的是向数据框添加一个新列。我想要一个新的列是字符类型,并包含来自其他列(也是字符类型)的排序单词的“concat”。例如,对于以下数据框:

> library(datasets)
> states.df <- data.frame(name = as.character(state.name),
+                         region = as.character(state.region),
+                         division = as.character(state.division))
> 
> head(states.df, 3)
     name region           division
1 Alabama  South East South Central
2  Alaska   West            Pacific
3 Arizona   West           Mountain 

I would like to get a new column with the following first element:

我想获得一个包含以下第一个元素的新列:

"Alamaba_East South Central_South" 

I tried this:

我试过这个:

mutate(states.df,
   concated_column = paste0(sort(name, region, division), collapse="_"))

But I received an error:

但我收到一个错误:

Error in sort(1:50, c(2L, 4L, 4L, 2L, 4L, 4L, 1L, 2L, 2L, 2L, 4L, 4L,  : 
  'decreasing' must be a length-1 logical vector.
Did you intend to set 'partial'?

Thank you for any help in advance!

感谢您提前帮助!

2 个解决方案

#1


22  

You need to use sep = not collapse =, and why use sort?. And I used paste and not paste0.

你需要使用sep = not collapse =,为什么要使用sort?。我使用粘贴而不是paste0。

library(dplyr)
states.df <- data.frame(name = as.character(state.name),
                        region = as.character(state.region), 
                        division = as.character(state.division))
res = mutate(states.df,
   concated_column = paste(name, region, division, sep = '_'))

As far as the sorting goes, you do not use sort correctly. Maybe you want:

就排序而言,您不能正确使用排序。也许你想要:

as.data.frame(lapply(states.df, sort))

This sorts each column, and creates a new data.frame with those columns.

这会对每列进行排序,并使用这些列创建新的data.frame。

#2


2  

Adding on to Paul's answer. If you want to sort the rows, you could try order. Here is an example:

再加上保罗的回答。如果要对行进行排序,可以尝试订购。这是一个例子:

res1 <- mutate(states.df,
          concated_column = apply(states.df[order(name, region, division), ], 1, 
                                  function(x) paste0(x, collapse = "_")))

Here order sorts the data.frame states.df by name and then breaks the tie by region and division

这里的命令按名称对data.frame states.df进行排序,然后按区域和除法区分

#1


22  

You need to use sep = not collapse =, and why use sort?. And I used paste and not paste0.

你需要使用sep = not collapse =,为什么要使用sort?。我使用粘贴而不是paste0。

library(dplyr)
states.df <- data.frame(name = as.character(state.name),
                        region = as.character(state.region), 
                        division = as.character(state.division))
res = mutate(states.df,
   concated_column = paste(name, region, division, sep = '_'))

As far as the sorting goes, you do not use sort correctly. Maybe you want:

就排序而言,您不能正确使用排序。也许你想要:

as.data.frame(lapply(states.df, sort))

This sorts each column, and creates a new data.frame with those columns.

这会对每列进行排序,并使用这些列创建新的data.frame。

#2


2  

Adding on to Paul's answer. If you want to sort the rows, you could try order. Here is an example:

再加上保罗的回答。如果要对行进行排序,可以尝试订购。这是一个例子:

res1 <- mutate(states.df,
          concated_column = apply(states.df[order(name, region, division), ], 1, 
                                  function(x) paste0(x, collapse = "_")))

Here order sorts the data.frame states.df by name and then breaks the tie by region and division

这里的命令按名称对data.frame states.df进行排序,然后按区域和除法区分