How can I reshape a data.table
(long into wide) without doing a function like sum
or mean
? I was looking at dcast/melt/reshape/etc. But I don't get the desired results.
如何重塑数据。表(长到宽)不做算术或平均值?我在看dcast/melt/整形等等。但是我没有得到理想的结果。
This is my data:
这是我的数据:
DT <- data.table(id = c("1","1","2","3"), score = c("5", "4", "5", "6"))
Original format:
原来的格式:
> DT
id score
1 5
1 4
2 5
3 6
Desired format:
需要的格式:
id score1 score2
1 5 4
2 5 NA
3 6 NA
I now do the trick with:
我现在用:
DT <- DT[, list(list(score)), by=id]
But then the contents of the first cell is like:
但是第一个单元格的内容是:
c("5", "4")
And I need to split it (I use the package splitstackshape
):
我需要拆分它(我使用splitstackshape):
DT <- cSplit(DT, "V1", ",")
This is probably not the most efficient method... What is a better way?
这可能不是最有效的方法……什么是更好的方法?
1 个解决方案
#1
4
You can use getanID
to create a unique .id
for the grouping variable id
. Then, try with dcast.data.table
(or simply dcast
from versions 1.9.5 and beyond) and if needed change the column names using setnames
您可以使用getanID为分组变量id创建一个惟一的.id。表(或简单的来自版本1.9.5或更高版本的dcast),如果需要,使用setname更改列名
library(splitstackshape)
res <- dcast(getanID(DT, 'id'), id~.id,value.var='score')
setnames(res, 2:3, paste0('score', 1:2))[]
# id score1 score2
#1: 1 5 4
#2: 2 5 NA
#3: 3 6 NA
Or using only data.table
或者只使用data.table
dcast(DT[, .id:=paste0('score', 1:.N), by=id],
id~.id, value.var='score')
# id score1 score2
#1: 1 5 4
#2: 2 5 NA
#3: 3 6 NA
Or from the code you were using (less number of characters)
或者从您正在使用的代码中(减少字符数)
cSplit(DT[, toString(score), by=id], 'V1', ',')
# id V1_1 V1_2
#1: 1 5 4
#2: 2 5 NA
#3: 3 6 NA
#1
4
You can use getanID
to create a unique .id
for the grouping variable id
. Then, try with dcast.data.table
(or simply dcast
from versions 1.9.5 and beyond) and if needed change the column names using setnames
您可以使用getanID为分组变量id创建一个惟一的.id。表(或简单的来自版本1.9.5或更高版本的dcast),如果需要,使用setname更改列名
library(splitstackshape)
res <- dcast(getanID(DT, 'id'), id~.id,value.var='score')
setnames(res, 2:3, paste0('score', 1:2))[]
# id score1 score2
#1: 1 5 4
#2: 2 5 NA
#3: 3 6 NA
Or using only data.table
或者只使用data.table
dcast(DT[, .id:=paste0('score', 1:.N), by=id],
id~.id, value.var='score')
# id score1 score2
#1: 1 5 4
#2: 2 5 NA
#3: 3 6 NA
Or from the code you were using (less number of characters)
或者从您正在使用的代码中(减少字符数)
cSplit(DT[, toString(score), by=id], 'V1', ',')
# id V1_1 V1_2
#1: 1 5 4
#2: 2 5 NA
#3: 3 6 NA