行复制数据。表的列值

I have a dataset that is structured as following:

我有一个数据集，其结构如下:

data <- data.table(ID=1:10,Tenure=c(2,3,4,2,1,1,3,4,5,2),Var=rnorm(10))

数据< - data.table(ID = 1:10,任期= c(2、3、4、2、1,1,3,4,5,2),Var = rnorm(10))

    ID Tenure         Var
 1:  1      2 -0.72892371
 2:  2      3 -1.73534591
 3:  3      4  0.47007030
 4:  4      2  1.33173044
 5:  5      1 -0.07900914
 6:  6      1  0.63493316
 7:  7      3 -0.62710577
 8:  8      4 -1.69238758
 9:  9      5 -0.85709328
10: 10      2  0.10716830

I need to replicate each row N=Tenure times. e.g. I need to replicate the first row 2 times (since Tenure = 2.

我需要复制每一行N=保留率。我需要重复第一行2次(因为保留率= 2)。

I need my transformed dataset to look like the following:

我需要转换后的数据集如下所示:

setkey(data,ID)
print(data[,.(ID=rep(ID,Tenure))][data][, Indx := 1:.N, by=ID])

   ID Tenure        Var Indx
1:  1      2 -0.7289237    1
2:  1      2 -0.7289237    2
3:  2      3 -1.7353459    1
4:  2      3 -1.7353459    2
5:  2      3 -1.7353459    3
6:  3      4  0.4700703    1
...
...

Is there a more efficient way (a more data.table way) to do this? My way is pretty slow. I was thinking there should be a way to do this using a by-without-by merge usng .EACHI?

是否有更有效的方法(更多的数据)。桌子的方式)做这?我的路相当慢。我在想，应该有一种方法可以通过合并usng。eachi来实现这一点。

2 个解决方案

#1

I don't think using a key/merge is helpful here. Just expand by passing a vector of row indices:

我认为使用键/合并在这里没有帮助。通过传递一个行索引向量来展开:

DT <- data[rep(1:.N,Tenure)][,Indx:=1:.N,by=ID]

#2

You could try:

你可以试试:

library(splitstackshape)
expandRows(data, "Tenure", drop = FALSE)[,Indx:=1:.N,by=ID][]

或

library(dplyr)
library(splitstackshape)
expandRows(data, "Tenure", drop = FALSE) %>% 
  group_by(ID) %>%
  mutate(Indx = row_number(Tenure))

Which gives:

这使:

    ID Tenure        Var Indx
 1:  1      2 -0.8808717    1
 2:  1      2 -0.8808717    2
 3:  2      3  0.5962590    1
 4:  2      3  0.5962590    2
 5:  2      3  0.5962590    3
 6:  3      4  0.1197176    1
 7:  3      4  0.1197176    2
 8:  3      4  0.1197176    3
 9:  3      4  0.1197176    4
10:  4      2 -0.2821739    1

#1