获取数据框中的最小行数

时间:2023-01-06 17:03:36

I am working with a dataframe that has 65 variables in it. The first variable catalogs a person, and the next 64 variables indicate the geographic distance that person is from each of 64 locations. Using R, I would like to create a new variable that catalogs the shortest distance for each person to one of those 64 locations.

我正在使用一个包含65个变量的数据框。第一个变量为一个人编目,接下来的64个变量表示该人与64个位置中的每一个的地理距离。使用R,我想创建一个新变量,将每个人的最短距离编目到这64个位置之一。

For example: if person X is 35, 50, 79, 100, 450...miles away from the locations, I would like the new variable to automatically assign them a 35, because this is the shortest distance. Any help with this would be much appreciated. Thanks.

例如:如果人X距离位置35,50,79,100,450 ......英里,我希望新变量自动为它们分配35,因为这是最短的距离。任何帮助都将非常感激。谢谢。

3 个解决方案

#1


8  

df <- data.frame(let=letters[1:25], d1=sample(1:25,25), d2=sample(1:25,25), d3=sample(1:25,25))

df$shortest <- apply(df[,2:4],1,min)

The second line applies the function min to each row and assigns it to the new column in my data.frame df. See ?apply for more explanation of what the second line is doing. Careful to skip the first column, or any columns that aren't distances:

第二行将函数min应用于每一行,并将其分配给data.frame df中的新列。请参阅?申请更多解释第二行正在做什么。小心跳过第一列或任何不是距离的列:

apply(df,1,min) gives completely difference answers since its finding the "min" of strings.

apply(df,1,min)给出了完全不同的答案,因为它找到了字符串的“min”。

> min(2:10)
[1] 2
> min(as.character(2:10))
[1] "10"

#2


12  

Or, using the example of Justin:

或者,使用贾斯汀的例子:

df$shortest <- do.call(pmin,df[-1])

see also ?pmin and ?do.call, and note that you can drop the first variable in your data frame by using the list indices (so not using any comma at all, see also ?Extract )

另请参阅?pmin和?do.call,并注意您可以使用列表索引删除数据框中的第一个变量(因此根本不使用任何逗号,另请参阅?Extract)

#3


4  

I'd approach this with apply but transform or other approach could work.

我会通过应用来解决这个问题,但转换或其他方法可以起作用。

#fake data set
ID=LETTERS[1:5], distance=matrixsample(
DF <- as.data.frame(matrix(sample(1:100, rep=T, 100), 5, 20))
DF <- data.frame(ID=LETTERS[1:5], DF)

#solution
DF$newvar <- apply(DF[,-1], 1, min)

#1


8  

df <- data.frame(let=letters[1:25], d1=sample(1:25,25), d2=sample(1:25,25), d3=sample(1:25,25))

df$shortest <- apply(df[,2:4],1,min)

The second line applies the function min to each row and assigns it to the new column in my data.frame df. See ?apply for more explanation of what the second line is doing. Careful to skip the first column, or any columns that aren't distances:

第二行将函数min应用于每一行,并将其分配给data.frame df中的新列。请参阅?申请更多解释第二行正在做什么。小心跳过第一列或任何不是距离的列:

apply(df,1,min) gives completely difference answers since its finding the "min" of strings.

apply(df,1,min)给出了完全不同的答案,因为它找到了字符串的“min”。

> min(2:10)
[1] 2
> min(as.character(2:10))
[1] "10"

#2


12  

Or, using the example of Justin:

或者,使用贾斯汀的例子:

df$shortest <- do.call(pmin,df[-1])

see also ?pmin and ?do.call, and note that you can drop the first variable in your data frame by using the list indices (so not using any comma at all, see also ?Extract )

另请参阅?pmin和?do.call,并注意您可以使用列表索引删除数据框中的第一个变量(因此根本不使用任何逗号,另请参阅?Extract)

#3


4  

I'd approach this with apply but transform or other approach could work.

我会通过应用来解决这个问题,但转换或其他方法可以起作用。

#fake data set
ID=LETTERS[1:5], distance=matrixsample(
DF <- as.data.frame(matrix(sample(1:100, rep=T, 100), 5, 20))
DF <- data.frame(ID=LETTERS[1:5], DF)

#solution
DF$newvar <- apply(DF[,-1], 1, min)