按R中的colSums划分列

时间:2023-02-08 22:58:38

I am trying to scale the values in a matrix so that each column adds up to one. I have tried:

我试图缩放矩阵中的值,以便每列添加一个。我努力了:

m = matrix(c(1:9),nrow=3, ncol=3, byrow=T)
     [,1] [,2] [,3]
[1,]    1    2    3
[2,]    4    5    6
[3,]    7    8    9

colSums(m)
12 15 18

m = m/colSums(m)
          [,1]      [,2] [,3]
[1,] 0.08333333 0.1666667 0.25
[2,] 0.26666667 0.3333333 0.40
[3,] 0.38888889 0.4444444 0.50

colSums(m)
[1] 0.7388889 0.9444444 1.1500000

so obviously this doesn't work. I then tried this:

所以显然这不起作用。然后我尝试了这个:

m = m/matrix(rep(colSums(m),3), nrow=3, ncol=3, byrow=T)
          [,1]      [,2]      [,3]
[1,] 0.08333333 0.1333333 0.1666667
[2,] 0.33333333 0.3333333 0.3333333
[3,] 0.58333333 0.5333333 0.5000000

 m = colSums(m)
[1] 1 1 1

so this works, but it feels like I'm missing something here. This can't be how it is routinely done. I'm certain I am being stupid here. Any help you can give would be appreciated Cheers, Davy

这样可行,但感觉我在这里遗漏了一些东西。这不是常规做法。我确定我在这里很傻。任何你能给予的帮助都会受到赞赏,干杯,戴维

2 个解决方案

#1


39  

See ?sweep, eg:

看?扫描,例如:

> sweep(m,2,colSums(m),`/`)
           [,1]      [,2]      [,3]
[1,] 0.08333333 0.1333333 0.1666667
[2,] 0.33333333 0.3333333 0.3333333
[3,] 0.58333333 0.5333333 0.5000000

or you can transpose the matrix and then colSums(m) gets recycled correctly. Don't forget to transpose afterwards again, like this :

或者你可以转置矩阵然后colSums(m)被正确回收。不要忘记再次转置,如下:

> t(t(m)/colSums(m))
           [,1]      [,2]      [,3]
[1,] 0.08333333 0.1333333 0.1666667
[2,] 0.33333333 0.3333333 0.3333333
[3,] 0.58333333 0.5333333 0.5000000

Or you use the function prop.table() to do basically the same:

或者你使用函数prop.table()来做基本相同的事情:

> prop.table(m,2)
           [,1]      [,2]      [,3]
[1,] 0.08333333 0.1333333 0.1666667
[2,] 0.33333333 0.3333333 0.3333333
[3,] 0.58333333 0.5333333 0.5000000

The time differences are rather small. the sweep() function and the t() trick are the most flexible solutions, prop.table() is only for this particular case

时差相当小。 sweep()函数和t()技巧是最灵活的解决方案,prop.table()仅适用于这种特殊情况

#2


5  

Per usual, Joris has a great answer. Two others that came to mind:

按照惯例,Joris有一个很好的答案。想到的另外两个人:

#Essentially your answer
f1 <- function() m / rep(colSums(m), each = nrow(m))
#Two calls to transpose
f2 <- function() t(t(m) / colSums(m))
#Joris
f3 <- function() sweep(m,2,colSums(m),`/`)

Joris' answer is the fastest on my machine:

Joris的回答是我机器上最快的答案:

> m <- matrix(rnorm(1e7), ncol = 10000)
> library(rbenchmark)
> benchmark(f1,f2,f3, replications=1e5, order = "relative")
  test replications elapsed relative user.self sys.self user.child sys.child
3   f3       100000   0.386   1.0000     0.385    0.001          0         0
1   f1       100000   0.421   1.0907     0.382    0.002          0         0
2   f2       100000   0.465   1.2047     0.386    0.003          0         0

#1


39  

See ?sweep, eg:

看?扫描,例如:

> sweep(m,2,colSums(m),`/`)
           [,1]      [,2]      [,3]
[1,] 0.08333333 0.1333333 0.1666667
[2,] 0.33333333 0.3333333 0.3333333
[3,] 0.58333333 0.5333333 0.5000000

or you can transpose the matrix and then colSums(m) gets recycled correctly. Don't forget to transpose afterwards again, like this :

或者你可以转置矩阵然后colSums(m)被正确回收。不要忘记再次转置,如下:

> t(t(m)/colSums(m))
           [,1]      [,2]      [,3]
[1,] 0.08333333 0.1333333 0.1666667
[2,] 0.33333333 0.3333333 0.3333333
[3,] 0.58333333 0.5333333 0.5000000

Or you use the function prop.table() to do basically the same:

或者你使用函数prop.table()来做基本相同的事情:

> prop.table(m,2)
           [,1]      [,2]      [,3]
[1,] 0.08333333 0.1333333 0.1666667
[2,] 0.33333333 0.3333333 0.3333333
[3,] 0.58333333 0.5333333 0.5000000

The time differences are rather small. the sweep() function and the t() trick are the most flexible solutions, prop.table() is only for this particular case

时差相当小。 sweep()函数和t()技巧是最灵活的解决方案,prop.table()仅适用于这种特殊情况

#2


5  

Per usual, Joris has a great answer. Two others that came to mind:

按照惯例,Joris有一个很好的答案。想到的另外两个人:

#Essentially your answer
f1 <- function() m / rep(colSums(m), each = nrow(m))
#Two calls to transpose
f2 <- function() t(t(m) / colSums(m))
#Joris
f3 <- function() sweep(m,2,colSums(m),`/`)

Joris' answer is the fastest on my machine:

Joris的回答是我机器上最快的答案:

> m <- matrix(rnorm(1e7), ncol = 10000)
> library(rbenchmark)
> benchmark(f1,f2,f3, replications=1e5, order = "relative")
  test replications elapsed relative user.self sys.self user.child sys.child
3   f3       100000   0.386   1.0000     0.385    0.001          0         0
1   f1       100000   0.421   1.0907     0.382    0.002          0         0
2   f2       100000   0.465   1.2047     0.386    0.003          0         0