两个日期之间的每一行获得最大值?

时间:2021-06-21 19:16:55

learning sql here and I ran into a challenge.

在这里学习SQL,我遇到了挑战。

I have the following table:

我有下表:

tbl <- data.frame(
   id_name = c("a", "a", "b", "c", "d", "f", "b", "c", "d", "f"),
   value = c(1, -1, 1, 1, 1, 1, -1, -1, -1, -1),
   score = c(1, 0, 1, 2, 3, 4, 3, 2, 1, 0),
   date = as.Date(c("2001-1-1", "2002-1-1", "2003-1-1", "2005-1-1", 
                    "2005-1-1", "2007-1-1", "2008-1-1", "2010-1-1", 
                    "2011-1-1", "2012-1-1"), "%Y-%m-%d")
                   )


+---------+-------+-------+-----------+
| id_name | value | score |   date    |
+---------+-------+-------+-----------+
| a       |     1 |     1 |  2001-1-1 |
|  a      |    -1 |     0 |  2002-1-1 |
|  b      |     1 |     1 |  2003-1-1 |
|  c      |     1 |     2 |  2005-1-1 |
|  d      |     1 |     3 |  2005-1-1 |
|  f      |     1 |     4 |  2007-1-1 |
|  b      |    -1 |     3 |  2008-1-1 |
|  c      |    -1 |     2 |  2010-1-1 |
|  d      |    -1 |     1 |  2011-1-1 |
|  f      |    -1 |     0 |  2012-1-1 |
+---------+-------+-------+-----------+

My goal is this:

我的目标是:

For each id_name, I'd like to get the first date (in case of tie breakers) of maximum score from the tbl between the dates where the current row = id_name (inclusive)

对于每个id_name,我想在当前行= id_name(包括)的日期之间获得tbl的最大分数(如果是断路器)

For example, id_name 'a' should return '2001-1-1' since its score is 1 id_name 'b' should return '2007-1-1' since its score is 4:

例如,id_name'a'应返回'2001-1-1',因为其得分为1 id_name'b'应返回'2007-1-1',因为其得分为4:

+---------+----------+
| id_name |   date   |
+---------+----------+
| a       | 2001-1-1 |
| b       | 2007-1-1 |
+---------+----------+

This is what I have thus far,

这是我到目前为止,

   sqldf("
  SELECT 
    id_name,
    date,
    score
  FROM
    tbl As d
  WHERE
    score = (
                        SELECT MAX(score)
                        FROM tbl As b
                        WHERE 
                          date >= (
                                        SELECT MIN(date)
                                        FROM tbl
                                        WHERE id_name = b.id_name
                          ) AND
                          date <= (
                                        SELECT MAX(date)
                                        FROM tbl
                                        WHERE id_name = b.id_name
                          )
    )
  ")

Problem is that it is returning the rows with the global max value irrespective of the current row value

问题是它返回具有全局最大值的行,而与当前行值无关

Thanks!

1 个解决方案

#1


0  

I think a correlated subquery in the WHERE clause will fit the bill here:

我认为WHERE子句中的相关子查询符合以下条件:

SELECT id_name, date
FROM tbl as t1
WHERE score = (SELECT max(score) FROM tbl WHERE id_name = t1.id_name)

#1


0  

I think a correlated subquery in the WHERE clause will fit the bill here:

我认为WHERE子句中的相关子查询符合以下条件:

SELECT id_name, date
FROM tbl as t1
WHERE score = (SELECT max(score) FROM tbl WHERE id_name = t1.id_name)