SQL查找表中第一次出现的数据集

时间:2022-08-23 11:03:06

Say if I have a table:

假如我有一张桌子:

CREATE TABLE T
(
    TableDTM  TIMESTAMP  NOT NULL,
    Code      INT        NOT NULL
);

And I insert some rows:

我插入一些行:

INSERT INTO T (TableDTM, Code) VALUES ('2011-01-13 10:00:00', 5);
INSERT INTO T (TableDTM, Code) VALUES ('2011-01-13 10:10:00', 5);
INSERT INTO T (TableDTM, Code) VALUES ('2011-01-13 10:20:00', 5);
INSERT INTO T (TableDTM, Code) VALUES ('2011-01-13 10:30:00', 5);
INSERT INTO T (TableDTM, Code) VALUES ('2011-01-13 10:40:00', 0);
INSERT INTO T (TableDTM, Code) VALUES ('2011-01-13 10:50:00', 1);
INSERT INTO T (TableDTM, Code) VALUES ('2011-01-13 11:00:00', 1);
INSERT INTO T (TableDTM, Code) VALUES ('2011-01-13 11:10:00', 1);
INSERT INTO T (TableDTM, Code) VALUES ('2011-01-13 11:20:00', 0);
INSERT INTO T (TableDTM, Code) VALUES ('2011-01-13 11:30:00', 5);
INSERT INTO T (TableDTM, Code) VALUES ('2011-01-13 11:40:00', 5);
INSERT INTO T (TableDTM, Code) VALUES ('2011-01-13 11:50:00', 3);
INSERT INTO T (TableDTM, Code) VALUES ('2011-01-13 12:00:00', 3);
INSERT INTO T (TableDTM, Code) VALUES ('2011-01-13 12:10:00', 3);

So I end up with a table similar to:

所以我最终得到了一张类似于的表:

2011-01-13 10:00:00, 5
2011-01-13 10:10:00, 5
2011-01-13 10:20:00, 5
2011-01-13 10:30:00, 5
2011-01-13 10:40:00, 0
2011-01-13 10:50:00, 1
2011-01-13 11:00:00, 1
2011-01-13 11:10:00, 1
2011-01-13 11:20:00, 0
2011-01-13 11:30:00, 5
2011-01-13 11:40:00, 5
2011-01-13 11:50:00, 3
2011-01-13 12:00:00, 3
2011-01-13 12:10:00, 3

How can I select the first date of each set of identical numbers, so I end up with this:

如何选择每组相同数字的第一个日期,所以我最终得到:

2011-01-13 10:00:00, 5
2011-01-13 10:40:00, 0
2011-01-13 10:50:00, 1
2011-01-13 11:20:00, 0
2011-01-13 11:30:00, 5
2011-01-13 11:50:00, 3

I've been messing about with sub queries and the like for most of the day and for some reason I can't seem to crack it. I'm sure there's a simple way somewhere!

在一天中的大部分时间里,我一直在搞乱子查询等等,出于某种原因我似乎无法破解它。我确定在某个地方有一个简单的方法!

I would probably want to exclude the 0's from the results, but that's not important for now..

我可能想要从结果中排除0,但这对于现在来说并不重要。

4 个解决方案

#1


4  

Revised 15 Jan 11

I'm sure there's a simple way somewhere!

我确定在某个地方有一个简单的方法!

Yes, there is. But first, two Issues.

就在这里。但首先是两个问题。

  1. The table is not a Relational Database table. It does not have an unique key, which is demanded by the RM and Normalisation (specifically that each row must have an unique identifier; not necessarily a PK). Therefore SQL, a standard language, for operating on Relational Database tables, cannot perform basic operations on it.

    该表不是关系数据库表。它没有唯一的密钥,这是RM和规范化所要求的(具体地说,每一行必须具有唯一的标识符;不一定是PK)。因此,用于在关系数据库表上操作的SQL(标准语言)无法对其执行基本操作。

    • it is a Heap (data structure, inserted and deleted in chronological order), with records not rows.
    • 它是一个堆(数据结构,按时间顺序插入和删除),记录不是行。
    • any and all operations using SQL will be horribly slow, and will not be correct
    • 使用SQL的任何和所有操作都会非常慢,并且不正确
    • SET ROWCOUNT to 1, perform row processing, and SQL will work on the Heap just fine
    • 将ROWCOUNT设置为1,执行行处理,SQL就可以在Heap上运行了
    • your best bet is use any unix utiliy to operate on it (awk, cut, chop). They are blindingly fast. The awk script required to answer your requirement would take 3 mins to write and it will run in seconds for millions of records (I wrote a few last week).
      .
    • 你最好的选择是使用任何unix utiliy来操作它(awk,cut,chop)。他们的速度非常快。响应你的要求所需的awk脚本需要3分钟才能写入,并且它将在几秒钟内运行数百万条记录(我上周写了一些)。 。

    So the question really is SQL to find the first occurrence of sets of data in a non-relational Heap.

    所以问题实际上是SQL在非关系堆中找到第一组数据集。

    Now if your question was SQL to find the first occurrence of sets of data in a Relational table, implying of course some unique row identifier, that would be (a) easy in SQL, and (b) fast in any flavour of SQL ...

    现在,如果你的问题是SQL来查找Relational表中第一次出现的数据集,当然暗示一些独特的行标识符,这将是(a)在SQL中很容易,以及(b)在任何SQL的快速中都很快。 。

    • except Oracle, which is known to handle subqueries badly (specifically Tony Andrews' comments, he is a well-known authority on Oracle). In which case, use Materialised Views.
      .
    • 除了Oracle之外,已知它会严重处理子查询(特别是Tony Andrews的评论,他是Oracle的知名权威)。在这种情况下,请使用物化视图。 。
  2. The question is very generic (no complaint). But many of these specific needs are usually applied within a larger context, and the context has requirements which are absent from the specification here. Generally the need is for a simple Subquery (but in Oracle use a Materialised View to avoid the subquery). And the subquery, too, depends on the outer context, the outer query. Therefore the answer to the small generic question will not contain the answer to the actual specific need.

    这个问题非常通用(没有投诉)。但是,这些特定需求中的许多通常在更大的上下文中应用,并且上下文具有本说明书中不存在的要求。通常需要一个简单的子查询(但在Oracle中使用物化视图来避免子查询)。子查询也取决于外部上下文,外部查询。因此,小通用问题的答案将不包含实际特定需求的答案。


Anyway, I do not wish to avoid the question. Why don't we use a real world example, rather than a simple generic one; and find the first or last occurrence, or minimum or maximum value, of a set of data, within another set of data, in a Relational table ?

无论如何,我不想回避这个问题。为什么我们不使用现实世界的例子,而不是简单的通用例子;并在Relational表中查找另一组数据中的一组数据的第一个或最后一个或最小值或最大值?

Main Query

主要查询

Let's use the ▶Data Model◀ from your previous question.

让我们使用上一个问题中的▶数据模型◀。

Report all Alerts since a certain date, with the peak Value for the duration, that are not Acknowledged

报告自特定日期以来的所有警报,其持续时间的峰值为未确认

Since you will be using exactly the same technique (with different table and column names) for all your temporal and History requirements, you need to fully understand the basic construct of a Subquery, and its different applications.

由于您将使用完全相同的技术(具有不同的表和列名称)来满足您的所有时间和历史要求,因此您需要完全理解子查询的基本构造及其不同的应用程序。

Introduction

Note that you have, not only a pure 5NF Database, with Relational Identifiers (composite keys), you have full Temporal capability throughout, and the temporal requirement is rendered without breaking 5NF (No Update Anomalies), which means the ValidToDateTime for periods and durations is derived, and not duplicated in data. Point is, that complicates things, hence this is not the best example for a tutorial on Subqueries.

请注意,您不仅拥有纯5NF数据库,还具有关系标识符(复合键),您具有完整的Temporal功能,并且在不破坏5NF(无更新异常)的情况下呈现时间要求,这意味着有效期和持续时间的ValidToDateTime是派生的,而不是在数据中重复。点是,这使事情变得复杂,因此这不是子查询教程的最佳示例。

  • Remember the SQL engine is a set-processor, so we approach the problem with a set-oriented mindset
    • do not dumb the engine down to row-processing; that is very slow
    • 不要把发动机笨到行处理;那很慢
    • and more important, unnecessary
    • 更重要的是,没有必要
  • 请记住,SQL引擎是一个集合处理器,所以我们用面向集合的思维方式处理问题,不要将引擎愚蠢到行处理;这是非常缓慢,更重要,不必要的
  • Subqueries are normal SQL. The syntax I am using is straight ISO/IEC/ANSI SQL.
    • if you cannot code subqueries in SQL, you will be very limited; and then need to introduce data duplication or use large result sets as Materialised Views or temporary tables or all manner of additional data and additional processing, which will be s.l.o.w to v.e.r.y s.l.o.w, not to mention completely unnecessary
    • 如果你不能在SQL中编写子查询,那么你将非常有限;然后需要引入数据复制或使用大型结果集作为物化视图或临时表或所有方式的附加数据和附加处理,这将是s.l.o.w到v.e.r.y s.l.o.w,更不用说完全没必要了
    • if there is anything you cannot do in a truly Relational Database (and my Data Models always are) without switching to row-processing or inline views or temp tables, ask for help, which is what you have done here.
    • 如果在真正的关系数据库(以及我的数据模型总是如此)中你无法做任何事情而不切换到行处理或内联视图或临时表,请求帮助,这就是你在这里所做的。
  • 子查询是普通的SQL。我使用的语法是直接的ISO / IEC / ANSI SQL。如果你不能在SQL中编写子查询,那么你将非常有限;然后需要引入数据复制或使用大型结果集作为物化视图或临时表或所有方式的附加数据和附加处理,这将非常慢,更不用说如果有任何你不能做的事情是完全没必要的真正的关系数据库(以及我的数据模型总是如此),无需切换到行处理或内联视图或临时表,请求帮助,这就是你在这里所做的。
  • You need to fully understand the first Subquery (simpler) before attempting to understand the second; etc.
  • 在尝试理解第二个子查询之前,您需要完全理解第一个子查询(更简单);等等

Method

First build the Outer query using minimum joins, etc, based on the structure of the result set that you need, and nothing more. It is very important that the structure of the outer query is resolved first; otherwise you will go back and forth trying to make the subquery fit the outer query, and vice versa.

首先使用最小连接等构建外部查询,具体取决于您需要的结果集的结构,仅此而已。首先解析外部查询的结构是非常重要的;否则你会来回尝试使子查询适合外部查询,反之亦然。

  • That happens to require a Subquery as well. So leave that part out for now, and pick that up later. For now, the Outer query gets all (not un-acknowledged) Alerts after a certain date
  • 这恰好需要一个子查询。所以现在就把这个部分留下来,然后再把它拿出来。现在,外部查询在特定日期之后获得所有(不是未确认的)警报

The ▶SQL code◀ required is on page 1 (sorry, the SO edit features are horrible, it destroys the formatting, and the code is already formatted).

需要▶SQL代码◀是第1页(抱歉,SO编辑功能很糟糕,它会破坏格式化,代码已经格式化)。

Then build the Subquery to fill each cell.

然后构建子查询以填充每个单元格。

Subquery (1) Derive Alert.Value

子查询(1)导出Alert.Value

That is a simple derived datum, select the Value from the Reading that generated the Alert. The tables are related, the cardinality is 1::1, so it is a straight join on the PK.

这是一个简单的派生数据,从生成Alert的Reading中选择Value。这些表是相关的,基数是1 :: 1,所以它是PK上的直接连接。

  • The type of Subquery required here is a Correlated Subquery, we need to correlate a table in the Outer query to a table in the (inner) Subquery.
    • in order to do that, we need an Alias for the table in the Outer query, to correlate it to a table in the Subquery.
    • 为了做到这一点,我们需要外部查询中的表的别名,以将其与子查询中的表相关联。
    • to make the distinction, I have used aliases only for such required correlation, and fully qualified names for plain joins
    • 为了区分,我只使用了别名来表示这种必需的相关性,以及普通连接的完全限定名
  • 此处所需的子查询类型是相关子查询,我们需要将外部查询中的表与(内部)子查询中的表相关联。为了做到这一点,我们需要外部查询中的表的别名,以将其与子查询中的表相关联。为了区分,我只使用了别名来表示这种必需的相关性,以及普通连接的完全限定名
  • Subqueries are very fast in any engine (except Oracle)
  • 任何引擎中的子查询都非常快(Oracle除外)
  • SQL is a cumbersome language. But that's all we have. So get used to it.
  • SQL是一种繁琐的语言。但这就是我们所拥有的一切。所以习惯它。

The ▶SQL code◀ required is on page 2.

需要▶SQL代码◀是第2页。

I have purposely given you a mix of joins in the Outer Query vs obtaining data via Subquery, so that you can learn (you could alternately obtain Alert.Value via a join, but that would be even more cumbersome).

我有意在外部查询中添加了多个连接,并通过子查询获取数据,这样您就可以学习(您可以通过连接交替获取Alert.Value,但这会更加繁琐)。

The next Subquery we need derives Alert.PeakValue. For that we need to determine the Temporal Duration of the Alert. We have the beginning of the Alert Duration; we need to determine the end of the Duration, which is the next (temporally) Reading.Value that is within range. That requires a Subquery as well, which we better handle first.

我们需要的下一个子查询派生Alert.PeakValue。为此,我们需要确定警报的时间持续时间。我们有警报持续时间的开始;我们需要确定持续时间的结束,这是范围内的下一个(暂时的)Reading.Value。这也需要一个Subquery,我们最好先处理它。

  • Work the logic from the inside, outward. Good old BODMAS.
  • 从内部向外工作逻辑。好老BODMAS。

Subquery (2) Derive Alert.EndDtm

子查询(2)导出Alert.EndDtm

A slightly more complex Suquery to select the first Reading.ReadingDtm, that is greater than or equal to the Alert.ReadingDtm, that has a Reading.Value which is less than or equal to its Sensor.UpperLimit.

稍微复杂的Suquery,用于选择第一个Reading.ReadingDtm,它大于或等于Alert.ReadingDtm,其Reading.Value小于或等于其Sensor.UpperLimit。

Handling 5NF Temporal Data

处理5NF时态数据

For handling temporal requirements in a 5NF Database (in which EndDateTime is not stored, as is duplicate data), we work on a StartDateTime only, and the EndDateTime is derived: it is the next StartDateTime. This is the Temporal notion of Duration.

为了处理5NF数据库中的时间要求(其中未存储EndDateTime,如同重复数据),我们仅处理StartDateTime,并导出EndDateTime:它是下一个StartDateTime。这是持续时间的时间概念。

  • Technically, it is one millisec (whatever the resolution for the Datatype used) less.
  • 从技术上讲,它是一毫秒(无论所使用的数据类型的分辨率如何)更少。
  • However, in order to be reasonable, we can speak of, and report, EndDateTime as simply the Next.StartDateTime, and ignore the one millisecond issue.
  • 但是,为了合理,我们可以说EndDateTime并将其报告为Next.StartDateTime,并忽略一毫秒的问题。
  • The code should always use >= This.StartDateTime and < Next.StartDateTime.
    • That eliminates a slew of avoidable bugs
    • 这消除了大量可避免的错误
    • Note that these comparison operators, which bracket the Temporal Duration, and should be used in a conventional manner throughout as per above, are quite independent of similar comparison operators related to business logic, eg. Sensor.UpperLimit (ie. watch for it, because both are often located in one WHERE clause, and it is easy to mix them up or get confused).
    • 注意,这些比较运算符包括时间持续时间,并且应该按照上面的传统方式使用,它们完全独立于与业务逻辑相关的类似比较运算符,例如。 Sensor.UpperLimit(即监视它,因为它们通常都位于一个WHERE子句中,很容易将它们混淆或混淆)。
  • 代码应始终使用> = This.StartDateTime和 。这消除了一系列可避免的错误请注意,这些比较运算符包含时间持续时间,并且应该按照上面的常规方式使用,它们完全独立于与业务逻辑相关的类似比较运算符,例如。>

The ▶SQL code◀ required, along with test data used, is on page 3.

所需的▶SQL代码◀以及使用的测试数据在第3页。

Subquery (3) Derive Alert.PeakValue

子查询(3)派生Alert.PeakValue

Now it is easy. Select the MAX(Value) from Readings between Alert.ReadingDtm and Alert.EndDtm, the duration of the Alert.

现在很容易。从Alert.ReadingDtm和Alert.EndDtm之间的读数中选择MAX(值),即警报的持续时间。

The ▶SQL code◀ required is on page 4.

需要▶SQL代码◀是第4页。

Scalar Subquery

标量子查询

In addition to being Correlated Subqueries, the above are all Scalar Subqueries, as they return a single value; each cell in the grid can be filled with only one value. (Non-Scalar Subqueries, that return multiple values, are quite legal, but not for the above.)

除了是相关子查询之外,以上都是标量子查询,因为它们返回单个值;网格中的每个单元格只能填充一个值。 (返回多个值的非标量子查询非常合法,但不适用于上述情况。)

Subquery (4) Acknowledged Alerts

子查询(4)已确认的警报

Ok, now that you have a handle on the above Correlated Scalar Subqueries, those that fill cells in a set, a set that is defined by the Outer query, let's look at a Subquery that can be used to constrain the Outer query. We do not really want all Alerts (above), we want Un-Acknowledged Alerts: the Identifiers that exist in Alert, that do not exist in Acknowledgement. That is not filling cells, that is changing the content of the Outer set. Of course, that means changing the WHERE clause.

好了,现在您已经掌握了上述相关标量子查询,那些填充集合中单元格的子集,一个由外部查询定义的集合,让我们看一下可用于约束外部查询的子查询。我们并不真正想要所有警报(上图),我们需要未确认的警报:警报中存在的标识符,在确认中不存在。这不是填充单元格,即改变外部集合的内容。当然,这意味着更改WHERE子句。

  • We are not changing the structure of the Outer set, so there is no change to the FROM and existing WHERE clauses.
  • 我们不会更改外部集的结构,因此FROM和现有的WHERE子句没有变化。

Simply add a WHERE condition to exclude the set of Acknowledged Alerts. 1::1 cardinality, straight Correlated join.

只需添加WHERE条件即可排除已确认的警报集。 1 :: 1基数,直相关联接。

The ▶SQL code◀ required is on page 5.

需要▶SQL代码◀是第5页。

The difference is, this is a non-Scalar Subquery, producing a set of rows (one column). We have an entire set of Alerts (the Outer set) matched against an entire set of Acknowledgements.

不同的是,这是一个非标量子查询,产生一组行(一列)。我们有一整套警报(外部集)与一整套致谢相匹配。

  • The matching is processed because we have told the engine that the Subquery is Correlated, by using an alias (no need for cumbersome joins to be identified)
  • 处理匹配是因为我们通过使用别名告知引擎子查询是相关的(不需要识别繁琐的连接)
  • Use 1, because we are performing an existence check. Visualise it as a column added onto the Alert set defined by the Outer query.
  • 使用1,因为我们正在执行存在检查。将其可视化为添加到外部查询定义的警报集上的列。
  • Never use * because we do not need the entire set of columns, and that will be slower
  • 永远不要使用*,因为我们不需要整套列,而且速度会慢一些
  • Likewise, failing to use a correlation, means a WHERE NOT IN () is required, but again, that constructs the defined column set, then compares the two sets. Much slower.
  • 同样,未使用相关性意味着需要WHERE NOT IN(),但同样,构造定义的列集,然后比较两个集合。慢得多。

Subquery (5) Actioned Alerts

子查询(5)操作警报

As an alternative constraint on the Outer query, for un-actioned Alerts, instead of (4), exclude the set of Actioned Alerts. Straight Correlated join.

作为外部查询的替代约束,对于未操作的警报,而不是(4),排除一组操作警报。直接相关联接。

The ▶SQL code◀ required is on page 5.

需要▶SQL代码◀是第5页。

This code has been tested on Sybase ASE 15.0.3 using 1000 Alerts and 200 Acknowledgements, of different combinations; and the Readings and Alerts identified in the document. Zero milliseconds execution time (0.003 second resolution) for all executions.

此代码已在Sybase ASE 15.0.3上使用1000个警报和200个已确认的不同组合进行测试;以及文件中确定的读数和警报。所有执行的零毫秒执行时间(0.003秒分辨率)。

If you need it, here is the ▶SQL Code in Text Format◀.

如果需要,可以使用文本格式的▶SQL代码◀。

Response to Comments

(6) ▶Register Alert from Reading◀
This code executes in a loop (provided), selecting new Readings which are out-of-range, and creating Alerts, except where applicable Alerts already exist.

(6)▶从阅读中注册警报◀此代码在循环(提供)中执行,选择超出范围的新读数,并创建警报,除非适用的警报已存在。

(7) ▶Load Alert From Reading◀
Given that you have a full set of test data for Reading, this code uses a modified form of (6) to load the applicable Alerts.

(7)▶从阅读中加载警报◀鉴于您有一整套用于阅读的测试数据,此代码使用修改后的(6)形式加载适用的警报。

Common Problem

It is "simple" when you know how. I repeat, writing SQL without the ability to write Subqueries is very limiting; it is essential for handling Relational Databases, which is what SQL was designed for.

当你知道如何时,它是“简单的”。我再说一遍,编写没有编写子查询能力的SQL是非常有限的;它对于处理关系数据库至关重要,这是SQL的设计目标。

  • Half the reason developers implement unnormalised data heaps (massive data duplication) is because they cannot write the subqueries required for Normalised structures
    • it is not that they have "denormalised for performance"; it is that they cannot code for Normalised. I have seen it a hundred times.
    • 并不是说他们“为表现而非正常化”;它们无法为Normalized编码。我已经看过它一百次了。
    • Case in point here: you have a fully Normalised Relational Database, and the difficulty is coding for it, and you were contemplating duplicating tables for processing purposes.
    • 举个例子:你有一个完全规范化的关系数据库,困难就是为它编码,你正在考虑复制表以便进行处理。
  • 开发人员实现非标准化数据堆(大规模数据复制)的一半原因是因为它们无法编写规范化结构所需的子查询,因此它们并非“为了性能而非规范化”;它们无法为Normalized编码。我已经看过它一百次了。举个例子:你有一个完全规范化的关系数据库,困难就是为它编码,你正在考虑复制表以便进行处理。
  • And that is not counting the added complexity of a temporal database; or a 5NF temporal database.
  • 这还不包括时间数据库增加的复杂性;或5NF时态数据库。
  • Normalisation means Never Duplicate Anything, more recently known as Don't Repeat Yourself
  • 规范化意味着永远不要复制任何东西,最近被称为“不要重复自己”
  • Master Suqueries and you will be in the 98th percentile: Normalised, true Relational Databases; zero data duplication; very high performance.
  • Master Suqueries,你将处于第98个百分点:规范化,真实的关系数据库;零数据重复;非常高的性能。

I think you can figure out the remaining queries you have.

我想你可以找出你剩下的查询。

Relational Identifier

Note, this example also happens to demonstrate the power of using Relational Identifiers, in that several tables in-between the ones we want do not have to be joined (yes! the truth is Relational Identifiers means less, not more, joins, than Id keys). Simply follow the solid lines.

注意,这个例子也恰好证明了使用关系标识符的能力,因为我们想要的几个表之间不必连接(是的!事实是关系标识符意味着更少,而不是更多,连接,而不是Id键)。只需按照实线。

  • Your temporal requirement demands keys containing DateTime. Imagine trying to code the above with Id PKs, there would be two levels of processing: one for the joins (and there would be far more of them), and another for the data processing.
  • 您的时间要求需要包含DateTime的键。想象一下尝试用Id PK编写上面的代码,会有两个级别的处理:一个用于连接(并且会有更多的连接),另一个用于数据处理。

Label

I try to stay away from colloquial labels ("nested", "inner", etc) because they are not specific, and stick to specific technical terms. For completeness and understanding:

我试图远离口语标签(“嵌套”,“内部”等)因为它们不具体,并坚持特定的技术术语。为了完整和理解:

  • a Subquery after the FROM clause, is a Materialised View, a result set derived in one query and then fed into the FROM clause of another query, as a "table".
    • The Oracle types call this Inline View.
    • Oracle类型调用此内联视图。
    • In most cases, you can write Correlated Subqueries as Materialised Views, but that is massively more I/O and processing (since Oracles handling of subqueries is abyssmal, for Oracle only, Materialised Views are "faster").
      .
    • 在大多数情况下,您可以将相关子查询编写为物化视图,但这大大增加了I / O和处理(因为Oracles处理子查询非常简单,仅对于Oracle,物化视图“更快”)。 。
  • FROM子句之后的子查询,是一个物化视图,一个查询中派生的结果集,然后作为“表”输入另一个查询的FROM子句。 Oracle类型调用此内联视图。在大多数情况下,您可以将相关子查询编写为物化视图,但这大大增加了I / O和处理(因为Oracles处理子查询非常简单,仅对于Oracle,物化视图“更快”)。 。
  • A Subquery in the WHERE clause is a Predicate Subquery, because it changes the content of the result set (that which it is predicated upon). It can return either a Scalar (one value) or non-Scalar (many values).

    WHERE子句中的子查询是谓词子查询,因为它更改了结果集的内容(它所基于的内容)。它可以返回标量(一个值)或非标量(多个值)。

    • for Scalars, use WHERE column =, or any scalar operator

      对于Scalars,请使用WHERE column =或任何标量运算符

    • for non-Scalars, use WHERE [NOT] EXISTS, or WHERE column [NOT] IN

      对于非Scalars,使用WHERE [NOT] EXISTS或WHERE列[NOT] IN

  • A Suquery in the WHERE clause does not need to be Correlated; the following works just fine. Identify all superfluous appendages:

    WHERE子句中的Suquery不需要相关;以下工作正常。识别所有多余的附属物:

    SELECT  [Never] = FirstName,
            [Acted] = LastName 
        FROM User 
        WHERE UserId NOT IN ( SELECT DISTINCT UserId
            FROM Action
            )

#2


1  

Try this:

尝试这个:

SELECT MIN(TableDTM) TableDTM, Code
FROM
(
    SELECT T1.TableDTM, T1.Code, MIN(T2.TableDTM) XTableDTM
    FROM T T1
    LEFT JOIN T T2
    ON T1.TableDTM <= T2.TableDTM
    AND T1.Code <> T2.Code
    GROUP BY T1.TableDTM, T1.Code
) X
GROUP BY XTableDTM, Code
ORDER BY 1;

#3


1  

PostgreSQL supports window functions, have a look at this

PostgreSQL支持窗口函数,看看这个

[EDIT] Try the following:

[编辑]尝试以下内容:

SELECT TableDTM, Code FROM
(
    SELECT TableDTM,
           Code,
           LAG(Code, 1, NULL) OVER (ORDER BY TableDTM) AS PrevCode
    FROM   T
)
WHERE PrevCode<>Code OR PrevCode IS NULL;

#4


0  

could you try something like

你可以尝试一下吗?

"SELECT DISTINCT Code, (SELECT MIN(TableDTM) FROM T AS Q WHERE Q.Code = T.Code) As TableDTM FROM T;"

and if you need to exclude the 0, change it in:

如果您需要排除0,请将其更改为:

 SELECT DISTINCT Code, (SELECT MIN(TableDTM) FROM T AS Q WHERE Q.Code = T.Code) As TableDTM FROM T WHERE Code <> 0;

#1


4  

Revised 15 Jan 11

I'm sure there's a simple way somewhere!

我确定在某个地方有一个简单的方法!

Yes, there is. But first, two Issues.

就在这里。但首先是两个问题。

  1. The table is not a Relational Database table. It does not have an unique key, which is demanded by the RM and Normalisation (specifically that each row must have an unique identifier; not necessarily a PK). Therefore SQL, a standard language, for operating on Relational Database tables, cannot perform basic operations on it.

    该表不是关系数据库表。它没有唯一的密钥,这是RM和规范化所要求的(具体地说,每一行必须具有唯一的标识符;不一定是PK)。因此,用于在关系数据库表上操作的SQL(标准语言)无法对其执行基本操作。

    • it is a Heap (data structure, inserted and deleted in chronological order), with records not rows.
    • 它是一个堆(数据结构,按时间顺序插入和删除),记录不是行。
    • any and all operations using SQL will be horribly slow, and will not be correct
    • 使用SQL的任何和所有操作都会非常慢,并且不正确
    • SET ROWCOUNT to 1, perform row processing, and SQL will work on the Heap just fine
    • 将ROWCOUNT设置为1,执行行处理,SQL就可以在Heap上运行了
    • your best bet is use any unix utiliy to operate on it (awk, cut, chop). They are blindingly fast. The awk script required to answer your requirement would take 3 mins to write and it will run in seconds for millions of records (I wrote a few last week).
      .
    • 你最好的选择是使用任何unix utiliy来操作它(awk,cut,chop)。他们的速度非常快。响应你的要求所需的awk脚本需要3分钟才能写入,并且它将在几秒钟内运行数百万条记录(我上周写了一些)。 。

    So the question really is SQL to find the first occurrence of sets of data in a non-relational Heap.

    所以问题实际上是SQL在非关系堆中找到第一组数据集。

    Now if your question was SQL to find the first occurrence of sets of data in a Relational table, implying of course some unique row identifier, that would be (a) easy in SQL, and (b) fast in any flavour of SQL ...

    现在,如果你的问题是SQL来查找Relational表中第一次出现的数据集,当然暗示一些独特的行标识符,这将是(a)在SQL中很容易,以及(b)在任何SQL的快速中都很快。 。

    • except Oracle, which is known to handle subqueries badly (specifically Tony Andrews' comments, he is a well-known authority on Oracle). In which case, use Materialised Views.
      .
    • 除了Oracle之外,已知它会严重处理子查询(特别是Tony Andrews的评论,他是Oracle的知名权威)。在这种情况下,请使用物化视图。 。
  2. The question is very generic (no complaint). But many of these specific needs are usually applied within a larger context, and the context has requirements which are absent from the specification here. Generally the need is for a simple Subquery (but in Oracle use a Materialised View to avoid the subquery). And the subquery, too, depends on the outer context, the outer query. Therefore the answer to the small generic question will not contain the answer to the actual specific need.

    这个问题非常通用(没有投诉)。但是,这些特定需求中的许多通常在更大的上下文中应用,并且上下文具有本说明书中不存在的要求。通常需要一个简单的子查询(但在Oracle中使用物化视图来避免子查询)。子查询也取决于外部上下文,外部查询。因此,小通用问题的答案将不包含实际特定需求的答案。


Anyway, I do not wish to avoid the question. Why don't we use a real world example, rather than a simple generic one; and find the first or last occurrence, or minimum or maximum value, of a set of data, within another set of data, in a Relational table ?

无论如何,我不想回避这个问题。为什么我们不使用现实世界的例子,而不是简单的通用例子;并在Relational表中查找另一组数据中的一组数据的第一个或最后一个或最小值或最大值?

Main Query

主要查询

Let's use the ▶Data Model◀ from your previous question.

让我们使用上一个问题中的▶数据模型◀。

Report all Alerts since a certain date, with the peak Value for the duration, that are not Acknowledged

报告自特定日期以来的所有警报,其持续时间的峰值为未确认

Since you will be using exactly the same technique (with different table and column names) for all your temporal and History requirements, you need to fully understand the basic construct of a Subquery, and its different applications.

由于您将使用完全相同的技术(具有不同的表和列名称)来满足您的所有时间和历史要求,因此您需要完全理解子查询的基本构造及其不同的应用程序。

Introduction

Note that you have, not only a pure 5NF Database, with Relational Identifiers (composite keys), you have full Temporal capability throughout, and the temporal requirement is rendered without breaking 5NF (No Update Anomalies), which means the ValidToDateTime for periods and durations is derived, and not duplicated in data. Point is, that complicates things, hence this is not the best example for a tutorial on Subqueries.

请注意,您不仅拥有纯5NF数据库,还具有关系标识符(复合键),您具有完整的Temporal功能,并且在不破坏5NF(无更新异常)的情况下呈现时间要求,这意味着有效期和持续时间的ValidToDateTime是派生的,而不是在数据中重复。点是,这使事情变得复杂,因此这不是子查询教程的最佳示例。

  • Remember the SQL engine is a set-processor, so we approach the problem with a set-oriented mindset
    • do not dumb the engine down to row-processing; that is very slow
    • 不要把发动机笨到行处理;那很慢
    • and more important, unnecessary
    • 更重要的是,没有必要
  • 请记住,SQL引擎是一个集合处理器,所以我们用面向集合的思维方式处理问题,不要将引擎愚蠢到行处理;这是非常缓慢,更重要,不必要的
  • Subqueries are normal SQL. The syntax I am using is straight ISO/IEC/ANSI SQL.
    • if you cannot code subqueries in SQL, you will be very limited; and then need to introduce data duplication or use large result sets as Materialised Views or temporary tables or all manner of additional data and additional processing, which will be s.l.o.w to v.e.r.y s.l.o.w, not to mention completely unnecessary
    • 如果你不能在SQL中编写子查询,那么你将非常有限;然后需要引入数据复制或使用大型结果集作为物化视图或临时表或所有方式的附加数据和附加处理,这将是s.l.o.w到v.e.r.y s.l.o.w,更不用说完全没必要了
    • if there is anything you cannot do in a truly Relational Database (and my Data Models always are) without switching to row-processing or inline views or temp tables, ask for help, which is what you have done here.
    • 如果在真正的关系数据库(以及我的数据模型总是如此)中你无法做任何事情而不切换到行处理或内联视图或临时表,请求帮助,这就是你在这里所做的。
  • 子查询是普通的SQL。我使用的语法是直接的ISO / IEC / ANSI SQL。如果你不能在SQL中编写子查询,那么你将非常有限;然后需要引入数据复制或使用大型结果集作为物化视图或临时表或所有方式的附加数据和附加处理,这将非常慢,更不用说如果有任何你不能做的事情是完全没必要的真正的关系数据库(以及我的数据模型总是如此),无需切换到行处理或内联视图或临时表,请求帮助,这就是你在这里所做的。
  • You need to fully understand the first Subquery (simpler) before attempting to understand the second; etc.
  • 在尝试理解第二个子查询之前,您需要完全理解第一个子查询(更简单);等等

Method

First build the Outer query using minimum joins, etc, based on the structure of the result set that you need, and nothing more. It is very important that the structure of the outer query is resolved first; otherwise you will go back and forth trying to make the subquery fit the outer query, and vice versa.

首先使用最小连接等构建外部查询,具体取决于您需要的结果集的结构,仅此而已。首先解析外部查询的结构是非常重要的;否则你会来回尝试使子查询适合外部查询,反之亦然。

  • That happens to require a Subquery as well. So leave that part out for now, and pick that up later. For now, the Outer query gets all (not un-acknowledged) Alerts after a certain date
  • 这恰好需要一个子查询。所以现在就把这个部分留下来,然后再把它拿出来。现在,外部查询在特定日期之后获得所有(不是未确认的)警报

The ▶SQL code◀ required is on page 1 (sorry, the SO edit features are horrible, it destroys the formatting, and the code is already formatted).

需要▶SQL代码◀是第1页(抱歉,SO编辑功能很糟糕,它会破坏格式化,代码已经格式化)。

Then build the Subquery to fill each cell.

然后构建子查询以填充每个单元格。

Subquery (1) Derive Alert.Value

子查询(1)导出Alert.Value

That is a simple derived datum, select the Value from the Reading that generated the Alert. The tables are related, the cardinality is 1::1, so it is a straight join on the PK.

这是一个简单的派生数据,从生成Alert的Reading中选择Value。这些表是相关的,基数是1 :: 1,所以它是PK上的直接连接。

  • The type of Subquery required here is a Correlated Subquery, we need to correlate a table in the Outer query to a table in the (inner) Subquery.
    • in order to do that, we need an Alias for the table in the Outer query, to correlate it to a table in the Subquery.
    • 为了做到这一点,我们需要外部查询中的表的别名,以将其与子查询中的表相关联。
    • to make the distinction, I have used aliases only for such required correlation, and fully qualified names for plain joins
    • 为了区分,我只使用了别名来表示这种必需的相关性,以及普通连接的完全限定名
  • 此处所需的子查询类型是相关子查询,我们需要将外部查询中的表与(内部)子查询中的表相关联。为了做到这一点,我们需要外部查询中的表的别名,以将其与子查询中的表相关联。为了区分,我只使用了别名来表示这种必需的相关性,以及普通连接的完全限定名
  • Subqueries are very fast in any engine (except Oracle)
  • 任何引擎中的子查询都非常快(Oracle除外)
  • SQL is a cumbersome language. But that's all we have. So get used to it.
  • SQL是一种繁琐的语言。但这就是我们所拥有的一切。所以习惯它。

The ▶SQL code◀ required is on page 2.

需要▶SQL代码◀是第2页。

I have purposely given you a mix of joins in the Outer Query vs obtaining data via Subquery, so that you can learn (you could alternately obtain Alert.Value via a join, but that would be even more cumbersome).

我有意在外部查询中添加了多个连接,并通过子查询获取数据,这样您就可以学习(您可以通过连接交替获取Alert.Value,但这会更加繁琐)。

The next Subquery we need derives Alert.PeakValue. For that we need to determine the Temporal Duration of the Alert. We have the beginning of the Alert Duration; we need to determine the end of the Duration, which is the next (temporally) Reading.Value that is within range. That requires a Subquery as well, which we better handle first.

我们需要的下一个子查询派生Alert.PeakValue。为此,我们需要确定警报的时间持续时间。我们有警报持续时间的开始;我们需要确定持续时间的结束,这是范围内的下一个(暂时的)Reading.Value。这也需要一个Subquery,我们最好先处理它。

  • Work the logic from the inside, outward. Good old BODMAS.
  • 从内部向外工作逻辑。好老BODMAS。

Subquery (2) Derive Alert.EndDtm

子查询(2)导出Alert.EndDtm

A slightly more complex Suquery to select the first Reading.ReadingDtm, that is greater than or equal to the Alert.ReadingDtm, that has a Reading.Value which is less than or equal to its Sensor.UpperLimit.

稍微复杂的Suquery,用于选择第一个Reading.ReadingDtm,它大于或等于Alert.ReadingDtm,其Reading.Value小于或等于其Sensor.UpperLimit。

Handling 5NF Temporal Data

处理5NF时态数据

For handling temporal requirements in a 5NF Database (in which EndDateTime is not stored, as is duplicate data), we work on a StartDateTime only, and the EndDateTime is derived: it is the next StartDateTime. This is the Temporal notion of Duration.

为了处理5NF数据库中的时间要求(其中未存储EndDateTime,如同重复数据),我们仅处理StartDateTime,并导出EndDateTime:它是下一个StartDateTime。这是持续时间的时间概念。

  • Technically, it is one millisec (whatever the resolution for the Datatype used) less.
  • 从技术上讲,它是一毫秒(无论所使用的数据类型的分辨率如何)更少。
  • However, in order to be reasonable, we can speak of, and report, EndDateTime as simply the Next.StartDateTime, and ignore the one millisecond issue.
  • 但是,为了合理,我们可以说EndDateTime并将其报告为Next.StartDateTime,并忽略一毫秒的问题。
  • The code should always use >= This.StartDateTime and < Next.StartDateTime.
    • That eliminates a slew of avoidable bugs
    • 这消除了大量可避免的错误
    • Note that these comparison operators, which bracket the Temporal Duration, and should be used in a conventional manner throughout as per above, are quite independent of similar comparison operators related to business logic, eg. Sensor.UpperLimit (ie. watch for it, because both are often located in one WHERE clause, and it is easy to mix them up or get confused).
    • 注意,这些比较运算符包括时间持续时间,并且应该按照上面的传统方式使用,它们完全独立于与业务逻辑相关的类似比较运算符,例如。 Sensor.UpperLimit(即监视它,因为它们通常都位于一个WHERE子句中,很容易将它们混淆或混淆)。
  • 代码应始终使用> = This.StartDateTime和 。这消除了一系列可避免的错误请注意,这些比较运算符包含时间持续时间,并且应该按照上面的常规方式使用,它们完全独立于与业务逻辑相关的类似比较运算符,例如。>

The ▶SQL code◀ required, along with test data used, is on page 3.

所需的▶SQL代码◀以及使用的测试数据在第3页。

Subquery (3) Derive Alert.PeakValue

子查询(3)派生Alert.PeakValue

Now it is easy. Select the MAX(Value) from Readings between Alert.ReadingDtm and Alert.EndDtm, the duration of the Alert.

现在很容易。从Alert.ReadingDtm和Alert.EndDtm之间的读数中选择MAX(值),即警报的持续时间。

The ▶SQL code◀ required is on page 4.

需要▶SQL代码◀是第4页。

Scalar Subquery

标量子查询

In addition to being Correlated Subqueries, the above are all Scalar Subqueries, as they return a single value; each cell in the grid can be filled with only one value. (Non-Scalar Subqueries, that return multiple values, are quite legal, but not for the above.)

除了是相关子查询之外,以上都是标量子查询,因为它们返回单个值;网格中的每个单元格只能填充一个值。 (返回多个值的非标量子查询非常合法,但不适用于上述情况。)

Subquery (4) Acknowledged Alerts

子查询(4)已确认的警报

Ok, now that you have a handle on the above Correlated Scalar Subqueries, those that fill cells in a set, a set that is defined by the Outer query, let's look at a Subquery that can be used to constrain the Outer query. We do not really want all Alerts (above), we want Un-Acknowledged Alerts: the Identifiers that exist in Alert, that do not exist in Acknowledgement. That is not filling cells, that is changing the content of the Outer set. Of course, that means changing the WHERE clause.

好了,现在您已经掌握了上述相关标量子查询,那些填充集合中单元格的子集,一个由外部查询定义的集合,让我们看一下可用于约束外部查询的子查询。我们并不真正想要所有警报(上图),我们需要未确认的警报:警报中存在的标识符,在确认中不存在。这不是填充单元格,即改变外部集合的内容。当然,这意味着更改WHERE子句。

  • We are not changing the structure of the Outer set, so there is no change to the FROM and existing WHERE clauses.
  • 我们不会更改外部集的结构,因此FROM和现有的WHERE子句没有变化。

Simply add a WHERE condition to exclude the set of Acknowledged Alerts. 1::1 cardinality, straight Correlated join.

只需添加WHERE条件即可排除已确认的警报集。 1 :: 1基数,直相关联接。

The ▶SQL code◀ required is on page 5.

需要▶SQL代码◀是第5页。

The difference is, this is a non-Scalar Subquery, producing a set of rows (one column). We have an entire set of Alerts (the Outer set) matched against an entire set of Acknowledgements.

不同的是,这是一个非标量子查询,产生一组行(一列)。我们有一整套警报(外部集)与一整套致谢相匹配。

  • The matching is processed because we have told the engine that the Subquery is Correlated, by using an alias (no need for cumbersome joins to be identified)
  • 处理匹配是因为我们通过使用别名告知引擎子查询是相关的(不需要识别繁琐的连接)
  • Use 1, because we are performing an existence check. Visualise it as a column added onto the Alert set defined by the Outer query.
  • 使用1,因为我们正在执行存在检查。将其可视化为添加到外部查询定义的警报集上的列。
  • Never use * because we do not need the entire set of columns, and that will be slower
  • 永远不要使用*,因为我们不需要整套列,而且速度会慢一些
  • Likewise, failing to use a correlation, means a WHERE NOT IN () is required, but again, that constructs the defined column set, then compares the two sets. Much slower.
  • 同样,未使用相关性意味着需要WHERE NOT IN(),但同样,构造定义的列集,然后比较两个集合。慢得多。

Subquery (5) Actioned Alerts

子查询(5)操作警报

As an alternative constraint on the Outer query, for un-actioned Alerts, instead of (4), exclude the set of Actioned Alerts. Straight Correlated join.

作为外部查询的替代约束,对于未操作的警报,而不是(4),排除一组操作警报。直接相关联接。

The ▶SQL code◀ required is on page 5.

需要▶SQL代码◀是第5页。

This code has been tested on Sybase ASE 15.0.3 using 1000 Alerts and 200 Acknowledgements, of different combinations; and the Readings and Alerts identified in the document. Zero milliseconds execution time (0.003 second resolution) for all executions.

此代码已在Sybase ASE 15.0.3上使用1000个警报和200个已确认的不同组合进行测试;以及文件中确定的读数和警报。所有执行的零毫秒执行时间(0.003秒分辨率)。

If you need it, here is the ▶SQL Code in Text Format◀.

如果需要,可以使用文本格式的▶SQL代码◀。

Response to Comments

(6) ▶Register Alert from Reading◀
This code executes in a loop (provided), selecting new Readings which are out-of-range, and creating Alerts, except where applicable Alerts already exist.

(6)▶从阅读中注册警报◀此代码在循环(提供)中执行,选择超出范围的新读数,并创建警报,除非适用的警报已存在。

(7) ▶Load Alert From Reading◀
Given that you have a full set of test data for Reading, this code uses a modified form of (6) to load the applicable Alerts.

(7)▶从阅读中加载警报◀鉴于您有一整套用于阅读的测试数据,此代码使用修改后的(6)形式加载适用的警报。

Common Problem

It is "simple" when you know how. I repeat, writing SQL without the ability to write Subqueries is very limiting; it is essential for handling Relational Databases, which is what SQL was designed for.

当你知道如何时,它是“简单的”。我再说一遍,编写没有编写子查询能力的SQL是非常有限的;它对于处理关系数据库至关重要,这是SQL的设计目标。

  • Half the reason developers implement unnormalised data heaps (massive data duplication) is because they cannot write the subqueries required for Normalised structures
    • it is not that they have "denormalised for performance"; it is that they cannot code for Normalised. I have seen it a hundred times.
    • 并不是说他们“为表现而非正常化”;它们无法为Normalized编码。我已经看过它一百次了。
    • Case in point here: you have a fully Normalised Relational Database, and the difficulty is coding for it, and you were contemplating duplicating tables for processing purposes.
    • 举个例子:你有一个完全规范化的关系数据库,困难就是为它编码,你正在考虑复制表以便进行处理。
  • 开发人员实现非标准化数据堆(大规模数据复制)的一半原因是因为它们无法编写规范化结构所需的子查询,因此它们并非“为了性能而非规范化”;它们无法为Normalized编码。我已经看过它一百次了。举个例子:你有一个完全规范化的关系数据库,困难就是为它编码,你正在考虑复制表以便进行处理。
  • And that is not counting the added complexity of a temporal database; or a 5NF temporal database.
  • 这还不包括时间数据库增加的复杂性;或5NF时态数据库。
  • Normalisation means Never Duplicate Anything, more recently known as Don't Repeat Yourself
  • 规范化意味着永远不要复制任何东西,最近被称为“不要重复自己”
  • Master Suqueries and you will be in the 98th percentile: Normalised, true Relational Databases; zero data duplication; very high performance.
  • Master Suqueries,你将处于第98个百分点:规范化,真实的关系数据库;零数据重复;非常高的性能。

I think you can figure out the remaining queries you have.

我想你可以找出你剩下的查询。

Relational Identifier

Note, this example also happens to demonstrate the power of using Relational Identifiers, in that several tables in-between the ones we want do not have to be joined (yes! the truth is Relational Identifiers means less, not more, joins, than Id keys). Simply follow the solid lines.

注意,这个例子也恰好证明了使用关系标识符的能力,因为我们想要的几个表之间不必连接(是的!事实是关系标识符意味着更少,而不是更多,连接,而不是Id键)。只需按照实线。

  • Your temporal requirement demands keys containing DateTime. Imagine trying to code the above with Id PKs, there would be two levels of processing: one for the joins (and there would be far more of them), and another for the data processing.
  • 您的时间要求需要包含DateTime的键。想象一下尝试用Id PK编写上面的代码,会有两个级别的处理:一个用于连接(并且会有更多的连接),另一个用于数据处理。

Label

I try to stay away from colloquial labels ("nested", "inner", etc) because they are not specific, and stick to specific technical terms. For completeness and understanding:

我试图远离口语标签(“嵌套”,“内部”等)因为它们不具体,并坚持特定的技术术语。为了完整和理解:

  • a Subquery after the FROM clause, is a Materialised View, a result set derived in one query and then fed into the FROM clause of another query, as a "table".
    • The Oracle types call this Inline View.
    • Oracle类型调用此内联视图。
    • In most cases, you can write Correlated Subqueries as Materialised Views, but that is massively more I/O and processing (since Oracles handling of subqueries is abyssmal, for Oracle only, Materialised Views are "faster").
      .
    • 在大多数情况下,您可以将相关子查询编写为物化视图,但这大大增加了I / O和处理(因为Oracles处理子查询非常简单,仅对于Oracle,物化视图“更快”)。 。
  • FROM子句之后的子查询,是一个物化视图,一个查询中派生的结果集,然后作为“表”输入另一个查询的FROM子句。 Oracle类型调用此内联视图。在大多数情况下,您可以将相关子查询编写为物化视图,但这大大增加了I / O和处理(因为Oracles处理子查询非常简单,仅对于Oracle,物化视图“更快”)。 。
  • A Subquery in the WHERE clause is a Predicate Subquery, because it changes the content of the result set (that which it is predicated upon). It can return either a Scalar (one value) or non-Scalar (many values).

    WHERE子句中的子查询是谓词子查询,因为它更改了结果集的内容(它所基于的内容)。它可以返回标量(一个值)或非标量(多个值)。

    • for Scalars, use WHERE column =, or any scalar operator

      对于Scalars,请使用WHERE column =或任何标量运算符

    • for non-Scalars, use WHERE [NOT] EXISTS, or WHERE column [NOT] IN

      对于非Scalars,使用WHERE [NOT] EXISTS或WHERE列[NOT] IN

  • A Suquery in the WHERE clause does not need to be Correlated; the following works just fine. Identify all superfluous appendages:

    WHERE子句中的Suquery不需要相关;以下工作正常。识别所有多余的附属物:

    SELECT  [Never] = FirstName,
            [Acted] = LastName 
        FROM User 
        WHERE UserId NOT IN ( SELECT DISTINCT UserId
            FROM Action
            )

#2


1  

Try this:

尝试这个:

SELECT MIN(TableDTM) TableDTM, Code
FROM
(
    SELECT T1.TableDTM, T1.Code, MIN(T2.TableDTM) XTableDTM
    FROM T T1
    LEFT JOIN T T2
    ON T1.TableDTM <= T2.TableDTM
    AND T1.Code <> T2.Code
    GROUP BY T1.TableDTM, T1.Code
) X
GROUP BY XTableDTM, Code
ORDER BY 1;

#3


1  

PostgreSQL supports window functions, have a look at this

PostgreSQL支持窗口函数,看看这个

[EDIT] Try the following:

[编辑]尝试以下内容:

SELECT TableDTM, Code FROM
(
    SELECT TableDTM,
           Code,
           LAG(Code, 1, NULL) OVER (ORDER BY TableDTM) AS PrevCode
    FROM   T
)
WHERE PrevCode<>Code OR PrevCode IS NULL;

#4


0  

could you try something like

你可以尝试一下吗?

"SELECT DISTINCT Code, (SELECT MIN(TableDTM) FROM T AS Q WHERE Q.Code = T.Code) As TableDTM FROM T;"

and if you need to exclude the 0, change it in:

如果您需要排除0,请将其更改为:

 SELECT DISTINCT Code, (SELECT MIN(TableDTM) FROM T AS Q WHERE Q.Code = T.Code) As TableDTM FROM T WHERE Code <> 0;