SQL/mysql -选择不同的/唯一的但是返回所有的列?

时间:2021-06-16 04:26:22
SELECT DISTINCT field1, field2, field3, ......   FROM table

I am trying to accomplish the following sql statement but I want it to return all columns is this possible? Something like:

我正在尝试完成下面的sql语句,但是我希望它返回所有列,这是可能的吗?喜欢的东西:

SELECT DISTINCT field1, * from table

15 个解决方案

#1


309  

You're looking for a group by:

你在寻找一个群体:

select *
from table
group by field1

Which can occasionally be written with a distinct on statement:

有时可以用不同的语句写出来:

select distinct on field1 *
from table

On most platforms, however, neither of the above will work because the behavior on the other columns is unspecified. (The first works in MySQL, if that's what you're using.)

然而,在大多数平台上,上述两种方法都不会起作用,因为其他列上的行为是未指定的。(这是MySQL的第一个工作,如果你正在使用它的话。)

You could fetch the distinct fields and stick to picking a single arbitrary row each time.

您可以获取不同的字段并坚持每次选择一个任意的行。

On some platforms (e.g. PostgreSQL, Oracle, T-SQL) this can be done directly using window functions:

在某些平台上(例如PostgreSQL、Oracle、T-SQL),可以直接使用窗口函数来完成:

select *
from (
   select *,
          row_number() over (partition by field1 order by field2) as row_number
   from table
   ) as rows
where row_number = 1

On others (MySQL, SQLite), you'll need to write subqueries that will make you join the entire table with itself (example), so not recommended.

对于其他的(MySQL, SQLite),您需要编写子查询,使您可以自己加入整个表(示例),因此不推荐。

#2


44  

From the phrasing of your question, I understand that you want to select the distinct values for a given field and for each such value to have all the other column values in the same row listed. Most DBMSs will not allow this with neither DISTINCT nor GROUP BY, because the result is not determined.

从您的问题的措辞来看,我理解您想要为给定的字段选择不同的值,并为每个这样的值选择在同一列中列出的所有其他列值。大多数DBMSs都不会允许它既不清楚也不分组,因为结果是不确定的。

Think of it like this: if your field1 occurs more than once, what value of field2 will be listed (given that you have the same value for field1 in two rows but two distinct values of field2 in those two rows).

这样想:如果您的field1不止一次发生,那么field2的值将被列出(假定您在两行中对field1具有相同的值,但是在这两行中有两个不同的field2值)。

You can however use aggregate functions (explicitely for every field that you want to be shown) and using a GROUP BY instead of DISTINCT:

但是,您可以使用聚合函数(明确地为您希望显示的每个字段)和使用一个组而不是不同的:

SELECT field1, MAX(field2), COUNT(field3), SUM(field4), .... FROM table GROUP BY field1

#3


15  

If I understood your problem correctly, it's similar to one I just had. You want to be able limit the usability of DISTINCT to a specified field, rather than applying it to all the data.

如果我正确地理解了你的问题,它和我刚才说的类似。您希望能够限制特定字段的可用性,而不是将其应用于所有数据。

If you use GROUP BY without an aggregate function, which ever field you GROUP BY will be your DISTINCT filed.

如果你使用GROUP BY没有一个聚合函数,那么你将会得到一个不同的字段。

If you make your query:

如果你提出疑问:

SELECT * from table GROUP BY field1;

It will show all your results based on a single instance of field1.

它将基于field1的单个实例显示所有结果。

For example, if you have a table with name, address and city. A single person has multiple addresses recorded, but you just want a single address for the person, you can query as follows:

例如,如果您有一个具有名称、地址和城市的表。一个人有多个地址记录,但你只需要一个人的地址,你可以查询如下:

SELECT * FROM persons GROUP BY name;

The result will be that only one instance of that name will appear with its address, and the other one will be omitted from the resulting table. Caution: if your fileds have atomic values such as firstName, lastName you want to group by both.

结果是,只有一个实例的名称将出现在它的地址中,而另一个将从结果表中删除。注意:如果您的fileds具有像firstName、lastName这样的原子值,那么您需要同时对它们进行分组。

SELECT * FROM persons GROUP BY lastName, firstName;

because if two people have the same last name and you only group by lastName, one of those persons will be omitted from the results. You need to keep those things into consideration. Hope this helps.

因为如果两个人的姓是相同的,而你只使用lastName,那么其中一个人将会被忽略。你需要把这些事情考虑进去。希望这个有帮助。

#4


10  

SELECT  c2.field1 ,
        field2
FROM    (SELECT DISTINCT
                field1
         FROM   dbo.TABLE AS C
        ) AS c1
        JOIN dbo.TABLE AS c2 ON c1.field1 = c2.field1

#5


3  

Great question @aryaxt -- you can tell it was a great question because you asked it 5 years ago and I stumbled upon it today trying to find the answer!

很好的问题@aryaxt——你可以看出这是个很好的问题,因为你5年前问过这个问题,我今天偶然发现了它,试图找到答案!

I just tried to edit the accepted answer to include this, but in case my edit does not make it in:

我只是试着编辑这个被接受的答案来包含这个,但是万一我的编辑没有做到:

If your table was not that large, and assuming your primary key was an auto-incrementing integer you could do something like this:

如果你的表不是那么大,假设你的主键是一个自动递增的整数你可以这样做:

SELECT 
  table.*
FROM table
--be able to take out dupes later
LEFT JOIN (
  SELECT field, MAX(id) as id
  FROM table
  GROUP BY field
) as noDupes on noDupes.id = table.id
WHERE
  //this will result in only the last instance being seen
  noDupes.id is not NULL

#6


2  

You can do it with a WITH clause.

你可以用with和从句来做。

For example:

例如:

WITH c AS (SELECT DISTINCT a, b, c FROM tableName)
SELECT * FROM tableName r, c WHERE c.rowid=r.rowid AND c.a=r.a AND c.b=r.b AND c.c=r.c

This also allows you to select only the rows selected in the WITH clauses query.

这也允许您只选择在WITH子句查询中选择的行。

#7


1  

For SQL Server you can use the dense_rank and additional windowing functions to get all rows AND columns with duplicated values on specified columns. Here is an example...

对于SQL Server,您可以使用dense_rank和附加的窗口函数来获取在指定列上具有重复值的所有行和列。这里有一个例子……

with t as (
    select col1 = 'a', col2 = 'b', col3 = 'c', other = 'r1' union all
    select col1 = 'c', col2 = 'b', col3 = 'a', other = 'r2' union all
    select col1 = 'a', col2 = 'b', col3 = 'c', other = 'r3' union all
    select col1 = 'a', col2 = 'b', col3 = 'c', other = 'r4' union all
    select col1 = 'c', col2 = 'b', col3 = 'a', other = 'r5' union all
    select col1 = 'a', col2 = 'a', col3 = 'a', other = 'r6'
), tdr as (
    select 
        *, 
        total_dr_rows = count(*) over(partition by dr)
    from (
        select 
            *, 
            dr = dense_rank() over(order by col1, col2, col3),
            dr_rn = row_number() over(partition by col1, col2, col3 order by other)
        from 
            t
    ) x
)

select * from tdr where total_dr_rows > 1

This is taking a row count for each distinct combination of col1, col2, and col3.

这是对col1、col2和col3的每个不同组合的行计数。

#8


1  

That's a really good question. I have read some useful answers here already, but probably I can add a more precise explanation.

这是个很好的问题。我已经在这里阅读了一些有用的答案,但可能我可以添加一个更精确的解释。

Reducing the number of query results with a GROUP BY statement is easy as long as you don't query additional information. Let's assume you got the following table 'locations'.

只要不查询其他信息,通过语句减少查询结果的数量是很容易的。假设你有下表的位置。

--country-- --city--
 France      Lyon
 Poland      Krakow
 France      Paris
 France      Marseille
 Italy       Milano

Now the query

现在查询

SELECT country FROM locations
GROUP BY country

will result in:

将导致:

--country--
 France
 Poland
 Italy

However, the following query

然而,下面的查询

SELECT country, city FROM locations
GROUP BY country

...throws an error in MS SQL, because how could your computer know which of the three French cities "Lyon", "Paris" or "Marseille" you want to read in the field to the right of "France"?

…在MS SQL中抛出一个错误,因为你的计算机如何知道法国三个城市中的哪个“里昂”、“巴黎”或“马赛”,你想在“法国”的右边读到什么?

In order to correct the second query, you must add this information. One way to do this is to use the functions MAX() or MIN(), selecting the biggest or smallest value among all candidates. MAX() and MIN() are not only applicable to numeric values, but also compare the alphabetical order of string values.

为了纠正第二个查询,必须添加此信息。一种方法是使用MAX()或MIN()函数,在所有候选者中选择最大或最小的值。MAX()和MIN()不仅适用于数值,还可以比较字符串值的字母顺序。

SELECT country, MAX(city) FROM locations
GROUP BY country

will result in:

将导致:

--country-- --city--
 France      Paris
 Poland      Krakow
 Italy       Milano

or:

或者:

SELECT country, MIN(city) FROM locations
GROUP BY country

will result in:

将导致:

--country-- --city--
 France      Lyon
 Poland      Krakow
 Italy       Milano

These functions are a good solution as long as you are fine with selecting your value from the either ends of the alphabetical (or numeric) order. But what if this is not the case? Let us assume that you need a value with a certain characteristic, e.g. starting with the letter 'M'. Now things get complicated.

这些函数是一个很好的解决方案,只要您可以从字母(或数字)顺序的两端选择您的值。但如果事实并非如此呢?让我们假设你需要一个具有某种特征的值,例如从字母M开始。现在事情变得复杂。

The only solution I could find so far is to put your whole query into a subquery, and to construct the additional column outside of it by hands:

到目前为止,我能找到的唯一解决方案是将您的整个查询放入一个子查询中,并通过手工构建额外的列:

SELECT
     countrylist.*,
     (SELECT TOP 1 city
     FROM locations
     WHERE
          country = countrylist.country
          AND city like 'M%'
     )
FROM
(SELECT country FROM locations
GROUP BY country) countrylist

will result in:

将导致:

--country-- --city--
 France      Marseille
 Poland      NULL
 Italy       Milano

#9


0  

SELECT *
FROM tblname
GROUP BY duplicate_values
ORDER BY ex.VISITED_ON DESC
LIMIT 0 , 30

in ORDER BY i have just put example here, you can also add ID field in this

根据我刚才举的例子,你也可以在这里添加ID字段。

#10


0  

I would suggest using

我建议使用

SELECT  * from table where field1 in 
(
  select distinct field1 from table
)

this way if you have the same value in field1 across multiple rows, all the records will be returned.

这样,如果在field1中有多个行,那么所有记录都将返回。

#11


-1  

Add GROUP BY to field you want to check for duplicates your query may look like

添加组到您想要检查的字段的副本,您的查询可能看起来像。

SELECT field1, field2, field3, ......   FROM table GROUP BY field1

field1 will be checked to exclude duplicate records

field1将被选中,以排除重复记录。

or you may query like

或者您可以查询。

SELECT *  FROM table GROUP BY field1

duplicate records of field1 are excluded from SELECT

field1的重复记录被排除在SELECT之外。

#12


-1  

It can be done by inner query

它可以通过内部查询来完成。

$query = "SELECT * 
            FROM (SELECT field
                FROM table
                ORDER BY id DESC) as rows               
            GROUP BY field";

#13


-2  

Just include all of your fields in the GROUP BY clause.

只需要在GROUP BY子句中包含所有字段。

#14


-2  

SELECT DISTINCT FIELD1, FIELD2, FIELD3 FROM TABLE1 works if the values of all three columns are unique in the table.

如果表中所有三列的值都是唯一的,则从TABLE1中选择不同的FIELD1、FIELD2、FIELD3。

If, for example, you have multiple identical values for first name, but the last name and other information in the selected columns is different, the record will be included in the result set.

例如,如果您对第一个名称有多个相同的值,但是所选列中的最后一个名称和其他信息是不同的,则记录将包含在结果集中。

#15


-3  

SELECT * from table where field in (SELECT distinct field from table)

#1


309  

You're looking for a group by:

你在寻找一个群体:

select *
from table
group by field1

Which can occasionally be written with a distinct on statement:

有时可以用不同的语句写出来:

select distinct on field1 *
from table

On most platforms, however, neither of the above will work because the behavior on the other columns is unspecified. (The first works in MySQL, if that's what you're using.)

然而,在大多数平台上,上述两种方法都不会起作用,因为其他列上的行为是未指定的。(这是MySQL的第一个工作,如果你正在使用它的话。)

You could fetch the distinct fields and stick to picking a single arbitrary row each time.

您可以获取不同的字段并坚持每次选择一个任意的行。

On some platforms (e.g. PostgreSQL, Oracle, T-SQL) this can be done directly using window functions:

在某些平台上(例如PostgreSQL、Oracle、T-SQL),可以直接使用窗口函数来完成:

select *
from (
   select *,
          row_number() over (partition by field1 order by field2) as row_number
   from table
   ) as rows
where row_number = 1

On others (MySQL, SQLite), you'll need to write subqueries that will make you join the entire table with itself (example), so not recommended.

对于其他的(MySQL, SQLite),您需要编写子查询,使您可以自己加入整个表(示例),因此不推荐。

#2


44  

From the phrasing of your question, I understand that you want to select the distinct values for a given field and for each such value to have all the other column values in the same row listed. Most DBMSs will not allow this with neither DISTINCT nor GROUP BY, because the result is not determined.

从您的问题的措辞来看,我理解您想要为给定的字段选择不同的值,并为每个这样的值选择在同一列中列出的所有其他列值。大多数DBMSs都不会允许它既不清楚也不分组,因为结果是不确定的。

Think of it like this: if your field1 occurs more than once, what value of field2 will be listed (given that you have the same value for field1 in two rows but two distinct values of field2 in those two rows).

这样想:如果您的field1不止一次发生,那么field2的值将被列出(假定您在两行中对field1具有相同的值,但是在这两行中有两个不同的field2值)。

You can however use aggregate functions (explicitely for every field that you want to be shown) and using a GROUP BY instead of DISTINCT:

但是,您可以使用聚合函数(明确地为您希望显示的每个字段)和使用一个组而不是不同的:

SELECT field1, MAX(field2), COUNT(field3), SUM(field4), .... FROM table GROUP BY field1

#3


15  

If I understood your problem correctly, it's similar to one I just had. You want to be able limit the usability of DISTINCT to a specified field, rather than applying it to all the data.

如果我正确地理解了你的问题,它和我刚才说的类似。您希望能够限制特定字段的可用性,而不是将其应用于所有数据。

If you use GROUP BY without an aggregate function, which ever field you GROUP BY will be your DISTINCT filed.

如果你使用GROUP BY没有一个聚合函数,那么你将会得到一个不同的字段。

If you make your query:

如果你提出疑问:

SELECT * from table GROUP BY field1;

It will show all your results based on a single instance of field1.

它将基于field1的单个实例显示所有结果。

For example, if you have a table with name, address and city. A single person has multiple addresses recorded, but you just want a single address for the person, you can query as follows:

例如,如果您有一个具有名称、地址和城市的表。一个人有多个地址记录,但你只需要一个人的地址,你可以查询如下:

SELECT * FROM persons GROUP BY name;

The result will be that only one instance of that name will appear with its address, and the other one will be omitted from the resulting table. Caution: if your fileds have atomic values such as firstName, lastName you want to group by both.

结果是,只有一个实例的名称将出现在它的地址中,而另一个将从结果表中删除。注意:如果您的fileds具有像firstName、lastName这样的原子值,那么您需要同时对它们进行分组。

SELECT * FROM persons GROUP BY lastName, firstName;

because if two people have the same last name and you only group by lastName, one of those persons will be omitted from the results. You need to keep those things into consideration. Hope this helps.

因为如果两个人的姓是相同的,而你只使用lastName,那么其中一个人将会被忽略。你需要把这些事情考虑进去。希望这个有帮助。

#4


10  

SELECT  c2.field1 ,
        field2
FROM    (SELECT DISTINCT
                field1
         FROM   dbo.TABLE AS C
        ) AS c1
        JOIN dbo.TABLE AS c2 ON c1.field1 = c2.field1

#5


3  

Great question @aryaxt -- you can tell it was a great question because you asked it 5 years ago and I stumbled upon it today trying to find the answer!

很好的问题@aryaxt——你可以看出这是个很好的问题,因为你5年前问过这个问题,我今天偶然发现了它,试图找到答案!

I just tried to edit the accepted answer to include this, but in case my edit does not make it in:

我只是试着编辑这个被接受的答案来包含这个,但是万一我的编辑没有做到:

If your table was not that large, and assuming your primary key was an auto-incrementing integer you could do something like this:

如果你的表不是那么大,假设你的主键是一个自动递增的整数你可以这样做:

SELECT 
  table.*
FROM table
--be able to take out dupes later
LEFT JOIN (
  SELECT field, MAX(id) as id
  FROM table
  GROUP BY field
) as noDupes on noDupes.id = table.id
WHERE
  //this will result in only the last instance being seen
  noDupes.id is not NULL

#6


2  

You can do it with a WITH clause.

你可以用with和从句来做。

For example:

例如:

WITH c AS (SELECT DISTINCT a, b, c FROM tableName)
SELECT * FROM tableName r, c WHERE c.rowid=r.rowid AND c.a=r.a AND c.b=r.b AND c.c=r.c

This also allows you to select only the rows selected in the WITH clauses query.

这也允许您只选择在WITH子句查询中选择的行。

#7


1  

For SQL Server you can use the dense_rank and additional windowing functions to get all rows AND columns with duplicated values on specified columns. Here is an example...

对于SQL Server,您可以使用dense_rank和附加的窗口函数来获取在指定列上具有重复值的所有行和列。这里有一个例子……

with t as (
    select col1 = 'a', col2 = 'b', col3 = 'c', other = 'r1' union all
    select col1 = 'c', col2 = 'b', col3 = 'a', other = 'r2' union all
    select col1 = 'a', col2 = 'b', col3 = 'c', other = 'r3' union all
    select col1 = 'a', col2 = 'b', col3 = 'c', other = 'r4' union all
    select col1 = 'c', col2 = 'b', col3 = 'a', other = 'r5' union all
    select col1 = 'a', col2 = 'a', col3 = 'a', other = 'r6'
), tdr as (
    select 
        *, 
        total_dr_rows = count(*) over(partition by dr)
    from (
        select 
            *, 
            dr = dense_rank() over(order by col1, col2, col3),
            dr_rn = row_number() over(partition by col1, col2, col3 order by other)
        from 
            t
    ) x
)

select * from tdr where total_dr_rows > 1

This is taking a row count for each distinct combination of col1, col2, and col3.

这是对col1、col2和col3的每个不同组合的行计数。

#8


1  

That's a really good question. I have read some useful answers here already, but probably I can add a more precise explanation.

这是个很好的问题。我已经在这里阅读了一些有用的答案,但可能我可以添加一个更精确的解释。

Reducing the number of query results with a GROUP BY statement is easy as long as you don't query additional information. Let's assume you got the following table 'locations'.

只要不查询其他信息,通过语句减少查询结果的数量是很容易的。假设你有下表的位置。

--country-- --city--
 France      Lyon
 Poland      Krakow
 France      Paris
 France      Marseille
 Italy       Milano

Now the query

现在查询

SELECT country FROM locations
GROUP BY country

will result in:

将导致:

--country--
 France
 Poland
 Italy

However, the following query

然而,下面的查询

SELECT country, city FROM locations
GROUP BY country

...throws an error in MS SQL, because how could your computer know which of the three French cities "Lyon", "Paris" or "Marseille" you want to read in the field to the right of "France"?

…在MS SQL中抛出一个错误,因为你的计算机如何知道法国三个城市中的哪个“里昂”、“巴黎”或“马赛”,你想在“法国”的右边读到什么?

In order to correct the second query, you must add this information. One way to do this is to use the functions MAX() or MIN(), selecting the biggest or smallest value among all candidates. MAX() and MIN() are not only applicable to numeric values, but also compare the alphabetical order of string values.

为了纠正第二个查询,必须添加此信息。一种方法是使用MAX()或MIN()函数,在所有候选者中选择最大或最小的值。MAX()和MIN()不仅适用于数值,还可以比较字符串值的字母顺序。

SELECT country, MAX(city) FROM locations
GROUP BY country

will result in:

将导致:

--country-- --city--
 France      Paris
 Poland      Krakow
 Italy       Milano

or:

或者:

SELECT country, MIN(city) FROM locations
GROUP BY country

will result in:

将导致:

--country-- --city--
 France      Lyon
 Poland      Krakow
 Italy       Milano

These functions are a good solution as long as you are fine with selecting your value from the either ends of the alphabetical (or numeric) order. But what if this is not the case? Let us assume that you need a value with a certain characteristic, e.g. starting with the letter 'M'. Now things get complicated.

这些函数是一个很好的解决方案,只要您可以从字母(或数字)顺序的两端选择您的值。但如果事实并非如此呢?让我们假设你需要一个具有某种特征的值,例如从字母M开始。现在事情变得复杂。

The only solution I could find so far is to put your whole query into a subquery, and to construct the additional column outside of it by hands:

到目前为止,我能找到的唯一解决方案是将您的整个查询放入一个子查询中,并通过手工构建额外的列:

SELECT
     countrylist.*,
     (SELECT TOP 1 city
     FROM locations
     WHERE
          country = countrylist.country
          AND city like 'M%'
     )
FROM
(SELECT country FROM locations
GROUP BY country) countrylist

will result in:

将导致:

--country-- --city--
 France      Marseille
 Poland      NULL
 Italy       Milano

#9


0  

SELECT *
FROM tblname
GROUP BY duplicate_values
ORDER BY ex.VISITED_ON DESC
LIMIT 0 , 30

in ORDER BY i have just put example here, you can also add ID field in this

根据我刚才举的例子,你也可以在这里添加ID字段。

#10


0  

I would suggest using

我建议使用

SELECT  * from table where field1 in 
(
  select distinct field1 from table
)

this way if you have the same value in field1 across multiple rows, all the records will be returned.

这样,如果在field1中有多个行,那么所有记录都将返回。

#11


-1  

Add GROUP BY to field you want to check for duplicates your query may look like

添加组到您想要检查的字段的副本,您的查询可能看起来像。

SELECT field1, field2, field3, ......   FROM table GROUP BY field1

field1 will be checked to exclude duplicate records

field1将被选中,以排除重复记录。

or you may query like

或者您可以查询。

SELECT *  FROM table GROUP BY field1

duplicate records of field1 are excluded from SELECT

field1的重复记录被排除在SELECT之外。

#12


-1  

It can be done by inner query

它可以通过内部查询来完成。

$query = "SELECT * 
            FROM (SELECT field
                FROM table
                ORDER BY id DESC) as rows               
            GROUP BY field";

#13


-2  

Just include all of your fields in the GROUP BY clause.

只需要在GROUP BY子句中包含所有字段。

#14


-2  

SELECT DISTINCT FIELD1, FIELD2, FIELD3 FROM TABLE1 works if the values of all three columns are unique in the table.

如果表中所有三列的值都是唯一的,则从TABLE1中选择不同的FIELD1、FIELD2、FIELD3。

If, for example, you have multiple identical values for first name, but the last name and other information in the selected columns is different, the record will be included in the result set.

例如,如果您对第一个名称有多个相同的值,但是所选列中的最后一个名称和其他信息是不同的,则记录将包含在结果集中。

#15


-3  

SELECT * from table where field in (SELECT distinct field from table)