SQL选择PostgreSQL中列中所有值的中间三分之一

时间:2021-05-08 07:57:00

Suppose I have a column of heights -- how can I select all and only those height values that are neither in the top 30% of values nor the bottom 30% of values.

假设我有一列高度 - 我如何选择所有且只有那些既不在前30%的值也不在最后30%的值中的高度值。

UPDATE:

I'd like the answer for PostgreSQL (or, failing that, MySQL -- I'm using Rails).

我想要PostgreSQL的答案(或者,失败,MySQL - 我正在使用Rails)。

5 个解决方案

#1


WITH cte AS (
 SELECT *, NTILE(100) OVER (ORDER BY column) as rank
 FROM table)
SELECT * FROM cte WHERE rank BETWEEN 30 and 70

#2


For SQL Server 2005 +

对于SQL Server 2005 +

SELECT
    *
FROM
    MyTable M
EXCEPT
SELECT
    *
FROM
    (SELECT TOP 30 PERCENT
        *
    FROM
        MyTable M
    ORDER BY
        Height
    UNION ALL
    SELECT TOP 30 PERCENT
        *
    FROM
        MyTable M
    ORDER BY
        Height DESC) foo

#3


for sql server 2005+ you should use the NTILE() function for this.

对于sql server 2005+,您应该使用NTILE()函数。

SELECT *
FROM   (
         SELECT ntile(3) over(order by AddressId) as Percentile, *
         FROM   (
                SELECT top 100 *
                FROM   Person.Address
           ) t
       ) t
where Percentile = 2

#4


You're asking for PostgresSQL, and that doesn't support NTITLE or TOP X PERCENT.

你要求PostgresSQL,并且不支持NTITLE或TOP X PERCENT。

Without either of those, I can think of a query like this retrieve the middle rows:

没有这些,我可以想到这样的查询检索中间行:

select *
from MyTable
where height not in (
    select Height from MyTable order by Height desc 
    limit ((select count(*) from MyTable)*0.3)
    union
    select Height from MyTable order by Height
    limit ((select count(*) from MyTable)*0.3)
)

Now, I'm not sure if PostgresSQL supports a limit calculated in a subquery, and I don't have a PostgresSQL database near to try it.

现在,我不确定PostgresSQL是否支持在子查询中计算的限制,并且我没有附近的PostgresSQL数据库来尝试它。

#5


Postgres only accepts contants in limit clause. So the solution above does not work.

Postgres只接受限制条款中的含量。所以上面的解决方案不起作用。

Your select is something like this:

你的选择是这样的:

SELECT *
  FROM (SELECT T.HEIGHT, 
               -- this tells us the "ranking" of each row 
               -- by counting all the heights that are small than 
               -- height in the that row
               (SELECT COUNT(*) + 1
                  FROM <table> T1 
                 WHERE T1.HEIGHT < T.HEIGHT
               ) AS RANK,
               -- this tells us the count of rows in the table
               (SELECT COUNT(*) 
                  FROM <table> T1
               ) AS REC_COUNT
          FROM <table> T
         ORDER BY T.HEIGHT
       ) T
 -- now just list rows wich ranking is between (not top30) and (not bottom30)
 WHERE T.RANK BETWEEN (T.REC_COUNT*0.30) AND (T.REC_COUNT*0.70)

This is gonna work in any database what accepts subselects (subqueries).

这可以在任何接受子选择(子查询)的数据库中工作。

This does not treat equalties in "heights", but it could be done using primary key

这不会在“高度”中处理均衡,但可以使用主键来完成

SELECT COUNT(*) + 1
  FROM <table> T1 
 WHERE (T1.HEIGHT < T.HEIGHT)
    OR (T1.HEIGHT = T.HEIGHT and T1.PK_FIELD < T.PK_FIELD)

Regards.

#1


WITH cte AS (
 SELECT *, NTILE(100) OVER (ORDER BY column) as rank
 FROM table)
SELECT * FROM cte WHERE rank BETWEEN 30 and 70

#2


For SQL Server 2005 +

对于SQL Server 2005 +

SELECT
    *
FROM
    MyTable M
EXCEPT
SELECT
    *
FROM
    (SELECT TOP 30 PERCENT
        *
    FROM
        MyTable M
    ORDER BY
        Height
    UNION ALL
    SELECT TOP 30 PERCENT
        *
    FROM
        MyTable M
    ORDER BY
        Height DESC) foo

#3


for sql server 2005+ you should use the NTILE() function for this.

对于sql server 2005+,您应该使用NTILE()函数。

SELECT *
FROM   (
         SELECT ntile(3) over(order by AddressId) as Percentile, *
         FROM   (
                SELECT top 100 *
                FROM   Person.Address
           ) t
       ) t
where Percentile = 2

#4


You're asking for PostgresSQL, and that doesn't support NTITLE or TOP X PERCENT.

你要求PostgresSQL,并且不支持NTITLE或TOP X PERCENT。

Without either of those, I can think of a query like this retrieve the middle rows:

没有这些,我可以想到这样的查询检索中间行:

select *
from MyTable
where height not in (
    select Height from MyTable order by Height desc 
    limit ((select count(*) from MyTable)*0.3)
    union
    select Height from MyTable order by Height
    limit ((select count(*) from MyTable)*0.3)
)

Now, I'm not sure if PostgresSQL supports a limit calculated in a subquery, and I don't have a PostgresSQL database near to try it.

现在,我不确定PostgresSQL是否支持在子查询中计算的限制,并且我没有附近的PostgresSQL数据库来尝试它。

#5


Postgres only accepts contants in limit clause. So the solution above does not work.

Postgres只接受限制条款中的含量。所以上面的解决方案不起作用。

Your select is something like this:

你的选择是这样的:

SELECT *
  FROM (SELECT T.HEIGHT, 
               -- this tells us the "ranking" of each row 
               -- by counting all the heights that are small than 
               -- height in the that row
               (SELECT COUNT(*) + 1
                  FROM <table> T1 
                 WHERE T1.HEIGHT < T.HEIGHT
               ) AS RANK,
               -- this tells us the count of rows in the table
               (SELECT COUNT(*) 
                  FROM <table> T1
               ) AS REC_COUNT
          FROM <table> T
         ORDER BY T.HEIGHT
       ) T
 -- now just list rows wich ranking is between (not top30) and (not bottom30)
 WHERE T.RANK BETWEEN (T.REC_COUNT*0.30) AND (T.REC_COUNT*0.70)

This is gonna work in any database what accepts subselects (subqueries).

这可以在任何接受子选择(子查询)的数据库中工作。

This does not treat equalties in "heights", but it could be done using primary key

这不会在“高度”中处理均衡,但可以使用主键来完成

SELECT COUNT(*) + 1
  FROM <table> T1 
 WHERE (T1.HEIGHT < T.HEIGHT)
    OR (T1.HEIGHT = T.HEIGHT and T1.PK_FIELD < T.PK_FIELD)

Regards.