实现这个SQL查询的最佳方式是什么?

I have a PRODUCTS table, and each product can have multiple attributes so I have an ATTRIBUTES table, and another table called ATTRIBPRODUCTS which sits in the middle. The attributes are grouped into classes (type, brand, material, colour, etc), so people might want a product of a particular type, from a certain brand.

我有一个product表，每个产品可以有多个属性，所以我有一个属性表，还有一个叫做ATTRIBPRODUCTS的表，位于中间。属性被分组为类(类型、品牌、材料、颜色等)，所以人们可能想要一个特定类型的产品，从一个特定的品牌。

PRODUCTS
product_id
product_name

ATTRIBUTES
attribute_id
attribute_name
attribute_class

ATTRIBPRODUCTS
attribute_id
product_id

When someone is looking for a product they can select one or many of the attributes. The problem I'm having is returning a single product that has multiple attributes. This should be really simple I know but SQL really isn't my thing and past a certain point I get a bit lost in the logic. The problem is I'm trying to check each attribute class separately so I want to end up with something like:

当有人在寻找产品时，他们可以选择一个或多个属性。我遇到的问题是返回一个具有多个属性的产品。我知道这应该很简单，但SQL真的不是我喜欢的东西，在某种程度上，我有点迷失在逻辑中。问题是，我试图分别检查每个属性类，所以我想以如下方式结束:

SELECT DISTINCT products.product_id
FROM         attribproducts 
INNER JOIN products ON attribproducts.product_id = products.product_id
WHERE     (attribproducts.attribute_id IN (9,10,11)
AND        attribproducts.attribute_id IN (60,61))

I've used IN to separate the blocks of attributes of different classes, so I end up with the products which are of certain types, but also of certain brands. From the results I've had it seems to be that AND between the IN statements that's causing the problem.

我用IN来分离不同类的属性块，所以我最终得到的产品是特定类型的，也是特定品牌的。从我得到的结果来看，似乎是在IN语句之间导致了这个问题。

Can anyone help a little? I don't have the luxury of completely refactoring the database unfortunately, there is a lot more to it than this bit, so any suggestions how to work with what I have will be gratefully received.

谁能帮点忙吗?不幸的是，我没有完全重构数据库的奢侈之处，这里有比这一点更多的东西，因此，任何关于如何使用我所拥有的东西的建议都会受到欢迎。

6 个解决方案

#1

Take a look at the answers to the question SQL: Many-To-Many table AND query. It's the exact same problem. Cletus gave there 2 possible solutions, none of which very trivial (but then again, there simply is no trivial solution).

看看SQL:多对多表和查询这个问题的答案。同样的问题。Cletus给出了两种可能的解决方案，其中没有一个是非常琐碎的(但是，同样地，没有一个是琐碎的解决方案)。

#2

SELECT DISTINCT products.product_id 
FROM products p
INNER JOIN attribproducts ptype on p.product_id = ptype.product_id
INNER JOIN attribproducts pbrand on p.product_id = pbrand.product_id 
WHERE ptype.attribute_id IN (9,10,11) 
    AND pbrand.attribute_id IN (60,61)

#3

Try this:

试试这个:

select * from products p, attribproducts a1, attribproducts a2
  where p.product_id = a1.product_id
    and p.product_id = a2.product_id
    and a1.attribute_id in (9,10,11)
    and a2.attribute_id in (60,61);

#4

This will return no rows because you're only counting rows that have a number that's (either 9, 10, 11) AND (either 60, 61).

这将返回no行，因为您只计算有数字的行(9、10、11)和(60、61)。

Because those sets don't intersect, you'll get no rows.

因为这些集合不相交，所以没有行。

If you use OR instead, it'll give products with attributes that are in the set 9, 10, 11, 60, 61, which isn't what you want either, although you'll then get multiple rows for each product.

如果你使用或者相反，它会给出具有集合9、10、11、60、61中的属性的产品，这也不是你想要的，尽管你会得到每个产品的多个行。

You could use that select as an subquery in a GROUP BY statement, grouping by the quantity of products, and order that grouping by the number of shared attributes. That will give you the highest matches first.

您可以使用select作为一个GROUP BY语句中的子查询，按产品数量分组，并按共享属性的数量排序。这会让你先得到最高的比赛。

Alternatively (as another answer shows), you could join with a new copy of the table for each attribute set, giving you only those products that match all attribute sets.

或者(如另一个答案所示)，您可以为每个属性集联合一个新的表副本，只提供与所有属性集匹配的产品。

#5

It sounds like you have a data schema that is GREAT for storage but terrible for selecting/reporting. When you have a data structure of OBJECT, ATTRIBUTE, OBJECT-ATTRIBUTE and OBJECT-ATTRIBUTE-VALUE you can store many objects with many different attributes per object. This is sometime referred to as "Vertical Storage".

听起来您的数据模式对于存储来说很好，但是对于选择/报告来说却很糟糕。当您拥有对象、属性、对象属性和对象属性-值的数据结构时，您可以为每个对象存储具有许多不同属性的对象。这有时被称为“垂直存储”。

However, when you want to retrieve a list of objects with all of their attributes values, it is an variable number of joins you have to make. It is much easier to retrieve data when it is stored horizonatally (Defined columns of data)

但是，当您想要检索具有所有属性值的对象的列表时，您必须要创建的连接数量是可变的。在水平存储数据(定义的数据列)时，检索数据要容易得多

I have run into this scenario several times. Since you cannot change the existing data structure. My suggest would be to write a "layer" of tables on top. Dynamically create a table for each object/product you have. Then dynamically create static columns in those new tables for each attribute. Pretty much you need to "flatten" your vertically stored attribute/values into static columns. Convert from a vertical architecture into a horizontal ones.

我多次遇到这种情况。因为您无法更改现有的数据结构。我的建议是在上面写一层表格。动态地为您拥有的每个对象/产品创建一个表。然后在这些新表中为每个属性动态创建静态列。几乎需要将垂直存储的属性/值“平坦化”为静态列。将垂直架构转换为水平架构。

Use the "flattened" tables for reporting, and use the vertical tables for storage.

使用“扁平”表进行报告，使用垂直表进行存储。

If you need sample code or more details, just ask me.

如果您需要样例代码或更多细节，请咨询我。

I hope this is clear. I have not had much coffee yet :)

我希望这是清楚的。我还没喝多少咖啡。

Thanks, - Mark

谢谢,马克

#6

You can use multiple inner joins -- I think this would work:

你可以使用多个内部连接——我认为这样可以:

select distinct product_id
from products p
inner join attribproducts a1 on a1.product_id=p.product_id
inner join attribproducts a2 on a1.product_id=p.product_id
where a1.attribute_id in (9,10,11) 
  and a2.attribute_id in (60,61)

#1

#2

SELECT DISTINCT products.product_id 
FROM products p
INNER JOIN attribproducts ptype on p.product_id = ptype.product_id
INNER JOIN attribproducts pbrand on p.product_id = pbrand.product_id 
WHERE ptype.attribute_id IN (9,10,11) 
    AND pbrand.attribute_id IN (60,61)

#3