如何使用Django中的QuerySet过滤最新的m2m对象

时间:2022-10-04 14:15:58

I have the following model:

我有以下模型:

class Customer(SomeInheritedModel):
    name = models.CharField(max_length=50)
    ...

class Account(SomeInheritedModel):
    customer = models.ForeignKey(Customer, related_name='accounts')
    ...

class Product(SomeInheritedModel):
    name = models.CharField(max_length=50)
    ...

class License(SomeInheritedModel):
    account = models.ForeignKey(Account)
    product = models.ForeignKey(Product)
    maintenance = models.ManyToManyField('Maintenance', related_name="maintenances")

class Maintenance(SomeInheritedModel):
    start_date = models.DateTimeField(null=True)
    expiration_date = models.DateTimeField(null=True)

Once a license maintenance is renewed, a new Maintenance object is created. This way I can track back to all Maintenance a particular License has had.

一旦许可证维护被更新,就会创建一个新的维护对象。这样我就可以追踪到所有的维护,一个特定的许可证。

Now I want to generate a report to show me all Customers which License are about to expire - based on their Maintenance expiry date. And I want only the latest Maintenance object a License has, because it is the latest sold. I don't want the others.

现在,我想生成一个报告,向我展示所有的客户,根据他们的维护有效期即将过期的许可证。而且我只想要一个许可证拥有的最新的维护对象,因为它是最新售出的。我不想要其他人。

I know I could achieve this with a QuerySet and a for loop, but that would be a bit costy to the server, given a have a lot of entries.

我知道我可以通过一个QuerySet和一个for循环来实现这一点,但是这对服务器来说有点麻烦,因为a有很多条目。

Is there a way to do this filtering through a QuerySet? Something like this:

是否有一种方法可以通过一个QuerySet来进行过滤?是这样的:

Customer.objects.filter(accounts__licenses__maintenances__expiry_date__last__range=(now().date(), one_month_into_future().date()))

.date Customer.objects.filter(accounts__licenses__maintenances__expiry_date__last__range =(现在()(),one_month_into_future().date()))

I know I can use __last in some ocasions, but that doesn't quite work if I have to specify something after that.

我知道我可以在一些ocasions中使用__last,但如果我必须在之后指定一些东西,那就不太管用了。

Edit

I found my answer through what @hynekcer suggested. You can use annotate.

我通过@hynekcer的建议找到了答案。您可以使用注释。

License.objects.filter(foo=True)
    .annotate(max_exp_date=models.Max('maintenances__expiration_date'))\
    .filter(max_exp_date__gte=report.start_date, max_exp_date__lte=report.end_date)\
    .select_related('account__customer')

3 个解决方案

#1


1  

tl;dr )

The current important answer is the EDIT at the end.

当前重要的答案是最后的编辑。

Yes, it is important to ignore all older related objects (maintenances), even those which are still valid, because there may exist a renewed maintenance.

是的,忽略所有旧的相关对象(维护)是很重要的,即使是那些仍然有效的对象,因为可能存在更新的维护。

I think that you simplified your real model so much, that it doesn't work good. You have two chained ManyToMany relations with anonymous (not explicit) relationship table. This makes impossible to write right query sets.

我认为你把你的真实模型简化了很多,以至于它不能正常工作。你有两个用匿名(不明确的)关系表锁住的。这样就不可能编写正确的查询集。

Bugs:

错误:

1) You use the same name ("licenses" and also "maintenances") for a field and its related_name. It is a nonsense because: docs

1)对字段及其related_name使用相同的名称(“许可证”和“维护”)。这是无稽之谈,因为:文档

related_name
The name to use for the relation from the related object back to this one. It’s also the default value for related_query_name (the name to use for the reverse filter name from the target model).

related_name从相关对象返回到此对象的关系要使用的名称。它也是related_query_name的默认值(用于目标模型的反向过滤器名)。

It is not useful that you will see a reverse queryset maintenances on object Maintenance to License. Similarly a queryset `licenses' on License to Customer. You can easily rename the related_name because it does't change the database and doesn't cause a migration.

您将看到关于对象维护的反向查询集维护许可,这是没有用的。类似地,向客户颁发许可证的查询集“许可证”。您可以轻松地重命名related_name,因为它不会更改数据库,也不会导致迁移。

2) Is the License a common or individual object? If it is individual then it doesn't need many-to-many relationship to Customer object. If it is common then you can't track a paid maintenance to individual customers through it. (You also don't mean that two customers are co-owners of one license! Are they? :-) You probably mean a common LicensedProduct and an individual License that connects the customer with that product. I understand that the user can buy one maintenace for more licenses and and many-to-many is here good.

2)许可是公共的还是单独的对象?如果是单独的,那么它不需要与客户对象的多对多关系。如果它是常见的,那么您就不能通过它来跟踪单个客户的付费维护。(你也不是说两个客户是一个许可证的共有人!他们是吗?:-)你可能指的是一个通用的被许可产品和一个连接客户和该产品的单独的许可。我理解用户可以购买一个维护来获得更多的许可,多对多是好的。

First I fix the model (some way I guess before I can ask you)

首先我修正了模型(在我问你之前,我想是这样)

class Customer(SomeInheritedModel):
    # "licenses" is the reverse query to License
    # optionally you can enable many-to-many relation to licensed products
    # lic_products = models.ManyToManyField(

class Product(models.Model):
    pass  # licensed product details

class License(SomeInheritedModel):
    customer = models.ForeignKey(Customer, related_name='licenses')
    product = models.ForeignKey(Product, related_name='+')  # used '+' because not deeded
    maintenances = models.ManyToManyField(
        Maintenance,
        through='LicenseMaintenance',
        through_fields=('license', 'maintenance'),
        related_name='licenses')

class Maintenance(SomeInheritedModel):
    start_date = DateTimeField(null=True)
    expiration_date = DateTimeField(null=True)

class LicenseMaintenance(models.Model):
    license = models.ForeignKey(License, on_delete=models.CASCADE)
    maintenance = models.ForeignKey(Maintenance, on_delete=models.CASCADE)

querysets: (can be simplified much by removing order_by and related fields)

querysets:(可以通过删除order_by和相关字段来简化)

remind_start = datetime.datetime.now(tz=TIMEZONE)
remind_end = remind_start + datetime.timedelta(days=30)

expiring_lic_maintenances = (
    LicenseMaintenance.objects.values('license',
                                      'license__customer',
                                      'license__customer__name')
    .annotate(max_exp_date=models.Max('maintenance__expiration_date'))
    .filter(max_exp_date__lte=remind_start, max_exp_date__gte=remind_end)
    .order_by('license__customer__name', 'license__customer', 'license')
)   # some small detail can be used like e.g. customer name in the example, not used later

expiring_licenses = (
    License.objects.filter(
        license__in=expiring_lic_maintenances.values_list('license', flat=True))
    .select_related('customer', 'product')
    .order_by('license__customer__name', 'license__customer', 'license')
)   # that queryset with subquery is executed by one SQL command

Not more than two SQL requests are executed by running these querysets:

运行这些queryset不会执行超过两个SQL请求:

# but I prefer a simple map and queryset with subquery:
expiration_map = {x.license_id: x.max_exp_date for x in expiring_lic_maintenances}


for lic in expiring_licenses:
    print("{name}, your maintenance for {lic_name} is expiring on {exp_date}".format(
        name=lic.customer.name,
        lic_name=lic.product.name,
        exp_date=expiration_map[lic.id],
    ))

I expect, that it is a new project and you don't need migrations yet for modified models. I wrote similar code so many times enough that I didn't verified it now. A mistake could happen and you can inform me sufficient time before the end of bounty easily.

我希望,这是一个新的项目,您还不需要对修改后的模型进行迁移。我写了很多类似的代码,所以现在还没有验证。一个错误可能会发生,你可以在赏金结束前告诉我足够的时间。


EDIT after edited question:
Aggregation functions work correctly on many-to-many fields without explicit model for join table in the current Django version:

编辑已编辑的问题:在当前Django版本中,聚合函数可以在多对多字段上正确地工作,而不需要显式的连接表模型:

>>> expiring = (
...     License.objects.values('id',
...                            'account__customer',
...                            'account__customer__name')
...     .annotate(max_exp_date=models.Max('maintenance__expiration_date'))
...     .filter(max_exp_date__gte=remind_start, max_exp_date__lte=remind_end)
... )

and look at compiled SQL:

看看编译后的SQL:

>>> str(expiring.query)
SELECT app_license.id, app_account.customer_id, app_customer.name, MAX(app_maintenance.expiration_date) AS max_exp_date
    FROM app_license INNER JOIN app_account ON (app_license.account_id = app_account.id)
    INNER JOIN app_customer ON (app_account.customer_id = app_customer.id)
    LEFT OUTER JOIN app_license_maintenance ON (app_license.id = app_license_maintenance.license_id)
    LEFT OUTER JOIN app_maintenance ON (app_license_maintenance.maintenance_id = app_maintenance.id)
    GROUP BY app_license.id, app_account.customer_id, app_customer.name
    HAVING (MAX(app_maintenance.expiration_date) >= 2017-04-07T13:45:35.485755 AND
            MAX(app_maintenance.expiration_date) <= 2017-03-08T13:45:35.485755
            )

In generally this is compiled by two outer joins.

通常,这是由两个外部连接编译的。


If you find a more complicated case, where it doesn't work or the query is slow because it is more complicated for some database engines to optimize with outer joins, you can everytimes get the implicit model and run the query on it because it is the top model in the relation hierarchy:

如果你找到一个更复杂的情况下,它不工作或查询是缓慢的,因为它是更复杂的数据库引擎优化与外部连接,你每次可以隐式模型和运行查询它,因为它是*模型层次结构的关系:

we can explore the implicit intermediate model of the table:

我们可以探索表的隐式中间模型:

>>> License.maintenance.through
app.models.License_maintenance
>>> LicenseMaintenance = License.maintenance.through
>>> LicenseMaintenance._meta.fields
(<django.db.models.fields.AutoField: id>,
 <django.db.models.fields.related.ForeignKey: license>,
 <django.db.models.fields.related.ForeignKey: maintenance>)

and use it: (all joins are automatically compiled to inner joins)

并使用它:(所有连接都自动编译为内部连接)

>>> expiring = (
...     LicenseMaintenance.objects.values('license',
...                                       'license__account__customer',
...                                       'license__account__customer__name')
...     .annotate(max_exp_date=models.Max('maintenance__expiration_date'))
...     .filter(max_exp_date__lte=remind_start, max_exp_date__gte=remind_end)
... )

#2


1  

In this situation you have 2 options: First is to use prefetch_related:

在这种情况下,您有两个选项:首先是使用prefetch_related:

from django.db.models import Prefetch

now = timezone.now()
maintenance_qs = Maintenance.objects.filter(expiry_date__lte=now).order_by('-expire_date')
license_qs = License.objects.filter(maintenances__expiry_date__lte=now).\
    prefetch_related(
        Prefetch('maintenances', queryset=maintenance_qs)
    ).order_by(-'maintenances__expiry_date')
customers = Customer.objects.prefetch_related(Prefetch('licenses', queryset=license_qs))

It will hit database 3 times, you can read more about prefetch_related and Prefetch object. It will return all licenses and all maintenance but it will be sorted and you can take only 1 item. You can use it like this.

它将访问数据库3次,您可以阅读更多关于prefetch_related和Prefetch对象的信息。它将返回所有的许可证和所有的维护,但是它将被排序并且你只能取1个项目。你可以这样使用它。

for customer in customers:
    last_license = customer.licenses.all()[0]
    last_maintenance = last_license.maintenances.all()[0]

Or you can try to use raw SQL. Your query look like:

或者您可以尝试使用原始SQL。您的查询看起来像:

customers = Customer.objects.raw(
'''
SELECT * FROM (
    SELECT "yourapp_customer"."id", 
           "yourapp_license"."id", 
           "yourapp_maintenance"."id",
           "yourapp_maintanance"."start_date",
           "yourapp_maintanance"."expiration_date",
           MAX("yourapp_maintanance"."expiration_date") over (partition by "yourapp_customer"."id") as last_expired
    FROM "yourapp_customer"
    LEFT OUTER JOIN "yourapp_customer_licenses" ON
        "yourapp_customer"."id" = "yourapp_customer_licenses"."customer_id"
    LEFT OUTER JOIN "yourapp_license" ON
        "yourapp_license"."id" = "yourapp_customer_licenses"."license_id"
    LEFT OUTER JOIN "yourapp_license_maintenances" ON
        "yourapp_license"."id" = "yourapp_license_maintenances"."license_id"
    LEFT OUTER JOIN "yourapp_maintanance" ON
        "yourapp_maintanance"."id" = "yourapp_license_maintenances"."maintanance_id"
    WHERE "yourapp_maintanance"."expiration_date" < NOW()
) AS T
where expiration_date = last_expired
'''
)

It should work much faster, but using this query you can't construct License and Maintenance objects. All properties will be stored in Customer model. You can read more about window functions

它的工作速度应该快得多,但是使用这个查询,您不能构造许可证和维护对象。所有属性都将存储在Customer模型中。您可以阅读有关窗口函数的更多信息

#3


-1  

Use models.Model, not simple object... For example

使用模型。模型,而不是简单的对象……例如

class Customer(models.Model):
    licenses = models.ManyToManyField(License, related_name="licenses")

then run

然后运行

manage.py makemigration

#1


1  

tl;dr )

The current important answer is the EDIT at the end.

当前重要的答案是最后的编辑。

Yes, it is important to ignore all older related objects (maintenances), even those which are still valid, because there may exist a renewed maintenance.

是的,忽略所有旧的相关对象(维护)是很重要的,即使是那些仍然有效的对象,因为可能存在更新的维护。

I think that you simplified your real model so much, that it doesn't work good. You have two chained ManyToMany relations with anonymous (not explicit) relationship table. This makes impossible to write right query sets.

我认为你把你的真实模型简化了很多,以至于它不能正常工作。你有两个用匿名(不明确的)关系表锁住的。这样就不可能编写正确的查询集。

Bugs:

错误:

1) You use the same name ("licenses" and also "maintenances") for a field and its related_name. It is a nonsense because: docs

1)对字段及其related_name使用相同的名称(“许可证”和“维护”)。这是无稽之谈,因为:文档

related_name
The name to use for the relation from the related object back to this one. It’s also the default value for related_query_name (the name to use for the reverse filter name from the target model).

related_name从相关对象返回到此对象的关系要使用的名称。它也是related_query_name的默认值(用于目标模型的反向过滤器名)。

It is not useful that you will see a reverse queryset maintenances on object Maintenance to License. Similarly a queryset `licenses' on License to Customer. You can easily rename the related_name because it does't change the database and doesn't cause a migration.

您将看到关于对象维护的反向查询集维护许可,这是没有用的。类似地,向客户颁发许可证的查询集“许可证”。您可以轻松地重命名related_name,因为它不会更改数据库,也不会导致迁移。

2) Is the License a common or individual object? If it is individual then it doesn't need many-to-many relationship to Customer object. If it is common then you can't track a paid maintenance to individual customers through it. (You also don't mean that two customers are co-owners of one license! Are they? :-) You probably mean a common LicensedProduct and an individual License that connects the customer with that product. I understand that the user can buy one maintenace for more licenses and and many-to-many is here good.

2)许可是公共的还是单独的对象?如果是单独的,那么它不需要与客户对象的多对多关系。如果它是常见的,那么您就不能通过它来跟踪单个客户的付费维护。(你也不是说两个客户是一个许可证的共有人!他们是吗?:-)你可能指的是一个通用的被许可产品和一个连接客户和该产品的单独的许可。我理解用户可以购买一个维护来获得更多的许可,多对多是好的。

First I fix the model (some way I guess before I can ask you)

首先我修正了模型(在我问你之前,我想是这样)

class Customer(SomeInheritedModel):
    # "licenses" is the reverse query to License
    # optionally you can enable many-to-many relation to licensed products
    # lic_products = models.ManyToManyField(

class Product(models.Model):
    pass  # licensed product details

class License(SomeInheritedModel):
    customer = models.ForeignKey(Customer, related_name='licenses')
    product = models.ForeignKey(Product, related_name='+')  # used '+' because not deeded
    maintenances = models.ManyToManyField(
        Maintenance,
        through='LicenseMaintenance',
        through_fields=('license', 'maintenance'),
        related_name='licenses')

class Maintenance(SomeInheritedModel):
    start_date = DateTimeField(null=True)
    expiration_date = DateTimeField(null=True)

class LicenseMaintenance(models.Model):
    license = models.ForeignKey(License, on_delete=models.CASCADE)
    maintenance = models.ForeignKey(Maintenance, on_delete=models.CASCADE)

querysets: (can be simplified much by removing order_by and related fields)

querysets:(可以通过删除order_by和相关字段来简化)

remind_start = datetime.datetime.now(tz=TIMEZONE)
remind_end = remind_start + datetime.timedelta(days=30)

expiring_lic_maintenances = (
    LicenseMaintenance.objects.values('license',
                                      'license__customer',
                                      'license__customer__name')
    .annotate(max_exp_date=models.Max('maintenance__expiration_date'))
    .filter(max_exp_date__lte=remind_start, max_exp_date__gte=remind_end)
    .order_by('license__customer__name', 'license__customer', 'license')
)   # some small detail can be used like e.g. customer name in the example, not used later

expiring_licenses = (
    License.objects.filter(
        license__in=expiring_lic_maintenances.values_list('license', flat=True))
    .select_related('customer', 'product')
    .order_by('license__customer__name', 'license__customer', 'license')
)   # that queryset with subquery is executed by one SQL command

Not more than two SQL requests are executed by running these querysets:

运行这些queryset不会执行超过两个SQL请求:

# but I prefer a simple map and queryset with subquery:
expiration_map = {x.license_id: x.max_exp_date for x in expiring_lic_maintenances}


for lic in expiring_licenses:
    print("{name}, your maintenance for {lic_name} is expiring on {exp_date}".format(
        name=lic.customer.name,
        lic_name=lic.product.name,
        exp_date=expiration_map[lic.id],
    ))

I expect, that it is a new project and you don't need migrations yet for modified models. I wrote similar code so many times enough that I didn't verified it now. A mistake could happen and you can inform me sufficient time before the end of bounty easily.

我希望,这是一个新的项目,您还不需要对修改后的模型进行迁移。我写了很多类似的代码,所以现在还没有验证。一个错误可能会发生,你可以在赏金结束前告诉我足够的时间。


EDIT after edited question:
Aggregation functions work correctly on many-to-many fields without explicit model for join table in the current Django version:

编辑已编辑的问题:在当前Django版本中,聚合函数可以在多对多字段上正确地工作,而不需要显式的连接表模型:

>>> expiring = (
...     License.objects.values('id',
...                            'account__customer',
...                            'account__customer__name')
...     .annotate(max_exp_date=models.Max('maintenance__expiration_date'))
...     .filter(max_exp_date__gte=remind_start, max_exp_date__lte=remind_end)
... )

and look at compiled SQL:

看看编译后的SQL:

>>> str(expiring.query)
SELECT app_license.id, app_account.customer_id, app_customer.name, MAX(app_maintenance.expiration_date) AS max_exp_date
    FROM app_license INNER JOIN app_account ON (app_license.account_id = app_account.id)
    INNER JOIN app_customer ON (app_account.customer_id = app_customer.id)
    LEFT OUTER JOIN app_license_maintenance ON (app_license.id = app_license_maintenance.license_id)
    LEFT OUTER JOIN app_maintenance ON (app_license_maintenance.maintenance_id = app_maintenance.id)
    GROUP BY app_license.id, app_account.customer_id, app_customer.name
    HAVING (MAX(app_maintenance.expiration_date) >= 2017-04-07T13:45:35.485755 AND
            MAX(app_maintenance.expiration_date) <= 2017-03-08T13:45:35.485755
            )

In generally this is compiled by two outer joins.

通常,这是由两个外部连接编译的。


If you find a more complicated case, where it doesn't work or the query is slow because it is more complicated for some database engines to optimize with outer joins, you can everytimes get the implicit model and run the query on it because it is the top model in the relation hierarchy:

如果你找到一个更复杂的情况下,它不工作或查询是缓慢的,因为它是更复杂的数据库引擎优化与外部连接,你每次可以隐式模型和运行查询它,因为它是*模型层次结构的关系:

we can explore the implicit intermediate model of the table:

我们可以探索表的隐式中间模型:

>>> License.maintenance.through
app.models.License_maintenance
>>> LicenseMaintenance = License.maintenance.through
>>> LicenseMaintenance._meta.fields
(<django.db.models.fields.AutoField: id>,
 <django.db.models.fields.related.ForeignKey: license>,
 <django.db.models.fields.related.ForeignKey: maintenance>)

and use it: (all joins are automatically compiled to inner joins)

并使用它:(所有连接都自动编译为内部连接)

>>> expiring = (
...     LicenseMaintenance.objects.values('license',
...                                       'license__account__customer',
...                                       'license__account__customer__name')
...     .annotate(max_exp_date=models.Max('maintenance__expiration_date'))
...     .filter(max_exp_date__lte=remind_start, max_exp_date__gte=remind_end)
... )

#2


1  

In this situation you have 2 options: First is to use prefetch_related:

在这种情况下,您有两个选项:首先是使用prefetch_related:

from django.db.models import Prefetch

now = timezone.now()
maintenance_qs = Maintenance.objects.filter(expiry_date__lte=now).order_by('-expire_date')
license_qs = License.objects.filter(maintenances__expiry_date__lte=now).\
    prefetch_related(
        Prefetch('maintenances', queryset=maintenance_qs)
    ).order_by(-'maintenances__expiry_date')
customers = Customer.objects.prefetch_related(Prefetch('licenses', queryset=license_qs))

It will hit database 3 times, you can read more about prefetch_related and Prefetch object. It will return all licenses and all maintenance but it will be sorted and you can take only 1 item. You can use it like this.

它将访问数据库3次,您可以阅读更多关于prefetch_related和Prefetch对象的信息。它将返回所有的许可证和所有的维护,但是它将被排序并且你只能取1个项目。你可以这样使用它。

for customer in customers:
    last_license = customer.licenses.all()[0]
    last_maintenance = last_license.maintenances.all()[0]

Or you can try to use raw SQL. Your query look like:

或者您可以尝试使用原始SQL。您的查询看起来像:

customers = Customer.objects.raw(
'''
SELECT * FROM (
    SELECT "yourapp_customer"."id", 
           "yourapp_license"."id", 
           "yourapp_maintenance"."id",
           "yourapp_maintanance"."start_date",
           "yourapp_maintanance"."expiration_date",
           MAX("yourapp_maintanance"."expiration_date") over (partition by "yourapp_customer"."id") as last_expired
    FROM "yourapp_customer"
    LEFT OUTER JOIN "yourapp_customer_licenses" ON
        "yourapp_customer"."id" = "yourapp_customer_licenses"."customer_id"
    LEFT OUTER JOIN "yourapp_license" ON
        "yourapp_license"."id" = "yourapp_customer_licenses"."license_id"
    LEFT OUTER JOIN "yourapp_license_maintenances" ON
        "yourapp_license"."id" = "yourapp_license_maintenances"."license_id"
    LEFT OUTER JOIN "yourapp_maintanance" ON
        "yourapp_maintanance"."id" = "yourapp_license_maintenances"."maintanance_id"
    WHERE "yourapp_maintanance"."expiration_date" < NOW()
) AS T
where expiration_date = last_expired
'''
)

It should work much faster, but using this query you can't construct License and Maintenance objects. All properties will be stored in Customer model. You can read more about window functions

它的工作速度应该快得多,但是使用这个查询,您不能构造许可证和维护对象。所有属性都将存储在Customer模型中。您可以阅读有关窗口函数的更多信息

#3


-1  

Use models.Model, not simple object... For example

使用模型。模型,而不是简单的对象……例如

class Customer(models.Model):
    licenses = models.ManyToManyField(License, related_name="licenses")

then run

然后运行

manage.py makemigration