在Django中将两个不相关的表/模型与相同的主键组合在一起

问题描述 投票:0回答:1

我有两个不相关的表具有相同的主键。

ip            mac
11.11.11.11   48-C0-09-1F-9B-54
33.33.33.33   4E-10-A3-BC-B8-9D
44.44.44.44   CD-00-60-08-56-2A
55.55.55.55   23-CE-D3-B1-39-A6

ip            type     owner
22.22.22.22   laptop   John Doe
33.33.33.33   server   XYZ Department
44.44.44.44   VM       Mary Smith
66.66.66.66   printer  ZWV Department

第一个表每分钟自动刷新一次。我无法更改数据库结构或填充它的脚本。

两个表都有ip作为PRIMARY KEY。

在一个视图中,我想显示一个这样的表:

ip           mac               type    owner          Alert
11.11.11.11  48-C0-09-1F-9B-54                        Unauthorized
55.55.55.55  23-CE-D3-B1-39-A6                        Unauthorized
22.22.22.22                    laptop  John Doe       Down
66.66.66.66                    printer ZWV Department Down
33.33.33.33  4E-10-A3-BC-B8-9D server  XYZ Department OK
44.44.44.44  CD-00-60-08-56-2A VM      Mary Smith     OK

我该如何建模呢?我应该将两个主键中的一个作为另一个的外键吗?

一旦代码运行,将会有大量数据,所以我想确保它足够快。

检索数据的最快方法是什么?


更新:

我尝试使用OneToOneField作为第二张桌子。

这有助于我获取两个表中的记录,以及未授权设备的记录(第二个表中缺少的IP):

ip           mac               type    owner          Alert
11.11.11.11  48-C0-09-1F-9B-54                        Unauthorized
55.55.55.55  23-CE-D3-B1-39-A6                        Unauthorized
33.33.33.33  4E-10-A3-BC-B8-9D server  XYZ Department OK
44.44.44.44  CD-00-60-08-56-2A VM      Mary Smith     OK

但我无法得到关闭的设备(第一个表中缺少IP):

22.22.22.22                    laptop  John Doe       Down
66.66.66.66                    printer ZWV Department Down

我请求帮助here,但似乎无法用OneToOneField完成

django django-models
1个回答
5
投票

General idea

你可以使用qs.union

  • 创建2个模型,它们之间没有任何关系。别忘了使用class Meta: managed = False
  • 从第一个模型中选择,使用子查询进行注释,使用第二个进行联合:
from django.db import models
from django.db.models import F, OuterRef, Subquery, Value
from django.db.models.functions import Coalesce

# OperationalDevice fields: ip, mac
# AllowedDevice fields: ip, type, owner

USE_EMPTY_STR_AS_DEFAULT = True

null_char_field = models.CharField(null=True)
if USE_EMPTY_STR_AS_DEFAULT:
    default_value = ''
else:
    default_value = None

# By default Expressions treat strings as "field_name" so if you want to use
# empty string as a second argument for Coalesce, then you should wrap it in
# `Value()`.
# `None` can be used there without wrapping in `Value()`, but in
# `.annotate(type=NoneValue)` it still should be wrapped, so it's easier to
# just "always wrap".
default_value = Value(default_value, output_field=null_char_field)

operational_devices_subquery = OperationalDevice.objects.filter(ip=OuterRef('ip'))


qs1 = (
    AllowedDevice.objects
    .all()
    .annotate(
        mac=Coalesce(
            Subquery(operational_devices_subquery.values('mac')[:1]),
            default_value,
            output_field=null_char_field,
        ),
    )
)

qs2 = (
    OperationalDevice.objects
    .exclude(
        ip__in=qs1.values('ip'),
    )
    .annotate(
        type=default_value,
        owner=default_value,
    )
)

final_qs = qs1.union(qs2)

Generic approach for multiple fields

更复杂但“普遍”的方法可能会使用Model._meta.get_fields()。对于“第二”模型具有更多1个额外字段(不仅仅是ip,mac)的情况,它将更容易使用。示例代码(未经过测试,但给出了总体印象):

# One more import:
from django.db.models.fields import NOT_PROVIDED

common_field_name = 'ip'

# OperationalDevice fields: ip, mac, some_more_fields ...
# AllowedDevice fields: ip, type, owner

operational_device_fields = OperationalDevice._meta.get_fields()
operational_device_fields_names = {_f.name for _f in operational_device_fields}  # or set((_f.name for ...))

allowed_device_fields = AllowedDevice._meta.get_fields()
allowed_device_fields_names = {_f.name for _f in allowed_device_fields}  # or set((_f.name for ...))

operational_devices_subquery = OperationalDevice.objects.filter(ip=OuterRef(common_field_name))

left_joined_qs = (  # "Kind-of". Assuming AllowedDevice to be "left" and OperationalDevice to be "right"
    AllowedDevice.objects
    .all()
    .annotate(
        **{
            _f.name: Coalesce(
                Subquery(operational_devices_subquery.values(_f.name)[1]),
                Value(_f.get_default()),  # Use defaults from model definition
                output_field=_f,
            )
            for _f in operational_device_fields
            if _f.name not in allowed_device_fields_names
            # NOTE: if fields other than `ip` "overlap", then you might consider
            # changing logic here. Current implementation keeps fields from the
            # AllowedDevice
        }
        # Unpacked dict is partially equivalent to this:
        # mac=Coalesce(
        #     Subquery(operational_devices_subquery.values('mac')[:1]),
        #     default_for_mac_eg_fallback_text_value,
        #     output_field=null_char_field,
        # ),
        # other_field = Coalesce(...),
        # ...
    )
)

lonely_right_rows_qs = (
    OperationalDevice.objects
    .exclude(
        ip__in=AllowedDevice.objects.all().values(common_field_name),
    )
    .annotate(
        **{
            _f.name: Value(_f.get_default(), output_field=_f),  # Use defaults from model definition
            for _f in allowed_device_fields
            if _f.name not in operational_device_fields_names
            # NOTE: See previous NOTE
        }
    )
)

final_qs = left_joined_qs.union(lonely_right_rows_qs)

Using OneToOneField for "better" SQL

从理论上讲,你可以使用device_info = models.OneToOneField(OperationalDevice, db_column='ip', primary_key=True, related_name='status_info'):在AllowedDevice。在这种情况下,您可以在不使用Subquery的情况下定义您的第一个QS:

from django.db.models import F

# Now 'ip' is not in field names ('device_info' is there), so add it:
allowed_device_fields_names.add(common_field_name)

# NOTE: I think this approach will result in a more compact SQL query without 
# multiple `(SELECT "some_field" FROM device_info_table ... ) as "some-field"`.
# This also might result in better query performance.
honest_join_qs = (
    AllowedDevice.objects
    .all()
    .annotate(
        **{
            _f.name: F(f'device_info__{_f.name}')
            for _f in operational_device_fields
            if _f.name not in allowed_device_fields_names
        }
    )
)

final_qs = honest_join_qs.union(lonely_right_rows_qs)
# or:
# final_qs = honest_join_qs.union(
#     OperationalDevice.objects.filter(status_info__isnull=True).annotate(**missing_fields_annotation)
# )
# I'm not sure which approach is better performance-wise...
# Commented one will use something like:
# `SELECT ... FROM "device_info_table" LEFT OUTER JOIN "status_info_table" ON ("device_info_table"."ip" = "status_info_table"."ip") WHERE "status_info_table"."ip" IS NULL
#
# So it might be a little better than first with `union(QS.exclude(ip__in=honest_join_qs.values('ip'))`.
# Because later uses SQL like this:
# `SELECT ... FROM "device_info_table" WHERE NOT ip IN (SELECT ip FROM "status_info_table")`
#
# But it's better to measure timings of both approaches to be sure.
# @GrannyAching, can you compare them and tell in the comments which one is better ?

附:要自动化模型定义,您可以使用manage.py inspectdb

P.P.S.也许multi-table inheritance与自定义OneToOneField(..., parent_link=True)可能比使用union更有帮助。


4
投票

由于ip是主键,第一个表经常更新,我建议更新第二个表并转换第二个表中的ip,使第一个表的ipOneToOneField

这就是你的模型应该是这样的:

class ModelA(models.Model):
    ip = models.GenericIPAddressField(unique=True)
    mac = models.CharField(max_length=17, null=True, blank=True)

class ModelB(models.Model):
    ip = models.OneToOneField(ModelA)
    type = models.CharField()
    owner = models.CharField()

docs

您还可以使用单独的列来建立一对一关系:

class ModelB(models.Model):
    ip = models.GenericIPAddressField(unique=True) 
    type = models.CharField()
    owner = models.CharField()
    modelA = models.OneToOneField(ModelA)

所以现在你可以将ip地址作为主键,你仍然可以使用字段ModelA来引用表modelA


3
投票

从两个表中的某个表中获取值后,只需查询另一个表,查找id。由于这两个表是分开的,因此您必须执行额外的查询。您不需要创建显式关系,因为您正在查看其“id / ip”。因此,一旦你有一个名为'first_object'的第一个值,只需查看它与另一个表的相对性。

other_columns = ModelB.objects.get(id=first_object.id)

然后,如果您只想将所需列“添加”到另一个模型,并将单个对象发送到您想要的任何内容:

first_object.attr1 = other_columns.attr1
...
© www.soinside.com 2019 - 2024. All rights reserved.