如何管理数据库中的“组”?

时间:2021-11-29 11:19:06

I've asked this question here, but I don't think I got my point across.

我在这里问过这个问题,但我不认为我明白了这一点。

Let's say I have the following tables (all PK are IDENTITY fields):

假设我有以下表格(所有PK都是IDENTITY字段):

  • People (PersonId (PK), Name, SSN, etc.)
  • 人(PersonId(PK),姓名,SSN等)

  • Loans (LoanId (PK), Amount, etc.)
  • 贷款(贷款(PK),金额等)

  • Borrowers (BorrowerId(PK), PersonId, LoanId)
  • 借款人(BorrowerId(PK),PersonId,LoanId)

Let's say Mr. Smith got 2 loans on his name, 3 joint loans with his wife, and 1 join loan with his mistress. For the purposes of application I want to GROUP people, so that I can easily single-out the loans that Mr. Smith took out jointly with his wife.

假设史密斯先生以他的名义获得2笔贷款,与妻子共同获得3笔贷款,并与他的情人一起加入1笔贷款。出于申请的目的,我想要GROUP人员,这样我就可以轻松地将史密斯先生与妻子共同拿出的贷款单独列出。

To accomplish that I added BorrowerGroup table, now I have the following (all PK are IDENTITY fields):

为了实现这一点,我添加了BorrowerGroup表,现在我有以下(所有PK都是IDENTITY字段):

  • People (PersonId (PK), Name, SSN, etc.)
  • 人(PersonId(PK),姓名,SSN等)

  • Loans (LoanId (PK), Amount, BorrowerGroupId, etc.)
  • 贷款(LoanId(PK),金额,BorrowerGroupId等)

  • BorrowerGroup(GroupId (PK))
  • Borrowers (BorrowerId(PK), GroupId, PersonId)
  • 借款人(BorrowerId(PK),GroupId,PersonId)

Now Mr. Smith is in 3 groups (himself, him and his wife, him and his mistress) and I can easily lookup his activity in any of those groups.

现在,史密斯先生分为三组(他本人,他和他的妻子,他和他的情妇),我可以轻松地在任何一组中查找他的活动。

The problems with new design:

新设计的问题:

The only way to generate new BorrowerGroup is by inserting MAX(GourpId)+1 with IDENTITY_INSERT ON, this just doesn't feel right. Also, the notion of a table with 1 column is kind of weird.

生成新BorrowerGroup的唯一方法是插入带有IDENTITY_INSERT ON的MAX(GourpId)+1,这感觉不对。此外,具有1列的表的概念有点奇怪。

I'm a firm believer in surrogate keys, and would like to stick to that design if possible.

我坚信代理键,如果可能的话,我想坚持这个设计。

This application does not care about individuals, the GROUP is treated as an individual

此应用程序不关心个人,GROUP被视为个人

Is there a better way to group people for the purpose of this application?

6 个解决方案

#1


The design of the database seems OK. Why do you have to use MAX(GourpId)+1 when you create a new group? Can't you just create the row and then use SCOPE_IDENTITY() to return the new ID?

数据库的设计似乎没问题。为什么在创建新组时必须使用MAX(GourpId)+1?你不能只创建行然后使用SCOPE_IDENTITY()来返回新的ID吗?

e.g.

INSERT INTO BorrowerGroup() DEFAULT VALUES
SELECT SCOPE_IDENTITY()

(See this other question)

(见另一个问题)

(edit to SQL courtesy of this question)

(编辑SQL提供此问题)

#2


You could just remove the table BorrowerGroups - it carries no information. This information is allready present via the Loans People share - I just assume you have a PeopleLoans table.

您可以删除BorrowerGroups表 - 它不包含任何信息。这些信息已经通过Loans People共享提供 - 我只是假设您有一个PeopleLoans表。

People          Loans           PeopleLoans
-----------     ------------    -----------
1  Smith         6  S1    60    1   6
2  Wife          7  S2    60    1   7
3  Mistress      8  S+W1  74    1   8
                 9  S+W2  74    1   9
                10  S+W3  74    1  10
                11  S+M1  89    1  11
                                2   8
                                2   9
                                2  10
                                3  11

So your BorrowerGroups are actually almost the Loans - 6 and 7 with Smith only, 8 to 10 with Smith and Wife, and 11 with Smith and Mistress. So there is no need for BorrowerGroups in the first place, because they are identical to Loans grouped by the involved People.

所以你的BorrowerGroups实际上几乎就是贷款 - 只有史密斯和妻子的6到7,史密斯和妻子的8到10,以及史密斯和女主人的11。因此,首先不需要BorrowerGroups,因为它们与相关人员分组的贷款相同。

But it might be quite hard to efficently retrieve this information, so you could think about adding a GroupId directly to Loans. Ignoring the second column of Loans (just for readability) the third column schould represent your groups. They are redundant, so you have to be carefull if you change them.

但是有效地检索这些信息可能非常困难,因此您可以考虑将GroupId直接添加到贷款中。忽略第二列贷款(仅为了便于阅读),第三列可以代表您的组。它们是多余的,所以如果你改变它们就必须小心。

If you find a good way to derive a unique GroupId from the ids of involved people, you could make it a computed column. If a string would be okay as an group id, you could just order the ids of the people an concat them with a separator.

如果您找到一种从相关人员的ID中派生唯一GroupId的好方法,您可以将其设为计算列。如果一个字符串可以作为一个组ID,你可以命令人们的id用一个分隔符连接它们。

Group 60 with Smith only would get id '1', group 74 would become 1.2, and group 89 would become 1.3. Not that smart, but unique and easy to compute.

只有史密斯的第60组才能获得id'1',第74组将获得1.2,第89组将获得1.3。不那么聪明,但独特且易于计算。

#3


use the original schema:

使用原始架构:

  • People (PersonId (PK), Name, SSN, etc.)
  • 人(PersonId(PK),姓名,SSN等)

  • Loans (LoanId (PK), Amount, etc.)
  • 贷款(贷款(PK),金额等)

  • Borrowers (BorrowerId(PK), PersonId, LoanId)
  • 借款人(BorrowerId(PK),PersonId,LoanId)

just query for the data you need (your example to find husband and wife on same loans):

只查询你需要的数据(你在同一笔贷款中寻找丈夫和妻子的例子):

SELECT
    l.*
    FROM Borrowers            b1
        INNER JOIN Borrowers  b2 ON b1.LoanId=b2.LoanId
        INNER JOIN Loans       l ON b1.LoanId=l.LoanId
    WHERE b1.PersonId=@HusbandID
        AND b2.PersonId=@WifeID

#4


I would do something more like this:

我会做更像这样的事情:

  • People (PersonId (PK), Name, SSN, etc.)

    人(PersonId(PK),姓名,SSN等)

  • Loans (LoanId (PK), Amount, BorrowerGroupId, etc.)

    贷款(LoanId(PK),金额,BorrowerGroupId等)

  • BorrowerGroup(BorrowerGroupId (PK))

  • PersonBelongsToBorrowerGroup(BorrowerGroupId (PK), PersonId(PK))

    PersonBelongsToBorrowerGroup(BorrowerGroupId(PK),PersonId(PK))

I got rid of the Borrowers table. Just store the info in the BorrowerGroup table. That's my preference.

我摆脱了借款人表。只需将信息存储在BorrowerGroup表中。这是我的偏好。

#5


The consensus seems to be to omit the BorrowerGroup table and I have to agree. Suggesting that you would use MAX(groupId+1) has all sorts of ACID/transaction issues and the main reason why IDENTITY fields exist.

共识似乎是省略了BorrowerGroup表,我不得不同意。建议您使用MAX(groupId + 1)具有各种ACID /事务问题以及IDENTITY字段存在的主要原因。

That said; the SQL that KM provided looks good. There are any number of ways to get the same results. Joins, sub-selects and so on. The real issue there... is knowing the dataset. Given the explanation you provided the datasets are going to be very small. That also supports removing the BorrowerGroup table.

那说; KM提供的SQL看起来不错。有多种方法可以获得相同的结果。连接,子选择等。那里的真正问题是了解数据集。鉴于您提供的解释,数据集将非常小。这也支持删除BorrowerGroup表。

#6


I would have a group table and then a groupmembers(borrowers) table to accomplish the many-to-many relationship between loans and people. This allows the tracking of data on the group other than just a list of members (I believe someone else made this suggestion?).

我会有一个小组表,然后是一个小组成员(借款人)表,以实现贷款和人之间的多对多关系。这允许跟踪组中的数据而不仅仅是成员列表(我相信其他人提出了这个建议?)。

CREATE TABLE LoanGroup
(
    ID int NOT NULL 
    , Group_Name char(50) NULL 
    , Date_Started datetime NULL 
    , Primary_ContactID int NULL
    , Group_Type varchar(25)
)

#1


The design of the database seems OK. Why do you have to use MAX(GourpId)+1 when you create a new group? Can't you just create the row and then use SCOPE_IDENTITY() to return the new ID?

数据库的设计似乎没问题。为什么在创建新组时必须使用MAX(GourpId)+1?你不能只创建行然后使用SCOPE_IDENTITY()来返回新的ID吗?

e.g.

INSERT INTO BorrowerGroup() DEFAULT VALUES
SELECT SCOPE_IDENTITY()

(See this other question)

(见另一个问题)

(edit to SQL courtesy of this question)

(编辑SQL提供此问题)

#2


You could just remove the table BorrowerGroups - it carries no information. This information is allready present via the Loans People share - I just assume you have a PeopleLoans table.

您可以删除BorrowerGroups表 - 它不包含任何信息。这些信息已经通过Loans People共享提供 - 我只是假设您有一个PeopleLoans表。

People          Loans           PeopleLoans
-----------     ------------    -----------
1  Smith         6  S1    60    1   6
2  Wife          7  S2    60    1   7
3  Mistress      8  S+W1  74    1   8
                 9  S+W2  74    1   9
                10  S+W3  74    1  10
                11  S+M1  89    1  11
                                2   8
                                2   9
                                2  10
                                3  11

So your BorrowerGroups are actually almost the Loans - 6 and 7 with Smith only, 8 to 10 with Smith and Wife, and 11 with Smith and Mistress. So there is no need for BorrowerGroups in the first place, because they are identical to Loans grouped by the involved People.

所以你的BorrowerGroups实际上几乎就是贷款 - 只有史密斯和妻子的6到7,史密斯和妻子的8到10,以及史密斯和女主人的11。因此,首先不需要BorrowerGroups,因为它们与相关人员分组的贷款相同。

But it might be quite hard to efficently retrieve this information, so you could think about adding a GroupId directly to Loans. Ignoring the second column of Loans (just for readability) the third column schould represent your groups. They are redundant, so you have to be carefull if you change them.

但是有效地检索这些信息可能非常困难,因此您可以考虑将GroupId直接添加到贷款中。忽略第二列贷款(仅为了便于阅读),第三列可以代表您的组。它们是多余的,所以如果你改变它们就必须小心。

If you find a good way to derive a unique GroupId from the ids of involved people, you could make it a computed column. If a string would be okay as an group id, you could just order the ids of the people an concat them with a separator.

如果您找到一种从相关人员的ID中派生唯一GroupId的好方法,您可以将其设为计算列。如果一个字符串可以作为一个组ID,你可以命令人们的id用一个分隔符连接它们。

Group 60 with Smith only would get id '1', group 74 would become 1.2, and group 89 would become 1.3. Not that smart, but unique and easy to compute.

只有史密斯的第60组才能获得id'1',第74组将获得1.2,第89组将获得1.3。不那么聪明,但独特且易于计算。

#3


use the original schema:

使用原始架构:

  • People (PersonId (PK), Name, SSN, etc.)
  • 人(PersonId(PK),姓名,SSN等)

  • Loans (LoanId (PK), Amount, etc.)
  • 贷款(贷款(PK),金额等)

  • Borrowers (BorrowerId(PK), PersonId, LoanId)
  • 借款人(BorrowerId(PK),PersonId,LoanId)

just query for the data you need (your example to find husband and wife on same loans):

只查询你需要的数据(你在同一笔贷款中寻找丈夫和妻子的例子):

SELECT
    l.*
    FROM Borrowers            b1
        INNER JOIN Borrowers  b2 ON b1.LoanId=b2.LoanId
        INNER JOIN Loans       l ON b1.LoanId=l.LoanId
    WHERE b1.PersonId=@HusbandID
        AND b2.PersonId=@WifeID

#4


I would do something more like this:

我会做更像这样的事情:

  • People (PersonId (PK), Name, SSN, etc.)

    人(PersonId(PK),姓名,SSN等)

  • Loans (LoanId (PK), Amount, BorrowerGroupId, etc.)

    贷款(LoanId(PK),金额,BorrowerGroupId等)

  • BorrowerGroup(BorrowerGroupId (PK))

  • PersonBelongsToBorrowerGroup(BorrowerGroupId (PK), PersonId(PK))

    PersonBelongsToBorrowerGroup(BorrowerGroupId(PK),PersonId(PK))

I got rid of the Borrowers table. Just store the info in the BorrowerGroup table. That's my preference.

我摆脱了借款人表。只需将信息存储在BorrowerGroup表中。这是我的偏好。

#5


The consensus seems to be to omit the BorrowerGroup table and I have to agree. Suggesting that you would use MAX(groupId+1) has all sorts of ACID/transaction issues and the main reason why IDENTITY fields exist.

共识似乎是省略了BorrowerGroup表,我不得不同意。建议您使用MAX(groupId + 1)具有各种ACID /事务问题以及IDENTITY字段存在的主要原因。

That said; the SQL that KM provided looks good. There are any number of ways to get the same results. Joins, sub-selects and so on. The real issue there... is knowing the dataset. Given the explanation you provided the datasets are going to be very small. That also supports removing the BorrowerGroup table.

那说; KM提供的SQL看起来不错。有多种方法可以获得相同的结果。连接,子选择等。那里的真正问题是了解数据集。鉴于您提供的解释,数据集将非常小。这也支持删除BorrowerGroup表。

#6


I would have a group table and then a groupmembers(borrowers) table to accomplish the many-to-many relationship between loans and people. This allows the tracking of data on the group other than just a list of members (I believe someone else made this suggestion?).

我会有一个小组表,然后是一个小组成员(借款人)表,以实现贷款和人之间的多对多关系。这允许跟踪组中的数据而不仅仅是成员列表(我相信其他人提出了这个建议?)。

CREATE TABLE LoanGroup
(
    ID int NOT NULL 
    , Group_Name char(50) NULL 
    , Date_Started datetime NULL 
    , Primary_ContactID int NULL
    , Group_Type varchar(25)
)