
时间:2021-11-29 11:19:06

I've asked this question here, but I don't think I got my point across.


Let's say I have the following tables (all PK are IDENTITY fields):


  • People (PersonId (PK), Name, SSN, etc.)
  • 人(PersonId(PK),姓名,SSN等)

  • Loans (LoanId (PK), Amount, etc.)
  • 贷款(贷款(PK),金额等)

  • Borrowers (BorrowerId(PK), PersonId, LoanId)
  • 借款人(BorrowerId(PK),PersonId,LoanId)

Let's say Mr. Smith got 2 loans on his name, 3 joint loans with his wife, and 1 join loan with his mistress. For the purposes of application I want to GROUP people, so that I can easily single-out the loans that Mr. Smith took out jointly with his wife.


To accomplish that I added BorrowerGroup table, now I have the following (all PK are IDENTITY fields):


  • People (PersonId (PK), Name, SSN, etc.)
  • 人(PersonId(PK),姓名,SSN等)

  • Loans (LoanId (PK), Amount, BorrowerGroupId, etc.)
  • 贷款(LoanId(PK),金额,BorrowerGroupId等)

  • BorrowerGroup(GroupId (PK))
  • Borrowers (BorrowerId(PK), GroupId, PersonId)
  • 借款人(BorrowerId(PK),GroupId,PersonId)

Now Mr. Smith is in 3 groups (himself, him and his wife, him and his mistress) and I can easily lookup his activity in any of those groups.


The problems with new design:


The only way to generate new BorrowerGroup is by inserting MAX(GourpId)+1 with IDENTITY_INSERT ON, this just doesn't feel right. Also, the notion of a table with 1 column is kind of weird.

生成新BorrowerGroup的唯一方法是插入带有IDENTITY_INSERT ON的MAX(GourpId)+1,这感觉不对。此外,具有1列的表的概念有点奇怪。

I'm a firm believer in surrogate keys, and would like to stick to that design if possible.


This application does not care about individuals, the GROUP is treated as an individual


Is there a better way to group people for the purpose of this application?

6 个解决方案


The design of the database seems OK. Why do you have to use MAX(GourpId)+1 when you create a new group? Can't you just create the row and then use SCOPE_IDENTITY() to return the new ID?




(See this other question)


(edit to SQL courtesy of this question)



You could just remove the table BorrowerGroups - it carries no information. This information is allready present via the Loans People share - I just assume you have a PeopleLoans table.

您可以删除BorrowerGroups表 - 它不包含任何信息。这些信息已经通过Loans People共享提供 - 我只是假设您有一个PeopleLoans表。

People          Loans           PeopleLoans
-----------     ------------    -----------
1  Smith         6  S1    60    1   6
2  Wife          7  S2    60    1   7
3  Mistress      8  S+W1  74    1   8
                 9  S+W2  74    1   9
                10  S+W3  74    1  10
                11  S+M1  89    1  11
                                2   8
                                2   9
                                2  10
                                3  11

So your BorrowerGroups are actually almost the Loans - 6 and 7 with Smith only, 8 to 10 with Smith and Wife, and 11 with Smith and Mistress. So there is no need for BorrowerGroups in the first place, because they are identical to Loans grouped by the involved People.

所以你的BorrowerGroups实际上几乎就是贷款 - 只有史密斯和妻子的6到7,史密斯和妻子的8到10,以及史密斯和女主人的11。因此,首先不需要BorrowerGroups,因为它们与相关人员分组的贷款相同。

But it might be quite hard to efficently retrieve this information, so you could think about adding a GroupId directly to Loans. Ignoring the second column of Loans (just for readability) the third column schould represent your groups. They are redundant, so you have to be carefull if you change them.


If you find a good way to derive a unique GroupId from the ids of involved people, you could make it a computed column. If a string would be okay as an group id, you could just order the ids of the people an concat them with a separator.


Group 60 with Smith only would get id '1', group 74 would become 1.2, and group 89 would become 1.3. Not that smart, but unique and easy to compute.



use the original schema:


  • People (PersonId (PK), Name, SSN, etc.)
  • 人(PersonId(PK),姓名,SSN等)

  • Loans (LoanId (PK), Amount, etc.)
  • 贷款(贷款(PK),金额等)

  • Borrowers (BorrowerId(PK), PersonId, LoanId)
  • 借款人(BorrowerId(PK),PersonId,LoanId)

just query for the data you need (your example to find husband and wife on same loans):


    FROM Borrowers            b1
        INNER JOIN Borrowers  b2 ON b1.LoanId=b2.LoanId
        INNER JOIN Loans       l ON b1.LoanId=l.LoanId
    WHERE b1.PersonId=@HusbandID
        AND b2.PersonId=@WifeID


I would do something more like this:


  • People (PersonId (PK), Name, SSN, etc.)


  • Loans (LoanId (PK), Amount, BorrowerGroupId, etc.)


  • BorrowerGroup(BorrowerGroupId (PK))

  • PersonBelongsToBorrowerGroup(BorrowerGroupId (PK), PersonId(PK))


I got rid of the Borrowers table. Just store the info in the BorrowerGroup table. That's my preference.



The consensus seems to be to omit the BorrowerGroup table and I have to agree. Suggesting that you would use MAX(groupId+1) has all sorts of ACID/transaction issues and the main reason why IDENTITY fields exist.

共识似乎是省略了BorrowerGroup表,我不得不同意。建议您使用MAX(groupId + 1)具有各种ACID /事务问题以及IDENTITY字段存在的主要原因。

That said; the SQL that KM provided looks good. There are any number of ways to get the same results. Joins, sub-selects and so on. The real issue there... is knowing the dataset. Given the explanation you provided the datasets are going to be very small. That also supports removing the BorrowerGroup table.

那说; KM提供的SQL看起来不错。有多种方法可以获得相同的结果。连接,子选择等。那里的真正问题是了解数据集。鉴于您提供的解释,数据集将非常小。这也支持删除BorrowerGroup表。


I would have a group table and then a groupmembers(borrowers) table to accomplish the many-to-many relationship between loans and people. This allows the tracking of data on the group other than just a list of members (I believe someone else made this suggestion?).


    ID int NOT NULL 
    , Group_Name char(50) NULL 
    , Date_Started datetime NULL 
    , Primary_ContactID int NULL
    , Group_Type varchar(25)


The design of the database seems OK. Why do you have to use MAX(GourpId)+1 when you create a new group? Can't you just create the row and then use SCOPE_IDENTITY() to return the new ID?




(See this other question)


(edit to SQL courtesy of this question)



You could just remove the table BorrowerGroups - it carries no information. This information is allready present via the Loans People share - I just assume you have a PeopleLoans table.

您可以删除BorrowerGroups表 - 它不包含任何信息。这些信息已经通过Loans People共享提供 - 我只是假设您有一个PeopleLoans表。

People          Loans           PeopleLoans
-----------     ------------    -----------
1  Smith         6  S1    60    1   6
2  Wife          7  S2    60    1   7
3  Mistress      8  S+W1  74    1   8
                 9  S+W2  74    1   9
                10  S+W3  74    1  10
                11  S+M1  89    1  11
                                2   8
                                2   9
                                2  10
                                3  11

So your BorrowerGroups are actually almost the Loans - 6 and 7 with Smith only, 8 to 10 with Smith and Wife, and 11 with Smith and Mistress. So there is no need for BorrowerGroups in the first place, because they are identical to Loans grouped by the involved People.

所以你的BorrowerGroups实际上几乎就是贷款 - 只有史密斯和妻子的6到7,史密斯和妻子的8到10,以及史密斯和女主人的11。因此,首先不需要BorrowerGroups,因为它们与相关人员分组的贷款相同。

But it might be quite hard to efficently retrieve this information, so you could think about adding a GroupId directly to Loans. Ignoring the second column of Loans (just for readability) the third column schould represent your groups. They are redundant, so you have to be carefull if you change them.


If you find a good way to derive a unique GroupId from the ids of involved people, you could make it a computed column. If a string would be okay as an group id, you could just order the ids of the people an concat them with a separator.


Group 60 with Smith only would get id '1', group 74 would become 1.2, and group 89 would become 1.3. Not that smart, but unique and easy to compute.



use the original schema:


  • People (PersonId (PK), Name, SSN, etc.)
  • 人(PersonId(PK),姓名,SSN等)

  • Loans (LoanId (PK), Amount, etc.)
  • 贷款(贷款(PK),金额等)

  • Borrowers (BorrowerId(PK), PersonId, LoanId)
  • 借款人(BorrowerId(PK),PersonId,LoanId)

just query for the data you need (your example to find husband and wife on same loans):


    FROM Borrowers            b1
        INNER JOIN Borrowers  b2 ON b1.LoanId=b2.LoanId
        INNER JOIN Loans       l ON b1.LoanId=l.LoanId
    WHERE b1.PersonId=@HusbandID
        AND b2.PersonId=@WifeID


I would do something more like this:


  • People (PersonId (PK), Name, SSN, etc.)


  • Loans (LoanId (PK), Amount, BorrowerGroupId, etc.)


  • BorrowerGroup(BorrowerGroupId (PK))

  • PersonBelongsToBorrowerGroup(BorrowerGroupId (PK), PersonId(PK))


I got rid of the Borrowers table. Just store the info in the BorrowerGroup table. That's my preference.



The consensus seems to be to omit the BorrowerGroup table and I have to agree. Suggesting that you would use MAX(groupId+1) has all sorts of ACID/transaction issues and the main reason why IDENTITY fields exist.

共识似乎是省略了BorrowerGroup表,我不得不同意。建议您使用MAX(groupId + 1)具有各种ACID /事务问题以及IDENTITY字段存在的主要原因。

That said; the SQL that KM provided looks good. There are any number of ways to get the same results. Joins, sub-selects and so on. The real issue there... is knowing the dataset. Given the explanation you provided the datasets are going to be very small. That also supports removing the BorrowerGroup table.

那说; KM提供的SQL看起来不错。有多种方法可以获得相同的结果。连接,子选择等。那里的真正问题是了解数据集。鉴于您提供的解释,数据集将非常小。这也支持删除BorrowerGroup表。


I would have a group table and then a groupmembers(borrowers) table to accomplish the many-to-many relationship between loans and people. This allows the tracking of data on the group other than just a list of members (I believe someone else made this suggestion?).


    ID int NOT NULL 
    , Group_Name char(50) NULL 
    , Date_Started datetime NULL 
    , Primary_ContactID int NULL
    , Group_Type varchar(25)