为什么在用户表中应该有一个ID列?

时间:2021-10-05 06:36:14

It's obvious that we already have another unique information about each user, and that is username. Then, why we need another unique thing for each user? Why should we also have an id for each user? What would happen if we omit the id column?

很明显,我们已经有了关于每个用户的另一个唯一信息,那就是用户名。那么,为什么每个用户都需要另一个唯一的东西呢?为什么我们还要为每个用户设置一个id呢?如果我们省略id列会发生什么?

5 个解决方案

#1


11  

Even if your username is unique, there are few advantages to having an extra id column instead of using the varchar as your primary key.

即使您的用户名是唯一的,使用额外的id列而不是使用varchar作为主键也没有什么好处。

  • Some people prefer to use an integer column as the primary key, to serve as a surrogate key that never needs to change, even if other columns are subject to change. Although there's nothing preventing a natural primary key from being changeable too, you'd have to use cascading foreign key constraints to ensure that the foreign keys in related tables are updated in sync with any such change.

    有些人喜欢使用整数列作为主键,作为永远不需要更改的代理键,即使其他列可能会更改。虽然没有什么可以阻止自然主键的更改,但是您必须使用级联外键约束来确保相关表中的外键与此类更改同步更新。

  • The primary key being a 32-bit integer instead of a varchar can save space. The choice between a int or a varchar foreign key column in every other table that references your user table can be a good reason.

    主键是32位整数而不是varchar可以节省空间。在每个引用您的用户表的其他表中,int或varchar外键列之间的选择可能是一个很好的理由。

  • Inserting to the primary key index is a little bit more efficient if you add new rows to the end of the index, compared to of wedging them into the middle of the index. Indexes in MySQL tables are usually B+Tree data structures, and you can study these to understand how they perform.

    如果在索引的末尾添加新的行,而不是将它们插入到索引的中间,那么插入到主键索引就会稍微有效一些。MySQL表中的索引通常是B+树数据结构,您可以研究它们以了解它们的性能。

  • Some application frameworks prefer the convention that every table in your database has a primary key column called id, instead of using natural keys or compound keys. Following such conventions can make certain programming tasks simpler.

    有些应用程序框架更喜欢这样的约定:数据库中的每个表都有一个名为id的主键列,而不是使用自然键或复合键。遵循这些约定可以使某些编程任务更简单。

None of these issues are deal-breakers. And there are also advantages to using natural keys:

这些问题都不是破坏性的。使用自然键也有好处:

  • If you look up rows by username more often than you search by id, it can be better to choose the username as the primary key, and take advantage of the index-organized storage of InnoDB. Make your primary lookup column be the primary key, if possible, because primary key lookups are more efficient in InnoDB (you should be using InnoDB in MySQL).

    如果按用户名查找行比按id查找行更频繁,那么最好选择用户名作为主键,并利用InnoDB的索引组织存储。如果可能的话,将主查找列作为主键,因为在InnoDB中主键查找更有效(应该在MySQL中使用InnoDB)。

  • As you noticed, if you already have a unique constraint on username, it seems a waste of storage to keep an extra id column you don't need.

    正如您所注意到的,如果您已经对用户名有一个惟一的限制,那么保留一个不需要的额外id列似乎是浪费存储空间。

  • Using a natural key means that foreign keys contain a human-readable value, instead of an arbitrary integer id. This allows queries to use the foreign key value without having to join back to the parent table for the "real" value.

    使用自然键意味着外键包含一个人可读的值,而不是任意的整数id。这允许查询使用外键值,而不必返回到父表以获得“真正的”值。

The point is that there's no rule that covers 100% of cases. I often recommend that you should keep your options open, and use natural keys, compound keys, and surrogate keys even in a single database.

关键是没有任何规则涵盖100%的案例。我经常建议您保持选项开放,使用自然键、复合键和代理键,即使是在一个数据库中。

I cover some issues of surrogate keys in the chapter "ID Required" in my book SQL Antipatterns: Avoiding the Pitfalls of Database Programming.

在我的《SQL反模式:避免数据库编程的陷阱》一章“ID Required”中,我介绍了代理键的一些问题。

#2


7  

This identifier is known as a Surrogate Key. The page I linked lists both the advantages and disadvantages.

这个标识符被称为代理键。我链接的页面列出了优点和缺点。

In practice, I have found them to be advantageous because even superkey data can change over time (i.e. a user's email address may change and thus any corresponding relations must change), but a surrogate key never needs to change for the data it identifies because its value is meaningless to the relation.

在实践中,我发现他们是有利的,因为即使超码数据随时间变化(即用户的电子邮件地址可能会改变,因此任何相应的关系必须改变),但是代理键不需要改变的数据标识的关系,因为它的值是无意义的。

It's also nice from a JOIN standpoint because it can be an integer with a smaller key length than a varchar.

从联接的角度来看,它也很好,因为它可以是一个键长小于varchar的整数。

I can say that in practice I prefer to use them. I have been bitten too many times by having multiple-column primary keys or a data-representative superkey used across tables having to become non-unique later due to changing requirements during development, and that is not a situation you want to deal with.

我可以说,实际上我更喜欢用它们。我已经被多列主键或跨表使用的数据代表超键困扰了很多次,由于开发期间需求的变化,这些超键必须在以后变得非惟一,这不是您希望处理的情况。

#3


3  

im mysql we have.

im mysql。

 1:Index fields 2:Unique fields and 3:PK fields.
index means pointable
unique means in a table must be one in all rows.
PK = index + unique

in a table you may have lots of unique fields like
username or passport code or email.
but you need a field like ID. that is both unique and index (=PK).which is first is always one thing and never changes and second is unique and third is simple (because is often number).

在一个表中,您可能有许多独特的字段,比如用户名或护照代码或电子邮件。但是您需要一个像ID.那样的字段,这是唯一的和索引(=PK)。第一总是一件事,永远不会改变,第二是唯一的,第三是简单的(因为经常是数字)。

#4


2  

One reason to have a numeric id is that creating an index on it is leaner than on a text-field, reducing index size and processing time required to look up a specific user. Also it's less bytes to save when cross-referencing to a user (relational database) in a different table.

拥有数字id的一个原因是,在其上创建索引比在文本字段上创建索引更简洁,从而减少了查找特定用户所需的索引大小和处理时间。此外,当在不同的表中交叉引用用户(关系数据库)时,保存的字节也更少。

#5


2  

In my opinion, every table should have a unique, auto-incremented id.

在我看来,每个表都应该有一个唯一的、自动递增的id。

Here are some practical reasons. If you have duplicate rows, you can readily determine which row to delete. If you want to know the order that rows were inserted, you have that information in the id. As for users, there's more than on "John Smith" in the world. An id provides a key for foreign references.

以下是一些实际的原因。如果有重复的行,可以很容易地确定要删除哪一行。如果你想知道插入的顺序,你就有了id中的信息。对于用户来说,世界上有更多的“John Smith”。id为外来引用提供了一个键。

Finally, just about anything that might describe a user -- a name, an address, a telephone number, an email address -- could change over time.

最后,任何可能描述用户的东西——姓名、地址、电话号码、电子邮件地址——都会随着时间而改变。

#1


11  

Even if your username is unique, there are few advantages to having an extra id column instead of using the varchar as your primary key.

即使您的用户名是唯一的,使用额外的id列而不是使用varchar作为主键也没有什么好处。

  • Some people prefer to use an integer column as the primary key, to serve as a surrogate key that never needs to change, even if other columns are subject to change. Although there's nothing preventing a natural primary key from being changeable too, you'd have to use cascading foreign key constraints to ensure that the foreign keys in related tables are updated in sync with any such change.

    有些人喜欢使用整数列作为主键,作为永远不需要更改的代理键,即使其他列可能会更改。虽然没有什么可以阻止自然主键的更改,但是您必须使用级联外键约束来确保相关表中的外键与此类更改同步更新。

  • The primary key being a 32-bit integer instead of a varchar can save space. The choice between a int or a varchar foreign key column in every other table that references your user table can be a good reason.

    主键是32位整数而不是varchar可以节省空间。在每个引用您的用户表的其他表中,int或varchar外键列之间的选择可能是一个很好的理由。

  • Inserting to the primary key index is a little bit more efficient if you add new rows to the end of the index, compared to of wedging them into the middle of the index. Indexes in MySQL tables are usually B+Tree data structures, and you can study these to understand how they perform.

    如果在索引的末尾添加新的行,而不是将它们插入到索引的中间,那么插入到主键索引就会稍微有效一些。MySQL表中的索引通常是B+树数据结构,您可以研究它们以了解它们的性能。

  • Some application frameworks prefer the convention that every table in your database has a primary key column called id, instead of using natural keys or compound keys. Following such conventions can make certain programming tasks simpler.

    有些应用程序框架更喜欢这样的约定:数据库中的每个表都有一个名为id的主键列,而不是使用自然键或复合键。遵循这些约定可以使某些编程任务更简单。

None of these issues are deal-breakers. And there are also advantages to using natural keys:

这些问题都不是破坏性的。使用自然键也有好处:

  • If you look up rows by username more often than you search by id, it can be better to choose the username as the primary key, and take advantage of the index-organized storage of InnoDB. Make your primary lookup column be the primary key, if possible, because primary key lookups are more efficient in InnoDB (you should be using InnoDB in MySQL).

    如果按用户名查找行比按id查找行更频繁,那么最好选择用户名作为主键,并利用InnoDB的索引组织存储。如果可能的话,将主查找列作为主键,因为在InnoDB中主键查找更有效(应该在MySQL中使用InnoDB)。

  • As you noticed, if you already have a unique constraint on username, it seems a waste of storage to keep an extra id column you don't need.

    正如您所注意到的,如果您已经对用户名有一个惟一的限制,那么保留一个不需要的额外id列似乎是浪费存储空间。

  • Using a natural key means that foreign keys contain a human-readable value, instead of an arbitrary integer id. This allows queries to use the foreign key value without having to join back to the parent table for the "real" value.

    使用自然键意味着外键包含一个人可读的值,而不是任意的整数id。这允许查询使用外键值,而不必返回到父表以获得“真正的”值。

The point is that there's no rule that covers 100% of cases. I often recommend that you should keep your options open, and use natural keys, compound keys, and surrogate keys even in a single database.

关键是没有任何规则涵盖100%的案例。我经常建议您保持选项开放,使用自然键、复合键和代理键,即使是在一个数据库中。

I cover some issues of surrogate keys in the chapter "ID Required" in my book SQL Antipatterns: Avoiding the Pitfalls of Database Programming.

在我的《SQL反模式:避免数据库编程的陷阱》一章“ID Required”中,我介绍了代理键的一些问题。

#2


7  

This identifier is known as a Surrogate Key. The page I linked lists both the advantages and disadvantages.

这个标识符被称为代理键。我链接的页面列出了优点和缺点。

In practice, I have found them to be advantageous because even superkey data can change over time (i.e. a user's email address may change and thus any corresponding relations must change), but a surrogate key never needs to change for the data it identifies because its value is meaningless to the relation.

在实践中,我发现他们是有利的,因为即使超码数据随时间变化(即用户的电子邮件地址可能会改变,因此任何相应的关系必须改变),但是代理键不需要改变的数据标识的关系,因为它的值是无意义的。

It's also nice from a JOIN standpoint because it can be an integer with a smaller key length than a varchar.

从联接的角度来看,它也很好,因为它可以是一个键长小于varchar的整数。

I can say that in practice I prefer to use them. I have been bitten too many times by having multiple-column primary keys or a data-representative superkey used across tables having to become non-unique later due to changing requirements during development, and that is not a situation you want to deal with.

我可以说,实际上我更喜欢用它们。我已经被多列主键或跨表使用的数据代表超键困扰了很多次,由于开发期间需求的变化,这些超键必须在以后变得非惟一,这不是您希望处理的情况。

#3


3  

im mysql we have.

im mysql。

 1:Index fields 2:Unique fields and 3:PK fields.
index means pointable
unique means in a table must be one in all rows.
PK = index + unique

in a table you may have lots of unique fields like
username or passport code or email.
but you need a field like ID. that is both unique and index (=PK).which is first is always one thing and never changes and second is unique and third is simple (because is often number).

在一个表中,您可能有许多独特的字段,比如用户名或护照代码或电子邮件。但是您需要一个像ID.那样的字段,这是唯一的和索引(=PK)。第一总是一件事,永远不会改变,第二是唯一的,第三是简单的(因为经常是数字)。

#4


2  

One reason to have a numeric id is that creating an index on it is leaner than on a text-field, reducing index size and processing time required to look up a specific user. Also it's less bytes to save when cross-referencing to a user (relational database) in a different table.

拥有数字id的一个原因是,在其上创建索引比在文本字段上创建索引更简洁,从而减少了查找特定用户所需的索引大小和处理时间。此外,当在不同的表中交叉引用用户(关系数据库)时,保存的字节也更少。

#5


2  

In my opinion, every table should have a unique, auto-incremented id.

在我看来,每个表都应该有一个唯一的、自动递增的id。

Here are some practical reasons. If you have duplicate rows, you can readily determine which row to delete. If you want to know the order that rows were inserted, you have that information in the id. As for users, there's more than on "John Smith" in the world. An id provides a key for foreign references.

以下是一些实际的原因。如果有重复的行,可以很容易地确定要删除哪一行。如果你想知道插入的顺序,你就有了id中的信息。对于用户来说,世界上有更多的“John Smith”。id为外来引用提供了一个键。

Finally, just about anything that might describe a user -- a name, an address, a telephone number, an email address -- could change over time.

最后,任何可能描述用户的东西——姓名、地址、电话号码、电子邮件地址——都会随着时间而改变。