MySQL:字段长度。它真的那么重要吗?

时间:2022-12-25 22:18:57

I'm working with some database abstraction layers and most of them are using attributes like "String" which is VARCHAR 250 or INTEGER which has length of 11 digits. But for example I have something that will be less than 250 characters long. Should I go and make it less? Does it really makes any valuable difference?

我正在处理一些数据库抽象层,其中大多数使用的属性是“String”,它是VARCHAR 250,或者是长度为11位的整数。但是例如,我有一个小于250个字符的东西。我应该少做点吗?这真的有什么有价值的区别吗?

Thanks in advance!

提前谢谢!

5 个解决方案

#1


8  

INT length does nothing. All INTs are 4 bytes. The number you can set, is only used for zerofill (and who uses that!?).

INT长度并没有。所有INTs都是4个字节。您可以设置的数字,只用于零填充(谁会使用它!?)

VARCHAR length does more. It's the maxlength of the field. VARCHAR is saved so that only the actual data is stored, so the length doesn't mattter. These days, you can have bigger VARCHARs than 255 bytes (being 256^2-1). The difference is the bytes that are used for the field length. VARCHAR(100) and VARCHAR(8) and VARCHAR(255) use 1 byte to save the field length. VARCHAR(1000) uses 2.

VARCHAR长度也更多。它是场的最大长度。VARCHAR被保存,以便只有实际的数据被存储,所以长度不会变。这些天,你可以有更大的varchar格式比255字节(256 ^ 2 - 1)。不同之处在于用于字段长度的字节数。VARCHAR(100)和VARCHAR(8)和VARCHAR(255)使用一个字节来保存字段长度。VARCHAR(1000)使用2。

Hope that helps =)

希望帮助=)

edit
I almost always make my VARCHARs 250 long. Actual length should be checked in the app anyway. For bigger fields I use TEXT (and those are stored differently, so can be much much longer).

编辑我几乎总是让我的VARCHARs长250。无论如何,应该在应用程序中检查实际长度。对于较大的字段,我使用文本(这些字段的存储方式不同,因此可能要长得多)。

edit
I don't know how current this is, but it used to help me (understand): http://help.scibit.com/Mascon/masconMySQL_Field_Types.html

编辑我不知道这是什么,但它曾经帮助我(理解):http://help.scibit.com/Mascon/masconMySQL_Field_Types.html。

#2


1  

Not so sure in MySQL, but in MS SQL it only makes a difference for sufficiently large databases. Typically, I like to use smaller fields for a) the space saving (it never hurts to practice good habits) and b) for the implied validation (if you know a certain field should never be more than 10 characters, why allow eleven, let alone 250?).

在MySQL中不是很确定,但是在MS SQL中,它只对足够大的数据库有影响。通常,我喜欢使用较小的字段作为a)节省空间(养成良好的习惯不会有坏处)和b)进行隐含的验证(如果您知道某个字段不应该超过10个字符,为什么只允许11个字符,更不用说250个字符?)

#3


1  

First, remember that the database is meant to store facts and is designed to protect itself against bad data. Thus, the reason you do not want to allow a user to enter 250 characters for a first name is that a user will put all kinds of data in there that is not a first name. They'll put their whole name, their underwear size, a novel about what they did last summer and so on. Thus, you want to strive to enforce that the data is as correct as possible. It is a mistake to assume that the application is the sole protector against bad data. You want users to tell you that they had a problem stuffing War in Peace into a given column.

首先,请记住,数据库是用来存储事实的,设计它是为了保护自己不受坏数据的影响。因此,您不希望允许用户为一个名字输入250个字符的原因是,用户将把所有不是名字的数据放在那里。他们会把他们的全名,内衣尺寸,一本关于他们去年夏天做什么的小说,等等。因此,您需要努力确保数据是尽可能正确的。假定应用程序是防止坏数据的唯一保护程序是错误的。你想让用户告诉你,他们有一个问题,把和平的战争变成了一个给定的专栏。

Thus, the most important question is, "What is the most appropriate value for the data being stored?" Ideally, you would use an int and a check constraint to ensure that the values have an appropriate range (e.g. greater than zero, less than a billion etc.). Unfortunately, this is one of MySQL's greatest weakness: it does not honor check constraints. That simply means you must implement those integrity checks in triggers which admittedly is more cumbersome.

因此,最重要的问题是,“存储的数据最合适的值是什么?”理想情况下,您将使用int和check约束来确保值具有适当的范围(例如,大于0,小于10亿等等)。不幸的是,这是MySQL最大的缺点之一:它不支持检查约束。这仅仅意味着您必须在触发器中实现这些完整性检查,诚然,这更麻烦。

Will the difference between an int (4 bytes) make an appreciable difference to a tinyint (1 byte)? Obviously, it depends on the amount of data. If you will have no more than 10 rows, the answer is obviously no. If you will have 10 billion rows, the answer is obviously "Yes". However, IMO, this is premature optimization. It is far better to focus on ensuring correctness first.

int(4字节)之间的差异会对tinyint(1字节)产生明显的影响吗?显然,这取决于数据的数量。如果你的行数不超过10行,答案显然是不。如果有100亿行,答案显然是肯定的。然而,在我看来,这是不成熟的优化。最好首先关注于确保正确性。

For text, you should ask whether your data should support Chinese, Japanese or non-ANSI values (i.e., should you use nvarchar or varchar)? Does this value represent a real world code like a currency code, or bank code which has a specific specification?

对于文本,您应该询问您的数据是否应该支持中文、日文或非ansi值(例如,你应该使用nvarchar还是varchar)?这个值表示真实的代码,比如货币代码,还是具有特定规范的银行代码?

#4


0  

I thinks Rudie is wrong, not all INTs are 4 bytes... in MySQL you have:

我认为Rudie是错的,不是所有INTs都是4字节…在MySQL中有:

tinyint = 1 byte, smallint = 2 bytes, mediumint = 3 bytes, int = 4 bytes, bigint = 8 bytes.

tinyint = 1字节,smallint = 2字节,mediumint = 3字节,int = 4字节,bigint = 8字节。

I think Rudie refers to the "display with" that is the number you put between parenthesis when you are creating a column, e.g.:

我认为Rudie指的是“display with”,也就是你在创建一个列时在括号之间的数字,例如:

age INT(3)

年龄INT(3)

You're telling to the RDBMS just to SHOW no more than 3 numbers.

你告诉RDBMS,只是为了显示不超过3个数字。

And VARCHARs are (variable length charcter string) so if you declare let's say name varchar(5000) and you store a name like "Mario" you only are using 7 bytes (5 for the data and 2 for the length of the value).

VARCHARs是(可变长度的charcter字符串)所以如果你声明命名为varchar(5000)并存储一个像Mario这样的名字,你只需要7个字节(数据5个字节,值长度2个字节)。

#5


0  

The correct field size serves to limit the bad data that can be put in. For instance suppose you have a phone number field. If you allow 250 characters, you will often end up with things like the following in the phone field (an example not taken at random):

正确的字段大小可以限制可以放入的坏数据。例如,假设您有一个电话号码字段。如果你允许250个字符,你通常会在电话字段中得到如下内容(一个不是随机抽取的例子):

Call the good-looking blonde secretary instead.

So first limiting the length is part of how we enforce data integrity rules. As such it is critical.

因此,首先限制长度是我们执行数据完整性规则的一部分。因此,这是至关重要的。

Second, there is only so much space on a datapage and while some databases will allow you to create tables where the potential record is longer than the width of the data page, they often will not allow you to actually exceed it when storing the data. This can lead to some very hard to find bugs when suddenly one record can't be saved. I don't know about MySql and whether it does this but I know SQL Server does and it is very hard to figure out what is wrong. So making data the correct size can be critical to preventing bugs.

其次,数据页上只有这么大的空间,虽然有些数据库允许您创建一些表,其中的潜在记录比数据页的宽度要长,但是它们通常不会允许您在存储数据时实际超过它。当一条记录突然不能保存时,这可能导致很难找到错误。我不知道MySql是否这么做,但我知道SQL Server是这样做的,很难弄清楚是哪里出了问题。因此,正确的数据大小对于防止错误是至关重要的。

#1


8  

INT length does nothing. All INTs are 4 bytes. The number you can set, is only used for zerofill (and who uses that!?).

INT长度并没有。所有INTs都是4个字节。您可以设置的数字,只用于零填充(谁会使用它!?)

VARCHAR length does more. It's the maxlength of the field. VARCHAR is saved so that only the actual data is stored, so the length doesn't mattter. These days, you can have bigger VARCHARs than 255 bytes (being 256^2-1). The difference is the bytes that are used for the field length. VARCHAR(100) and VARCHAR(8) and VARCHAR(255) use 1 byte to save the field length. VARCHAR(1000) uses 2.

VARCHAR长度也更多。它是场的最大长度。VARCHAR被保存,以便只有实际的数据被存储,所以长度不会变。这些天,你可以有更大的varchar格式比255字节(256 ^ 2 - 1)。不同之处在于用于字段长度的字节数。VARCHAR(100)和VARCHAR(8)和VARCHAR(255)使用一个字节来保存字段长度。VARCHAR(1000)使用2。

Hope that helps =)

希望帮助=)

edit
I almost always make my VARCHARs 250 long. Actual length should be checked in the app anyway. For bigger fields I use TEXT (and those are stored differently, so can be much much longer).

编辑我几乎总是让我的VARCHARs长250。无论如何,应该在应用程序中检查实际长度。对于较大的字段,我使用文本(这些字段的存储方式不同,因此可能要长得多)。

edit
I don't know how current this is, but it used to help me (understand): http://help.scibit.com/Mascon/masconMySQL_Field_Types.html

编辑我不知道这是什么,但它曾经帮助我(理解):http://help.scibit.com/Mascon/masconMySQL_Field_Types.html。

#2


1  

Not so sure in MySQL, but in MS SQL it only makes a difference for sufficiently large databases. Typically, I like to use smaller fields for a) the space saving (it never hurts to practice good habits) and b) for the implied validation (if you know a certain field should never be more than 10 characters, why allow eleven, let alone 250?).

在MySQL中不是很确定,但是在MS SQL中,它只对足够大的数据库有影响。通常,我喜欢使用较小的字段作为a)节省空间(养成良好的习惯不会有坏处)和b)进行隐含的验证(如果您知道某个字段不应该超过10个字符,为什么只允许11个字符,更不用说250个字符?)

#3


1  

First, remember that the database is meant to store facts and is designed to protect itself against bad data. Thus, the reason you do not want to allow a user to enter 250 characters for a first name is that a user will put all kinds of data in there that is not a first name. They'll put their whole name, their underwear size, a novel about what they did last summer and so on. Thus, you want to strive to enforce that the data is as correct as possible. It is a mistake to assume that the application is the sole protector against bad data. You want users to tell you that they had a problem stuffing War in Peace into a given column.

首先,请记住,数据库是用来存储事实的,设计它是为了保护自己不受坏数据的影响。因此,您不希望允许用户为一个名字输入250个字符的原因是,用户将把所有不是名字的数据放在那里。他们会把他们的全名,内衣尺寸,一本关于他们去年夏天做什么的小说,等等。因此,您需要努力确保数据是尽可能正确的。假定应用程序是防止坏数据的唯一保护程序是错误的。你想让用户告诉你,他们有一个问题,把和平的战争变成了一个给定的专栏。

Thus, the most important question is, "What is the most appropriate value for the data being stored?" Ideally, you would use an int and a check constraint to ensure that the values have an appropriate range (e.g. greater than zero, less than a billion etc.). Unfortunately, this is one of MySQL's greatest weakness: it does not honor check constraints. That simply means you must implement those integrity checks in triggers which admittedly is more cumbersome.

因此,最重要的问题是,“存储的数据最合适的值是什么?”理想情况下,您将使用int和check约束来确保值具有适当的范围(例如,大于0,小于10亿等等)。不幸的是,这是MySQL最大的缺点之一:它不支持检查约束。这仅仅意味着您必须在触发器中实现这些完整性检查,诚然,这更麻烦。

Will the difference between an int (4 bytes) make an appreciable difference to a tinyint (1 byte)? Obviously, it depends on the amount of data. If you will have no more than 10 rows, the answer is obviously no. If you will have 10 billion rows, the answer is obviously "Yes". However, IMO, this is premature optimization. It is far better to focus on ensuring correctness first.

int(4字节)之间的差异会对tinyint(1字节)产生明显的影响吗?显然,这取决于数据的数量。如果你的行数不超过10行,答案显然是不。如果有100亿行,答案显然是肯定的。然而,在我看来,这是不成熟的优化。最好首先关注于确保正确性。

For text, you should ask whether your data should support Chinese, Japanese or non-ANSI values (i.e., should you use nvarchar or varchar)? Does this value represent a real world code like a currency code, or bank code which has a specific specification?

对于文本,您应该询问您的数据是否应该支持中文、日文或非ansi值(例如,你应该使用nvarchar还是varchar)?这个值表示真实的代码,比如货币代码,还是具有特定规范的银行代码?

#4


0  

I thinks Rudie is wrong, not all INTs are 4 bytes... in MySQL you have:

我认为Rudie是错的,不是所有INTs都是4字节…在MySQL中有:

tinyint = 1 byte, smallint = 2 bytes, mediumint = 3 bytes, int = 4 bytes, bigint = 8 bytes.

tinyint = 1字节,smallint = 2字节,mediumint = 3字节,int = 4字节,bigint = 8字节。

I think Rudie refers to the "display with" that is the number you put between parenthesis when you are creating a column, e.g.:

我认为Rudie指的是“display with”,也就是你在创建一个列时在括号之间的数字,例如:

age INT(3)

年龄INT(3)

You're telling to the RDBMS just to SHOW no more than 3 numbers.

你告诉RDBMS,只是为了显示不超过3个数字。

And VARCHARs are (variable length charcter string) so if you declare let's say name varchar(5000) and you store a name like "Mario" you only are using 7 bytes (5 for the data and 2 for the length of the value).

VARCHARs是(可变长度的charcter字符串)所以如果你声明命名为varchar(5000)并存储一个像Mario这样的名字,你只需要7个字节(数据5个字节,值长度2个字节)。

#5


0  

The correct field size serves to limit the bad data that can be put in. For instance suppose you have a phone number field. If you allow 250 characters, you will often end up with things like the following in the phone field (an example not taken at random):

正确的字段大小可以限制可以放入的坏数据。例如,假设您有一个电话号码字段。如果你允许250个字符,你通常会在电话字段中得到如下内容(一个不是随机抽取的例子):

Call the good-looking blonde secretary instead.

So first limiting the length is part of how we enforce data integrity rules. As such it is critical.

因此,首先限制长度是我们执行数据完整性规则的一部分。因此,这是至关重要的。

Second, there is only so much space on a datapage and while some databases will allow you to create tables where the potential record is longer than the width of the data page, they often will not allow you to actually exceed it when storing the data. This can lead to some very hard to find bugs when suddenly one record can't be saved. I don't know about MySql and whether it does this but I know SQL Server does and it is very hard to figure out what is wrong. So making data the correct size can be critical to preventing bugs.

其次,数据页上只有这么大的空间,虽然有些数据库允许您创建一些表,其中的潜在记录比数据页的宽度要长,但是它们通常不会允许您在存储数据时实际超过它。当一条记录突然不能保存时,这可能导致很难找到错误。我不知道MySql是否这么做,但我知道SQL Server是这样做的,很难弄清楚是哪里出了问题。因此,正确的数据大小对于防止错误是至关重要的。