为什么在创建数据库表时需要这些特定的数据类型?

时间:2023-02-07 16:30:06

Take the following create table statement:

采取以下create table语句:

create table fruit
{
  count int,
  name varchar(32),
  size float
}

Instead of those specific data types, why not have "string", "number", "boolean" or better yet, not having to specify any data types at all.

而不是那些特定的数据类型,为什么不具有“字符串”,“数字”,“布尔”或更好,而不必指定任何数据类型。

What are the technical reasons for having such specific data types? (as opposed to generic or no data type)

拥有这些特定数据类型的技术原因是什么? (与通用或无数据类型相反)

10 个解决方案

#1


4  

Imagine 20 millions rows in a table, with an int column where all the numbers are 1 through 10.

想象一下表中有20亿行,其中一列是所有数字都是1到10的int列。

If you used a tinyint for that, it would take 1 byte. If you used a regular int, it would take 4 bytes. That's four times the amount of disk space, 60 MBs more disk space.

如果您使用tinyint,则需要1个字节。如果使用常规int,则需要4个字节。这是磁盘空间量的四倍,磁盘空间增加了60 MB。

Theoretically, you could design a database engine to "smart config" a table, but imagine our theoretical table where all of a sudden the database decides it need to allocate more bytes for the data in the column. The whole table would need to be re-paged, and the performance would slow to a crawl for potentially hours while the engine restructured the table. There are so many edge cases and ways to get it wrong, that it would be more work to stay on top of automatic configuration than to just design your application properly in the first place.

从理论上讲,你可以设计一个数据库引擎来“智能配置”一个表,但想象一下我们的理论表,数据库突然决定它需要为列中的数据分配更多的字节。整个表需要重新分页,并且在引擎重组表时,性能会在可能的几小时内慢慢爬行。有太多边缘情况和方法可以解决问题,因此保持自动配置比仅仅正确设计应用程序更重要。

#2


3  

It sets a strategy for sorting and indexing, as well as enforce data integrity.

它为排序和索引设置了策略,并强制实施数据完整性。

Imagine this.

MyNumberField as generic: "1234", 13, 35, "1234afgas"

MyNumberField为通用:“1234”,13,35,“1234afgas”

Why are some of those strings and why is there letters in "1234afgas"?

为什么有些字符串以及为什么“1234afgas”中有字母?

With the type constraints those wouldn't be allowed.

对于类型约束,不允许这样做。

#3


3  

because there is a different in size and storage

因为有不同的大小和存储空间

tinyint = 1 byte

tinyint = 1个字节

smallint = 2 bytes

smallint = 2个字节

int = 4 bytes

int = 4个字节

bigint = 8 bytes

bigint = 8个字节

so if you know you only need to store up to a certain range there is no need to use bigint and incur overhead of storing extra bytes per row

因此,如果您知道您只需要存储一定范围就不需要使用bigint并且每行存储额外的字节会产生额外费用

same holds for strings (char, varchar etc etc)

同样适用于字符串(char,varchar等)

also built in constraints...can't store the letter A in an int...data will be clean..

也建在约束...不能将字母A存储在一个int ...数据将是干净的..

#4


2  

Not only are you telling the database system how you are going to use the data: string, boolean, number. You are also telling the database which internal representation to use. This is important for space, indexing, and performance reasons.

您不仅告诉数据库系统将如何使用数据:string,boolean,number。您还告诉数据库使用哪个内部表示。这对于空间,索引和性能原因很重要。

#5


2  

To add to what everyone else has posted there is also a huge issue with data integrity. Imagine you stored the value "1" into the database, should this be treated as TRUE, a numeric value of 1, a string "1"...

为了增加其他人发布的内容,数据完整性也存在巨大问题。想象一下,您将值“1”存储到数据库中,如果将其视为TRUE,数值为1,则为字符串“1”...

if two columns have a value of "1", does col1 + col2 equal numeric 2 or string "11"?

如果两列的值为“1”,col1 + col2是否等于数字2或字符串“11”?

#6


1  

Aside to what's already been said, there are databases that do not require data types, such as SQLite (http://www.sqlite.org/).

除了已经说过的内容之外,还有一些不需要数据类型的数据库,例如SQLite(http://www.sqlite.org/)。

#7


0  

There are databases out there that do not type, the one that comes to mind is IBM's Universe DB (aka Pick). With that Db all fields are of the string type and you define how they are used via a "dictionary".

有些数据库没有输入,想到的是IBM的Universe DB(又名Pick)。使用该Db,所有字段都是字符串类型,您可以通过“字典”定义它们的使用方式。

Having used both strongly typed DB's and Universe extensively, I'm partial to the strongly typed ones froma programming standpoint.

在广泛使用了强类型DB和Universe之后,我从编程的角度来看是强类型的。

#8


0  

The same basis of question could be asked of any type anywhere. Why have types in classes? It is a limitation and expectation of data. You expect to get x type so you can deal with x type. You don't want to deal with the infinite possibility and do lots of type checking every time you deal with a piece of data.

可以在任何地方询问相同的问题基础。为什么类中有类型?这是对数据的限制和期望。你希望得到x类型,这样你就可以处理x类型。您不希望处理无限的可能性,并且每次处理一段数据时都要进行大量的类型检查。

The types whether primitive or created type are there to define the structure that is being held. It is saying that N is a type X and you can do all the things that type X can do.

无论是原始类型还是创建类型,都可以定义所持有的结构。它说N是X型,你可以做X型可以做的所有事情。

You are saying, for instance, I am dealing with an integer than can be in a certain range of numbers -X to X vs a big integer which can be in a larger range of numbers -Z to Z. (as a specific example). Usage expectations will fall in those ranges.

你要说的是,例如,我处理的是一个整数,而不是某个数字范围内的数字-X​​到X对比一个大整数,它可以在更大的数字范围内-Z到Z.(作为一个具体的例子) 。使用期望将落在这些范围内。

You also, as others have mentioned, defining how to store the information at a lower level. Event saying you have an integer is somewhat of an abstraction away from the machine.

正如其他人所提到的,您还可以定义如何在较低级别存储信息。说你有一个整数的事件在某种程度上是远离机器的抽象。

#9


0  

Aside from storage, a certain datatype is also a type of constraint If you know for instance a certain account number will hold exactly 8 chars, defining that in the type is the most logical and performant thing you can do. (nchar(8) for example)

除了存储之外,某种数据类型也是一种约束类型如果您知道某个帐号将恰好包含8个字符,那么在该类型中定义该字符是您可以执行的最符合逻辑且最高效的操作。 (例如nchar(8))

You are setting the domain (or a part of it, it can be further refined by other constraints) immediately in the field's type that way.

您正在以该方式在​​字段的类型中立即设置域(或其中的一部分,可以通过其他约束进一步细化)。

#10


0  

One of the primary functions of a database is to be able to perform operations on huge amounts of data efficiently. Being very specific about data types increases the number of things that the database engine can assume about the data it's storing. Therefore, it has to perform fewer calculations and runs faster, and it can avoid allocating storage that it won't need which makes the database smaller and therefore faster still.

数据库的主要功能之一是能够有效地对大量数据执行操作。对数据类型非常具体,增加了数据库引擎可以假设它存储的数据的数量。因此,它必须执行更少的计算并且运行得更快,并且可以避免分配不需要的存储,这使得数据库更小并因此更快。

One of the other primary functions of a database is to ensure data integrity. The more exactly you specify what sort of data should be stored in a field, the less likely you are to accidentally store the wrong data there. It's analogous to why your C compiler is so picky about the code you write: you should much prefer to deal with compile-time errors than run-time errors.

数据库的其他主要功能之一是确保数据完整性。您越准确地指定应该在字段中存储哪种数据,就越不可能在那里意外存储错误的数据。这类似于你的C编译器对你编写的代码如此挑剔的原因:你应该更喜欢处理编译时错误而不是运行时错误。

#1


4  

Imagine 20 millions rows in a table, with an int column where all the numbers are 1 through 10.

想象一下表中有20亿行,其中一列是所有数字都是1到10的int列。

If you used a tinyint for that, it would take 1 byte. If you used a regular int, it would take 4 bytes. That's four times the amount of disk space, 60 MBs more disk space.

如果您使用tinyint,则需要1个字节。如果使用常规int,则需要4个字节。这是磁盘空间量的四倍,磁盘空间增加了60 MB。

Theoretically, you could design a database engine to "smart config" a table, but imagine our theoretical table where all of a sudden the database decides it need to allocate more bytes for the data in the column. The whole table would need to be re-paged, and the performance would slow to a crawl for potentially hours while the engine restructured the table. There are so many edge cases and ways to get it wrong, that it would be more work to stay on top of automatic configuration than to just design your application properly in the first place.

从理论上讲,你可以设计一个数据库引擎来“智能配置”一个表,但想象一下我们的理论表,数据库突然决定它需要为列中的数据分配更多的字节。整个表需要重新分页,并且在引擎重组表时,性能会在可能的几小时内慢慢爬行。有太多边缘情况和方法可以解决问题,因此保持自动配置比仅仅正确设计应用程序更重要。

#2


3  

It sets a strategy for sorting and indexing, as well as enforce data integrity.

它为排序和索引设置了策略,并强制实施数据完整性。

Imagine this.

MyNumberField as generic: "1234", 13, 35, "1234afgas"

MyNumberField为通用:“1234”,13,35,“1234afgas”

Why are some of those strings and why is there letters in "1234afgas"?

为什么有些字符串以及为什么“1234afgas”中有字母?

With the type constraints those wouldn't be allowed.

对于类型约束,不允许这样做。

#3


3  

because there is a different in size and storage

因为有不同的大小和存储空间

tinyint = 1 byte

tinyint = 1个字节

smallint = 2 bytes

smallint = 2个字节

int = 4 bytes

int = 4个字节

bigint = 8 bytes

bigint = 8个字节

so if you know you only need to store up to a certain range there is no need to use bigint and incur overhead of storing extra bytes per row

因此,如果您知道您只需要存储一定范围就不需要使用bigint并且每行存储额外的字节会产生额外费用

same holds for strings (char, varchar etc etc)

同样适用于字符串(char,varchar等)

also built in constraints...can't store the letter A in an int...data will be clean..

也建在约束...不能将字母A存储在一个int ...数据将是干净的..

#4


2  

Not only are you telling the database system how you are going to use the data: string, boolean, number. You are also telling the database which internal representation to use. This is important for space, indexing, and performance reasons.

您不仅告诉数据库系统将如何使用数据:string,boolean,number。您还告诉数据库使用哪个内部表示。这对于空间,索引和性能原因很重要。

#5


2  

To add to what everyone else has posted there is also a huge issue with data integrity. Imagine you stored the value "1" into the database, should this be treated as TRUE, a numeric value of 1, a string "1"...

为了增加其他人发布的内容,数据完整性也存在巨大问题。想象一下,您将值“1”存储到数据库中,如果将其视为TRUE,数值为1,则为字符串“1”...

if two columns have a value of "1", does col1 + col2 equal numeric 2 or string "11"?

如果两列的值为“1”,col1 + col2是否等于数字2或字符串“11”?

#6


1  

Aside to what's already been said, there are databases that do not require data types, such as SQLite (http://www.sqlite.org/).

除了已经说过的内容之外,还有一些不需要数据类型的数据库,例如SQLite(http://www.sqlite.org/)。

#7


0  

There are databases out there that do not type, the one that comes to mind is IBM's Universe DB (aka Pick). With that Db all fields are of the string type and you define how they are used via a "dictionary".

有些数据库没有输入,想到的是IBM的Universe DB(又名Pick)。使用该Db,所有字段都是字符串类型,您可以通过“字典”定义它们的使用方式。

Having used both strongly typed DB's and Universe extensively, I'm partial to the strongly typed ones froma programming standpoint.

在广泛使用了强类型DB和Universe之后,我从编程的角度来看是强类型的。

#8


0  

The same basis of question could be asked of any type anywhere. Why have types in classes? It is a limitation and expectation of data. You expect to get x type so you can deal with x type. You don't want to deal with the infinite possibility and do lots of type checking every time you deal with a piece of data.

可以在任何地方询问相同的问题基础。为什么类中有类型?这是对数据的限制和期望。你希望得到x类型,这样你就可以处理x类型。您不希望处理无限的可能性,并且每次处理一段数据时都要进行大量的类型检查。

The types whether primitive or created type are there to define the structure that is being held. It is saying that N is a type X and you can do all the things that type X can do.

无论是原始类型还是创建类型,都可以定义所持有的结构。它说N是X型,你可以做X型可以做的所有事情。

You are saying, for instance, I am dealing with an integer than can be in a certain range of numbers -X to X vs a big integer which can be in a larger range of numbers -Z to Z. (as a specific example). Usage expectations will fall in those ranges.

你要说的是,例如,我处理的是一个整数,而不是某个数字范围内的数字-X​​到X对比一个大整数,它可以在更大的数字范围内-Z到Z.(作为一个具体的例子) 。使用期望将落在这些范围内。

You also, as others have mentioned, defining how to store the information at a lower level. Event saying you have an integer is somewhat of an abstraction away from the machine.

正如其他人所提到的,您还可以定义如何在较低级别存储信息。说你有一个整数的事件在某种程度上是远离机器的抽象。

#9


0  

Aside from storage, a certain datatype is also a type of constraint If you know for instance a certain account number will hold exactly 8 chars, defining that in the type is the most logical and performant thing you can do. (nchar(8) for example)

除了存储之外,某种数据类型也是一种约束类型如果您知道某个帐号将恰好包含8个字符,那么在该类型中定义该字符是您可以执行的最符合逻辑且最高效的操作。 (例如nchar(8))

You are setting the domain (or a part of it, it can be further refined by other constraints) immediately in the field's type that way.

您正在以该方式在​​字段的类型中立即设置域(或其中的一部分,可以通过其他约束进一步细化)。

#10


0  

One of the primary functions of a database is to be able to perform operations on huge amounts of data efficiently. Being very specific about data types increases the number of things that the database engine can assume about the data it's storing. Therefore, it has to perform fewer calculations and runs faster, and it can avoid allocating storage that it won't need which makes the database smaller and therefore faster still.

数据库的主要功能之一是能够有效地对大量数据执行操作。对数据类型非常具体,增加了数据库引擎可以假设它存储的数据的数量。因此,它必须执行更少的计算并且运行得更快,并且可以避免分配不需要的存储,这使得数据库更小并因此更快。

One of the other primary functions of a database is to ensure data integrity. The more exactly you specify what sort of data should be stored in a field, the less likely you are to accidentally store the wrong data there. It's analogous to why your C compiler is so picky about the code you write: you should much prefer to deal with compile-time errors than run-time errors.

数据库的其他主要功能之一是确保数据完整性。您越准确地指定应该在字段中存储哪种数据,就越不可能在那里意外存储错误的数据。这类似于你的C编译器对你编写的代码如此挑剔的原因:你应该更喜欢处理编译时错误而不是运行时错误。