在关系数据库中存储矩阵。

时间:2022-10-03 16:17:50

I am working on a project for a client and going through the initial database design. The project will be a simple web app for tracking processes and their outcomes within a matrix diagram, I am looking for a good way to store these in relational tables.

我正在为一个客户做一个项目,并进行最初的数据库设计。这个项目将是一个简单的web应用程序,用于在矩阵图中跟踪过程及其结果。

Right now I am thinking I have a general table for Routines which the x and y coords will map too and maybe off from that a lookup table containing the ID of coordinates in which a "hit" is recorded. Anyone have any better ways of doing this?

现在我在想,我有一个通用的例程表,x和y的coords也会映射到它,也许从那个查找表开始,它包含了记录“命中”的坐标ID。有人有更好的方法吗?

Thanks!

谢谢!

EDIT:

编辑:

This is just the beginning of the project so I have limited detail as of yet, but my main reasoning behind multiple tables is because the matrices will be completely dynamic in size and generic so that each one may be different and they will be tied to a user

这仅仅是项目的开始,所以我还没有详细的细节,但是我在多个表背后的主要原因是,这些矩阵将会是完全动态的,大小和通用的,这样每一个都可能不同,它们将被绑定到一个用户。

I also forgot to mention that order of the x/y values are important, which further supported my reasoning behind having multiple tables for x y and values, from this I strongly assume that needing to know each individual cell is important

我还忘了提到x/y值的顺序很重要,这进一步支持了我为x y和值设置多个表的推理,因此我强烈假设需要知道每个单元格是重要的

EXAMPLE:

例子:

The basic example (albeit abstract) of this lies in the process regarding a restaurant. The actions being stuff along the lines of sit down, order food, look over menu, order drinks, eat, pay, etc. the outcomes being order taken, drinks delivered, food delivered, change given. While seemingly simple it becomes complex when taken into consideration things happen differently with each occurrence, also in the case of take out or buffets. the order of the actions and outcomes becomes integral in seeing the differences between the situations

最基本的例子(虽然抽象)是关于餐厅的过程。这些行为包括坐下来,点餐,看菜单,点饮料,吃,付,等等这些结果是订单,饮料,送餐,改变。虽然看起来很简单,但考虑到每一件事发生的不同,事情就会变得复杂起来。行动和结果的顺序在看到情况之间的差异时变得不可分割

4 个解决方案

#1


19  

There are lots of way to do this, we would need a lot more information to be more specific about what would be best for you. However, here are the two SOP ways:

有很多方法可以做到这一点,我们需要更多的信息来更具体地告诉你什么是最适合你的。然而,这里有两个SOP方法:

Either a separate table for each matrix:

每个矩阵的单独表格:

CREATE TABLE YourMatrixName(
    RowNo smallint NOT NULL,
    ColNo smallint NOT NULL,
    CellValue varchar](50) NULL,
 CONSTRAINT [PK_Matrices] PRIMARY KEY CLUSTERED 
    ([RowNo] ASC, [ColNo] ASC)
) ON [PRIMARY];
GO

CREATE UNIQUE NONCLUSTERED INDEX IX_YourMatrixName ON dbo.YourMatrixName
    (ColNo, RowNo); 
GO

Or, all of the matrices in one table:

或者,所有的矩阵在一个表中:

CREATE TABLE Matrices(
    MatrixName varchar(24) NOT NULL,
    RowNo smallint NOT NULL,
    ColNo smallint NOT NULL,
    CellValue varchar(50) NULL,
 CONSTRAINT [PK_Matrices] PRIMARY KEY CLUSTERED 
    ([MatrixName] ASC, [RowNo] ASC, [ColNo] ASC)
) ON [PRIMARY];
GO

CREATE UNIQUE NONCLUSTERED INDEX IX_Matrices ON dbo.Matrices
    (ColNo, RowNo); 
GO

These are standard normal form, virtually all other ways of doing it are not well normalized. Some advantages of these approaches:

这些是标准的正规形式,实际上所有其他的方法都没有很好地标准化。这些方法的一些优点:

  1. You do not have to fill in every cell, only the ones you are using. Or have a default value (0 or "") and skip those.
  2. 您不必填充每个单元格,只需填充正在使用的单元格。或者有一个默认值(0或“”)并跳过它们。
  3. This is easily the most flexible approach, even in the "all in one" model, there is no need to restrict them to the same size in any way, and it is very easy to resize them.
  4. 这是最灵活的方法,即使在“all in one”模型中,也不需要以任何方式将它们限制在相同的大小,而且调整大小非常容易。
  5. You can easily query the contents of the matrix, something that is increasingly difficult in more compact storage methods.
  6. 您可以很容易地查询矩阵的内容,这在更紧凑的存储方法中越来越困难。
  7. "Hit"s or any other aspect of the matrix cells are easy to implement as additional fields in the rows. Make them Null-able if you're worried about the additional space, and index them if you want to query/report on these attributes separately. Its also just as easy to retrofit features like this with this model also.
  8. “Hit”或矩阵单元格的任何其他方面都很容易实现为行中的附加字段。如果您担心额外的空间,让它们为空,并且如果您想要分别查询/报告这些属性,则可以对它们进行索引。它同样也很容易对这种模型进行改造。

The primary disadvantage is that there is typically a high space to data overhead. Many assume that there is also high overhead to Insert or retrieve new matrices but in fact there are several documented techniques that can make it quite fast.

主要的缺点是数据开销通常有很大的空间。许多人认为插入或检索新矩阵也有很高的开销,但实际上有一些文档化的技术可以使它非常快。

#2


4  

Is your matrix dense of sparse? If it's sparse, it may be better for each entry to just store a list of hit's, rather than have a full 2D table that is mostly 0's.

你的矩阵密度稀疏吗?如果是稀疏的,那么对于每个条目,最好只存储hit's的列表,而不是一个完整的2D表,其中大部分都是0。

#3


3  

Rather than two tables, I would just use one table: (x, y, outcome). Beyond that it's hard to give any more advice with the limited information given.

而不是两个表,我只使用一个表:(x, y,结果)。除此之外,在有限的信息中很难给出更多的建议。

#4


2  

Video memory, a very simple 2D matrix is stored as follows:

视频记忆,一个非常简单的二维矩阵存储如下:

ABCD
EFGH
IJKL

in ram sequentially like an array as

在ram中按顺序排列,就像数组一样。

A,B,C,D,E,F,G,H,I,J,K,L

A,B,C,D,E,F,G,H,I,J,K,L

element x,y can be found at array offset

元素x,y可以在阵列偏移处找到

[y*width+x]

for instance, x=2,y=2 (zero-based) refers to element K.

例如,x=2,y=2(从零开始)指的是元素K。

[y*width+x]=[2*4+2]=10. array element 10 (again zero-based) = K, so you're good.

[y *宽度+ x]=[2 * 4 + 2]= 10。数组元素10(也是基于零的)= K,所以你做得很好。

Storing in a comma-delimited list will let you put a matrix of any size in an nvarchar field. This assumes that you don't need to query individual cells in SQL, but just grab the matrix as a whole and process it client-side.

在逗号分隔的列表中存储,可以在nvarchar字段中放置任意大小的矩阵。这假定您不需要在SQL中查询单个单元格,而只需获取整个矩阵并在客户端处理它。

Your table may look like this:

你的桌子可能是这样的:

tbl_matrices
----
id
user_id
matrix nvarchar(max)

Also, this works very well if you're matrices are sparse, otherwise you'll wind up with a lot of empty ,,,,, elements. You can work around that as well, though.

同样,如果你的矩阵是稀疏的,这也能很好地工作,否则你会得到很多空的,,,,元素。不过,你也可以在这方面工作。

#1


19  

There are lots of way to do this, we would need a lot more information to be more specific about what would be best for you. However, here are the two SOP ways:

有很多方法可以做到这一点,我们需要更多的信息来更具体地告诉你什么是最适合你的。然而,这里有两个SOP方法:

Either a separate table for each matrix:

每个矩阵的单独表格:

CREATE TABLE YourMatrixName(
    RowNo smallint NOT NULL,
    ColNo smallint NOT NULL,
    CellValue varchar](50) NULL,
 CONSTRAINT [PK_Matrices] PRIMARY KEY CLUSTERED 
    ([RowNo] ASC, [ColNo] ASC)
) ON [PRIMARY];
GO

CREATE UNIQUE NONCLUSTERED INDEX IX_YourMatrixName ON dbo.YourMatrixName
    (ColNo, RowNo); 
GO

Or, all of the matrices in one table:

或者,所有的矩阵在一个表中:

CREATE TABLE Matrices(
    MatrixName varchar(24) NOT NULL,
    RowNo smallint NOT NULL,
    ColNo smallint NOT NULL,
    CellValue varchar(50) NULL,
 CONSTRAINT [PK_Matrices] PRIMARY KEY CLUSTERED 
    ([MatrixName] ASC, [RowNo] ASC, [ColNo] ASC)
) ON [PRIMARY];
GO

CREATE UNIQUE NONCLUSTERED INDEX IX_Matrices ON dbo.Matrices
    (ColNo, RowNo); 
GO

These are standard normal form, virtually all other ways of doing it are not well normalized. Some advantages of these approaches:

这些是标准的正规形式,实际上所有其他的方法都没有很好地标准化。这些方法的一些优点:

  1. You do not have to fill in every cell, only the ones you are using. Or have a default value (0 or "") and skip those.
  2. 您不必填充每个单元格,只需填充正在使用的单元格。或者有一个默认值(0或“”)并跳过它们。
  3. This is easily the most flexible approach, even in the "all in one" model, there is no need to restrict them to the same size in any way, and it is very easy to resize them.
  4. 这是最灵活的方法,即使在“all in one”模型中,也不需要以任何方式将它们限制在相同的大小,而且调整大小非常容易。
  5. You can easily query the contents of the matrix, something that is increasingly difficult in more compact storage methods.
  6. 您可以很容易地查询矩阵的内容,这在更紧凑的存储方法中越来越困难。
  7. "Hit"s or any other aspect of the matrix cells are easy to implement as additional fields in the rows. Make them Null-able if you're worried about the additional space, and index them if you want to query/report on these attributes separately. Its also just as easy to retrofit features like this with this model also.
  8. “Hit”或矩阵单元格的任何其他方面都很容易实现为行中的附加字段。如果您担心额外的空间,让它们为空,并且如果您想要分别查询/报告这些属性,则可以对它们进行索引。它同样也很容易对这种模型进行改造。

The primary disadvantage is that there is typically a high space to data overhead. Many assume that there is also high overhead to Insert or retrieve new matrices but in fact there are several documented techniques that can make it quite fast.

主要的缺点是数据开销通常有很大的空间。许多人认为插入或检索新矩阵也有很高的开销,但实际上有一些文档化的技术可以使它非常快。

#2


4  

Is your matrix dense of sparse? If it's sparse, it may be better for each entry to just store a list of hit's, rather than have a full 2D table that is mostly 0's.

你的矩阵密度稀疏吗?如果是稀疏的,那么对于每个条目,最好只存储hit's的列表,而不是一个完整的2D表,其中大部分都是0。

#3


3  

Rather than two tables, I would just use one table: (x, y, outcome). Beyond that it's hard to give any more advice with the limited information given.

而不是两个表,我只使用一个表:(x, y,结果)。除此之外,在有限的信息中很难给出更多的建议。

#4


2  

Video memory, a very simple 2D matrix is stored as follows:

视频记忆,一个非常简单的二维矩阵存储如下:

ABCD
EFGH
IJKL

in ram sequentially like an array as

在ram中按顺序排列,就像数组一样。

A,B,C,D,E,F,G,H,I,J,K,L

A,B,C,D,E,F,G,H,I,J,K,L

element x,y can be found at array offset

元素x,y可以在阵列偏移处找到

[y*width+x]

for instance, x=2,y=2 (zero-based) refers to element K.

例如,x=2,y=2(从零开始)指的是元素K。

[y*width+x]=[2*4+2]=10. array element 10 (again zero-based) = K, so you're good.

[y *宽度+ x]=[2 * 4 + 2]= 10。数组元素10(也是基于零的)= K,所以你做得很好。

Storing in a comma-delimited list will let you put a matrix of any size in an nvarchar field. This assumes that you don't need to query individual cells in SQL, but just grab the matrix as a whole and process it client-side.

在逗号分隔的列表中存储,可以在nvarchar字段中放置任意大小的矩阵。这假定您不需要在SQL中查询单个单元格,而只需获取整个矩阵并在客户端处理它。

Your table may look like this:

你的桌子可能是这样的:

tbl_matrices
----
id
user_id
matrix nvarchar(max)

Also, this works very well if you're matrices are sparse, otherwise you'll wind up with a lot of empty ,,,,, elements. You can work around that as well, though.

同样,如果你的矩阵是稀疏的,这也能很好地工作,否则你会得到很多空的,,,,元素。不过,你也可以在这方面工作。