在数据库中存储媒体文件的最佳方式是什么?

时间:2022-10-22 09:34:24

I want to store a large number of sound files in a database, but I don't know if it is a good practice. I would like to know the pros and cons of doing it in this way.

我想在数据库中存储大量的声音文件,但是我不知道这是不是一个好的实践。我想知道这样做的利弊。

I also thought on the possibility to have "links" to those files, but maybe this will carry more problems than solutions. Any experience in this direction will be welcome :)

我也考虑过可能有“链接”到这些文件,但这可能会带来更多的问题而不是解决方案。如有此方面的经验,欢迎:)

Note: The database will be MySQL.

注意:数据库将是MySQL。

8 个解决方案

#1


83  

Every system I know of that stores large numbers of big files stores them externally to the database. You store all of the queryable data for the file (title, artist, length, etc) in the database, along with a partial path to the file. When it's time to retrieve the file, you extract the file's path, prepend some file root (or URL) to it, and return that.

我所知道的每个系统都将大量的大文件存储在数据库的外部。在数据库中存储文件的所有可查询数据(标题、艺术家、长度等),以及文件的部分路径。当需要检索文件时,您将提取文件的路径,将某个文件根(或URL)前置到它,并返回该路径。

So, you'd have a "location" column, with a partial path in it, like "a/b/c/1000", which you then map to: "http://myserver/files/a/b/c/1000.mp3"

因此,您将有一个“location”列,其中包含部分路径,如“a/b/c/1000”,然后映射到:“http://myserver/files/a/b/c/1000.mp3”

Make sure that you have an easy way to point the media database at a different server/directory, in case you need that for data recovery. Also, you might need a routine that re-syncs the database with the contents of the file archive.

确保有一种简单的方法将媒体数据库指向不同的服务器/目录,以备数据恢复时需要。此外,您可能需要一个将数据库与文件归档的内容重新同步的例程。

Also, if you're going to have thousands of media files, don't store them all in one giant directory - that's a performance bottleneck on some file systems. Instead,break them up into multiple balanced sub-trees.

此外,如果您将有数千个媒体文件,不要将它们全部存储在一个巨大的目录中——这是某些文件系统的性能瓶颈。相反,把它们分解成多个平衡的子树。

#2


16  

I think storing them in the database is ok, as long as you use a good implementation. You can read this older but good article for ideas on how to keep the larger amounts of data in the database from affecting performance.

我认为在数据库中存储它们是可以的,只要您使用一个好的实现。您可以阅读这篇较老但很好的文章,了解如何保持数据库中的大量数据不会影响性能。

http://www.dreamwerx.net/phpforum/?id=1

http://www.dreamwerx.net/phpforum/?id=1

I've had literally 100's of gigs loaded in mysql databases without any issues. The design and implementation is key, do it wrong and you'll suffer.

我已经在mysql数据库中加载了100个gigs,没有任何问题。设计和实现是关键,做错了,你就会遭殃。

More DB Advantages (not already mentioned): - Works better in a load balanced environment - You can build in more backend storage scalability

更多的DB优点(还没有提到):-在负载均衡的环境中工作得更好——您可以构建更多的后端存储可伸缩性

#3


8  

I've experimented in different projects with doing it both ways and we've finally decided that it's easier to use the file system as well. After all, the file system is already optimized for storing, retrieving, and indexing files.

我在不同的项目中尝试了这两种方法,我们最终决定更容易使用文件系统。毕竟,文件系统已经对存储、检索和索引文件进行了优化。

The one tip that I would have about that is to only store a "root relative" path to the file in the database, then have your program or your queries/stored procedures/middle-ware use an installation specific root parameter to retrieve the file.

关于这一点,我有一个提示,就是只在数据库中存储文件的“根相关”路径,然后让您的程序或查询/存储过程/中间件使用特定于安装的根参数来检索文件。

For example, if you store XYZ.Wav in C:\MyProgram\Data\Sounds\X\ the full path would be

例如,如果存储XYZ。Wav in C:\MyProgram\Data\ \ \ \ sound \X\完整的路径是

C:\MyProgram\Data\Sounds\X\XYZ.Wav

But you would store the path and or filename in the database as:

但您将在数据库中存储路径和文件名:

X\XYZ.Wav

Elsewhere, in the database or in your program's configuration files, store a root path like SoundFilePath equal to

在其他地方,在数据库中或者在程序的配置文件中,存储像SoundFilePath一样的根路径。

C:\MyProgram\Data\Sounds\

C:\ MyProgram \声音\ \数据

Of course, where you split the root from the database path is up to you. That way if you move your program installation, you don't have to update the database.

当然,从数据库路径中分离根是由您自己决定的。这样,如果您移动了程序安装,就不需要更新数据库。

Also, if there are going to be lots of files, find some way of hashing the paths so you don't wind up with one directory containing hundreds or thousands of files (in my little example, there are subdirectories based on the first character of the filename, but you can go deeper or use random hashes). This makes search indexers happy as well.

同样,如果会有大量的文件,找到一些散列路径的方法,这样你就不会结束一个目录包含成百上千的文件(在我的示例中,有基于文件名的第一个字符的子目录,但你可以去更深层次的或使用随机散列)。这也让搜索索引者感到高兴。

#4


7  

Advantages of using a database:

使用数据库的优点:

  • Easy to join sound files with other data bits.
  • 容易连接声音文件与其他数据位。
  • Avoiding file i/o operations that bypass database security.
  • 避免绕过数据库安全性的文件i/o操作。
  • No need for separation operations to delete sound files when database records are deleted.
  • 当删除数据库记录时,不需要进行分离操作来删除声音文件。

Disadvantages of using a database:

使用数据库的缺点:

  • Database bloat
  • 数据库膨胀
  • Databases can be more expensive than file systems
  • 数据库可能比文件系统更昂贵

#5


4  

You could store them as BLOBs (or LONGBLOBs) and then retrieve the data out when you want to actually access the media files.

您可以将它们存储为blob(或LONGBLOBs),然后在实际访问媒体文件时检索数据。

or

You could simply store the media files on a drive and store the metadata in the DB.

您可以简单地将媒体文件存储在驱动器上,并将元数据存储在DB中。

I lean toward the latter method. I don't know how this is done overall in the world, but I suspect that many others would do the same.

我倾向于后一种方法。我不知道这在世界上是怎么做的,但我怀疑其他人也会这么做。

You can store links (partial paths to the data) and then retrieve this info. Makes it easy to move things around on drives and still access it.

您可以存储链接(数据的部分路径),然后检索此信息。使在驱动器上移动物体和仍然访问它变得容易。

I store off the relative path of each file in the DB along with other metadata about the files. The base path can then be changed on the fly if I need to relocate the actual data to another drive (either local or via UNC path).

我将数据库中每个文件的相对路径和关于文件的其他元数据存储在一起。如果我需要将实际数据重新定位到另一个驱动器(本地或通过UNC路径),则可以动态地更改基本路径。

That's how I do it. I'm sure others will have ideas too.

我就是这么做的。我相信其他人也会有想法。

#6


3  

Some advantages of using blobs to store files

使用blobs存储文件的一些优点

  • Lower management overhead - use a single tool to backup / restore etc
  • 较低的管理开销——使用一个工具来备份/恢复等等
  • No possibility for database and filesystem to be out of sync
  • 数据库和文件系统不可能不同步
  • Transactional capability (if needed)
  • 事务性能力(如果需要)

Some disadvantages

一些缺点

  • blows up your database servers' RAM with useless rubbish it could be using to store rows, indexes etc
  • 用一些无用的垃圾来破坏数据库服务器的RAM,这些垃圾可能用于存储行、索引等
  • Makes your DB backups very large, hence less manageable
  • 使您的DB备份非常大,因此较难管理
  • Not as convenient as a filesystem to serve to clients (e.g. with a web server)
  • 不像文件系统那样方便地服务于客户端(例如使用web服务器)

What about performance? Your mileage may vary. Filesystems are extremely varied, so are databases in their performance. In some cases a filesystem will win (probably with fewer larger files). In some cases a DB might be better (maybe with a very large number of smallish files).

性能怎么样?你的情况可能不同。文件系统非常多样化,数据库的性能也是如此。在某些情况下,文件系统会胜出(可能只有更小的文件)。在某些情况下,DB可能更好(可能有大量的小文件)。

In any case, don't worry, do what seems best at the time.

在任何情况下,不要担心,做那些看起来最好的事情。

Some databases offer a built-in web server to serve blobs. At the time of writing, MySQL does not.

有些数据库提供内置的web服务器来服务blobs。在写作的时候,MySQL没有。

#7


2  

Store them as external files. Then save the path in a varchar field. Putting large binary blobs into a relational database is generally very inefficient - they only use up space and slow things down as caches are filled are unusable. And there's nothing to be gained - the blobs themselves cannot be searched. You might want to save media meta data into the the database though.

将它们存储为外部文件。然后在varchar字段中保存路径。将大型二进制blobs放入关系数据库通常效率非常低——它们只会占用空间并降低速度,因为缓存被填满是不可用的。而且也没有什么可以得到的——这些气泡本身是不能被搜索的。您可能希望将媒体元数据保存到数据库中。

#8


1  

A simple solution would be to just store the relative locations of the files as strings and let the filesystem handle it. I've tried it on a project (we were storing office file attachments to a survey), and it worked fine.

一个简单的解决方案是将文件的相对位置存储为字符串,并让文件系统处理它。我曾在一个项目中尝试过(我们将office文件附件存储到一个调查中),它运行得很好。

#1


83  

Every system I know of that stores large numbers of big files stores them externally to the database. You store all of the queryable data for the file (title, artist, length, etc) in the database, along with a partial path to the file. When it's time to retrieve the file, you extract the file's path, prepend some file root (or URL) to it, and return that.

我所知道的每个系统都将大量的大文件存储在数据库的外部。在数据库中存储文件的所有可查询数据(标题、艺术家、长度等),以及文件的部分路径。当需要检索文件时,您将提取文件的路径,将某个文件根(或URL)前置到它,并返回该路径。

So, you'd have a "location" column, with a partial path in it, like "a/b/c/1000", which you then map to: "http://myserver/files/a/b/c/1000.mp3"

因此,您将有一个“location”列,其中包含部分路径,如“a/b/c/1000”,然后映射到:“http://myserver/files/a/b/c/1000.mp3”

Make sure that you have an easy way to point the media database at a different server/directory, in case you need that for data recovery. Also, you might need a routine that re-syncs the database with the contents of the file archive.

确保有一种简单的方法将媒体数据库指向不同的服务器/目录,以备数据恢复时需要。此外,您可能需要一个将数据库与文件归档的内容重新同步的例程。

Also, if you're going to have thousands of media files, don't store them all in one giant directory - that's a performance bottleneck on some file systems. Instead,break them up into multiple balanced sub-trees.

此外,如果您将有数千个媒体文件,不要将它们全部存储在一个巨大的目录中——这是某些文件系统的性能瓶颈。相反,把它们分解成多个平衡的子树。

#2


16  

I think storing them in the database is ok, as long as you use a good implementation. You can read this older but good article for ideas on how to keep the larger amounts of data in the database from affecting performance.

我认为在数据库中存储它们是可以的,只要您使用一个好的实现。您可以阅读这篇较老但很好的文章,了解如何保持数据库中的大量数据不会影响性能。

http://www.dreamwerx.net/phpforum/?id=1

http://www.dreamwerx.net/phpforum/?id=1

I've had literally 100's of gigs loaded in mysql databases without any issues. The design and implementation is key, do it wrong and you'll suffer.

我已经在mysql数据库中加载了100个gigs,没有任何问题。设计和实现是关键,做错了,你就会遭殃。

More DB Advantages (not already mentioned): - Works better in a load balanced environment - You can build in more backend storage scalability

更多的DB优点(还没有提到):-在负载均衡的环境中工作得更好——您可以构建更多的后端存储可伸缩性

#3


8  

I've experimented in different projects with doing it both ways and we've finally decided that it's easier to use the file system as well. After all, the file system is already optimized for storing, retrieving, and indexing files.

我在不同的项目中尝试了这两种方法,我们最终决定更容易使用文件系统。毕竟,文件系统已经对存储、检索和索引文件进行了优化。

The one tip that I would have about that is to only store a "root relative" path to the file in the database, then have your program or your queries/stored procedures/middle-ware use an installation specific root parameter to retrieve the file.

关于这一点,我有一个提示,就是只在数据库中存储文件的“根相关”路径,然后让您的程序或查询/存储过程/中间件使用特定于安装的根参数来检索文件。

For example, if you store XYZ.Wav in C:\MyProgram\Data\Sounds\X\ the full path would be

例如,如果存储XYZ。Wav in C:\MyProgram\Data\ \ \ \ sound \X\完整的路径是

C:\MyProgram\Data\Sounds\X\XYZ.Wav

But you would store the path and or filename in the database as:

但您将在数据库中存储路径和文件名:

X\XYZ.Wav

Elsewhere, in the database or in your program's configuration files, store a root path like SoundFilePath equal to

在其他地方,在数据库中或者在程序的配置文件中,存储像SoundFilePath一样的根路径。

C:\MyProgram\Data\Sounds\

C:\ MyProgram \声音\ \数据

Of course, where you split the root from the database path is up to you. That way if you move your program installation, you don't have to update the database.

当然,从数据库路径中分离根是由您自己决定的。这样,如果您移动了程序安装,就不需要更新数据库。

Also, if there are going to be lots of files, find some way of hashing the paths so you don't wind up with one directory containing hundreds or thousands of files (in my little example, there are subdirectories based on the first character of the filename, but you can go deeper or use random hashes). This makes search indexers happy as well.

同样,如果会有大量的文件,找到一些散列路径的方法,这样你就不会结束一个目录包含成百上千的文件(在我的示例中,有基于文件名的第一个字符的子目录,但你可以去更深层次的或使用随机散列)。这也让搜索索引者感到高兴。

#4


7  

Advantages of using a database:

使用数据库的优点:

  • Easy to join sound files with other data bits.
  • 容易连接声音文件与其他数据位。
  • Avoiding file i/o operations that bypass database security.
  • 避免绕过数据库安全性的文件i/o操作。
  • No need for separation operations to delete sound files when database records are deleted.
  • 当删除数据库记录时,不需要进行分离操作来删除声音文件。

Disadvantages of using a database:

使用数据库的缺点:

  • Database bloat
  • 数据库膨胀
  • Databases can be more expensive than file systems
  • 数据库可能比文件系统更昂贵

#5


4  

You could store them as BLOBs (or LONGBLOBs) and then retrieve the data out when you want to actually access the media files.

您可以将它们存储为blob(或LONGBLOBs),然后在实际访问媒体文件时检索数据。

or

You could simply store the media files on a drive and store the metadata in the DB.

您可以简单地将媒体文件存储在驱动器上,并将元数据存储在DB中。

I lean toward the latter method. I don't know how this is done overall in the world, but I suspect that many others would do the same.

我倾向于后一种方法。我不知道这在世界上是怎么做的,但我怀疑其他人也会这么做。

You can store links (partial paths to the data) and then retrieve this info. Makes it easy to move things around on drives and still access it.

您可以存储链接(数据的部分路径),然后检索此信息。使在驱动器上移动物体和仍然访问它变得容易。

I store off the relative path of each file in the DB along with other metadata about the files. The base path can then be changed on the fly if I need to relocate the actual data to another drive (either local or via UNC path).

我将数据库中每个文件的相对路径和关于文件的其他元数据存储在一起。如果我需要将实际数据重新定位到另一个驱动器(本地或通过UNC路径),则可以动态地更改基本路径。

That's how I do it. I'm sure others will have ideas too.

我就是这么做的。我相信其他人也会有想法。

#6


3  

Some advantages of using blobs to store files

使用blobs存储文件的一些优点

  • Lower management overhead - use a single tool to backup / restore etc
  • 较低的管理开销——使用一个工具来备份/恢复等等
  • No possibility for database and filesystem to be out of sync
  • 数据库和文件系统不可能不同步
  • Transactional capability (if needed)
  • 事务性能力(如果需要)

Some disadvantages

一些缺点

  • blows up your database servers' RAM with useless rubbish it could be using to store rows, indexes etc
  • 用一些无用的垃圾来破坏数据库服务器的RAM,这些垃圾可能用于存储行、索引等
  • Makes your DB backups very large, hence less manageable
  • 使您的DB备份非常大,因此较难管理
  • Not as convenient as a filesystem to serve to clients (e.g. with a web server)
  • 不像文件系统那样方便地服务于客户端(例如使用web服务器)

What about performance? Your mileage may vary. Filesystems are extremely varied, so are databases in their performance. In some cases a filesystem will win (probably with fewer larger files). In some cases a DB might be better (maybe with a very large number of smallish files).

性能怎么样?你的情况可能不同。文件系统非常多样化,数据库的性能也是如此。在某些情况下,文件系统会胜出(可能只有更小的文件)。在某些情况下,DB可能更好(可能有大量的小文件)。

In any case, don't worry, do what seems best at the time.

在任何情况下,不要担心,做那些看起来最好的事情。

Some databases offer a built-in web server to serve blobs. At the time of writing, MySQL does not.

有些数据库提供内置的web服务器来服务blobs。在写作的时候,MySQL没有。

#7


2  

Store them as external files. Then save the path in a varchar field. Putting large binary blobs into a relational database is generally very inefficient - they only use up space and slow things down as caches are filled are unusable. And there's nothing to be gained - the blobs themselves cannot be searched. You might want to save media meta data into the the database though.

将它们存储为外部文件。然后在varchar字段中保存路径。将大型二进制blobs放入关系数据库通常效率非常低——它们只会占用空间并降低速度,因为缓存被填满是不可用的。而且也没有什么可以得到的——这些气泡本身是不能被搜索的。您可能希望将媒体元数据保存到数据库中。

#8


1  

A simple solution would be to just store the relative locations of the files as strings and let the filesystem handle it. I've tried it on a project (we were storing office file attachments to a survey), and it worked fine.

一个简单的解决方案是将文件的相对位置存储为字符串,并让文件系统处理它。我曾在一个项目中尝试过(我们将office文件附件存储到一个调查中),它运行得很好。