在Mongo数据库中存储图像文件,这是一个好主意吗?

时间:2022-10-17 23:54:56

When working with mysql, it is a bad idea to store images as BLOB in the database, as it makes the database quite large which is harmful for normal usage of the database. Then, it is better to save image files on disk and save link to them within the database.

使用mysql时,将图像作为BLOB存储在数据库中是一个坏主意,因为它会使数据库非常大,这对正常使用数据库是有害的。然后,最好将图像文件保存在磁盘上并保存数据库中的链接。

However, I think this is different for MongoDB, as increasing the database file size has a negligible influence on performance (this is the reason that MongoDB can successfully handle billions of records).

但是,我认为这与MongoDB不同,因为增加数据库文件大小对性能的影响可以忽略不计(这就是MongoDB可以成功处理数十亿条记录的原因)。

Do you think it is better to save image files on MongoDB (as GridFS) to reduce number of files stored on the server; or still it is better to keep the database as small as possible?

您认为最好将图像文件保存在MongoDB(作为GridFS)以减少存储在服务器上的文件数量;或者最好是保持数据库尽可能小?

4 个解决方案

#1


11  

The problem isn't so much that the database gets big, databases can handle that (although MongoDB isn't as good as many other in that respect). The problem is that to send the data to the client it first has to be moved into RAM by the database, then copied over to the application's memory, then handed off to the kernel to be sent through the socket. It's wasting lots of RAM and CPU cycles. The reason it's better to have large files in the filesystem is that it's easier to get around copying it, you can ask the kernel to stream the file from disk to the socket directly.

问题不在于数据库变大,数据库可以解决这个问题(尽管MongoDB在这方面并不像许多其他人那么好)。问题是,要将数据发送到客户端,它首先必须由数据库移动到RAM中,然后复制到应用程序的内存中,然后传递给内核以通过套接字发送。它浪费了大量的RAM和CPU周期。在文件系统中拥有大文件最好的原因是它更容易复制它,你可以要求内核直接将文件从磁盘流式传输到套接字。

The downside of storing large files in the filesystem is that it's much harder to distribute. Using a database, and something like Mongo's GridFS makes it possible to scale out. You just have to make sure you don't copy the whole file into the application's memory at once, but a chunk at a time. Most web app frameworks have some support for sending chunked HTTP responses nowadays.

在文件系统中存储大文件的缺点是分发起来要困难得多。使用数据库和Mongo的GridFS之类的东西可以扩展。您只需要确保不要立即将整个文件复制到应用程序的内存中,而是一次复制一个块。现在,大多数Web应用程序框架都支持发送分块HTTP响应。

#2


5  

The answer is yes. Back in the old cave-man days, servers had mutable file systems you could change. This was great till we tried to scale things.

答案是肯定的。回到旧洞穴时代,服务器具有可更改的可变文件系统。在我们尝试扩展事物之前,这很棒。

Cave-people nowadays build apps with immutable deployments. Heroku and Dokku are examples of this. Because the web app server has no state, they can be created, upgraded, scaled, and destroyed easily.

Cave-people现在构建具有不可变部署的应用程序。 Heroku和Dokku就是这样的例子。由于Web应用程序服务器没有状态,因此可以轻松地创建,升级,扩展和销毁它们。

Since we still have files, we need to put them somewhere. There are several solutions: nfs, our database, someone elses database.

既然我们还有文件,我们需要将它们放在某个地方。有几种解决方案:nfs,我们的数据库,别人的数据库。

  • nfs is a 'network file system' which let's you do file i/o on network resources. If you're dealing with the network anyways, IMHO it doesn't add much value unless it's what you know already.

    nfs是一个'网络文件系统',你可以在网络资源上进行文件i / o。如果你正在处理网络,恕我直言,除非你已经知道,否则它不会增加太多价值。

  • Our database - For MongoDB there are two options: (file > 16mb) ? GridFS : BinData

    我们的数据库 - 对于MongoDB,有两个选项:(文件> 16mb)? GridFS:BinData

  • Someone elses database - Some are basic like Amazon S3 and some offer extra services like Cloudinary or Dropbox.

    有人使用数据库 - 有些是基本的,如Amazon S3,有些提供额外的服务,如Cloudinary或Dropbox。

If you're on an big-budget enterprise team and someone spends 40 hrs a week taking care of servers then sure - use the file system. If you're building web apps that scale, putting files in the DB makes sense.

如果你是一个大预算的企业团队,每周花40小时照顾服务器,那么肯定 - 使用文件系统。如果您正在构建可扩展的Web应用程序,则将文件放入数据库是有意义的。

If you're concerned about performance:

如果你担心表现:

1) Using a proxy (e.g. nginx) or a CDN to host your content for clients. Your server should just be serving cache misses.

1)使用代理(例如nginx)或CDN来托管客户的内容。您的服务器应该只是服务缓存未命中。

2) Use streaming IO Nodeschool has a cool tutorial for Node.js.

2)使用流媒体IO Nodeschool有一个很酷的Node.js教程。

#3


2  

MongoDB's GridFS is designed for this sort of storage and is quite handy for storing image files across many different servers in a way that all servers can use them.

MongoDB的GridFS专为此类存储而设计,非常便于以任何服务器都可以使用它们的方式将图像文件存储在许多不同的服务器上。

#4


0  

Storing images is not a good idea in any DB, because:

在任何数据库中存储图像都不是一个好主意,因为:

  • read/write to a DB is always slower than a filesystem
  • 读/写DB始终比文件系统慢
  • your DB backups grow to be huge and more time consuming
  • 您的数据库备份变得庞大且耗时
  • access to the files now requires going through your app and DB layers
  • 现在,访问文件需要浏览您的应用和数据库层

The last two are the real killers.

最后两个是真正的杀手。

Source: Three things you should never put in your database.

来源:您不应该在数据库中放置三件事。

So if you can make your application crafty, then better not to upload your pictures to MongoDB.

因此,如果您可以使您的应用程序变得狡猾,那么最好不要将您的照片上传到MongoDB。

However, if you are close to deadline... and the database will be so small that it will not grow up a lot and its size will never exceed the available RAM on the machine running your application, then I think (as opposed to the author of the cited article), you may consider storing the images in MongoDB. It's simply, convenient, quick to implement and gives you some flexibility.

但是,如果你接近截止日期...并且数据库将如此之小以至于它不会长大并且它的大小永远不会超过运行应用程序的机器上的可用RAM,那么我认为(而不是引用文章的作者),您可以考虑将图像存储在MongoDB中。它简单,方便,快速实施,并为您提供一些灵活性。

#1


11  

The problem isn't so much that the database gets big, databases can handle that (although MongoDB isn't as good as many other in that respect). The problem is that to send the data to the client it first has to be moved into RAM by the database, then copied over to the application's memory, then handed off to the kernel to be sent through the socket. It's wasting lots of RAM and CPU cycles. The reason it's better to have large files in the filesystem is that it's easier to get around copying it, you can ask the kernel to stream the file from disk to the socket directly.

问题不在于数据库变大,数据库可以解决这个问题(尽管MongoDB在这方面并不像许多其他人那么好)。问题是,要将数据发送到客户端,它首先必须由数据库移动到RAM中,然后复制到应用程序的内存中,然后传递给内核以通过套接字发送。它浪费了大量的RAM和CPU周期。在文件系统中拥有大文件最好的原因是它更容易复制它,你可以要求内核直接将文件从磁盘流式传输到套接字。

The downside of storing large files in the filesystem is that it's much harder to distribute. Using a database, and something like Mongo's GridFS makes it possible to scale out. You just have to make sure you don't copy the whole file into the application's memory at once, but a chunk at a time. Most web app frameworks have some support for sending chunked HTTP responses nowadays.

在文件系统中存储大文件的缺点是分发起来要困难得多。使用数据库和Mongo的GridFS之类的东西可以扩展。您只需要确保不要立即将整个文件复制到应用程序的内存中,而是一次复制一个块。现在,大多数Web应用程序框架都支持发送分块HTTP响应。

#2


5  

The answer is yes. Back in the old cave-man days, servers had mutable file systems you could change. This was great till we tried to scale things.

答案是肯定的。回到旧洞穴时代,服务器具有可更改的可变文件系统。在我们尝试扩展事物之前,这很棒。

Cave-people nowadays build apps with immutable deployments. Heroku and Dokku are examples of this. Because the web app server has no state, they can be created, upgraded, scaled, and destroyed easily.

Cave-people现在构建具有不可变部署的应用程序。 Heroku和Dokku就是这样的例子。由于Web应用程序服务器没有状态,因此可以轻松地创建,升级,扩展和销毁它们。

Since we still have files, we need to put them somewhere. There are several solutions: nfs, our database, someone elses database.

既然我们还有文件,我们需要将它们放在某个地方。有几种解决方案:nfs,我们的数据库,别人的数据库。

  • nfs is a 'network file system' which let's you do file i/o on network resources. If you're dealing with the network anyways, IMHO it doesn't add much value unless it's what you know already.

    nfs是一个'网络文件系统',你可以在网络资源上进行文件i / o。如果你正在处理网络,恕我直言,除非你已经知道,否则它不会增加太多价值。

  • Our database - For MongoDB there are two options: (file > 16mb) ? GridFS : BinData

    我们的数据库 - 对于MongoDB,有两个选项:(文件> 16mb)? GridFS:BinData

  • Someone elses database - Some are basic like Amazon S3 and some offer extra services like Cloudinary or Dropbox.

    有人使用数据库 - 有些是基本的,如Amazon S3,有些提供额外的服务,如Cloudinary或Dropbox。

If you're on an big-budget enterprise team and someone spends 40 hrs a week taking care of servers then sure - use the file system. If you're building web apps that scale, putting files in the DB makes sense.

如果你是一个大预算的企业团队,每周花40小时照顾服务器,那么肯定 - 使用文件系统。如果您正在构建可扩展的Web应用程序,则将文件放入数据库是有意义的。

If you're concerned about performance:

如果你担心表现:

1) Using a proxy (e.g. nginx) or a CDN to host your content for clients. Your server should just be serving cache misses.

1)使用代理(例如nginx)或CDN来托管客户的内容。您的服务器应该只是服务缓存未命中。

2) Use streaming IO Nodeschool has a cool tutorial for Node.js.

2)使用流媒体IO Nodeschool有一个很酷的Node.js教程。

#3


2  

MongoDB's GridFS is designed for this sort of storage and is quite handy for storing image files across many different servers in a way that all servers can use them.

MongoDB的GridFS专为此类存储而设计,非常便于以任何服务器都可以使用它们的方式将图像文件存储在许多不同的服务器上。

#4


0  

Storing images is not a good idea in any DB, because:

在任何数据库中存储图像都不是一个好主意,因为:

  • read/write to a DB is always slower than a filesystem
  • 读/写DB始终比文件系统慢
  • your DB backups grow to be huge and more time consuming
  • 您的数据库备份变得庞大且耗时
  • access to the files now requires going through your app and DB layers
  • 现在,访问文件需要浏览您的应用和数据库层

The last two are the real killers.

最后两个是真正的杀手。

Source: Three things you should never put in your database.

来源:您不应该在数据库中放置三件事。

So if you can make your application crafty, then better not to upload your pictures to MongoDB.

因此,如果您可以使您的应用程序变得狡猾,那么最好不要将您的照片上传到MongoDB。

However, if you are close to deadline... and the database will be so small that it will not grow up a lot and its size will never exceed the available RAM on the machine running your application, then I think (as opposed to the author of the cited article), you may consider storing the images in MongoDB. It's simply, convenient, quick to implement and gives you some flexibility.

但是,如果你接近截止日期...并且数据库将如此之小以至于它不会长大并且它的大小永远不会超过运行应用程序的机器上的可用RAM,那么我认为(而不是引用文章的作者),您可以考虑将图像存储在MongoDB中。它简单,方便,快速实施,并为您提供一些灵活性。