基于FastDfs的分布式文件存储系统设计

时间:2022-04-28 22:34:09

前言

FastDFS是一个开源的轻量级分布式文件系统,它对文件进行管理,功能包括:文件存储、文件同步、文件访问(文件上传、文件下载)等,解决了大容量存储和负载均衡的问题。特别适合以文件为载体的在线服务,如相册网站、视频网站等等。

FastDFS 架构

FastDFS 服务有三个角色:跟踪服务器(tracker server)、存储服务器(storage server)和客户端(client).

tracker server: 跟踪服务器,主要做调度工作,起到均衡的作用;负责管理所有的 storage server和 group,每个 storage 在启动后会连接 Tracker,告知自己所属 group 等信息,并保持周期性心跳,Tracker 根据 storage 心跳信息,建立 group--->[storage server list]的映射表;tracker 管理的元数据很少,会直接存放在内存;tracker 上的元信息都是由 storage 汇报的信息生成的,本身不需要持久化任何数据,tracker 之间是对等关系,因此扩展 tracker 服务非常容易,之间增加tracker 服务器即可,所有 tracker 都接受 stroage 心跳信息,生成元数据信息来提供读写服务(与其他 Master-Slave 架构的优势是没有单点,tracker 也不会成为瓶颈,最终数据是和一个可用的storage Server 进行传输的)。

storage server:存储服务器,主要提供容量和备份服务;以 group 为单位,每个 group 内可以包含多台 storage server,数据互为备份,存储容量空间以 group 内容量最小的 storage 为准;建议 group 内的 storage server 配置相同;以 group 为单位组织存储能够方便的进行应用隔离、负载均衡和副本数定制;缺点是 group 的容量受单机存储容量的限制同时 group 内机器坏掉数据恢复只能依赖 group 内其他机器重新同步(坏盘替换重新挂载重启 fdfs_storaged 即可)。

多个 group 之间的存储方式有 3 种策略:round robin(轮询)、load balance(选择最大剩余空间的组上传文件)、specify group(指定 group 上传)

group 中 storage 存储依赖本地文件系统,storage 可配置多个数据存储目录,磁盘不做 raid,直接分别挂载到多个目录,将这些目录配置为 storage 的数据目录即可。

storage 接受写请求时,会根据配置好的规则,选择其中一个存储目录来存储文件;为避免单个目录下的文件过多,storage 第一次启时,会在每个数据存储目录里创建 2 级子目录,每级 256 个,总共 65536 个,新写的文件会以 hash 的方式被路由到其中某个子目录下,然后将文件数据直 接作为一个本地文件存储到该目录中

基于FastDfs的分布式文件存储系统设计


FastDFS 工作流程

(1)上传

基于FastDfs的分布式文件存储系统设计


选择 tracker server

集群中 tracker 之间是对等关系,客户端在上传文件时可用任意选择一个 tracker选择存储 group

(2)下载

基于FastDfs的分布式文件存储系统设计

基于FastDfs的分布式文件存储系统设计

基于fastdfs的集群

(1)架构

基于FastDfs的分布式文件存储系统设计

应用服务器(即是网站):

应用服务器是lnmp的环境,同时也在应用服务器的安装部署fastdfs,但是是作为tricker跟踪点,同时也必须安装fastdfs的php的api扩展,即php_client。在/etc/fdfs/client.conf是客服端程序,这个与php.ini有着关系,需要正确配置,简单地说,php的扩展函数是通过这个client.conf文件将php与tracker联系在一起。然后就是fastdfs的tracker与storage之间再通信,最后是php和storage建立通信关系,并进行数据传输(也就是上面提到的fastdfs的工作流程原理)。


storage的分组集群:

每一台服务器都单独安装fastdfs的文件服务器,同时也安装nginx和nginx_fastdfs_modul模块,这个nginx主要是用于http的访问,特别是用于下载服务,而nginx_fastdfs_modul这个模块主要是为了避免同步复制还没复制完成就访问同组备份服务器文件,这肯定会产生错误的,所以加入nginx_fastdfs_modul这个模块可以避免这种同步延迟的问题,如果http访问找不到文件,那么tracker就会重新访问文件的源服务器。

组与组之间的服务器是相互独立的,没有任何的关系。但组内的各个服务器是可以相互通信的,主要是热备文件,文件之间同步复制,所以组内的各个服务器即storage服务器存的数据是相同的。但是组内的每一台服务器容量是已组内最小的那一台作为基准,索引建议组内的服务器的容量配置相同。

当一个组内的容量最小的那一台服务器的存储容量几乎达到瓶颈时,一般fastdfs会预留4GB左右吧。这时再往组内存储文件已经是不能的了,必须横向扩展存储,即添加组,增加storage服务器。当再上传大文件时,tracker就会通过选择一个组来存储文件,规则:

基于FastDfs的分布式文件存储系统设计

所以在组内增加服务器可以称作纵向扩展,添加组服务器可以称作横向扩展,分别实现不同的功能

纵向扩展:实现热备,数据冗余,同时分担访问负载,因为组内的storage服务器之间是对等的关系,存储的文件也是相同的,如果组内storage服务器增加了,那么对于访问可以将其分流在任何的一台组内storage服务器*问,大大减少承受负载不足压力。


横向扩展:增加存储容量空间,因为一个组的存储空间其实是由一台存储量最小的服务器决定的,要想继续增加存储,就必须增加组,扩大存储空间。


安装fastdfs和 配置

现在以一台应用服务器和3台storage服务器来演示

应用服务器:192.168.1.10

3台storage:192.168.1.11,192.168.1.12,192.168.1.13

一、安装fastdfs

应用服务器和3台storage服务器上都安装fastdfs

(1)安装libfastcommon,4部机器上都安装

基于FastDfs的分布式文件存储系统设计

(2)安装fastdfs,4部机器上都安装

基于FastDfs的分布式文件存储系统设计

(3)tracker配置

因为tricker是以192.168.1.10的作为tracker,192.168.1.10上配置tracker

安装完fastdfs时,会在/etc/fdfs/下有三个配置文件

基于FastDfs的分布式文件存储系统设计

把这三个文件复制并从命名为client.conf,storage.conf,tracker.conf

cp client.conf.sample client.conf

cp storage.conf.sample storage.conf

cp tracker.conf.sample tracker.conf

现在配置tracker.conf

# is this config file disabled
# false for enabled
# true for disabled
disabled=false

# bind an address of this host
# empty for bind all addresses of this host
bind_addr=

# the tracker server port
port=22122

# connect timeout in seconds
# default value is 30s
connect_timeout=30

# network timeout in seconds
# default value is 30s
network_timeout=60

# the base path to store data and log files
<strong><span style="font-size:18px;">base_path=/usr/local/fastdfs</span></strong>
# max concurrent connections this server supported
max_connections=256

# accept thread count
# default value is 1
# since V4.07
accept_threads=1

# work thread count, should <= max_connections
# default value is 4
# since V2.00
work_threads=4

# the method of selecting group to upload files
# 0: round robin
# 1: specify group
# 2: load balance, select the max free space group to upload file
store_lookup=2

# which group to upload file
# when store_lookup set to 1, must set store_group to the group name
store_group=group1

# which storage server to upload file
# 0: round robin (default)
# 1: the first server order by ip address
# 2: the first server order by priority (the minimal)
store_server=0

# which path(means disk or mount point) of the storage server to upload file
# 0: round robin
# 2: load balance, select the max free space path to upload file
store_path=0

# which storage server to download file
# 0: round robin (default)
# 1: the source storage server which the current file uploaded to
download_server=0

# reserved storage space for system or other applications.
# if the free(available) space of any stoarge server in
# a group <= reserved_storage_space,
# no file can be uploaded to this group.
# bytes unit can be one of follows:
### G or g for gigabyte(GB)
### M or m for megabyte(MB)
### K or k for kilobyte(KB)
### no unit for byte(B)
### XX.XX% as ratio such as reserved_storage_space = 10%
reserved_storage_space = 10%

#standard log level as syslog, case insensitive, value list:
### emerg for emergency
### alert
### crit for critical
### error
### warn for warning
### notice
### info
### debug
log_level=info

#unix group name to run this program,
#not set (empty) means run by the group of current user
run_by_group=

#unix username to run this program,
#not set (empty) means run by current user
run_by_user=

# allow_hosts can ocur more than once, host can be hostname or ip address,
# "*" (only one asterisk) means match all ip addresses
# we can use CIDR ips like 192.168.5.64/26
# and also use range like these: 10.0.1.[0-254] and host[01-08,20-25].domain.com
# for example:
# allow_hosts=10.0.1.[1-15,20]
# allow_hosts=host[01-08,20-25].domain.com
# allow_hosts=192.168.5.64/26
allow_hosts=*

# sync log buff to disk every interval seconds
# default value is 10 seconds
sync_log_buff_interval = 10

# check storage server alive interval seconds
check_active_interval = 120

# thread stack size, should >= 64KB
# default value is 64KB
thread_stack_size = 64KB

# auto adjust when the ip address of the storage server changed
# default value is true
storage_ip_changed_auto_adjust = true

# storage sync file max delay seconds
# default value is 86400 seconds (one day)
# since V2.00
storage_sync_file_max_delay = 86400

# the max time of storage sync a file
# default value is 300 seconds
# since V2.00
storage_sync_file_max_time = 300

# if use a trunk file to store several small files
# default value is false
# since V3.00
use_trunk_file = false

# the min slot size, should <= 4KB
# default value is 256 bytes
# since V3.00
slot_min_size = 256

# the max slot size, should > slot_min_size
# store the upload file to trunk file when it's size <= this value
# default value is 16MB
# since V3.00
slot_max_size = 16MB

# the trunk file size, should >= 4MB
# default value is 64MB
# since V3.00
trunk_file_size = 64MB

# if create trunk file advancely
# default value is false
# since V3.06
trunk_create_file_advance = false

# the time base to create trunk file
# the time format: HH:MM
# default value is 02:00
# since V3.06
trunk_create_file_time_base = 02:00

# the interval of create trunk file, unit: second
# default value is 38400 (one day)
# since V3.06
trunk_create_file_interval = 86400

# the threshold to create trunk file
# when the free trunk file size less than the threshold, will create
# the trunk files
# default value is 0
# since V3.06
trunk_create_file_space_threshold = 20G

# if check trunk space occupying when loading trunk free spaces
# the occupied spaces will be ignored
# default value is false
# since V3.09
# NOTICE: set this parameter to true will slow the loading of trunk spaces
# when startup. you should set this parameter to true when neccessary.
trunk_init_check_occupying = false

# if ignore storage_trunk.dat, reload from trunk binlog
# default value is false
# since V3.10
# set to true once for version upgrade when your version less than V3.10
trunk_init_reload_from_binlog = false

# the min interval for compressing the trunk binlog file
# unit: second
# default value is 0, 0 means never compress
# FastDFS compress the trunk binlog when trunk init and trunk destroy
# recommand to set this parameter to 86400 (one day)
# since V5.01
trunk_compress_binlog_min_interval = 0

# if use storage ID instead of IP address
# default value is false
# since V4.00
use_storage_id = false

# specify storage ids filename, can use relative or absolute path
# since V4.00
storage_ids_filename = storage_ids.conf

# id type of the storage server in the filename, values are:
## ip: the ip address of the storage server
## id: the server id of the storage server
# this paramter is valid only when use_storage_id set to true
# default value is ip
# since V4.03
id_type_in_filename = ip

# if store slave file use symbol link
# default value is false
# since V4.01
store_slave_file_use_link = false

# if rotate the error log every day
# default value is false
# since V4.02
rotate_error_log = false

# rotate error log time base, time format: Hour:Minute
# Hour from 0 to 23, Minute from 0 to 59
# default value is 00:00
# since V4.02
error_log_rotate_time=00:00

# rotate error log when the log file exceeds this size
# 0 means never rotates log file by log file size
# default value is 0
# since V4.02
rotate_error_log_size = 0

# keep days of the log files
# 0 means do not delete old log files
# default value is 0
log_file_keep_days = 0

# if use connection pool
# default value is false
# since V4.05
use_connection_pool = false

# connections whose the idle time exceeds this time will be closed
# unit: second
# default value is 3600
# since V4.05
connection_pool_max_idle_time = 3600

# HTTP port on this tracker server
http.server_port=8080

# check storage HTTP server alive interval seconds
# <= 0 for never check
# default value is 30
http.check_alive_interval=30

# check storage HTTP server alive type, values are:
# tcp : connect to the storge server with HTTP port only,
# do not request and get response
# http: storage check alive url must return http status 200
# default value is tcp
http.check_alive_type=tcp

# check storage HTTP server alive uri/url
# NOTE: storage embed HTTP server support uri: /status.html
http.check_alive_uri=/status.html
</span>
我们一般只需要配置base_path这一项就可以了,其他的缺省默认,也可以根据自己的环境进行配置。

然后启动tracker:    /usr/local/fdfs/bin/fdfs_trackerd /etc/fdfs/tracker.conf

说到这里,突然想起一个要注意的问题,就是FastDFS安装包的make.sh文件,在这个文件里我们需要配置fastDFS的安装位置和可执行文件位置,默认在/usr/bin/下的,里面包含可执行文件:

基于FastDfs的分布式文件存储系统设计

所以我们在安装前也是可以更改的,就是在make.sh文件中更改

TARGET_PREFIX=$DESTDIR/usr/local/fdfs
TARGET_CONF_PATH=$DESTDIR/etc/fdfs
TARGET_INIT_PATH=$DESTDIR/etc/init.d


(4)配置storage

在三台的服务器上配置storage.conf,每一台的操作都是相同的,现在以192.168.1.11作为演示

同样在/etc/fsdfs/下有三个文件配置,将他们重命名为client.conf,storage.conf,tracker.conf

编辑storage.conf

# is this config file disabled
# false for enabled
# true for disabled
disabled=false

# the name of the group this storage server belongs to
#
# comment or remove this item for fetching from tracker server,
# in this case, use_storage_id must set to true in tracker.conf,
# and storage_ids.conf must be configed correctly.
group_name=group1

# bind an address of this host
# empty for bind all addresses of this host
bind_addr=

# if bind an address of this host when connect to other servers
# (this storage server as a client)
# true for binding the address configed by above parameter: "bind_addr"
# false for binding any address of this host
client_bind=true

# the storage server port
port=23000

# connect timeout in seconds
# default value is 30s
connect_timeout=30

# network timeout in seconds
# default value is 30s
network_timeout=60

# heart beat interval in seconds
heart_beat_interval=30

# disk usage report interval in seconds
stat_report_interval=60

# the base path to store data and log files
base_path=/usr/local/fastdfs
# max concurrent connections the server supported# default value is 256# more max_connections means more memory will be usedmax_connections=256# the buff size to recv / send data# this parameter must more than 8KB# default value is 64KB# since V2.00buff_size = 256KB# accept thread count# default value is 1# since V4.07accept_threads=1# work thread count, should <= max_connections# work thread deal network io# default value is 4# since V2.00work_threads=4# if disk read / write separated##  false for mixed read and write##  true for separated read and write# default value is true# since V2.00disk_rw_separated = true# disk reader thread count per store base path# for mixed read / write, this parameter can be 0# default value is 1# since V2.00disk_reader_threads = 1# disk writer thread count per store base path# for mixed read / write, this parameter can be 0# default value is 1# since V2.00disk_writer_threads = 1# when no entry to sync, try read binlog again after X milliseconds# must > 0, default value is 200mssync_wait_msec=50# after sync a file, usleep milliseconds# 0 for sync successively (never call usleep)sync_interval=0# storage sync start time of a day, time format: Hour:Minute# Hour from 0 to 23, Minute from 0 to 59sync_start_time=00:00# storage sync end time of a day, time format: Hour:Minute# Hour from 0 to 23, Minute from 0 to 59sync_end_time=23:59# write to the mark file after sync N files# default value is 500write_mark_file_freq=500# path(disk or mount point) count, default value is 1store_path_count=1# store_path#, based 0, if store_path0 not exists, it's value is base_path# the paths must be existstore_path0=/usr/local/fastdfs#store_path1=/usr/local/fastdfs2# subdir_count  * subdir_count directories will be auto created under each # store_path (disk), value can be 1 to 256, default value is 256subdir_count_per_path=256# tracker_server can ocur more than once, and tracker_server format is#  "host:port", host can be hostname or ip address
<span style="background-color: rgb(204, 204, 204);"><span style="color:#ff0000;">tracker_server=192.168.1.10:22122</span></span>#standard log level as syslog, case insensitive, value list:### emerg for emergency### alert### crit for critical### error### warn for warning### notice### info### debuglog_level=info#unix group name to run this program, #not set (empty) means run by the group of current userrun_by_group=#unix username to run this program,#not set (empty) means run by current userrun_by_user=# allow_hosts can ocur more than once, host can be hostname or ip address,# "*" (only one asterisk) means match all ip addresses# we can use CIDR ips like 192.168.5.64/26# and also use range like these: 10.0.1.[0-254] and host[01-08,20-25].domain.com# for example:# allow_hosts=10.0.1.[1-15,20]# allow_hosts=host[01-08,20-25].domain.com# allow_hosts=192.168.5.64/26allow_hosts=*# the mode of the files distributed to the data path# 0: round robin(default)# 1: random, distributted by hash codefile_distribute_path_mode=0# valid when file_distribute_to_path is set to 0 (round robin), # when the written file count reaches this number, then rotate to next path# default value is 100file_distribute_rotate_count=100# call fsync to disk when write big file# 0: never call fsync# other: call fsync when written bytes >= this bytes# default value is 0 (never call fsync)fsync_after_written_bytes=0# sync log buff to disk every interval seconds# must > 0, default value is 10 secondssync_log_buff_interval=10# sync binlog buff / cache to disk every interval seconds# default value is 60 secondssync_binlog_buff_interval=10# sync storage stat info to disk every interval seconds# default value is 300 secondssync_stat_file_interval=300# thread stack size, should >= 512KB# default value is 512KBthread_stack_size=512KB# the priority as a source server for uploading file.# the lower this value, the higher its uploading priority.# default value is 10upload_priority=10# the NIC alias prefix, such as eth in Linux, you can see it by ifconfig -a# multi aliases split by comma. empty value means auto set by OS type# default values is emptyif_alias_prefix=# if check file duplicate, when set to true, use FastDHT to store file indexes# 1 or yes: need check# 0 or no: do not check# default value is 0check_file_duplicate=0# file signature method for check file duplicate## hash: four 32 bits hash code## md5: MD5 signature# default value is hash# since V4.01file_signature_method=hash# namespace for storing file indexes (key-value pairs)# this item must be set when check_file_duplicate is true / onkey_namespace=FastDFS# set keep_alive to 1 to enable persistent connection with FastDHT servers# default value is 0 (short connection)keep_alive=0# you can use "#include filename" (not include double quotes) directive to # load FastDHT server list, when the filename is a relative path such as # pure filename, the base path is the base path of current/this config file.# must set FastDHT server list when check_file_duplicate is true / on# please see INSTALL of FastDHT for detail##include /home/yuqing/fastdht/conf/fdht_servers.conf# if log to access log# default value is false# since V4.00use_access_log = false# if rotate the access log every day# default value is false# since V4.00rotate_access_log = false# rotate access log time base, time format: Hour:Minute# Hour from 0 to 23, Minute from 0 to 59# default value is 00:00# since V4.00access_log_rotate_time=00:00# if rotate the error log every day# default value is false# since V4.02rotate_error_log = false# rotate error log time base, time format: Hour:Minute# Hour from 0 to 23, Minute from 0 to 59# default value is 00:00# since V4.02error_log_rotate_time=00:00# rotate access log when the log file exceeds this size# 0 means never rotates log file by log file size# default value is 0# since V4.02rotate_access_log_size = 0# rotate error log when the log file exceeds this size# 0 means never rotates log file by log file size# default value is 0# since V4.02rotate_error_log_size = 0# keep days of the log files# 0 means do not delete old log files# default value is 0log_file_keep_days = 0# if skip the invalid record when sync file# default value is false# since V4.02file_sync_skip_invalid_record=false# if use connection pool# default value is false# since V4.05use_connection_pool = false# connections whose the idle time exceeds this time will be closed# unit: second# default value is 3600# since V4.05connection_pool_max_idle_time = 3600# use the ip address of this storage server if domain_name is empty,# else this domain name will ocur in the url redirected by the tracker serverhttp.domain_name=# the port of the web server on this storage serverhttp.server_port=8888

启动storage服务:/usr/local/fdfs/bin/fdfs_storaged /etc/fdfs/storage.conf 服务器

特别注意group_name=group1,如果要将storage服务器配置在同一组的话,group_name设置组名相同就可以,tracker服务器会自动判断同组成员

(5)storage安装nginx

基于FastDfs的分布式文件存储系统设计

接着讲一些配置文件也复制至/etc/fdfs/下,方便管理

cp fastdfs-nginx-modul/src/mod_fdfs.conf /etc/fdfs/

touch /var/log/mod_fdfs.log //模块产生日志位置


配置nginx.conf

server {
listen 80;
server_name 192.168.1.11;
location /group1/M00{
root /usr/local/fdfs;
ngx_fastdfs_modul;
}
}

配置mod_fastdfs.conf

基于FastDfs的分布式文件存储系统设计

每一台storage服务器的mod_fastdfs按照这个模式配置,根据所在组配置group_name。


(6)安装php_client的API

在应用服务器。即tracker所在的服务器上安装php的扩展。在FastDFS的源码包解压后里面有个php_client目录,进入此目录,参照README进行安装:

phpize
./configure --with-php-config=/usr/local/php/bin/php-config
make
make install

安装完成后会自动把modules/fastdfs_client.so 拷贝到/usr/lib/php5/20090626目录下,只需要把fastdfs_client.ini copy到/etc/php5/conf.d目录下,执行php fastdfs_test.php 进行测试,php5 -m也会看到fastdfs_client模块,在README中还有相关的php函数解释

安装完成后,需要根据情况配置客服端程序和php.ini

vim /etc/fdfs/client.conf 

# connect timeout in seconds
# default value is 30s
connect_timeout=30

# network timeout in seconds
# default value is 30s
network_timeout=60

# the base path to store log files
base_path=/usr/local/php_client</span>
# tracker_server can ocur more than once, and tracker_server format is
# "host:port", host can be hostname or ip address
tracker_server=192.168.1.10:22122#standard log level as syslog, case insensitive, value list:### emerg for emergency### alert### crit for critical### error### warn for warning### notice### info### debuglog_level=info# if use connection pool# default value is false# since V4.05use_connection_pool = false# connections whose the idle time exceeds this time will be closed# unit: second# default value is 3600# since V4.05connection_pool_max_idle_time = 3600# if load FastDFS parameters from tracker server# since V4.05# default value is falseload_fdfs_parameters_from_tracker=false# if use storage ID instead of IP address# same as tracker.conf# valid only when load_fdfs_parameters_from_tracker is false# default value is false# since V4.05use_storage_id = false# specify storage ids filename, can use relative or absolute path# same as tracker.conf# valid only when load_fdfs_parameters_from_tracker is false# since V4.05storage_ids_filename = storage_ids.conf#HTTP settingshttp.tracker_server_port=80#use "#include" directive to include HTTP other settiongs##include http.conf
接着编辑php.ini

extension = fastdfs_client.so
fastdfs_client.tracker_group_count = 1
fastdfs_client.tracker_group0 = /etc/fdfs/client.conf
这样子php的api通过client.conf,和tracker联系起来。

一些基本的函数api在php_client的文件夹下面有详细的说明。