----------------------- Brick Failure Detection ----------------------- Detecting failures on the filesystem that a brick uses makes it possible to handle errors that are caused from outside of the Gluster environment. :: [2014-05-19 3:39:53]# brick="/mnt/lv4/vol4";gluster volume create vol4 eins:$brick zwei:$brick drei:$brick vier:$brick fuenf:$brick sechs:$brick volume create: vol4: success: please start the volume to access data [2014-05-19 3:40:22]# gluster volume start vol4 volume start: vol4: success # gluster volume set vol4 storage.health-check-interval 10 volume set: success [2014-05-19 3:51:16]# gluster volume info vol4 Volume Name: vol4 Type: Distribute Volume ID: 706122a9-44fc-4d1d-8c3b-97482d98b95c Status: Started Number of Bricks: 6 Transport-type: tcp Bricks: Brick1: eins:/mnt/lv4/vol4 Brick2: zwei:/mnt/lv4/vol4 Brick3: drei:/mnt/lv4/vol4 Brick4: vier:/mnt/lv4/vol4 Brick5: fuenf:/mnt/lv4/vol4 Brick6: sechs:/mnt/lv4/vol4 Options Reconfigured: storage.health-check-interval: 10 [sechs]# dmsetup table vg0-swift: 0 209715200 linear 8:7 838862848 vg0-cinder: 0 209715200 linear 8:7 419432448 vg0-lv4: 0 209715200 linear 8:7 1468008448 vg0-lv3: 0 209715200 linear 8:7 1258293248 vg0-lv2: 0 209715200 linear 8:7 1048578048 vg0-lv1: 0 209715200 linear 8:7 209717248 vg0-lv0: 0 209715200 linear 8:7 2048 vg0-glance: 0 209715200 linear 8:7 629147648 # echo 0 209715200 error > dmsetup-error-target [2014-05-19 3:43:43]# dmsetup load vg0-lv4 dmsetup-error-target [2014-05-19 3:44:46]# dmsetup resume vg0-lv4 [2014-05-19 3:45:06]# dmsetup table vg0-swift: 0 209715200 linear 8:7 838862848 vg0-cinder: 0 209715200 linear 8:7 419432448 vg0-lv4: 0 209715200 error vg0-lv3: 0 209715200 linear 8:7 1258293248 vg0-lv2: 0 209715200 linear 8:7 1048578048 vg0-lv1: 0 209715200 linear 8:7 209717248 vg0-lv0: 0 209715200 linear 8:7 2048 vg0-glance: 0 209715200 linear 8:7 629147648 brick ----- :: [2014-05-18 18:49:53.720594] I [glusterfsd-mgmt.c:56:mgmt_cbk_spec] 0-mgmt: Volume file changed [2014-05-18 18:50:04.238239] W [posix-helpers.c:1294:posix_health_check_thread_proc] 0-vol4-posix: stat() on /mnt/lv4/vol4 returned: Input/output error [2014-05-18 18:50:04.238328] M [posix-helpers.c:1314:posix_health_check_thread_proc] 0-vol4-posix: health-check failed, going down Message from syslogd@sechs at May 19 03:50:04 ... glusterfsd: [2014-05-18 18:50:04.238328] M [posix-helpers.c:1314:posix_health_check_thread_proc] 0-vol4-posix: health-check failed, going down [2014-05-18 18:50:34.238551] M [posix-helpers.c:1319:posix_health_check_thread_proc] 0-vol4-posix: still alive! -> SIGTERM Message from syslogd@sechs at May 19 03:50:34 ... glusterfsd: [2014-05-18 18:50:34.238551] M [posix-helpers.c:1319:posix_health_check_thread_proc] 0-vol4-posix: still alive! -> SIGTERM [2014-05-18 18:50:34.238910] W [glusterfsd.c:1095:cleanup_and_exit] (-->/lib64/libc.so.6(clone+0x6d) [0x7f1144ebab7d] (-->/lib64/libpthread.so.0(+0x79d1) [0x7f114554d9d1] (-->/usr/local/glusterfs-3.5.0/sbin/glusterfsd(glusterfs_sigwaiter+0xf0) [0x4085af]))) 0-: received signum (15), shutting down syslog ------ :: May 19 03:49:55 sechs kernel: XFS (dm-7): metadata I/O error: block 0x0 ("xfs_buf_iodone_callbacks") error 5 buf count 4096 May 19 03:49:57 sechs kernel: XFS (dm-7): metadata I/O error: block 0x6400108 ("xlog_iodone") error 5 buf count 4096 May 19 03:49:57 sechs kernel: XFS (dm-7): xfs_do_force_shutdown(0x2) called from line 1062 of file fs/xfs/xfs_log.c. Return address = 0xffffffffa04dd131 May 19 03:49:57 sechs kernel: XFS (dm-7): Log I/O Error Detected. Shutting down filesystem May 19 03:49:57 sechs kernel: XFS (dm-7): Please umount the filesystem and rectify the problem(s) May 19 03:50:04 sechs glusterfsd: [2014-05-18 18:50:04.238328] M [posix-helpers.c:1314:posix_health_check_thread_proc] 0-vol4-posix: health-check failed, going down Message from syslogd@sechs at May 19 03:50:04 ... glusterfsd: [2014-05-18 18:50:04.238328] M [posix-helpers.c:1314:posix_health_check_thread_proc] 0-vol4-posix: health-check failed, going down May 19 03:50:27 sechs kernel: XFS (dm-7): xfs_log_force: error 5 returned. Message from syslogd@sechs at May 19 03:50:34 ... glusterfsd: [2014-05-18 18:50:34.238551] M [posix-helpers.c:1319:posix_health_check_thread_proc] 0-vol4-posix: still alive! -> SIGTERM May 19 03:50:34 sechs glusterfsd: [2014-05-18 18:50:34.238551] M [posix-helpers.c:1319:posix_health_check_thread_proc] 0-vol4-posix: still alive! -> SIGTERM May 19 03:50:57 sechs kernel: XFS (dm-7): xfs_log_force: error 5 returned. gluster volume status --------------------- :: [2014-05-19 3:52:41]# gluster volume status vol4 Status of volume: vol4 Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick eins:/mnt/lv4/vol4 49160 Y 2925 Brick zwei:/mnt/lv4/vol4 49159 Y 440 Brick drei:/mnt/lv4/vol4 49152 Y 32500 Brick vier:/mnt/lv4/vol4 49152 Y 32657 Brick fuenf:/mnt/lv4/vol4 49152 Y 24517 Brick sechs:/mnt/lv4/vol4 N/A N N/A NFS Server on localhost 2049 Y 29535 NFS Server on zwei N/A N N/A NFS Server on vier N/A N N/A NFS Server on drei N/A N N/A NFS Server on eins N/A N N/A NFS Server on fuenf N/A N N/A NFS Server on sechs N/A N N/A Task Status of Volume vol4 ------------------------------------------------------------------------------ There are no active volume tasks process ------- :: # ps -ef | grep glusterfsd | grep -v grep | wc -l 0 service glusterd restart ------------------------ :: [2014-05-18 18:58:17.197872] I [glusterfsd.c:1959:main] 0-/usr/local/glusterfs-3.5.0/sbin/glusterfsd: Started running /usr/local/glusterfs-3.5.0/sbin/glusterfsd version 3.5git (/usr/local/glusterfs-3.5.0/sbin/glusterfsd -s sechs --volfile-id vol4.sechs.mnt-lv4-vol4 -p /var/lib/glusterd/vols/vol4/run/sechs-mnt-lv4-vol4.pid -S /var/run/23afc72b5ceddccd28b405b1cdf5b4df.socket --brick-name /mnt/lv4/vol4 -l /usr/local/glusterfs-3.5.0/var/log/glusterfs/bricks/mnt-lv4-vol4.log --xlator-option *-posix.glusterd-uuid=0765d288-a59b-4ccf-90ae-c3332c83dbf4 --brick-port 49152 --xlator-option vol4-server.listen-port=49152) [2014-05-18 18:58:17.205310] I [socket.c:3561:socket_init] 0-socket.glusterfsd: SSL support is NOT enabled [2014-05-18 18:58:17.205486] I [socket.c:3576:socket_init] 0-socket.glusterfsd: using system polling thread [2014-05-18 18:58:17.205880] I [socket.c:3561:socket_init] 0-glusterfs: SSL support is NOT enabled [2014-05-18 18:58:17.205949] I [socket.c:3576:socket_init] 0-glusterfs: using system polling thread [2014-05-18 18:58:18.834910] I [graph.c:254:gf_add_cmdline_options] 0-vol4-server: adding option 'listen-port' for volume 'vol4-server' with value '49152' [2014-05-18 18:58:18.834976] I [graph.c:254:gf_add_cmdline_options] 0-vol4-posix: adding option 'glusterd-uuid' for volume 'vol4-posix' with value '0765d288-a59b-4ccf-90ae-c3332c83dbf4' [2014-05-18 18:58:18.837332] I [rpcsvc.c:2064:rpcsvc_set_outstanding_rpc_limit] 0-rpc-service: Configured rpc.outstanding-rpc-limit with value 64 [2014-05-18 18:58:18.837510] W [options.c:848:xl_opt_validate] 0-vol4-server: option 'listen-port' is deprecated, preferred is 'transport.socket.listen-port', continuing with correction [2014-05-18 18:58:18.837572] I [socket.c:3561:socket_init] 0-tcp.vol4-server: SSL support is NOT enabled [2014-05-18 18:58:18.837601] I [socket.c:3576:socket_init] 0-tcp.vol4-server: using system polling thread [2014-05-18 18:58:18.838445] E [common-utils.c:93:mkdir_p] 0-: Failed due to reason Input/output error [2014-05-18 18:58:18.838505] I [mem-pool.c:539:mem_pool_destroy] 0-vol4-changelog: size=108 max=0 total=0 [2014-05-18 18:58:18.838533] E [xlator.c:403:xlator_init] 0-vol4-changelog: Initialization of volume 'vol4-changelog' failed, review your volfile again [2014-05-18 18:58:18.838561] E [graph.c:307:glusterfs_graph_init] 0-vol4-changelog: initializing translator failed [2014-05-18 18:58:18.838610] E [graph.c:502:glusterfs_graph_activate] 0-graph: init failed [2014-05-18 18:58:18.839480] W [glusterfsd.c:1095:cleanup_and_exit] (-->/usr/local/glusterfs-3.5.0/lib/libgfrpc.so.0(rpc_clnt_handle_reply+0x1b5) [0x7f2981c837d8] (-->/usr/local/glusterfs-3.5.0/sbin/glusterfsd(mgmt_getspec_cbk+0x36a) [0x40cf77] (-->/usr/local/glusterfs-3.5.0/sbin/glusterfsd(glusterfs_process_volfp+0x18a) [0x408bf2]))) 0-: received signum (0), shutting down