diff options
Diffstat (limited to 'health/health.d/bcache.conf')
-rw-r--r-- | health/health.d/bcache.conf | 14 |
1 files changed, 8 insertions, 6 deletions
diff --git a/health/health.d/bcache.conf b/health/health.d/bcache.conf index f0da9ac5..d5fccf4f 100644 --- a/health/health.d/bcache.conf +++ b/health/health.d/bcache.conf @@ -1,13 +1,14 @@ template: bcache_cache_errors on: disk.bcache_cache_read_races - lookup: sum -10m unaligned absolute + lookup: sum -1m unaligned absolute units: errors every: 1m warn: $this > 0 - crit: $this > ( ($status >= $CRITICAL) ? (0) : (10) ) - delay: down 1h multiplier 1.5 max 2h - info: the number of times bcache had issues using the cache, during the last 10 mins (this usually means your SSD cache is failing) + delay: up 2m down 1h multiplier 1.5 max 2h + info: number of times data was read from the cache, \ + the bucket was reused and invalidated in the last 10 minutes \ + (when this occurs the data is reread from the backing device) to: sysadmin template: bcache_cache_dirty @@ -16,7 +17,8 @@ template: bcache_cache_dirty units: % every: 1m warn: $this > ( ($status >= $WARNING ) ? ( 70 ) : ( 90 ) ) - crit: $this > ( ($status >= $CRITICAL) ? ( 90 ) : ( 95 ) ) + crit: $this > ( ($status == $CRITICAL) ? ( 90 ) : ( 95 ) ) delay: up 1m down 1h multiplier 1.5 max 2h - info: the percentage of cache space used for dirty and metadata (this usually means your SSD cache is too small) + info: percentage of cache space used for dirty data and metadata \ + (this usually means your SSD cache is too small) to: sysadmin |