[Solaris] zfs_params

Ниже будут перечислены используемые параметры для тюнинга ZFS.


If the ARC detects low memory (via arc_reclaim_needed()), then we call arc_kmem_reap_now() and subsequently dnlc_reduce_cache() – which reduces the # of dnlc entries by 3% (ARC_REDUCE_DNLC_PERCENT).

So yeah, dnlc_nentries would be really interesting to see (especially if its << ncsize).
The version of statit that we’re using is still attached to ancient 32-bit counters that /are/ overflowing on our runs. I’m fixing this at the moment and I’ll send around a new binary this afternoon.

Default: 0x3

How to change:

# echo arc_reduce_dnlc_percent/W0t2 | mdb -kw

zfs_arc_max, zfs_arc_min (deprecated in 11.2)

Determines the maximum/minimum size of the ZFS Adjustable Replacement Cache (ARC). Solaris 11.2 deprecates the zfs_arc_max kernel parameter in favor of user_reserve_hint_pct and that’s cool.


How to change:


This variable controls the amount of RAM that arc_shrinks will try to reclaim. By default this is set to 5, which equates to shrinking by 1/32 of arc_max. We tuned this to 11, which is 1/2048 of arc_max. Based on that, we would be shrinking the arc by about 100MB per shrink event, rather than 6GB of RAM.

Every second a process runs which checks if data can be removed from the ARC and evicts it. Default max 1/32nd of the ARC can be evicted at a time. This is limited because evicting large amounts of data from ARC stalls all other processes. Back when 8GB was a lot of memory 1/32nd meant 256MB max at a time. When you have 196GB of memory 1/32nd is 6.3GB, which can cause up to 20-30 seconds of unresponsiveness (depending on the record size).
(where 11 is 1/2 11 or 1/2048th, 10 is  1/2 10 or 1/1024th etc. Change depending on amount of RAM in your system).

Default: 0x5

How to change:

# echo arc_shrink_shift/W0xa | mdb -kw


This parameter controls compression of ZFS metadata (indirect blocks only). ZFS data block compression is controlled by the ZFS compression property that can be set per file system.

Default: 0

How to change:

# echo zfs_mdcomp_disable/W0t1 | mdb -kw


This parameter determines a file-level prefetching mechanism called zfetch. This mechanism looks at the patterns of reads to files and anticipates on some reads, thereby reducing application wait times.

Default: 0

How to change:

# echo zfs_prefetch_disable/W0t1 | mdb -kw


Metaslab granularity, in bytes. This is roughly similar to what would be referred to as the “stripe size” in traditional RAID arrays. In normal operation, ZFS will try to write this amount of data to a top-level vdev before moving on to the next one.

The traditional VDEV space re-balancing occurred by means of a bias based on a 512K metaslab_aliquot and the number of VDEV children.  This bias mechanism will not function correctly with large allocation sizes. An alternate method may need to be devised to allow effective re-balancing when streams of large allocations occur.

Intel is currently working on a alternate re-balancing solution for large blocks.

Default: 0x80000

How to change

# echo metaslab_aliquot/W0x90000 | mdb -kw


Количество DVA (data virtual address) в указателе блока, так называемые ditto-blocks

Default: 0x3

How to change:


Is used to define the mode in which given zpool can be initialized internally by ZFS, typically used as READ/WRITE mode.

Default: 0x3

How to change:


Set additional debugging flags

flag value symbolic name description
0x1 ZFS_DEBUG_DPRINTF Enable dprintf entries in the debug log
0x2 ZFS_DEBUG_DBUF_VERIFY Enable extra dnode verifications
0x4 ZFS_DEBUG_DNODE_VERIFY Enable extra dnode verifications
0x8 ZFS_DEBUG_SNAPNAMES Enable snapshot name verification
0x10 ZFS_DEBUG_MODIFY Check for illegally modified ARC buffers
0x20 ZFS_DEBUG_SPA Enable spa_dbgmsg entries in the debug log
0x40 ZFS_DEBUG_ZIO_FREE Enable verification of block frees
0x80 ZFS_DEBUG_HISTOGRAM_VERIFY Enable extra spacemap histogram verifications
0x100 ZFS_DEBUG_METASLAB_VERIFY Verify space accounting on disk matches in-core range_trees
0x200 ZFS_DEBUG_SET_ERROR Enable SET_ERROR and dprintf entries in the debug log

Default: 0x0

How to change:

# echo zfs_flags/W0x8 | mdb -kw


This sets how often (in milliseconds) the cache dumps to disk (tgx sync).

Default: 0x1388

How to change:

# echo zfs_txg_synctime_ms/W0x2000 | mdb -kw


This sets how often (in milliseconds) the cache dumps to SSD disk (tgx sync). Only for SSD disks

Default: 0x2170

How to change:

# echo zfs_ssd_txg_synctime_ms/W0x21700 | mdb -kw


Seconds between transaction group commits (delay between ZIL commits changes)

Default: 0x5

How to change:

#echo zfs_txg_timeout/W0t120 | mdb -kw


Min tgx write limit

Default:  0x800000

How to change:


Max tgx write limit

Default: 0xff98dc00

How to change:


log2(fraction of memory) per txg (int)

Default: 0x3

How to change:


Override txg write limit

Default: 0x0

How to change:

# echo zfs_write_limit_override/W0t402653184 | mdb -kw


Disable write throttling

Default: 0x0

How to change:

# echo zfs_no_write_throttle/W 1 | mdb -kw


essentially disables the vdev cache as the random I/Os are not going to be lower than XXX

Default: 0x4000

How to change:


Total size of the per-disk cache

Default: 0x0

How to change:


is the base 2 logarithm of the size used to read disks.

Default: 0x10

How to change:


This parameter controls, how many I/O requests can be pending per vdev. For example when you have 100 disks visible from your OS with a zfs:zfs_vdev_max_pending of 2, you have 200 request outstanding at maximum. When you have 100 disks hidden behind your storage controller just showing a single LUN, you will have – you will know it – 2 pending requests at maximum.

Default: 0xa

How to change:

# echo zfs_vdev_max_pending/W0t35 | mdb –kw


same that above.

Default: 0x4

How to change:


maximum number of scrub/resilver I/O per leaf vdev

Default: 0xa

How to change:


Deadline time shift for vdev I/O

Default: 0x6

How to change:


Exponential I/O issue ramp-up rate

Default: 0x2

How to change:


Max vdev I/O aggregation size

Default: 0x20000

How to change:


This parameter controls ZFS write cache flushes for the entire system.Oracle’s Sun hardware should not require tuning this parameter. If you need to tune cache flushing, considering tuning it per hardware device. See the general instructions below. Contact your storage vendor for instructions on how to tell the storage devices to ignore the cache flushes sent by ZFS.

Default: 0x1

How to change:


Disable intent logging replay. Can be disabled for recovery from corrupted ZIL. If zil_replay_disable = 1, then when a volume or filesystem is brought online, no attempt to replay the ZIL is made and any existing ZIL is destroyed. This can result in loss of data without notice.

Default: 0x0

How to change:


The minimum free space, in percent, which must be available in a space map to continue allocations in a first-fit fashion. Once the space map’s free space drops below this level we dynamically switch to using best-fit allocations.

Default: 0x100000

How to change:


Percentage free space in metaslab

Default: 0x4

How to change:


Enable fault injection.
To handle fault injection, we keep track of a series of zinject_record_t structures which describe which logical block(s) should be injected with a fault. These are kept in a global list. Each record corresponds to a given spa_t and maintains a special hold on the spa_t so that it cannot be deleted or exported while the injection record exists. Device level injection is done using the ‘zi_guid’ field. If this is set, it means that the error is destined for a particular device, not a piece of data. This is a rather poor data structure and algorithm, but we don’t expect more than a few faults at any one time, so it should be sufficient for our needs.

Default: 0x0

How to change:


Limit on data size being sent to the ZIL. (Синхронные записи будут записываться непосредственно в пул или записываться в slog. По умолчанию это 32k. Операции записи, превышающие это значение буду выполняться непосредственно в пуле)

Default: 0x8000

How to change:


Bytes to read per chunk

Default: 0x100000

How to change:


Is a factor used to trigger I/O starvation avoidance behavior. Used in conjunction with zfs_vdev_max_pending to track the earliest I/O that has been issued. If more than zfs_vdev_max_queue_wait full pending queues have been issued since, this I/O is being starved. Don’t accept any more I/Os. This will drain the pending queue until the starved I/O is processed.

Default: 0x4

How to change:


Max number of streams per zfetch (prefetch streams per

Default: 0x8


Min time before an active prefetch stream can be reclaimed

Default: 0x2


Max number of blocks to prefetch at a time

= 0x100


If prefetching is enabled, disable prefetching for reads larger than this size.

Default: 0x100000


Set for no scrub I/O. Use 1 for yes and 0 for no (default).

Default: 0x0


Set for no scrub prefetching. Use 1 for yes and 0 for no (default).

Default: 0x0


  • 11.3

fzap_default_block_shift = 0xe
metaslab_gang_threshold = 0x100001
vdev_mirror_shift = 0x15
zvol_immediate_write_sz = 0x8000
zfs_no_scan_io = 0x0
zfs_no_scan_prefetch = 0x0
zfetch_maxbytes_ub = 0x2000000
zfetch_maxbytes_lb = 0x400000
zfetch_target_blks = 0x100
zfetch_throttle_interval = 0xa
zfetch_num_hash_buckets = 0x400000
zfetch_ageout = 0xa
zfetch_ageout_sleep_time = 0x2
zfs_default_bs = 0x9
zfs_default_ibs = 0xe
zfs_vdev_future_reads = 0x2
zfs_vdev_future_read_bytes = 0x40000
zfs_vdev_future_writes = 0x2
zfs_vdev_future_write_bytes = 0x40000

  • 11.2 / 11.1

zfs_vdev_future_pending = 0xa


Залишити відповідь

Ваша e-mail адреса не оприлюднюватиметься. Обов’язкові поля позначені *

Домашняя страничка Andy
Записки молодого админа
Самостоятельная подготовка к Cisco CCNA
Самостоятельная подготовка к Cisco CCNP
Powered by Muff