[Solaris] zfs_params

Ниже будут перечислены используемые параметры для тюнинга ZFS.

arc_reduce_dnlc_percent

If the ARC detects low memory (via arc_reclaim_needed()), then we call arc_kmem_reap_now() and subsequently dnlc_reduce_cache() — which reduces the # of dnlc entries by 3% (ARC_REDUCE_DNLC_PERCENT).

So yeah, dnlc_nentries would be really interesting to see (especially if its << ncsize).
The version of statit that we’re using is still attached to ancient 32-bit counters that /are/ overflowing on our runs. I’m fixing this at the moment and I’ll send around a new binary this afternoon.

Default: 0x3

How to change:

# echo arc_reduce_dnlc_percent/W0t2 | mdb -kw

zfs_arc_max, zfs_arc_min (deprecated in 11.2)

Determines the maximum/minimum size of the ZFS Adjustable Replacement Cache (ARC). Solaris 11.2 deprecates the zfs_arc_max kernel parameter in favor of user_reserve_hint_pct and that’s cool.

Default:

How to change:

arc_shrink_shift

This variable controls the amount of RAM that arc_shrinks will try to reclaim. By default this is set to 5, which equates to shrinking by 1/32 of arc_max. We tuned this to 11, which is 1/2048 of arc_max. Based on that, we would be shrinking the arc by about 100MB per shrink event, rather than 6GB of RAM.

Every second a process runs which checks if data can be removed from the ARC and evicts it. Default max 1/32nd of the ARC can be evicted at a time. This is limited because evicting large amounts of data from ARC stalls all other processes. Back when 8GB was a lot of memory 1/32nd meant 256MB max at a time. When you have 196GB of memory 1/32nd is 6.3GB, which can cause up to 20-30 seconds of unresponsiveness (depending on the record size).
(where 11 is 1/2 11 or 1/2048th, 10 is  1/2 10 or 1/1024th etc. Change depending on amount of RAM in your system).

Default:

How to change:

zfs_mdcomp_disable

This parameter controls compression of ZFS metadata (indirect blocks only). ZFS data block compression is controlled by the ZFS compression property that can be set per file system.

Default: 0

How to change:

# echo zfs_mdcomp_disable/W0t1 | mdb -kw

zfs_prefetch_disable

This parameter determines a file-level prefetching mechanism called zfetch. This mechanism looks at the patterns of reads to files and anticipates on some reads, thereby reducing application wait times.

Default: 0

How to change:

# echo zfs_prefetch_disable/W0t1 | mdb -kw

metaslab_aliquot 

Metaslab granularity, in bytes. This is roughly similar to what would be referred to as the «stripe size» in traditional RAID arrays. In normal operation, ZFS will try to write this amount of data to a top-level vdev before moving on to the next one.

The traditional VDEV space re-balancing occurred by means of a bias based on a 512K metaslab_aliquot and the number of VDEV children.  This bias mechanism will not function correctly with large allocation sizes. An alternate method may need to be devised to allow effective re-balancing when streams of large allocations occur.

Intel is currently working on a alternate re-balancing solution for large blocks.

Default: 0x80000

How to change:

spa_max_replication_override

Количество DVA (data virtual address) в указателе блока, так называемые ditto-blocks

Default: 0x3

How to change:

spa_mode_global

Is used to define the mode in which given zpool can be initialized internally by ZFS, typically used as READ/WRITE mode.

Default: 0x3

How to change:

zfs_flags

Set additional debugging flags

flag value symbolic name description
0x1 ZFS_DEBUG_DPRINTF Enable dprintf entries in the debug log
0x2 ZFS_DEBUG_DBUF_VERIFY Enable extra dnode verifications
0x4 ZFS_DEBUG_DNODE_VERIFY Enable extra dnode verifications
0x8 ZFS_DEBUG_SNAPNAMES Enable snapshot name verification
0x10 ZFS_DEBUG_MODIFY Check for illegally modified ARC buffers
0x20 ZFS_DEBUG_SPA Enable spa_dbgmsg entries in the debug log
0x40 ZFS_DEBUG_ZIO_FREE Enable verification of block frees
0x80 ZFS_DEBUG_HISTOGRAM_VERIFY Enable extra spacemap histogram verifications
0x100 ZFS_DEBUG_METASLAB_VERIFY Verify space accounting on disk matches in-core range_trees
0x200 ZFS_DEBUG_SET_ERROR Enable SET_ERROR and dprintf entries in the debug log

Default: 0x0

How to change:

# echo zfs_flags/W0x8 | mdb -kw

zfs_txg_synctime_ms

This sets how often (in milliseconds) the cache dumps to disk (tgx sync).

Default: 0x1388

How to change:

zfs_txg_timeout 

Seconds between transaction group commits (delay between ZIL commits changes)

Default: 0x5

How to change:

# echo "zfs_txg_timeout/Z 3c" | mdb -kw

zfs_write_limit_min

Min tgx write limit

Default:  0x800000

How to change:

zfs_write_limit_max

Max tgx write limit

Default: 0xff98dc00

How to change:

zfs_write_limit_shift

log2(fraction of memory) per txg (int)

Default: 0x3

How to change:

zfs_write_limit_override

Override txg write limit

Default: 0x0

How to change:

# echo zfs_write_limit_override/W0t402653184 | mdb -kw

zfs_no_write_throttle

Disable write throttling

Default: 0x0

How to change:

# echo zfs_no_write_throttle/W 1 | mdb -kw

zfs_vdev_cache_max

essentially disables the vdev cache as the random I/Os are not going to be lower than XXX

Default: 0x4000

How to change:

zfs_vdev_cache_size

Total size of the per-disk cache

Default: 0x0

How to change:

zfs_vdev_cache_bshift

is the base 2 logarithm of the size used to read disks.

Default: 0x10

How to change:

zfs_vdev_max_pending

This parameter controls, how many I/O requests can be pending per vdev. For example when you have 100 disks visible from your OS with a zfs:zfs_vdev_max_pending of 2, you have 200 request outstanding at maximum. When you have 100 disks hidden behind your storage controller just showing a single LUN, you will have – you will know it – 2 pending requests at maximum.

Default: 0xa

How to change:

# echo zfs_vdev_max_pending/W0t35 | mdb –kw

zfs_vdev_min_pending

same that above.

Default: 0x4

How to change:

zfs_scrub_limit

maximum number of scrub/resilver I/O per leaf vdev

Default: 0xa

How to change:

zfs_vdev_time_shift

Deadline time shift for vdev I/O

Default: 0x6

How to change:

zfs_vdev_ramp_rate

Exponential I/O issue ramp-up rate

Default: 0x2

How to change:

zfs_vdev_aggregation_limit

Max vdev I/O aggregation size

Default: 0x20000

How to change:

zfs_nocacheflush

This parameter controls ZFS write cache flushes for the entire system.Oracle’s Sun hardware should not require tuning this parameter. If you need to tune cache flushing, considering tuning it per hardware device. See the general instructions below. Contact your storage vendor for instructions on how to tell the storage devices to ignore the cache flushes sent by ZFS.

Default: 0x1

How to change:

zil_replay_disable

Disable intent logging replay

Default: 0x0

How to change:

metaslab_df_alloc_threshold

The minimum free space, in percent, which must be available in a space map to continue allocations in a first-fit fashion. Once the space map’s free space drops below this level we dynamically switch to using best-fit allocations.

Default: 0x100000

How to change:

metaslab_df_free_pct

Percentage free space in metaslab

Default: 0x4

How to change:

zio_injection_enabled

Enable fault injection.
To handle fault injection, we keep track of a series of zinject_record_t structures which describe which logical block(s) should be injected with a fault. These are kept in a global list. Each record corresponds to a given spa_t and maintains a special hold on the spa_t so that it cannot be deleted or exported while the injection record exists. Device level injection is done using the ‘zi_guid’ field. If this is set, it means that the error is destined for a particular device, not a piece of data. This is a rather poor data structure and algorithm, but we don’t expect more than a few faults at any one time, so it should be sufficient for our needs.

Default: 0x0

How to change:

zfs_immediate_write_sz

Limit on data size being sent to the ZIL. (Синхронные записи будут записываться непосредственно в пул или записываться в slog. По умолчанию это 32k. Операции записи, превышающие это значение буду выполняться непосредственно в пуле)

Default: 0x8000

How to change:

zfs_read_chunk_size

Bytes to read per chunk

Default: 0x100000

How to change:

zfs_vdev_max_queue_wait

Is a factor used to trigger I/O starvation avoidance behavior. Used in conjunction with zfs_vdev_max_pending to track the earliest I/O that has been issued. If more than zfs_vdev_max_queue_wait full pending queues have been issued since, this I/O is being starved. Don’t accept any more I/Os. This will drain the pending queue until the starved I/O is processed.

Default: 0x4

How to change:

zfetch_max_streams

Max number of streams per zfetch (prefetch streams per
file).

Default: 0x8

zfetch_min_sec_reap

Min time before an active prefetch stream can be reclaimed

Default: 0x2

zfetch_block_cap

Max number of blocks to prefetch at a time

= 0x100

zfetch_array_rd_sz

If prefetching is enabled, disable prefetching for reads larger than this size.

Default: 0x100000

zfs_no_scrub_io

Set for no scrub I/O. Use 1 for yes and 0 for no (default).

Default: 0x0

zfs_no_scrub_prefetch

Set for no scrub prefetching. Use 1 for yes and 0 for no (default).

Default: 0x0

Unknown:

  • 11.3

fzap_default_block_shift = 0xe
metaslab_gang_threshold = 0x100001
vdev_mirror_shift = 0x15
zvol_immediate_write_sz = 0x8000
zfs_no_scan_io = 0x0
zfs_no_scan_prefetch = 0x0
zfetch_maxbytes_ub = 0x2000000
zfetch_maxbytes_lb = 0x400000
zfetch_target_blks = 0x100
zfetch_throttle_interval = 0xa
zfetch_num_hash_buckets = 0x400000
zfetch_ageout = 0xa
zfetch_ageout_sleep_time = 0x2
zfs_default_bs = 0x9
zfs_default_ibs = 0xe
zfs_vdev_future_reads = 0x2
zfs_vdev_future_read_bytes = 0x40000
zfs_vdev_future_writes = 0x2
zfs_vdev_future_write_bytes = 0x40000

  • 11.2 / 11.1

zfs_vdev_future_pending = 0xa

Добавить комментарий

Ваш e-mail не будет опубликован. Обязательные поля помечены *