Ik heb hier een ZFS mirror van twee 5TB SMR schijven via USB3 aangesloten. Deze is voor 90% vol (4.10T / 4.55T). Een scrub duurt 13 uur. Dat vind ik meer dan redelijk. Ook goed om te realiseren:
Scrub and resilver use exactly the same code
Uit de video hieronder gelinkt.
Daarbij is het probleem wat je beschrijft inderdaad lang van toepassing geweest binnen ZFS. Echter met de
sequential scrub and resilver een stuk minder een probleem. Je zal in recente ZFS versies ook zien dat de output van de scrub anders is dan je gewend was.
root@pve:/# zpool status rpool
pool: rpool
state: ONLINE
scan: scrub in progress since Fri Apr 17 20:23:18 2020
4.93G scanned at 505M/s, 2.70G issued at 276M/s, 4.93G total
0B repaired, 54.68% done, 0 days 00:00:08 to go
Het verschil is de nu
scanned en
issued splitsing die er in oudere versies niet was.
Dit is overigens een ZFS mirror van SSD's. De oude output was dit (willekeurig van internet):
# zpool status tank
pool: tank
state: ONLINE
scan: scrub in progress since Sat Dec 8 08:06:36 2012
32.0M scanned out of 48.5M at 16.0M/s, 0h0m to go
0 repaired, 65.99% done
Uitleg van de commit in ZoL
Currently, scrubs and resilvers can take an extremely long time to complete. This is largely due to the fact that zfs scans process pools in logical order, as determined by each block's bookmark. This makes sense from a simplicity perspective, but blocks in zfs are often scattered randomly across disks, particularly due to zfs's copy-on-write mechanisms.
This patch improves performance by splitting scrubs and resilvers into a metadata scanning phase and an IO issuing phase. The metadata scan reads through the structure of the pool and gathers an in-memory queue of I/Os, sorted by size and offset on disk. The issuing phase will then issue the scrub I/Os as sequentially as possible, greatly improving performance.
This patch also updates and cleans up some of the scan code which has not been updated in several years.
https://github.com/openzf...df6d0ae33196f5b5decbc48fd
https://github.com/openzfs/zfs/pull/6256How Has This Been Tested?
Initial performance tests show scrubs taking 5-6x less time with this patch on a 47TB pool consisting of data mimicking our production environment. On zfs version 0.6.8, the scrub took a little over 125 hours while it only took 22.5 with this patch. @skiselkov 's presentation at the developer summit cited a 16x performance improvement for a worst case scenario.
As this patch solidifies we will add more formalized automated tests.
PDF van de oorspronkelijke PoC van dit idee:
https://drive.google.com/...e4cdmVU91cml1N0pKYTQ/view
met een presentatie:
https://www.youtube.com/w...wv8BdBj4&feature=youtu.be
Presentatie van de OpenZFS Developer Summit:
https://www.youtube.com/watch?v=upn9tYh917s
Het zit dus al in ZoL 0.8 and FreeNAS 11.1
https://www.ixsystems.com...ntial-scrub-and-resilver/
Mijn ervaringen worden gedeeld door andere gebruikers:
Both the scrub and resilver performace is MANY times faster with the new sequential scrub/resilver code. On my 6TB WD ES drives, it starts off slow at ~80-100 MB/s, then after about 15 minutes or so ramps up unto the hundreds of MB/s and stays there until it's done. Its also much quieter. Scrubs and resilvers used to thrash drives terribly, and now they thrash for the first minute or so then go almost silent unless you try to access the array during the resilver process.
https://arstechnica.com/c....php?p=37591199#p37591199
[Reactie gewijzigd door CurlyMo op 17 april 2020 22:05]