SCST cache options

By steve, 21 October, 2015

Having built a number of iscsi target systems over the years, I have consistently come back to SCST as my preferred implementation. Reasons have varied from IET crashing under load (doe not handle error conditions correctly), LIO crashing in early releases or not supporting naa_id (which is needed for vmware to mount a filesystem if the LUN id changes, or is different for different clusters).

Having said the above, I have always found the correct nv_cache and write_through options difficult to choose from a data integrity point of view, so I will have a go at describing them:
write_through=1

  • This causes SCST to flush all data to the backing storage device immediately.
  • This is the safest option to choose from a data consistency point of view, since it does not depend on the initiator sending flush commands to force data to be written to disk. It may cause slower data access as a result.
  • From a technical standpoint, setting this flag causes the file/block device to be opened with the O_DSYNC to achieve the above
  • There are notes in the official documentation to make sure your backing disk does not secretly do write-back caching, since enabling this option also causes any sync operations sent from an initiator to be a NOOP.
  • This also causes the iscsi device to tell the initiator that the device has a write-through cache, which will reduce the chance that the initiator will send a flush command (since all data received is meant to be committed before the write is acknowledged)
  • What is not clear from the official documentation, if you have a battery backed RAID controller with write-back cache enabled, you may get better performance setting write_through=1, since the RAID controller has its own write-back cache. I have just finished debugging a system that was getting IO stalling due to large bursts of write activity caused by cache flushes from the Linux OS because we had write_through=0 (the default value).

nv_cache=1

  • This causes SCST to never guarantee that data has been flushed to the backing storage device
  • As per the official documentation, if you set this option, then write_through is forced to 0 (even if you explicitly try to set write_through=1)
  • Setting this option causes SCST to ignore any requests to flush data to disk

nv_cache=0 and write_through=0

  • Unless o_direct is selected, this combination results in using the linux OS memory as a write-back cache, but will guarantee data is on the backing storage as soon as a flush command is received from the iSCSI inititator
  • This is equivalent to having the write_cache enabled on a SAS or SATA hard drive, and is dependent on the filesystem that is mounted on the iscsi initiator handling this correctly.

Comments