We have some iSCSI setups using a linux pacemaker cluster, and I have always had issues with adding more resources. The setup is as follows:
- IP Primitive - One for each IP address
- Target Primitive - One for the system
- LUN Primitive - One per LUN
- Device Primitive - One per device (may include a clone primitive)
These are controlled through the following mechanisms:
- group groupname target_primitive lun_primitive ... ip_primitive - One per server
- colocation colo_name inf: groupname device_primitive - One per server
- order order_name Mandatory: device_primitive groupname - One per server
The issue I have is that whenever a new LUN is added, the group needs to be updated (to make sure the LUNs are all present when the IP address is added), which causes a brief outage. My thoughts are to set things up as follows:
- IP Primitive - One for each IP address
- Target Primitive - One for each IP address
- LUN Primitive - One per LUN/IP address combination
- Device Primitive - One per device (may include a clone primitive)
- group groupname target_primitive lun_primitive ... ip_primitive - One per IP address (makes sure the target-LUN-IP primitives are colocated and start in that order)
- colocation colo_name inf: groupname1 groupname2 ... device_primitive - One per server (make sure all resources run on the same host since iSCSI locks do not work across hosts)
- order order_name Mandatory: device_primitive groupname - One per IP address (make sure each group starts after the device is ready)
We would need to make sure the LUN add/remove code checks for existing resources before running, and only removed the LUN if it is the last one . This requires a custom version of the iSCSI primitive scripts.
The advantage of the second setup is that a LUN can be added independently to each IP address without causing a disruption to the initiators (they will all fail over to the other path)