LSI Logic heeft een nieuwe BIOS/firmware-release voor de MegaRAID SCSI 320-X familie uitgebracht. Het betreft hier de nieuwe generatie PCI-X RAID-adapters, dus niet de oudere modellen met een IOP303 of IOP302 processor en een 64-bit 66MHz PCI-interface. De nieuwe versie heeft een aantal bugfixes en kleine verbeteringen. De veranderingen worden in dit Word-document besproken.
Bug Fix #1
Symptom: FW hangs when doing a sequential read on a degraded array with an error
Background: When read request crosses the stripe size boundary, the request will be carried out by firing more than one read messages. If multiple messages are returned with read errors for the same request / command, the firmware was not handling it properly as the read-error recovery logic was designed to handle single read error message per command at a time. The fix was made to queue the rest of the read-error messages for the command if the first message was in read error recovery process. Completion of recovery of first message would start next message.
Problem: The problem of a command getting held up in the queue thereby timing the OS out arose due to two problems associated with implementation of the fix.
- 1. This fix was intended only for RAID1 read recovery. RAID 5 read recovery does not need this fix because it uses allocates alarm commands to carry out recovery. So when an IO request hits medium error firmware sets activeRdErMsgNo with the message number for that command and starts a recovery process to read the data from other drives. Due to medium error on the other drive the recovery fails and original IO request is completed with failure. But the activeRdErMsgNo was not cleared for that command ID. When driver uses the same command ID again and if firmware hits the medium error then FW interprets that there is previous message active in read recovery state since activeRdErMsgNo for that command is already set. FW then queues up this message expecting previous message to wake up this one after recovery completion. Since there is no message already in read recovery the message in the queue never gets processed and is held up in the queue forever.
- 2. This fix results in the same problem described above for degraded logical drive. A degraded logical drive cannot do read recovery so this fix is not appropriate for degraded logical drive.
How to Validate: Create Raid 1. Perform slow initiation followed by check consistency to ensure all drives are consistent and array is optimal. Fail pd1. Using serial monitor invoke debug menu for BBM to create error on pd0 at LBA 500. Use IOMeter to run a test that is doing sequential read at 100%. When it hits bad block at LBA 500 Firmware should not hang.
- 1. Message is inserted into Q (by setting activeRdErMsgNo) only in case of non-degraded (optimal) RAID1.
- 2. For each command, activeRdErMsgNo, rdErMsgWaitHead, and rdErMsgWaitTail are initialized to zero.
- 3. Check if activeRdErMsgNo, rdErMsgWaitHead & rdErMsgWaitTail are non-zero for command being completed. If any value is non-zero then that indicates some messages are still waiting to be processed & FW breaks into debugger
Bug Fix #2
Symptom: Fix Macro SET_LDRV_OP_STATUS
Fix: Macro uses un-initialized variable set as a macro argument. This causes problems where the variable uses a meaningless value or self-references. This was observed on Dobson builds
Details: Reverted back to original MAKEFILE to avoid stamping of ROM image as BETA
How to validate: BIOS POST will display accordingly.
Purpose: Fix warning messages during compilation.
Details: LogNvramEvent() is called only if F_MYLEX_PORTING is enabled
How to validate: NA