ipmiutil sensor not working for Huawei or DELL hardware (driver)
The ipmiutil package is compatible with any IPMI-compliant firmware/hardware. I see that you are running Windows, so either the Intel or Microsoft IPMI driver is a prerequisite. See ipmiutil UserGuide section 5.1 https://ipmiutil.sourceforge.net/docs/UserGuide. Due to some unresolved bugs in the Microsoft driver, the Intel driver is preferred.
ipmiutil sensor not working for Huawei or DELL hardware
Hi, this is only a debian bug. It can closed here. CU Jörg
Apologies Andy, The above file may not be helpful in your debugging. I had to clear old-sel logs as free records were zero. I have attached the ipmiutil_event.log , that was captured earlier. See if it helps.
Apologies Andy, The above file may not be helpful in your debugging. I had to clear old-sel logs as free records were zero. I have attached the ipmiutil_event.log , that was captured earlier. See if it helps.
please find attached.
SMBIOS entry point
ipmi_cmd (GET_SEL_ENTRY) returns response with incorrect id.
Could you provide the 'ipmiutil sel' output (without debug) from this system, so that I can see the record ids in order?
ipmiutil command resturn error -3
To avoid this issue, you could load one of the IPMI drivers. For the BMC timeout with the driverless method resulting in -3, ipmiutil is recovering from the BMC timeout as well as can be expected and the retry succeeds.
Invalid data field request error
Fan speed sensor is not able to come back to OK state
Thanks for your response, Andy.
What you describe implies that the firmware has the fan sensor locked in that state until a reboot or power cycle. This may be on-purpose, i.e. persist the threshold to force attention to it, or it may be a firmware bug. The software (in this case ipmiutil) is reporting the state that the firmware sets. So this would be a question for the firmware vendor.
That makes sense. Thanks for doing the research to present it this way. We would need to leverage /sys/firmware/efi/systab (in Linux only, not Windows, other) to support this method.
SMBIOS entry point
Fan speed sensor is not able to come back to OK state
Thank you for your response Andy, But, No, that was not the case, if you observe below log closely get_sel(0) rv=0 cc=0 id=b4 next=2200 sel ok, id=0 next=2200 we are requesting recid= 0, usual response should be 1 or ffff, but why 2200? ( as there was no SEL log identified by 2200, when this log happened) and in the below log get_sel(dff) rv=0 cc=0 id=dff next=7 sel ok, id=dff next=7 recid is dff, the next record is supposed to be e00 or ffff (LAST_REC), but why is it going back to 7 and relogging...
Thank you for your response Andy, But, No, that was not the case, if you observe below log closely get_sel(0) rv=0 cc=0 id=b4 next=2200 sel ok, id=0 next=2200 we are requesting recid= 0, usual response should be 1 or ffff, but why 2200? ( as there was no SEL log identified by 2200, when this log happened) and in the below log get_sel(dff) rv=0 cc=0 id=dff next=**7** sel ok, id=dff next=7 recid is dff, the next record is supposed to be e00 or ffff (LAST_REC), but why is it going back to 7 and relogging...
OK, this seems to be occurring when the recid requested is a large number (e.g. 2200) but in the interim, the SEL has been cleared, so the current last recid is smaller (e.g. 0144). It sounds like we need a method for getevt to notice that the SEL has been cleared and reset its last recid. If the getevt service were restarted, it would get back to normal, but it needs to self-correct.
ipmi_cmd (GET_SEL_ENTRY) returns response with incorrect id.
Issues with windows msi installer
After uninstall - wrong path Ouch. I need to build a new one with that fixed Problem with 2 eay dll files This works if installed as administrator and those 2 DLLs get put into Windows\system32, but could fail to copy those if the install user does not have write privileges to system32. The right answer is to change it to copy the 2 DLLs into the ipmiutil directory instead.
Issues with windows msi installer
Yes, the -3 error is showing that there was no response from the BMC within the timeout (300ms). This usually means that the BMC was busy with some event.
Hi Andy, Thanks for the response. Here by the firmware you mean BMC firmware? so does that mean error is coming from BMC side? Thanks Arvind
ipmiutil command resturn error -3
Notice that it first tries to open the OpenIPMI driver, then the imb driver, then the direct/driverless method. Using one of the drivers allows various software to access the firmware. The direct/driverless method depends on being the only client accessing the firmware. It will get errors if either some other program is accessing the firmware, or if the firmware is busy with an event. However, it has a loop to try again, and it succeeds when trying again.
ipmiutil command resturn error -3
Can't find SMBIOS address entry point.
So the bottom line is that this AMI CSM_Support setting appears to disable the server management firmware if it is set to disabled. Not much that ipmiutil can do about that.
Out of curiosity, who is the BIOS/platform vendor that has this setting? AMI I have run "dmidecode --no-sysfs" and found no difference in the output when "CSM_SUPPORT" is on or off, except 2 bytes in "OEM specific type".
OK, so apparently this "CSM_Support" setting controls all of the Server Management firmware stuff. So when it is turned off, the IPMI driver cannot load, because the firmware is disabled. Nothing we can do about that. Out of curiosity, who is the BIOS/platform vendor that has this setting?
Thanks Andy for prompt response. Please find answers inline below. 1) Did this work OK before you disabled CSM_SUPPORT in the BIOS? [Arvind]: Yes if I enable CSM_SUPPORT from BIOS set-up it again starts working. 2) Were you running this as root? If not, please try that. If so, please include the results with '-x' for debug. [Arvind]: Yes I am running it as a root. root@8190:~# icmd -x 0 20 18 1 ipmiutil ver 2.98 icmd ver 2.98 This is a test tool to compose IPMI commands. Do not use without knowledge...
The SMBIOS part in util/mem_if.c is looking for the string "SM" in the physical memory starting at 0xF000. Perhaps disabling CSM_SUPPORT affects this (?). However, the key issue appears to be with the IPMI firmware. You don't have a driver loaded (best option), but ipmiutil would have tried to use driverless, which would be sufficient if you are running as root. Driverless is intended for maintenance (single user) mode. Two questions: 1) Did this work OK before you disabled CSM_SUPPORT in the BIOS?...
The SMBIOS part in util/mem_if.c is looking for the string 'SM' in the physical memory starting at 0xF000. Perhaps disabling CSM_SUPPORT affects this (?). However, the key issue appears to be with the IPMI firmware. You don't have a driver loaded (best option), but ipmiutil would have tried to use driverless, which would be sufficient if you are running as root. Driverless is intended for maintenance (single user) mode. Two questions: 1) Did this work OK before you disabled CSM_SUPPORT in the BIOS?...
Can't find SMBIOS address entry point.
getevt exits with return code ( rv = -3)
ipmiutil.exe dependency
Question answered. Problem resolved
if we do not get a response with the above information by 6/30, we will assume that Pankaj resolved it via step 1 or 2 above.
Resolved the issue. The FRU data is ok now. No change to ipmiutil is warranted.
The show_fru() stops printing in the middle of processing Board Info area with a single byte data for a field
Thank you, Andy. Actually I saw "0xC1 0x20" had been set to "Board Product Name" field in Board Info area. And I noticed the ipmiutil stopped to print Board Info area fields for "Board Product Name" and the remaining fields.
Actually, the 0xC1 is empirically the end of the FRU data. Keep in mind that the length portion of this byte (0x01) includes the current byte. So if there were a one byte FRU data field with 'x', it would be 0xC2 0x78. There are several firmware vendors who have garbage after the FRU_END byte, so it is needful to exit the loop when 0xC1 is encountered. Over the years, several firmware FRU data flaws have been handled via ChkOverflow() and ValidTL(), but the FRU_END 0xC1 cannot be changed. Also, the...
The show_fru() stops printing in the middle of processing Board Info area with a single byte data for a field
In the 3.1.8 code, when the isensor module exits, it always frees the SDR buffer that was allocated. I notice that your screenshot shows memory being left only sometimes. I'm wondering if perhaps the 'ipmiutil sensor' code is being killed or aborted in those cases before it can free the memory. We do know that MS Windows has gotten worse about memory cleanup with the later versions. 1) I'm wondering if there is a Windows function that we can configure to be called when a kill/abort event occurs,...
Coming back to this, you have probably already handled it, but you have a few alternatives: 1) Contact the OpenIPMI driver project to resolve the slowness with kernel 5.4.0. 2) Create a bash script to run the ipmiutil sel via driverless with only one session, which would need to save the last event, and only take action if the last event does not match the saved event (i.e. one or more new events). 3) In ipmiutil, we could create an option to igetevent.c which would cause it to create a fresh session...
For this function, it looks like the right approach is fixing the firmware on this motherboard. The command 'ipmiutil health -x' will show the make and firmware version from the get_deviceid function as hex codes, and will show known makes. The firmware version is 1.15 as shown above, but the make/manufacturer isn't shown by default. However, to enter a support request you will probably have to go through the server manufacturer. The problem is that the firmware doesn't handle more than 2 requests...
Invalid data field request error
Since this seems to be specific to that firmware, we need more information. 1) What happens if you try this with -Flan instead of -Flan2? 2) Can you get the IPMI LAN configuration from the server at 192.168.1.171? Do you know which firmware vendor it is?
It appears that the firmware will not accept a lanplus connection session. There are two possible causes that I can think of: 1) This firmware only accepts IPMI LAN 1.x and not LAN 2.0 (lanplus), which seems unlikely but is easy to check by adding option "-Flan" 2) The IPMI LAN configuration is incorrect on that server somehow. Running 'ipmiutil lan' and 'ifconfig' (or ipconfig) locally on the 192.168.1.171 server will show all of the configuration parameters for review.
Hi Andy, We are using ipmi-3.1.8 binary in our monitoring tool. We have observed that the memory leakage issue is still there in this binary version for windows server 2019. Please take a look on this issue. Note- This issue is observed on only windows server 2019.
Okay! Thanks for your input. History: We recently upgraded Linux kernel version and all the packages. After the upgrade, ipmiutil access via openIPMI driver has become extremely slow. In order to speed up, we experimented with 'driverless' mode and the speed increased. OLD: Linux kernel 4.4.110 ipmiutil ver 2.79 (driver type = open) NEW: Linux kernel: 5.4.0-42-generic ipmiutil ver 3.15 ( driver type = kcs, Using driverless method) If 'driverless' method does not have multi-user support, using the...
Hmmm. I'm surprised. This is Intel firmware on the Intel S4600LH motherboard. BMC manufacturer = 000157 (Intel), product = 005c (S4600LH) BMC version = 1.15.4159 (Boot 1.13), IPMI v2.0 BIOS Version = SE5C600.86B.01.07.0002.030620132047 That means this isn't a one-off behavior, so ipmiutil needs to handle it more robustly. I guess the best approach then is to modify igetevent.c to initiate a new session for each pass.
Hi Andy Cress, Thank you for your inputs. I went through the sample script 'ipmiutil_evt' script, it exits if the mode is driverless. And our appliance also uses ipmiutil in driverless way: NS9100_35# ipmiutil cmd -k IPMI access is ok, driver type = kcs Using driverless method ipmiutil cmd, completed successfully I have uploaded the 'ipmiutil health -x' output file 'health_out' to the thread. As an alternative in the meantime, you could create a shell script to run 'ipmiutil sel' in a loop and take...
Hi Andy Cress, Thank you for your inputs. I went through the sample script 'ipmiutil_evt' script, it exits if the mode is driverless. And our appliance also uses ipmiutil in driverless way: NS9100_35# ipmiutil cmd -k IPMI access is ok, driver type = kcs Using driverless method ipmiutil cmd, completed successfully I have uploaded the 'ipmiutil health -x' output file 'health_out' to the thread.
For this function, it looks like the right approach is fixing the firmware on this motherboard. The command 'ipmiutil health -x' will show the make and firmware version from the get_deviceid function as hex codes, and will show known makes. The firmware version is 1.15 as shown above, but the make/manufacturer isn't shown by default. However, to enter a support request you will probably have to go through the server manufacturer. The problem is that the firmware doesn't handle more than 2 requests...
It appears that the firmware will not accept a lanplus connection session. There are two possible causes that I can think of: 1) This firmware only accepts IPMI LAN 1.x and not LAN 2.0 (lanplus), which seems unlikely but is easy to check by adding option "-Flan" 2) The IPMI LAN configuration is incorrect on that server somehow. Running 'ipmiutil lan' and 'ifconfig' (or ipconfig) locally on the 192.168.1.171 server will show all of the configuration parameters for review. On Thu, Feb 3, 2022 at 3:08...
If not, perhaps we could add special-case handling of -3 in igetevent.c for this firmware vendor to keep waiting if it gets this error. [Jyothi] I modified the code to continue the loop incase of return code -3. Now the ipmiutil getevt exited with a different error code '0xc7'. got event id 002b, sensor_type = 13 event data: 2b 00 02 a7 96 fb 61 33 00 04 13 05 71 a0 03 18 002b 02/03/22 08:47:35 MIN Bios Critical Interrupt #05 PCIe Cor Sensor PCIe Warn Receiver Error on (03:03.0) 71 [a0 03 18] Waiting...
Hi Andy Cress, Thank you very much for your quick response. I missed to mention that I tried with 'GetMessage' too but it resulted in crash as shown below. NS9100_35# ipmiutil getevt -t 0 ipmiutil getevent ver 3.15 -- BMC version 1.15, IPMI version 2.0 event receiver sa = 20 lun = 00 bmc enables = 0f igetevent reading sensors ... Get IPMI events from kcs driver igetevent waiting for events via method 2 (GetMessage) Waiting 0 seconds for an event ... buffer overflow detected : terminated Aborted (core...
Hi Andy Cress, Thank you very much for your quick response. I missed to mention that same error is seen when 'GetMessage' is used too as shown below. NS9100_35# ipmiutil getevt -t 0 ipmiutil getevent ver 3.15 -- BMC version 1.15, IPMI version 2.0 event receiver sa = 20 lun = 00 bmc enables = 0f igetevent reading sensors ... Get IPMI events from kcs driver igetevent waiting for events via method 2 (GetMessage) Waiting 0 seconds for an event ... buffer overflow detected : terminated Aborted (core dumped)...
Invalid data field request error
getevt exits with return code ( rv = -3)
That error means that the receive failed (LAN_ERR_RECV_FAIL = -3 in ipmicmd.h). There are several methods to get the events, mainly the SEL_events method and the GetMessage method. This by default uses the SEL_events method, and should continue to read the last SEL event until a new event occurs. The fact that it fails to read the SEL event that it was able to read twice before implies some firmware anomaly. Which firmware vendor is this? It may work better in this case to use -m instead of -s. If...
[supermicro] invalid DIMM location decoding from SMBIOS
This was included in ipmiutil-3.1.8: 11/05/2021 ARCress ipmiutil-3.1.8 changes (iver 3.18) util/oem_supermicro.c - disable DIMM decoding from SMBIOS for SuperMicro (albertlav)
IPMI memory issue
The ipmiutil-3.1.8 release was posted on 11/05/2021
Oops. This change did not get included in ipmiutil-3.1.8. It will go into the next release.
Issue starting ipmiutil.exe on Windows 10
ipmiutil.exe dependency
Yes, the two openssl DLLs (libeay32.dll and ssleay32.dll) must be in the PATH somewhere for ipmiutil.exe in order to support the lanplus protocol which uses SSL. This could be: - in \Windows\system32, - in the same directory as ipmiutil.exe, or - somewhere in the %PATH%
getevt exits with return code ( rv = -3)
ipmiutil.exe dependency
I believe so, but we won’t know for sure until you validate it.
@arcress Hey, does this release addresses the issue mentioned in the following ticket ? https://sourceforge.net/p/ipmiutil/support-requests/45/
Hey, does this release addresses the issue mentioned in the following ticket ? https://sourceforge.net/p/ipmiutil/support-requests/45/
Hey, Does this release addresses the issue mentioned in the following ticket ? https://sourceforge.net/p/ipmiutil/support-requests/45/
ipmiutil-3.1.8 is released
updates for 3.1.8
updates for ipmiutil-3.1.8
updates for ipmiutil-3.1.8
Hi Andy, Bravo, The binary you gave worked, Really appreciate your spontaneous help in providing the fix, Can you tell me when can we expect this fix in the release ?
It has worked!! Thanks Andy for fixing this :)
Thank you :) Will try this out :)
Finally I got a fresh clean build of openssl 64bit and this should either run clean or give better debug. http://ipmiutil.sourceforge.net/FILES/ipmiutil-317q.zip (ipmiutil-317q64.exe and *eay32.dll)
Thank you for the fast answer :) Of course i fully understand that!
Sorry not yet. I have been consumed with my day job. Planning for time to get it done this week.
Do you have any update for me on this? :)
"you were able to do lanplus previously with ipmiutil-3.1.2 Yes, i downloaded all releases from your sites .The ipmiutil 3.1.2 release (64bit windows), found underIPMIUTIL FILES, has always worked for me ( and older versions ). Just the newer once since 3.1.3 were not working!
"you were able to do lanplus previously with ipmiutil-3.1.2" Yes, i downloaded all releases from your sites .The ipmiutil 3.1.2 release (64bit windows), found underIPMIUTIL FILES, has always worked for me !
That confirms that the error is in the openssl code, so I do need to rebuild/modify that. For several of these, you were able to do lanplus previously with ipmiutil-3.1.2. Is this one of those, or has this one never worked?
Sorry attached the wrong txt :)
So like you can see , the error must lie somewhere else, as i got the same error with the 3.1.2a version you send!
I will try that out and report :) Thank you!