IS-8.0.2 Update2 (Unix) on AIX platform (2024)

ID da atualização: UPD912628
Versão: 8.0.2.1800
Plataforma: Unix
Data da release: 2024-06-01
Resumo

IS-8.0.2 Update2 (Unix) on AIX platform
 * * * READ ME * * * * * * InfoScale 8.0.2 * * * * * * Patch 1800 * * * Patch Date: 2024-05-31This document provides the following information: * PATCH NAME * OPERATING SYSTEMS SUPPORTED BY THE PATCH * PACKAGES AFFECTED BY THE PATCH * BASE PRODUCT VERSIONS FOR THE PATCH * SUMMARY OF INCIDENTS FIXED BY THE PATCH * DETAILS OF INCIDENTS FIXED BY THE PATCH * INSTALLATION PRE-REQUISITES * INSTALLING THE PATCH * REMOVING THE PATCH * KNOWN ISSUESPATCH NAME----------InfoScale 8.0.2 Patch 1800OPERATING SYSTEMS SUPPORTED BY THE PATCH----------------------------------------AIX 7.2AIX 7.3PACKAGES AFFECTED BY THE PATCH------------------------------VRTSaslapmVRTScavfVRTScpsVRTSdbacVRTSdbedVRTSgabVRTSlltVRTSodmVRTSsfcpiVRTSsfmhVRTSvbsVRTSvcsVRTSvcsagVRTSvxfenVRTSvxfsVRTSvxvmBASE PRODUCT VERSIONS FOR THE PATCH----------------------------------- * InfoScale Availability 8.0.2 * InfoScale Enterprise 8.0.2 * InfoScale Foundation 8.0.2 * InfoScale Storage 8.0.2SUMMARY OF INCIDENTS FIXED BY THE PATCH---------------------------------------Patch ID: -8.0.2.1500* 4119267 (4113582) In VVR environments, reboot on VVR primary nodes results in RVG going into passthru mode.* 4123065 (4113138) 'vradmin repstatus' invoked on the secondary site shows stale information* 4123069 (4116609) VVR Secondary logowner change is not reflected with virtual hostnames.* 4123080 (4111789) VVR does not utilize the network provisioned for it.* 4124291 (4111254) vradmind dumps core while associating a rlink to rvg because of NULL pointer reference.* 4124794 (4114952) With virtual hostnames, pause replication operation fails.* 4124796 (4108913) Vradmind dumps core because of memory corruption.* 4124889 (4090828) Enhancement to track plex att/recovery data synced in past to debug corruption* 4125392 (4114193) 'vradmin repstatus' incorrectly shows replication status as inconsistent.* 4125811 (4090772) vxconfigd/vx commands hung if fdisk opened secondary volume and secondary logowner panic'd* 4128086 (4128087) System panic at dmp_nvme_do_dummy_read.* 4128127 (4132265) Machine attached with NVMe devices may panic.* 4128835 (4127555) Unable to configure replication using diskgroup id.* 4129664 (4129663) Generate and add changelog in vxvm and aslapm rpm* 4129765 (4111978) Replication failed to start due to vxnetd threads not running on secondary site.* 4129766 (4128380) With virtual hostnames, 'vradmin resync' command may fail if invoked from DR site.* 4130854 (4066785) create new option usereplicatedev=only to import the replicated LUN only.* 4130858 (4128351) System hung observed when switching log owner.* 4130861 (4122061) Observing hung after resync operation, vxconfigd was waiting for slaves' response* 4130947 (4124725) With virtual hostnames, 'vradmin delpri' command may hang.* 4133312 (4128451) A hardware replicated disk group fails to be auto-imported after reboot.* 4133315 (4130642) node failed to rejoin the cluster after this node switched from master to slave due to the failure of the replicated diskgroup import.* 4133930 (4100646) Recoveries of dcl objects not happening due to ATT, RELOCATE flags are set on DCL subdisks* 4133946 (3972344) vxrecover returns an error - 'ERROR V-5-1-11150' Volume <vol_name> not found'* 4135127 (4134023) vxconfigrestore(Diskgroup configuration restoration) for H/W Replicated diskgroup failed.* 4135388 (4131202) In VVR environment, changeip command may fail.* 4136419 (4089696) In FSS environment, with DCO log attached to VVR SRL volume, reboot of the cluster may result into panic on the CVM master node.* 4136428 (4131449) In CVR environment, the restriction of four RVGs per diskgroup has been removed.* 4136429 (4077944) In VVR environment, application I/O operation may get hung.* 4136859 (4117568) vradmind dumps core due to invalid memory access.* 4136866 (4090476) SRL is not draining to secondary.* 4136868 (4120068) A standard disk was added to a cloned diskgroup successfully which is not expected.* 4136870 (4117957) During a phased reboot of a two node Veritas Access cluster, mounts would hang.* 4137508 (4066310) Added BLK-MQ feature for DMP driver* 4137615 (4087628) CVM goes into faulted state when slave node of primary is rebooted .* 4137753 (4128271) In CVR environment, a node is not able to join the CVM cluster if RVG recovery is taking place.* 4137757 (4136458) In CVR environment, the DCM resync may hang with 0% sync remaining.* 4137986 (4133793) vxsnap restore failed with DCO IO errors during the operation when run in loop for multiple VxVM volumes.* 4138051 (4090943) VVR Primary RLink cannot connect as secondary reports SRL log is full.* 4138069 (4139703) Panic due to wrong use of OS API (HUNZA issue)* 4138075 (4129873) In CVR environment, if CVM slave node is acting as logowner, then I/Os may hang when data volume is grown.* 4138224 (4129489) With VxVM installed in AWS cloud environment, disk devices may intermittently disappear from 'vxdisk list' output.* 4138236 (4134069) VVR replication was not using VxFS SmartMove feature if filesystem was not mounted on RVG Logowner node.* 4138237 (4113240) In CVR environment, with hostname binding configured, the Rlink on VVR secondary may have incorrect VVR primary IP.* 4138251 (4132799) No detailed error messages while joining CVM fail.* 4138348 (4121564) Memory leak for volcred_t could be observed in vxio.* 4138537 (4098144) vxtask list shows the parent process without any sub-tasks which never progresses for SRL volume* 4138538 (4085404) Huge perf drop after Veritas Volume Replicator (VVR) entered Data Change Map (DCM) mode, when a large size of Storage Replicator Log (SRL) is configured.* 4140598 (4141590) Some incidents do not appear in changelog because their cross-references are not properly processed* 4143580 (4142054) primary master got panicked with ted assert during the run.* 4146550 (4108235) System wide hang due to memory leak in VVR vxio kernel module* 4150459 (4150160) Panic due to less memory allocation than required* 4153570 (4134305) Collecting ilock stats for admin SIO causes buffer overrun.* 4153597 (4146424) CVM Failed to join after power off and Power on from ILO* 4154104 (4142772) Error mask NM_ERR_DCM_ACTIVE on rlink may not be cleared resulting in the rlink being unable to get into DCM again.* 4154107 (3995831) System hung: A large number of SIOs got queued in FMR.* 4155719 (4154921) system is stuck in zio_wait() in FC-IOV environment after reboot the primary control domain when dmp_native_support is on.* 4158517 (4159199) AIX 7.3 TL2 - Memory fault(coredump) while running "./scripts/admin/vxtune/vxdefault.tc"* 4158920 (4159680) set_proc_oom_score: not found while /usr/lib/vxvm/bin/vxconfigbackupd gets executed* 4161646 (4149528) Cluster wide hang after faulting nodes one by onePatch ID: VRTSaslapm 8.0.2.1500* 4125322 (4119950) Security vulnerabilities exists in third party components [curl and libxml].* 4133009 (4133010) Generate and add changelog in aslapm rpm* 4137995 (4117350) Import operation on disk group created on Hitachi ShadowImage (SI) disks is failing .Patch ID: VRTSsfcpi-8.0.2.1200* 4115603 (4115601) On Solaris, Publisher list gets displayed during Infoscale start,stop, and uninstall process and does not display a unique publisher list during install and upgrade.* 4115707 (4126025) While performing full upgrade of the secondary site, SRL missing & RLINK dissociated error observed.* 4115874 (4124871) Configuration of Vxfen fails for a three-node cluster on VMs in different AZs* 4116368 (4123645) During the Rolling upgrade response file creation, CPI is asking to unmount the VxFS filesystem.* 4116406 (4123654) Removed unnecessary swap space message.* 4116879 (4126018) During addnode, installer fails to mount resources.* 4116995 (4123657) Installer retries upgrading protocol version post-upgrade.* 4117956 (4104627) Providing multiple-patch support up to 10 patches.* 4121961 (4123908) Installer does not register InfoScale hosts to the VIOM Management Server after InfoScale configuration.* 4122442 (4122441) CPI is displaying licensing information after starting the product through the response file.* 4122749 (4122748) On Linux, had service fails to start during rolling upgrade from InfoScale 7.4.1 or lower to higher InfoScale version.* 4126470 (4130003) Installer failed to start vxfs_replication while performing Configuration of Enterprise on OEL 9.2* 4127111 (4127117) On a Linux system, you can configure the GCO(Global Cluster option) with a hostname by using InfoScale installer.* 4130377 (4131703) Installer performs dmp include/exclude operations if /etc/vx/vxvm.exclude is present on the system.* 4130996 (4130000) Installer failed to start veki,llt if Active state is failed.* 4131315 (4131314) Environment="VCS_ENABLE_PUBSEC_LOG=0" is added from cpi in install section of service file instead of Service section.* 4131684 (4131682) On SunOS, installer prompts the user to install 'bourne' package if it is not available.* 4132411 (4139946) Rolling upgrade fails if the recommended upgrade path is not followed.* 4133019 (4135602) Installer failed to update main.cf file with VCS user during reconfiguring a secured cluster to non-secured cluster.* 4133469 (4136432) add node to higher version of infoscale node fails.* 4135015 (4135014) CPI installer should not ask for install InfoScale after "./installer -precheck" is done.* 4136211 (4139940) Installer failed to get the package version and failed due to PADV missmatch.* 4139609 (4142877) Missing HF list not displayed during upgrade by using the patch release.* 4140512 (4140542) Rolling upgrade failed for the patch installer* 4155155 (4156038) File permission changes to EO logging.* 4157440 (4158841) VRTSrest verison changes support.* 4159940 (4159942) The installer will not update existing file permissions.* 4160348 (4163907) While applying infoscale-patch, check whether previous infoscale-update is installed.* 4161937 (4160983) In Solaris, the vxfs modules are getting removed from current BE while upgrading the Infoscale to ABE.Patch ID: -8.0.2.500* 4160665 (4160661) sfmh for IS 7.4.2 U7Patch ID: VRTSdbac-8.0.2.1100* 4161967 (4157901) vcsmmconfig.log file permission is hardcoded, but permission should be set as per EO-tunable VCS_ENABLE_PUBSEC_LOG_PERM.Patch ID: VRTSgab-8.0.2.1100* 4160642 (4160640) Aix vxfen requires Aix llt and gab of same version.Patch ID: VRTSvcsag-8.0.2.1100* 4156630 (4156628) Getting message "Uninitialized value $version in string eq at /opt/VRTSvcs/bin/NIC/monitor line 317"constantly.Patch ID: VRTSllt-8.0.2.1100* 4160641 (4160640) Aix vxfen requires Aix llt and gab of same version.Patch ID: VRTScps-8.0.2.1100* 4152885 (4152882) vxcpserv process received SIGABRT signal due to invalid pointer access in acvsc_lib while writing logs.* 4156113 (4156112) EO changes file permission tunable* 4157674 (4156112) EO changes file permission tunable* 4162878 (4136146) Update security component libraries.Patch ID: VRTSdbed-8.0.2.1200* 4163136 (4136146) Update security component libraries.Patch ID: VRTSvbs-8.0.2.1100* 4163135 (4136146) Update security component libraries.Patch ID: VRTSvxfen-8.0.2.1100* 4125891 (4113847) Support for even number of coordination disks for CVM-based disk-based fencing* 4125895 (4108561) Reading vxfen reservation not working* 4156076 (4156075) EO changes file permission tunable* 4156379 (4156075) EO changes file permission tunablePatch ID: VRTSvcs-8.0.2.1100* 4124106 (4129493) Tenable security scan kills the Notifier resource.* 4157581 (4157580) There are security vulnerabilities present in the current version of third-party component, OpenSSL, that is utilized by VCS.Patch ID: -8.0.2.1700* 4162683 (4153873) Deport decision was being dependent on local system only not on all systems in the cluster* 4118153 (4118154) System may panic in simple_unlock_mem() when errcheckdetail enabled.* 4138361 (4134884) Unable to deport Diskgroup. Volume or plex device is open or attached* 4138381 (4117342) System might panic due to hard lock up detected on CPU* 4138384 (4134194) vxfs/glm worker thread panic with kernel NULL pointer dereference* 4138430 (4137037) Fixing miscellaneous present in vxfstaskd schedule.* 4138444 (4118840) Incorrect in-core values if failure occur while flushing the metadata on-disk.* 4142554 (4142555) Modify reference to running fsck -y in mount.vxfs error message and update fsck_vxfs manpage* 4142810 (4149899) Veritas file replication failover fails to perform the failover of job to target cluster.* 4152221 (4151836) Invalid memory access during mount in error scenario.* 4153560 (4089199) Dynamic reconfiguration operation for CPU takes a lot of time.* 4154052 (4119281) Higher page-in requests on Solaris 11 SPARC.* 4154058 (4136235) Includes module parameter for changing pnlct merge frequency.* 4154119 (4126943) Create lost+found directory in VxFS file system with default ACL permissions as 700.* 4134040 (3979756) kfcntl/vx_cfs_ifcntllock performance is very bad on CFS.DETAILS OF INCIDENTS FIXED BY THE PATCH---------------------------------------This patch fixes the following incidents:Patch ID: -8.0.2.1500* 4119267 (Tracking ID: 4113582)SYMPTOM:In VVR environments, reboot on VVR primary nodes results in RVG going into passthru mode.DESCRIPTION:Reboot of primary nodes resulted in missing write completions of updates on the primary SRL volume. After the node came up, last update received by VVR secondary was incorrectly compared with the missing updates.RESOLUTION:Fixed the check to correctly compare the last received update by VVR secondary.* 4123065 (Tracking ID: 4113138)SYMPTOM:In CVR environments configured with virtual hostname, after node reboots on VVR Primary and Secondary, 'vradmin repstatus' invoked on the secondary site shows stale information with following warning message:VxVM VVR vradmin INFO V-5-52-1205 Primary is unreachable or RDS has configuration error. Displayed status information is from Secondary and can be out-of-date.DESCRIPTION:This issue occurs when there is a explicit RVG logowner set on the CVM master due to which the old connection of vradmind with its remote peer disconnects and new connection is not formed.RESOLUTION:Fixed the issue with the vradmind connection with its remote peer.* 4123069 (Tracking ID: 4116609)SYMPTOM:In CVR environments where replication is configured using virtual hostnames, vradmind on VVR primary loses connection with its remote peer after a planned RVG logowner change on the VVR secondary site.DESCRIPTION:vradmind on VVR primary was unable to detect a RVG logowner change on the VVR secondary site.RESOLUTION:Enabled primary vradmind to detect RVG logowner change on the VVR secondary site.* 4123080 (Tracking ID: 4111789)SYMPTOM:In VVR/CVR environments, VVR would use any IP/NIC/network to replicate the data and may not utilize the high performance NIC/network configured for VVR.DESCRIPTION:The default value of tunable was set to 'any_ip'.RESOLUTION:The default value of tunable is set to 'replication_ip'.* 4124291 (Tracking ID: 4111254)SYMPTOM:vradmind dumps core with the following stack:#3 0x00007f3e6e0ab3f6 in __assert_fail () from /root/cores/lib64/libc.so.6#4 0x000000000045922c in RDS::getHandle ()#5 0x000000000056ec04 in StatsSession::addHost ()#6 0x000000000045d9ef in RDS::addRVG ()#7 0x000000000046ef3d in RDS::createDummyRVG ()#8 0x000000000044aed7 in PriRunningState::update ()#9 0x00000000004b3410 in RVG::update ()#10 0x000000000045cb94 in RDS::update ()#11 0x000000000042f480 in DBMgr::update ()#12 0x000000000040a755 in main ()DESCRIPTION:vradmind was trying to access a NULL pointer (Remote Host Name) in a rlink object, as the Remote Host attribute of the rlink hasn't been set.RESOLUTION:The issue has been fixed by making code changes.* 4124794 (Tracking ID: 4114952)SYMPTOM:With VVR configured with a virtual hostname, after node reboots on DR site, 'vradmin pauserep' command failed with following error:VxVM VVR vradmin ERROR V-5-52-421 vradmind server on host <host> not responding or hostname cannot be resolved.DESCRIPTION:The virtual host mapped to multiple IP addresses, and vradmind was using incorrectly mapped IP address.RESOLUTION:Fixed by using the correct mapping of IP address from the virtual host.* 4124796 (Tracking ID: 4108913)SYMPTOM:Vradmind dumps core with the following stacks:#3 0x00007f2c171be3f6 in __assert_fail () from /root/coredump/lib64/libc.so.6#4 0x00000000005d7a90 in VList::concat () at VList.C:1017#5 0x000000000059ae86 in OpMsg::List2Msg () at Msg.C:1280#6 0x0000000000441bf6 in OpMsg::VList2Msg () at ../../include/Msg.h:389#7 0x000000000043ec33 in DBMgr::processStatsOpMsg () at DBMgr.C:2764#8 0x00000000004093e9 in process_message () at srvmd.C:418#9 0x000000000040a66d in main () at srvmd.C:733#0 0x00007f4d23470a9f in raise () from /root/core.Jan18/lib64/libc.so.6#1 0x00007f4d23443e05 in abort () from /root/core.Jan18/lib64/libc.so.6#2 0x00007f4d234b3037 in __libc_message () from /root/core.Jan18/lib64/libc.so.6#3 0x00007f4d234ba19c in malloc_printerr () from /root/core.Jan18/lib64/libc.so.6#4 0x00007f4d234bba9c in _int_free () from /root/core.Jan18/lib64/libc.so.6#5 0x00000000005d5a0a in ValueElem::_delete_val () at Value.C:491#6 0x00000000005d5990 in ValueElem::~ValueElem () at Value.C:480#7 0x00000000005d7244 in VElem::~VElem () at VList.C:480#8 0x00000000005d8ad9 in VList::~VList () at VList.C:1167#9 0x000000000040a71a in main () at srvmd.C:743#0 0x000000000040b826 in DList::head () at ../include/DList.h:82#1 0x00000000005884c1 in IpmHandle::send () at Ipm.C:1318#2 0x000000000056e101 in StatsSession::sendUCastStatsMsgToPrimary () at StatsSession.C:1157#3 0x000000000056dea1 in StatsSession::sendStats () at StatsSession.C:1117#4 0x000000000046f610 in RDS::collectStats () at RDS.C:6011#5 0x000000000043f2ef in DBMgr::collectStats () at DBMgr.C:2799#6 0x00007f98ed9131cf in start_thread () from /root/core.Jan26/lib64/libpthread.so.0#7 0x00007f98eca4cdd3 in clone () from /root/core.Jan26/lib64/libc.so.6DESCRIPTION:There is a race condition in vradmind that may cause memory corruption and unpredictable result. Vradmind periodically forks a child thread to collect VVR statistic data and send them to the remote site. While the main thread may also be sending data using the same handler object, thus member variables in the handler object are accessed in parallel from multiple threads and may become corrupted.RESOLUTION:The code changes have been made to fix the issue.* 4124889 (Tracking ID: 4090828)SYMPTOM:Dumped fmrmap data for better debuggability for corruption issuesDESCRIPTION:vxplex att/vxvol recover cli will internally fetch fmrmaps from kernel using existing ioctl before starting attach operation and get data in binary format and dump to file and store it with specific format like volname_taskid_date.RESOLUTION:Changes done now dumps the fmrmap data into a binary file.* 4125392 (Tracking ID: 4114193)SYMPTOM:'vradmin repstatus' command showed replication data status incorrectly as 'inconsistent'.DESCRIPTION:vradmind was relying on replication data status from both primary as well as DR site.RESOLUTION:Fixed replication data status to rely on the primary data status.* 4125811 (Tracking ID: 4090772)SYMPTOM:vxconfigd/vx commands hang on secondary site in a CVR environment.DESCRIPTION:Due to a window with unmatched SRL positions, if any application (e.g. fdisk) tryingto open the secondary RVG volume will acquire a lock and wait for SRL positions to match.During this if any vxvm transaction kicked in will also have to wait for same lock.Further logowner node panic'd which triggered logownership change protocol which hungas earlier transaction was stuck. As logowner change protocol could not complete,in absence of valid logowner SRL position could not match and caused deadlock. That leadto vxconfigd and vx command hang.RESOLUTION:Added changes to allow read operation on volume even if SRL positions areunmatched. We are still blocking write IOs and just allowing open() call for read-onlyoperations, and hence there will not be any data consistency or integrity issues.* 4128086 (Tracking ID: 4128087)SYMPTOM:System panic at dmp_nvme_do_dummy_read with following stacks:#0 machine_kexec#1 __crash_kexec #2 crash_kexec #3 oops_end #4 die #5 do_trap #6 do_invalid_op #7 invalid_op [exception RIP: kfree+316]#8 dmp_free #9 dmp_nvme_do_dummy_read #10 dmp_nvme_analyze_error #11 gen_analyze_error #12 dmp_analyze_error #13 dmp_process_errbp #14 dmp_daemons_loop #15 kthread #16 ret_from_fork_nospec_beginDESCRIPTION:In some VxDMP error processing code path, incorrect size parameter is passed to kernel memory free function, which can cause to system panic.RESOLUTION:Code has been changed to use correct size parameter.* 4128127 (Tracking ID: 4132265)SYMPTOM:Machine with NVMe disks panics with following stack: blk_update_requestblk_mq_end_requestdmp_kernel_nvme_ioctldmp_dev_ioctldmp_send_nvme_passthru_cmd_over_nodedmp_pr_do_nvme_readdmp_pgr_readdmpioctldmp_ioctlblkdev_ioctl__x64_sys_ioctldo_syscall_64DESCRIPTION:Issue was applicable to setups with NVMe devices which do not support SCSI3-PR as an ioctl was called without checking correctly if SCSI3-PR was supported.RESOLUTION:Fixed the check to avoid calling the ioctl on devices which do not support SCSI3-PR.* 4128835 (Tracking ID: 4127555)SYMPTOM:While adding secondary site using the 'vradmin addsec' command, the command fails with following error if diskgroup id is used in place of diskgroup name:VxVM vxmake ERROR V-5-1-627 Error in field remote_dg=<dgid>: name is too longDESCRIPTION:Diskgroup names can be 32 characters long where as diskgroup ids can be 64 characters long. This was not handled by vradmin commands.RESOLUTION:Fix vradmin commands to handle the case where longer diskgroup ids can be used in place of diskgroup names.* 4129664 (Tracking ID: 4129663)SYMPTOM:vxvm and aslapm rpm do not have changelogDESCRIPTION:Changelog in rpm will help to find missing incidents with respect to other version.RESOLUTION:Changelog is generated and added to vxvm and aslapm rpm.* 4129765 (Tracking ID: 4111978)SYMPTOM:Replication failed to start due to vxnetd threads not running on secondary site.DESCRIPTION:Vxnetd was waiting to start "nmcomudpsrv" and "nmcomlistenserver" threads. Due to a race condition of some resource between those two thread, vxnetd was stuck in a dead loop till max retry reached.RESOLUTION:Code changes have been made to add lock protection to avoid the race condition.* 4129766 (Tracking ID: 4128380)SYMPTOM:If VVR is configured using virtual hostname and 'vradmin resync' command is invoked from a DR site node, it fails with following error:VxVM VVR vradmin ERROR V-5-52-405 Primary vradmind server disconnected.DESCRIPTION:In case of virtual hostname maps to multiple IPs, vradmind service on the DR site was not able to reach the VVR logowner node on the primary site due to incorrect IP address mapping used.RESOLUTION:Fixed vradmind to use correct mapped IP address of the primary vradmind.* 4130854 (Tracking ID: 4066785)SYMPTOM:When the replicated disks are in SPLIT mode, importing its disk group failed with "Device is a hardware mirror".DESCRIPTION:When the replicated disks are in SPLIT mode, which are readable and writable, importing its disk group failed with "Device is a hardware mirror". Third party doesn't expose disk attribute to show when it's in SPLIT mode. With this new enhancement, the replicated disk group can be imported with option `-o usereplicatedev=only`.RESOLUTION:The code is enhanced to import the replicated disk group with option `-o usereplicatedev=only`.* 4130858 (Tracking ID: 4128351)SYMPTOM:System hung observed when switching log owner.DESCRIPTION:VVR mdship SIOs might be throttled due to reaching max allocation count,etc. These SIOs are holding io count. When log owner change kicked in and quiesced RVG. VVR log owner change SIO is waiting for iocount to drop to zero to proceed further. VVR mdship requests from the log client are returned with EAGAIN as RVG quiesced. The throttled mdship SIOs need to be driven by the upcoming mdship requests, hence the deadlock, which caused system hung.RESOLUTION:Code changes have been made to flush the mdship queue before VVR log owner change SIO waiting for IO drain.* 4130861 (Tracking ID: 4122061)SYMPTOM:Observing hung after resync operation, vxconfigd was waiting for slaves' response.DESCRIPTION:VVR logowner was in a transaction and returned VOLKMSG_EAGAIN to CVM_MSG_GET_METADATA which is expected. Once the client received VOLKMSG_EAGAIN, it would sleep 10 jiffies and retry the kmsg . In a busy cluster, it might happen the retried kmsgs plus the new kmsgs got built up and hit the kmsg flowcontrol before the vvr logowner transaction completed. Once the client refused any kmsgs due to the flowcontrol. The transaction on vvr logowner might get stuck because it required kmsg response from all the slave node.RESOLUTION:Code changes have been made to increase the kmsg flowcontrol and don't let kmsg receiver fall asleep but handle the kmsg in a restart function.* 4130947 (Tracking ID: 4124725)SYMPTOM:With VVR configured using virtual hostnames, 'vradmin delpri' command could hang after doing the RVG cleanup.DESCRIPTION:'vradmin delsec' command used prior to 'vradmin delpri' command had left the cleanup in an incomplete state resulting in next cleanup command to hang.RESOLUTION:Fixed to make sure that 'vradmin delsec' command executes its workflow correctly.* 4133312 (Tracking ID: 4128451)SYMPTOM:A hardware replicated disk group fails to be auto-imported after reboot.DESCRIPTION:Currently the standard diskgroup and cloned diskgroup are supported with auto-import. Hardware replicated disk group isn't supported yet.RESOLUTION:Code changes have been made to support hardware replicated disk groups with autoimport.* 4133315 (Tracking ID: 4130642)SYMPTOM:node failed to rejoin the cluster after switched from master to slave due to the failure of the replicated diskgroup import.The below error message could be found in /var/VRTSvcs/log/CVMCluster_A.log.CVMCluster:cvm_clus:monitor:vxclustadm nodestate return code:[101] with output: [state: out of clusterreason: Replicated dg record is found: retry to add a node failed]DESCRIPTION:The flag which shows the diskgroup was imported with usereplicatedev=only failed to be marked since the last time the diskgroup got imported. The missing flag caused the failure of the replicated diskgroup import, further caused node rejoin failure.RESOLUTION:The code changes have been done to flag the diskgroup after it got imported with usereplicatedev=only.* 4133930 (Tracking ID: 4100646)SYMPTOM:Recoveries of dcl objects not happening due to ATT, RELOCATE flags are set on DCL subdisksDESCRIPTION:Due to multiple reason stale tutil may remain stamped on dcl subdisks which may cause next vxrecover instancesnot able to recover dcl plex.RESOLUTION:Issue is resolved by vxattachd daemon intelligently detecting these stale tutils and clearing+triggering recoveries after 10 min interval.* 4133946 (Tracking ID: 3972344)SYMPTOM:After reboot of a node on a setup where multiple diskgroups / Volumes within diskgroups are present, sometimes in /var/log/messages an error 'vxrecover ERROR V-5-1-11150 Volume <volume_name> does not exist' is logged.DESCRIPTION:In volume_startable function (volrecover.c), dgsetup is called to set the current default diskgroup. This does not update the current_group variable leading to inappropriate mappings. Volumes are searched in an incorrect diskgroup which is logged in the error message.The vxrecover command works fine if the diskgroup name associated with volume is specified. [vxrecover -g <dg_name> -s]RESOLUTION:Changed the code to use switch_diskgroup() instead of dgsetup. Current_group is updated and the current_dg is set. Thus vxrecover finds the Volume correctly.* 4135127 (Tracking ID: 4134023)SYMPTOM:vxconfigrestore(Diskgroup configuration restoration) for H/W Replicated diskgroup failed with below error:# vxconfigrestore -p LINUXSRDFVxVM vxconfigrestore INFO V-5-2-6198 Diskgroup LINUXSRDF configuration restoration started ......VxVM vxdg ERROR V-5-1-0 Disk group LINUXSRDF: import failed:Replicated dg record is found.Did you want to import hardware replicated LUNs?Try vxdg [-o usereplicatedev=only] import option with -c[s]Please refer to system log for details.... ...VxVM vxconfigrestore ERROR V-5-2-3706 Diskgroup configuration restoration for LINUXSRDF failed.DESCRIPTION:H/W Replicated diskgroup can be imported only with option "-o usereplicatedev=only". vxconfigrestore didn't do H/W Replicated diskgroup check, without giving the proper import option diskgroup import failed.RESOLUTION:The code changes have been made to do H/W Replicated diskgroup check in vxconfigrestore .* 4135388 (Tracking ID: 4131202)SYMPTOM:In VVR environment, 'vradmin changeip' would fail with following error message:VxVM VVR vradmin ERROR V-5-52-479 Host <host> not reachable.DESCRIPTION:Existing heartbeat to new secondary host is assumed, whereas it starts after the changeip operation.RESOLUTION:Heartbeat assumption is fixed.* 4136419 (Tracking ID: 4089696)SYMPTOM:In FSS environment, with DCO log attached to VVR SRL volume, the reboot of the cluster may result into panic on the CVM master node as follows: voldco_get_mapidvoldco_get_detach_mapidvoldco_get_detmap_offsetvoldco_recover_detach_mapvolmv_recover_dcovolvol_mv_fmr_precommitvol_mv_precommitvol_ktrans_precommit_parallelvolobj_ktrans_sio_startvoliod_iohandlevoliod_loopDESCRIPTION:If DCO is configured with SRL volume, and both SRL volume plexes and DCO plexes get I/O error, this panic occurs in the recovery path.RESOLUTION:Recovery path is fixed to manage this condition.* 4136428 (Tracking ID: 4131449)SYMPTOM:In CVR environments, there was a restriction to configure up to four RVGs per diskgroup as more RVGs resulted in degradation of I/O performance in case of VxVM transactions.DESCRIPTION:In CVR environments, VxVM transactions on an RVG also impacted I/O operations on other RVGs in the same diskgroup resulting in I/O performance degradation in case of higher number of RVGs configured in a diskgroup.RESOLUTION:VxVM transaction impact has been isolated to each RVG resulting in the ability to scale beyond four RVGs in a diskgroup.* 4136429 (Tracking ID: 4077944)SYMPTOM:In VVR environment, when I/O throttling gets activated and deactivated by VVR, it may result in an application I/O hang.DESCRIPTION:In case VVR throttles and unthrottles I/O, the diving of throttled I/O is not done in one of the cases.RESOLUTION:Resolved the issue by making sure the application throttled I/Os get driven in all the cases.* 4136859 (Tracking ID: 4117568)SYMPTOM:Vradmind dumps core with the following stack:#1 std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string (this=0x7ffdc380d810, __str=<error reading variable: Cannot access memory at address 0x3736656436303563>)#2 0x000000000040e02b in ClientMgr::closeStatsSession#3 0x000000000040d0d7 in ClientMgr::client_ipm_close#4 0x000000000058328e in IpmHandle::~IpmHandle#5 0x000000000057c509 in IpmHandle::events#6 0x0000000000409f5d in mainDESCRIPTION:After terminating vrstat, the StatSession in vradmind was closed and the corresponding Client object was deleted. When closing the IPM object of vrstat, try to access the removed Client, hence the core dump.RESOLUTION:Core changes have been made to fix the issue.* 4136866 (Tracking ID: 4090476)SYMPTOM:Storage Replicator Log (SRL) is not draining to secondary. rlink status shows the outstanding writes never got reduced in several hours.VxVM VVR vxrlink INFO V-5-1-4640 Rlink xxx has 239346 outstanding writes, occupying 2210892 Kbytes (0%) on the SRLVxVM VVR vxrlink INFO V-5-1-4640 Rlink xxx has 239346 outstanding writes, occupying 2210892 Kbytes (0%) on the SRLVxVM VVR vxrlink INFO V-5-1-4640 Rlink xxx has 239346 outstanding writes, occupying 2210892 Kbytes (0%) on the SRLVxVM VVR vxrlink INFO V-5-1-4640 Rlink xxx has 239346 outstanding writes, occupying 2210892 Kbytes (0%) on the SRLVxVM VVR vxrlink INFO V-5-1-4640 Rlink xxx has 239346 outstanding writes, occupying 2210892 Kbytes (0%) on the SRLVxVM VVR vxrlink INFO V-5-1-4640 Rlink xxx has 239346 outstanding writes, occupying 2210892 Kbytes (0%) on the SRLDESCRIPTION:In poor network environment, VVR seems not syncing. Another reconfigure happened before VVR state became clean, VVR atomic window got set to a large size. VVR couldnt complete all the atomic updates before the next reconfigure. VVR ended kept sending atomic updates from VVR pending position. Hence VVR appears to be stuck.RESOLUTION:Code changes have been made to update VVR pending position accordingly.* 4136868 (Tracking ID: 4120068)SYMPTOM:A standard disk was added to a cloned diskgroup successfully which is not expected.DESCRIPTION:When add a disk to a disk group, a pre-check will be made to avoid ending up with a mixed diskgroup. In a cluster, the local node might fail to use the latest record to do the pre-check, which caused a mixed diskgroup in the cluster, further caused node join failure.RESOLUTION:Code changes have been made to use latest record to do a mixed diskgroup pre-check.* 4136870 (Tracking ID: 4117957)SYMPTOM:During a phased reboot of a two node Veritas Access cluster, mounts would hang. Transaction aborted waiting for io drain.VxVM vxio V-5-3-1576 commit: Timedout waiting for Cache XXXX to quiesce, iocount XX msg 0DESCRIPTION:Transaction on Cache object getting failed since there are IOs waiting on the cache object. Those queued IOs couldn't be proceeded due to the missing flag VOLOBJ_CACHE_RECOVERED on the cache object. A transact might kicked in when the old cache was doingrecovery, therefore the new cache object might fail to inherit VOLOBJ_CACHE_RECOVERED, further caused IO hung.RESOLUTION:Code changes have been to fail the new cache creation if the old cache is doing recovery.* 4137508 (Tracking ID: 4066310)SYMPTOM:New feature for performance improvementDESCRIPTION:Linux subsystem has two types of block driver 1) block multiqueue driver 2) bio based block driver. since day -1 DMP was a bio based driver now added new feature that is block multiqueue for DMP.RESOLUTION:resolved* 4137615 (Tracking ID: 4087628)SYMPTOM:When DCM is in replication mode with volumes mounted having large regions for DCM to sync and if slave node reboot is triggered, this might cause CVM to go into faulted state .DESCRIPTION:During Resiliency tests, performed sequence of operations as following. 1. On an AWS FSS-CVR setup, replication is started across the sites for 2 RVGs.2. The low owner service groups for both the RVGs are online on a Slave node. 3. Rebooted another Slave node where logowner is not online. 4. After Slave node come back from reboot, it is unable to join CVM Cluster. 5. Also vx commands are also hung/stuck on the CVM Master and Logowner slave node.RESOLUTION:In RU SIO before requesting vxfs_free_region(), drop IO count and hold it again after. Because the transaction has been locked (vol_ktrans_locked = 1) right before calling vxfs_free_region(), we don't need the iocount to hold rvg from being removed.* 4137753 (Tracking ID: 4128271)SYMPTOM:In CVR environment, a node is not able to join the CVM cluster if RVG recovery is taking place.DESCRIPTION:If there has been an SRL overflow, then RVG recovery takes more time as it was loaded with more work than required because the recovery related metadata was not updated.RESOLUTION:Updated the metadata correctly to reduce the RVG recovery time.* 4137757 (Tracking ID: 4136458)SYMPTOM:In CVR environment, if CVM slave node is acting as logowner, the DCM resync issues after snapshot restore may hang showing 0% sync is remaining.DESCRIPTION:The DCM resync completion is not correctly communicated to CVM master resulting into hang.RESOLUTION:The DCM resync operation is enhanced to correctly communicate resync completion to CVM master.* 4137986 (Tracking ID: 4133793)SYMPTOM:DCO experience IO Errors while doing a vxsnap restore on vxvm volumes.DESCRIPTION:Dirty flag was getting set in context of an SIO with flag VOLSIO_AUXFLAG_NO_FWKLOG being set. This led to transaction errors while doing a vxsnap restore command in loop for vxvm volumes causing transaction abort. As a result, VxVM tries to cleanup by removing newly added BMs. Now, VxVM tries to access the deleted BMs. however it is not able to since they were deleted previously. This ultimately leads to DCO IO error.RESOLUTION:Skip first write klogging in the context of an IO with flag VOLSIO_AUXFLAG_NO_FWKLOG being set.* 4138051 (Tracking ID: 4090943)SYMPTOM:On Primary, RLink is continuously getting connected/disconnected with below message seen in secondary syslog: VxVM VVR vxio V-5-3-0 Disconnecting replica <rlink_name> since log is full on secondary.DESCRIPTION:When RVG logowner node panic, RVG recovery happens in 3 phases.At the end of 2nd phase of recovery in-memory and on-disk SRL positions remains incorrectand during this time if there is logowner change then Rlink won't get connected.RESOLUTION:Handled in-memory and on-disk SRL positions correctly.* 4138069 (Tracking ID: 4139703)SYMPTOM:System gets panicked on RHEL9.2 AWS environment while registering the pgr key.DESCRIPTION:On RHEL 9.2, Observing panic while reading PGR keys on AWS VM.2)Reproduction steps: Run "/etc/vx/diag.d/vxdmppr read /dev/vx/dmp/ip-10-20-2-49_nvme4_0" on AWS nvme 9.2 setup.3) Build details: ga8_0_2_all_maint4)Test Bed details: AWS VM with RHEL 9.2 Nodes:Access details(login)Console details:4) OS and Kernel details: 5.14.0-284.11.1.el9_2.x86_645). Crash dump and core dump location with Binary6) Failure signature: PID: 8250 TASK: ffffa0e882ca1c80 CPU: 1 COMMAND: "vxdmppr" #0 [ffffbf3c4039f8e0] machine_kexec at ffffffffb626c237 #1 [ffffbf3c4039f938] __crash_kexec at ffffffffb63c3c9a #2 [ffffbf3c4039f9f8] crash_kexec at ffffffffb63c4e58 #3 [ffffbf3c4039fa00] oops_end at ffffffffb62291db #4 [ffffbf3c4039fa20] do_trap at ffffffffb622596e #5 [ffffbf3c4039fa70] do_error_trap at ffffffffb6225a25 #6 [ffffbf3c4039fab0] exc_invalid_op at ffffffffb6d256be #7 [ffffbf3c4039fad0] asm_exc_invalid_op at ffffffffb6e00af6 [exception RIP: kfree+1074] RIP: ffffffffb6578e32 RSP: ffffbf3c4039fb88 RFLAGS: 00010246 RAX: ffffa0e7984e9c00 RBX: ffffa0e7984e9c00 RCX: ffffa0e7984e9c60 RDX: 000000001bc22001 RSI: ffffffffb6729dfd RDI: ffffa0e7984e9c00 RBP: ffffa0e880042800 R8: ffffa0e8b572b678 R9: ffffa0e8b572b678 R10: 0000000000005aca R11: 00000000000000e0 R12: fffff20e00613a40 R13: fffff20e00613a40 R14: ffffffffb6729dfd R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #8 [ffffbf3c4039fbc0] blk_update_request at ffffffffb6729dfd #9 [ffffbf3c4039fc18] blk_mq_end_request at ffffffffb672a11a#10 [ffffbf3c4039fc30] dmp_kernel_nvme_ioctl at ffffffffc09f2647 [vxdmp]#11 [ffffbf3c4039fd00] dmp_dev_ioctl at ffffffffc09a3b93 [vxdmp]#12 [ffffbf3c4039fd10] dmp_send_nvme_passthru_cmd_over_node at ffffffffc09f1497 [vxdmp]#13 [ffffbf3c4039fd60] dmp_pr_do_nvme_read.constprop.0 at ffffffffc09b78e1 [vxdmp]#14 [ffffbf3c4039fe00] dmp_pr_read at ffffffffc09e40be [vxdmp]#15 [ffffbf3c4039fe78] dmpioctl at ffffffffc09b09c3 [vxdmp]#16 [ffffbf3c4039fe88] dmp_ioctl at ffffffffc09d7a1c [vxdmp]#17 [ffffbf3c4039fea0] blkdev_ioctl at ffffffffb6732b81#18 [ffffbf3c4039fef0] __x64_sys_ioctl at ffffffffb65df1ba#19 [ffffbf3c4039ff20] do_syscall_64 at ffffffffb6d2515c#20 [ffffbf3c4039ff50] entry_SYSCALL_64_after_hwframe at ffffffffb6e0009b RIP: 00007fef03c3ec6b RSP: 00007ffd1acad8a8 RFLAGS: 00000202 RAX: ffffffffffffffda RBX: 00000000444d5061 RCX: 00007fef03c3ec6b RDX: 00007ffd1acad990 RSI: 00000000444d5061 RDI: 0000000000000003 RBP: 0000000000000003 R8: 0000000001cbba20 R9: 0000000000000000 R10: 00007fef03c11d78 R11: 0000000000000202 R12: 00007ffd1acad990 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000002 ORIG_RAX: 0000000000000010 CS: 0033 SS: 002bRESOLUTION:resolved* 4138075 (Tracking ID: 4129873)SYMPTOM:In CVR environment, the application I/O may hang if CVM slave node is acting as RVG logowner and a data volume grow operation is triggered followed by a logclient node leaving the cluster.DESCRIPTION:When logowner is not CVM master, and data volume grow operation is taking place, the CVM master controls the region locking for IO operations. In case, a logclient node leaves the cluster, the I/O operations initiated by it are not cleaned up correctly due to lack of co-ordination between CVM master and RVG logowner node.RESOLUTION:Co-ordination between CVM master and RVG logowner node is fixed to manage the I/O cleanup correctly.* 4138224 (Tracking ID: 4129489)SYMPTOM:With VxVM installed in AWS cloud environment, disk devices may intermittently disappear from 'vxdisk list' output.DESCRIPTION:There was an issue with disk discovery at OS and DDL layer.RESOLUTION:Integration issue with disk discovery was resolved.* 4138236 (Tracking ID: 4134069)SYMPTOM:VVR replication was not using VxFS SmartMove feature if filesystem was not mounted on RVG Logowner node.DESCRIPTION:Initial synchronization and DCM replay of VVR required the filesystem to be mounted locally on the logowner node as VVR did not have capability to fetch the required information from a remotely mounted filesystem mount point.RESOLUTION:VVR is updated to fetch the required SmartMove related information from a remotely mounted filesystem mount point.* 4138237 (Tracking ID: 4113240)SYMPTOM:In CVR environment, with hostname binding configured, the Rlink on VVR secondary may have incorrect VVR primary IP.DESCRIPTION:The VVR Secondary Rlink picks up a wrong IP randomly as the replication is configured using virtual host which maps to multiple IPs.RESOLUTION:Correct the VVR Primary IP on the VVR Secondary Rlink.* 4138251 (Tracking ID: 4132799)SYMPTOM:If GLM is not loaded, start CVM fails with the following errors:# vxclustadm -m gab startnodeVxVM vxclustadm INFO V-5-2-9687 vxclustadm: Fencing driver is in disabled mode - VxVM vxclustadm ERROR V-5-1-9743 errno 3DESCRIPTION:The error number but the error message is printed while joining CVM fails.RESOLUTION:The code changes have been made to fix the issue.* 4138348 (Tracking ID: 4121564)SYMPTOM:Memory leak for volcred_t could be observed in vxio.DESCRIPTION:Memory leak could occur if some private region IOs hang on a disk and there are duplicate entries for the disk in vxio.RESOLUTION:Code has been changed to avoid memory leak.* 4138537 (Tracking ID: 4098144)SYMPTOM:vxtask list shows the parent process without any sub-tasks which never progresses for SRL volumeDESCRIPTION:vxtask remains stuck since the parent process doesn't exit. It was seen that all childs are completed, but the parent is not able to exit.(gdb) p active_jobs$1 = 1Active jobs are reduced as in when childs complete. Somehow one count is pending and we don't know which child exited without decrementing count. Instrumentation messages are added to capture the issue.RESOLUTION:Added a code that will create a log file in /etc/vx/log/. This file will be deleted when vxrecover exists successfully. The file will be present when vxtask parent hang issue is seen.* 4138538 (Tracking ID: 4085404)SYMPTOM:Huge perf drop after Veritas Volume Replicator (VVR) entered Data Change Map (DCM) mode, when a large size of Storage Replicator Log (SRL) is configured.DESCRIPTION:The active map flush caused RVG serialization. Once RVG gets serialized, all IOs are queued in restart queue, till the active map flush is finished. The too frequent active map flush caused the huge IO drop during flushing SRL to DCM.RESOLUTION:The code is modified to adjust the frequency of active map flush and balance the application IO and SRL flush.* 4140598 (Tracking ID: 4141590)SYMPTOM:Some incidents do not appear in changelog because their cross-references are not properly processedDESCRIPTION:Every cross-references is not parent-child. In such case 'top' will not be present and changelog script ends execution.RESOLUTION:All cross-references are traversed to find parent-child only if it present and then find top.* 4143580 (Tracking ID: 4142054)SYMPTOM:System panicked in the following stack:[ 9543.195915] Call Trace:[ 9543.195938] dump_stack+0x41/0x60[ 9543.195954] panic+0xe7/0x2ac[ 9543.195974] vol_rv_inactive+0x59/0x790 [vxio][ 9543.196578] vol_rvdcm_flush_done+0x159/0x300 [vxio][ 9543.196955] voliod_iohandle+0x294/0xa40 [vxio][ 9543.197327] ? volted_getpinfo+0x15/0xe0 [vxio][ 9543.197694] voliod_loop+0x4b6/0x950 [vxio][ 9543.198003] ? voliod_kiohandle+0x70/0x70 [vxio][ 9543.198364] kthread+0x10a/0x120[ 9543.198385] ? set_kthread_struct+0x40/0x40[ 9543.198389] ret_from_fork+0x1f/0x40DESCRIPTION:- From the SIO stack, we can see that it is a case of done being called twice. - Looking at vol_rvdcm_flush_start(), we can see that when child sio is created, it is being directly added to the the global SIO queue. - This can cause child SIO to start while vol_rvdcm_flush_start() is still in process of generating other child SIOs. - It means that, say the first child SIO gets done, it can find the children count going to zero and calls done.- The next child SIO, also independently find children count to be zero and call done.RESOLUTION:The code changes have been done to fix the problem.* 4146550 (Tracking ID: 4108235)SYMPTOM:System wide hang causing all application and config IOs hangDESCRIPTION:Memory pools are used in vxio driver for managing kernel memory for different purposes. One of the pool called 'NMCOM pool' used on VVR secondary was causing memory leak. Memory leak was not getting detected from pool stats as metadata referring to pool itself was getting freed.RESOLUTION:Bug causing memory leak is fixed. There was race condition in VxVM transaction code path on secondary side of VVR where memory was not getting freed when certain conditions was hit.* 4150459 (Tracking ID: 4150160)SYMPTOM:System gets panicked in dmp code pathDESCRIPTION:CMDS-fsmigadm test hits "Oops: 0003 [#1] PREEMPT SMP PTI"2)Reproduction steps: Running cmds-fsmigadm test.3) Build details:# rpm -qi VRTSvxvmName : VRTSvxvmVersion : 8.0.3.0000Release : 0716_RHEL9Architecture: x86_64Install Date: Wed 10 Jan 2024 11:46:24 AM ISTGroup : Applications/SystemSize : 414813743License : Veritas ProprietarySignature : RSA/SHA256, Thu 04 Jan 2024 04:24:23 PM IST, Key ID 4e84af75cc633953Source RPM : VRTSvxvm-8.0.3.0000-0716_RHEL9.src.rpmBuild Date : Thu 04 Jan 2024 06:35:01 AM ISTBuild Host : vmrsvrhel9bld.rsv.ven.veritas.comPackager : enterprise_technical_support@veritas.comVendor : Veritas Technologies LLCURL : www.veritas.com/supportSummary : Veritas Volume ManagerRESOLUTION:removed buggy code and fixed it.* 4153570 (Tracking ID: 4134305)SYMPTOM:Illegal memory access is detected when an admin SIO is trying to lock a volume.DESCRIPTION:While locking a volume, an admin SIO is converted to an incompatible SIO, on which collecting ilock stats causes memory overrun.RESOLUTION:The code changes have been made to fix the problem.* 4153597 (Tracking ID: 4146424)SYMPTOM:CVM node join operation may hang with vxconfigd on master node stuck in following code path. ioctl () kernel_ioctl () kernel_get_cvminfo_all () send_slaves () master_send_dg_diskids () dg_balance_copies () client_abort_records () client_abort () dg_trans_abort () dg_check_kernel () vold_check_signal () request_loop () main ()DESCRIPTION:During vxconfigd level communication between master and slave nodes, if GAB returns EAGAIN,vxconfigd code does a poll on the GAB fd. In normal circ*mstances, the GAB will return the poll callwith appropriate return value. If however, the poll timeout occurs (poll returning 0), it was erroneously treated as success and the caller assumes that message was sent, when in fact ithad failed. This resulted in the hang in the message exchange between master and slavevxconfigd.RESOLUTION:Fix is to retry the send operation on GAB fd after some delay if the poll times out in the contextof EAGAIN or ENOMEM error. The fix is applicable on both master and slave side functions* 4154104 (Tracking ID: 4142772)SYMPTOM:In case SRL overflow frequently happens, SRL reaches 99% filled but the rlink is unable to get into DCM mode.DESCRIPTION:When starting DCM mode, need to check if the error mask NM_ERR_DCM_ACTIVE has been set to prevent duplicated triggers. This flag should have been reset after DCM mode was activated by reconnecting the rlink. As there's a racing condition, the rlink reconnect may be completed before DCM is activated, hence the flag isn't able to be cleared.RESOLUTION:The code changes have been made to fix the issue.* 4154107 (Tracking ID: 3995831)SYMPTOM:System hung: A large number of SIOs got queued in FMR.DESCRIPTION:When IO load is high, there may be not enough chunks available. In that case, DRL flushsio needs to drive fwait queue which may get some available chunks. Due a race condition and a bug inside DRL, DRL may queue the flushsio and fail to trigger flushsio again, then DRL ends in a permanent hung situation, not able to flush the dirty regions. The queued SIOs fails to be driven further hence system hung.RESOLUTION:Code changes have been made to drive SIOs which got queued in FMR.* 4155719 (Tracking ID: 4154921)SYMPTOM:system is stuck in zio_wait() in FC-IOV environment after reboot the primary control domain when dmp_native_support is on.DESCRIPTION:Due to the different reasons, DMP might disable its subpaths. In a particular scenario, DMP might fail to reset IO QUIESCES flag on its subpaths, which caused IOs got queued in DMP defer queue. In case the upper layer, like zfs, kept waiting for IOs to complete, this bug might cause whole system hang.RESOLUTION:Code changes have been made to reset IO quiesce flag properly after disabled dmp path.* 4158517 (Tracking ID: 4159199)SYMPTOM:coredump was being generated while running the TC "./scripts/admin/vxtune/vxdefault.tc" on AIX 7.3 TL2gettimeofday(??, ??) at 0xd02a7dfcget_exttime(), line 532 in "vm_utils.c"cbr_cmdlog(argc = 2, argv = 0x2ff224e0, a_client_id = 0), line 275 in "cbr_cmdlog.c"main(argc = 2, argv = 0x2ff224e0), line 296 in "vxtune.c"DESCRIPTION:Passing NULL parameter to gettimeofday function was causing coredump creationRESOLUTION:Code changes have been made to pass timeval parameter instead of NULL to gettimeofday function.* 4158920 (Tracking ID: 4159680)SYMPTOM:0 Fri Apr 5 20:32:30 IST 2024 + read bd_dg bd_dgid 0 Fri Apr 5 20:32:30 IST 2024 + 0 Fri Apr 5 20:32:30 IST 2024 first_time=1+ clean_tempdir 0 Fri Apr 5 20:32:30 IST 2024 + whence -v set_proc_oom_score 0 Fri Apr 5 20:32:30 IST 2024 set_proc_oom_score not found 0 Fri Apr 5 20:32:30 IST 2024 + 0 Fri Apr 5 20:32:30 IST 2024 1> /dev/null+ set_proc_oom_score 17695012 0 Fri Apr 5 20:32:30 IST 2024 /usr/lib/vxvm/bin/vxconfigbackupd[295]: set_proc_oom_score: not found 0 Fri Apr 5 20:32:30 IST 2024 + vxnotifyDESCRIPTION:type set_proc_oom_score &>/dev/null && set_proc_oom_score $$Here the stdout and stderr stream is not getting redirected to /dev/null. This is because "&>" is incompatible with POSIX.>out 2>&1 is a POSIX-compliant way to redirect both standard output and standard error to out. It also works in pre-POSIX Bourne shells.RESOLUTION:The code changes have been done to fix the problem.* 4161646 (Tracking ID: 4149528)SYMPTOM:----------Vxconfigd and vx commands hang. The vxconfigd stack is seen as follows. volsync_wait volsiowait voldco_read_dco_toc voldco_await_shared_tocflush volcvm_ktrans_fmr_cleanup vol_ktrans_commit volconfig_ioctl volsioctl_real vols_ioctl vols_unlocked_ioctl do_vfs_ioctl ksys_ioctl __x64_sys_ioctl do_syscall_64 entry_SYSCALL_64_after_hwframeDESCRIPTION:------------There is a hang in CVM reconfig and DCO-TOC protocol. This results in vxconfigd and vxvm commands to hang. In case overlapping reconfigs, it is possible that rebuild seqno on master and slave end up having different values.At this point if some DCO-TOC protocol is also in progress, the protocol gets hung due to difference in the rebuildseqno (messages are dropped).One can find messages similar to following in the /etc/vx/log/logger.txt on master node. We can see the mismatch in the rebuild seqno in the two messages. Look at the strings - "rbld_seq: 1" "fsio-rbld_seqno: 0". The seqno receivedfrom slave is 1 and the one present on master is 0.Jan 16 11:57:56:329170 1705386476329170 38ee FMR dco_toc_req: mv: masterfsvol1-1 rcvd req withold_seq: 0 rbld_seq: 1Jan 16 11:57:56:329171 1705386476329171 38ee FMR dco_toc_req: mv: masterfsvol1-1 pend rbld, retry rbld_seq: 1 fsio-rbld_seqno: 0 old: 0 cur: 3 new: 3 flag: 0xc10d stRESOLUTION:----------Instead of using rebuild seqno to determine whether the DCO TOC protocol is running the same reconfig, using reconfig seqno as a rebuild seqno. Since the reconfig seqno on all nodes in the cluster is same, the DCO TCOprotocol will find consistent rebuild seqno during CVM reconfig and will not result in some node droppingthe DCO TOC protocol messages.Added CVM protocol version check while using reconfig seqno as rebuild seqno. Thus new functionality will come into effect only if CVM protocol version is >= 300.Patch ID: VRTSaslapm 8.0.2.1500* 4125322 (Tracking ID: 4119950)SYMPTOM:Vulnerabilities have been reported in third party components, [curl and libxml] that are used by VxVM.DESCRIPTION:Third party components [curl and libxml] in their current versions, used by VxVM have been reported with security vulnerabilities which needsRESOLUTION:[curl and libxml] have been upgraded to newer versions in which the reported security vulnerabilities have been addressed.* 4133009 (Tracking ID: 4133010)SYMPTOM:aslapm rpm does not have changelogDESCRIPTION:Changelog in rpm will help to find missing incidents with respect to other version.RESOLUTION:Changelog is generated and added to aslapm rpm.* 4137995 (Tracking ID: 4117350)SYMPTOM:Below error is observed when trying to import # vxdg -n SVOL_SIdg -o useclonedev=on -o updateid import SIdgVxVM vxdg ERROR V-5-1-0 Disk group SIdg: import failed:Replicated dg record is found.Did you want to import hardware replicated LUNs?Try vxdg [-o usereplicatedev=only] import option with -c[s]Please refer to system log for details.DESCRIPTION:REPLICATED flag is used to identify a hardware replicated device so to import dg on the REPLICATED disks , usereplicatedev option must be used . As that was not provided hence issue was observed .RESOLUTION:REPLICATED flag has been removed for Hitachi ShadowImage (SI) disks.Patch ID: VRTSsfcpi-8.0.2.1200* 4115603 (Tracking ID: 4115601)SYMPTOM:On Solaris, Publisher list gets displayed during Infoscale start,stop, and uninstall process and does not display a unique publisher list during install and upgrade. A publisher list gets displayed during install and upgrade which is not unique.DESCRIPTION:On Solaris, Publisher list gets displayed during Infoscale start,stop, and uninstall process and does not display a unique publisher list during install and upgrade. A publisher list gets displayed during install and upgrade which is not unique.RESOLUTION:Installer code modified to skip the publisher list during start, stop and uninstall process and get unique publisher list during install and upgrade.* 4115707 (Tracking ID: 4126025)SYMPTOM:While performing full upgrade of the secondary site, SRL missing & RLINK dissociated error observed.DESCRIPTION:While performing full upgrade of the secondary site, SRL missing & RLINK dissociated error observed.SRL volume[s] is[are] in recovery state which leads to failure in associating srl vol with rvg.RESOLUTION:Installer code modified to wait for recovery tasks to complete on all volumes and then proceed with associating srl with rvg.* 4115874 (Tracking ID: 4124871)SYMPTOM:Configuration of Vxfen fails for a three-node cluster on VMs in different AZs.DESCRIPTION:Configuration of Vxfen fails for a three-node cluster on VMs in different AZsRESOLUTION:Added set-strictsrc 0 tunable to the llttab file in the installer.* 4116368 (Tracking ID: 4123645)SYMPTOM:During the Rolling upgrade response file creation, CPI is asking to unmount the VxFS filesystem.DESCRIPTION:During the Rolling upgrade response file creation, CPI is asking to unmount the VxFS filesystem.RESOLUTION:Installer code modified to exclude VxFS file system unmount process.* 4116406 (Tracking ID: 4123654)SYMPTOM:The installer gives a null swap space message.DESCRIPTION:The installer gives a null swap space message, as the swap space requirement is not needed anymore.RESOLUTION:Installer code modified and removed swap space message.* 4116879 (Tracking ID: 4126018)SYMPTOM:During addnode, installer fails to mount resources.DESCRIPTION:During addnode, installer fails to mount resources; as new node system was not added to child SG of resources.RESOLUTION:Installer code modified to add new node to all SGs.* 4116995 (Tracking ID: 4123657)SYMPTOM:Installer while performing upgrade Cluster Protocol version not upgraded post Full UpgradeDESCRIPTION:Installer while performing full upgrade Cluster Protocol version not upgraded post Full Upgrade.RESOLUTION:Installer code modified to retry upgrade to complete Cluster Protocol version .* 4117956 (Tracking ID: 4104627)SYMPTOM:The installer supports maximum of 5 patches. The user is not able to provide more than 5 patches for installation.DESCRIPTION:The latest bundle package installer supports maximum 5 patches. The user is not able to provide more than 5 patches for installation.RESOLUTION:The installer code modified to support maximum of 10 patches for installation.* 4121961 (Tracking ID: 4123908)SYMPTOM:Installer does not register InfoScale hosts to the VIOM Management Server after InfoScale configuration.DESCRIPTION:Installer does not register InfoScale hosts to the VIOM Management Server after InfoScale configuration.RESOLUTION:Installer is enhanced to register InfoScale hosts to VIOM Management Server by using both menu-driven program and responsefile. To register InfoScale hosts to VIOM Management Server by using responsefile, $CFG{gendeploy_path} parameter must be used. The value for $CFG{gendeploy_path} is the absolute path of gendeploy script from the local node.* 4122442 (Tracking ID: 4122441)SYMPTOM:When the product starts through the response file, installer displays keyless licensing information on screen.DESCRIPTION:When the product starts through the response file, installer displays keyless licensing information on screen.RESOLUTION:Licensing code modified to skip licensing information during the product start process.* 4122749 (Tracking ID: 4122748)SYMPTOM:On Linux, had service fails to start during rolling upgrade from InfoScale 7.4.1 or lower to higher InfoScale version.DESCRIPTION:VCS protocol version was supported from InfoScale 7.4.2 onwards. During rolling upgrade process from 7.4.1 or lower to higher InfoScale version, due to wrong release matrix data, installer tries to perform single phase rolling upgrade instead of two-phase rolling upgrade and had service fails to start.RESOLUTION:Installer is enhanced to perform two-phase rolling upgrade if installed Infoscale version is 7.4.1 or older.* 4126470 (Tracking ID: 4130003)SYMPTOM:Installer failed to start vxfs_replication while performing Configuration of Enterprise on OEL 9.2DESCRIPTION:Installer failed to start vxfs_replication while performing Configuration of Enterprise on OEL 9.2RESOLUTION:Installer code modified to retry start vxfs_replication while Configuration of Enterprise.* 4127111 (Tracking ID: 4127117)SYMPTOM:On a Linux system, the InfoScale installer configures the GCO(Global Cluster option) only with a virtual IP address.DESCRIPTION:On a Linux system, you can configure the GCO(Global Cluster option) with a hostname by using InfoScale installer on a different cloud platform.RESOLUTION:Installer prompts for the hostname to configure the GCO.* 4130377 (Tracking ID: 4131703)SYMPTOM:Installer performs dmp include/exclude operations if /etc/vx/vxvm.exclude is present on the system.DESCRIPTION:Installer performs dmp include/exclude operations if /etc/vx/vxvm.exclude is present on the system which is not required.RESOLUTION:Removed unnecessary dmp include/exclude operations which are launched after starting services in the container environment.* 4130996 (Tracking ID: 4130000)SYMPTOM:Installer failed to start veki,llt if Active state is failed.DESCRIPTION:Installer failed to start veki,llt if Active state is failed.RESOLUTION:Installer code modified to start vxeki,llt by checking Active state.* 4131315 (Tracking ID: 4131314)SYMPTOM:Environment="VCS_ENABLE_PUBSEC_LOG=0" is added from cpi in install section of service file instead of Service section.DESCRIPTION:Environment="VCS_ENABLE_PUBSEC_LOG=0" is added from cpi in install section of service file instead of Service section.RESOLUTION:Environment="VCS_ENABLE_PUBSEC_LOG=0" is added in Service section of service file.* 4131684 (Tracking ID: 4131682)SYMPTOM:On SunOS, installer prompts the user to install 'bourne' package if it is not available.DESCRIPTION:Installer had a dependency on 'usr/sunos/bin/sh', which is from 'bourne' package. 'bourne' package is deprecated with latest SRUs.RESOLUTION:Installer code is updated to use '/usr/bin/sh' instead of 'usr/sunos/bin/sh' thus removing bourne package dependency.* 4132411 (Tracking ID: 4139946)SYMPTOM:Rolling upgrade fails if the recommended upgrade path is not followed.DESCRIPTION:Rolling upgrade fails if the recommended upgrade path is not followed.RESOLUTION:Installer code fixed to resolve rolling upgrade issues if recommended upgrade path is not followed.* 4133019 (Tracking ID: 4135602)SYMPTOM:Installer failed to update main.cf file with VCS user during reconfiguring a secured cluster to non-secured cluster.DESCRIPTION:Installer failed to update main.cf file with VCS user during reconfiguring a secured cluster to non-secured cluster.RESOLUTION:Installer code checks are modified to update VCS user in main.cf file during reconfiguration of cluster from secured to non-secure.* 4133469 (Tracking ID: 4136432)SYMPTOM:Installer failed to add node to a higher version of infoscale node.DESCRIPTION:Installer failed to add node to a higher version of infoscale node.RESOLUTION:Installer code modified to enable adding node to a higher version of infoscale node.* 4135015 (Tracking ID: 4135014)SYMPTOM:CPI installer is asking "Would you like to install InfoScale" after "./installer -precheck" is done.DESCRIPTION:CPI installer is asking "Would you like to install InfoScale" after "./installer -precheck" is done. So it should not ask for installation after precheck is completed.RESOLUTION:Installer code modified to skip the question for installation after precheck is completed.* 4136211 (Tracking ID: 4139940)SYMPTOM:Installer failed to get the package version and failed due to PADV missmatch.DESCRIPTION:Installer failed to get the package version and failed due to PADV missmatch.RESOLUTION:Installer code modified to retrieve proper package version.* 4139609 (Tracking ID: 4142877)SYMPTOM:Missing HF list not displayed during upgrade by using the patch release.DESCRIPTION:Missing HF list not displayed during upgrade by using the patch release.RESOLUTION:Add prechecks in installer for identifying missing HFs and accept action from customer.* 4140512 (Tracking ID: 4140542)SYMPTOM:Installer failed to rolling upgrade for patchDESCRIPTION:Rolling upgrade failed for the patch installer for the build version during mixed ru checkRESOLUTION:Installer code modified to handle build version during mixed ru check.* 4155155 (Tracking ID: 4156038)SYMPTOM:The installer supports the EO logging file permission changes.DESCRIPTION:The installer now supports EO logging file permission changes.RESOLUTION:The installer code has been modified to enable the support for EO logging file permission changes.* 4157440 (Tracking ID: 4158841)SYMPTOM:The installer supports VRTSrest version changes.DESCRIPTION:The installer now supports VRTSrest version changes.RESOLUTION:The installer code has been modified to enable the support for VRTSrest version changes.* 4159940 (Tracking ID: 4159942)SYMPTOM:The installer is used to update file permission.DESCRIPTION:The installer is used to update file permission.RESOLUTION:The installer code has been modified so that it will not update existing file permissions.* 4160348 (Tracking ID: 4163907)SYMPTOM:While applying a infoscale-patch, CPI does not check whether previous infoscale-update is installed.DESCRIPTION:While applying a infoscale-patch, check whether previous infoscale-update is installed.RESOLUTION:The installer code has been modified to check whether the previous infoscale update is installed or not.* 4161937 (Tracking ID: 4160983)SYMPTOM:In Solaris, after upgrading the Infoscale to ABE if we boot the current BE then the vxfs modules are not loading properly.DESCRIPTION:In Solaris, the vxfs modules are getting removed from current BE while upgrading the Infoscale to ABE.RESOLUTION:Installer code modified.Patch ID: -8.0.2.500* 4160665 (Tracking ID: 4160661)SYMPTOM:NADESCRIPTION:NARESOLUTION:NAPatch ID: VRTSdbac-8.0.2.1100* 4161967 (Tracking ID: 4157901)SYMPTOM:vcsmmconfig.log does not show file permissions 600 if EO-tunable VCS_ENABLE_PUBSEC_LOG_PERM is set to 0.DESCRIPTION:vcsmmconfig.log does not show file permissions 600 if EO-tunable VCS_ENABLE_PUBSEC_LOG_PERM is set to 0.RESOLUTION:Changes done in order to set file permission of vcsmmconfig.log as per EO-tunable VCS_ENABLE_PUBSEC_LOG_PERM.Patch ID: VRTSgab-8.0.2.1100* 4160642 (Tracking ID: 4160640)SYMPTOM:vxfen fails to install. Asks llt and gab of same version.DESCRIPTION:vxfen fails to install. Asks llt and gab of same version.RESOLUTION:Use llt and gab of same version for aix.Patch ID: VRTSvcsag-8.0.2.1100* 4156630 (Tracking ID: 4156628)SYMPTOM:Getting message "Uninitialized value $version in string eq at /opt/VRTSvcs/bin/NIC/monitor line 317" constantly.DESCRIPTION:The following message is constantly being reported in the NIC_A.log as $version is not getting initialized.2024/02/05 15:32:00 VCS INFO V-16-2-13716 Thread(1312) Resource(csgnic): Output of the completed operation (monitor)==============================================Use of uninitialized value $version in string eq at /opt/VRTSvcs/bin/NIC/monitor line 317, <IFCONFIG> line 1.Use of uninitialized value $version in string eq at /opt/VRTSvcs/bin/NIC/monitor line 317, <IFCONFIG> line 2.Use of uninitialized value $version in string eq at /opt/VRTSvcs/bin/NIC/monitor line 317, <IFCONFIG> line 3.RESOLUTION:During performing ping test, the $version is not initializing so code is updated to handle this problem.Patch ID: VRTSllt-8.0.2.1100* 4160641 (Tracking ID: 4160640)SYMPTOM:vxfen fails to install. Asks llt and gab of same version.DESCRIPTION:vxfen fails to install. Asks llt and gab of same version.RESOLUTION:Use llt and gab of same version for aix.Patch ID: VRTScps-8.0.2.1100* 4152885 (Tracking ID: 4152882)SYMPTOM:Intermittently losing access to the CP serversDESCRIPTION:Since we write logs into every log files(vxcpserve_[A|B|C].log ) at most till maxlen, but if it goes beyond that length, a new file will be opened and the old one will be closed. At this stack, fptr uses the old pointer, resulting in fwrite() to a closed FILE ptr.RESOLUTION:Opened a new file before assignment of fptr so that it will point to a correct FILE ptr.* 4156113 (Tracking ID: 4156112)SYMPTOM:EO changes file permission tunableDESCRIPTION:EO changes file permission tunableRESOLUTION:EO changes file permission tunable* 4157674 (Tracking ID: 4156112)SYMPTOM:EO changes file permission tunableDESCRIPTION:EO changes file permission tunableRESOLUTION:EO changes file permission tunable* 4162878 (Tracking ID: 4136146)SYMPTOM:Old version v6.1.14.26DESCRIPTION:New version available v6.1.14.27.RESOLUTION:Use New version available v6.1.14.27 libraries.Patch ID: VRTSdbed-8.0.2.1200* 4163136 (Tracking ID: 4136146)SYMPTOM:Old version v6.1.14.26DESCRIPTION:New version available v6.1.14.27.RESOLUTION:Use New version available v6.1.14.27 libraries.Patch ID: VRTSvbs-8.0.2.1100* 4163135 (Tracking ID: 4136146)SYMPTOM:Old version v6.1.14.26DESCRIPTION:New version available v6.1.14.27.RESOLUTION:Use New version available v6.1.14.27 libraries.Patch ID: VRTSvxfen-8.0.2.1100* 4125891 (Tracking ID: 4113847)SYMPTOM:Even number of cp disks is not supported by design. This enhancement is a part of AFA wherein a faulted disk needs to be replaced as soon as the number of coordination disks is even in number and fencing is up and runningDESCRIPTION:Regular split / network partitioning must be an odd number of disks.Even number of cp support is provided with cp_count. With cp_count/2+1, fencing is not allowed to come up. Also if cp_count is not defined in vxfenmode file then by default minimum 3 cp disk are needed, otherwise vxfen does not start.RESOLUTION:In case of even number of cp disk, another disk is added. The number of cp disks is odd and fencing is thus running.* 4125895 (Tracking ID: 4108561)SYMPTOM:Vxfen print keys internal utility was not working because of overrunning of array internallyDESCRIPTION:Vxfen print keys internal utility will not work if the number of keys exceed 8 will then return garbage valueOverrunning array keylist[i].key of 8 bytes at byte offset 8 using index y (which evaluates to 8)RESOLUTION:Restricted the internal loop to VXFEN_KEYLEN. Reading reservation working fine now.* 4156076 (Tracking ID: 4156075)SYMPTOM:EO changes file permission tunableDESCRIPTION:EO changes file permission tunableRESOLUTION:EO changes file permission tunable* 4156379 (Tracking ID: 4156075)SYMPTOM:EO changes file permission tunableDESCRIPTION:EO changes file permission tunableRESOLUTION:EO changes file permission tunablePatch ID: VRTSvcs-8.0.2.1100* 4124106 (Tracking ID: 4129493)SYMPTOM:Tenable security scan kills the Notifier resource.DESCRIPTION:When nmap port scan performed on port 14144 (on which notifier process is listening), notifier gets killed because of connection request.RESOLUTION:The required code changes have done to prevent Notifier agent crash when nmap port scan is performed on notifier port 14144.* 4157581 (Tracking ID: 4157580)SYMPTOM:Security vulnerabilities have been identified in the current version of the third-party component OpenSSL, which is utilized by VCS.DESCRIPTION:There are security vulnerabilities present in the current version of third-party component, OpenSSL, that is utilized by VCS.RESOLUTION:VCS is updated to use newer versions of OpenSSL in which the security vulnerabilities have been addressed.Patch ID: -8.0.2.1700* 4162683 (Tracking ID: 4153873)SYMPTOM:CVM master reboot resulted in volumes disabled on the slave nodeDESCRIPTION:The Infoscale stack exhibits unpredictable behaviour during reboots, sometimes the node hangs to come online, the working node goes into the faulted state and sometimes the cvm won't start on the rebooted node.RESOLUTION:Now we have added the mechanism for making decisions about deport and the code has been integrated with an offline routine.* 4118153 (Tracking ID: 4118154)SYMPTOM:System may panic in simple_unlock_mem() when errcheckdetail enabled with stack trace as follows.simple_unlock_mem()odm_io_waitreq()odm_io_waitreqs()odm_request_wait()odm_io()odm_io_stat()vxodmioctl()DESCRIPTION:odm_io_waitreq() has taken a lock and waiting to complete the IO request but it is interrupted by odm_iodone() to perform IO and unlocked a lock taken by odm_io_waitreq(). So when odm_io_waitreq() tries to unlock the lock it leads to panic as lock was unlocked already.RESOLUTION:Code has been modified to resolve this issue.* 4138361 (Tracking ID: 4134884)SYMPTOM:After unmounting the FS, when the diskgroup deport is initiated, it gives below error: vxvm:vxconfigd: V-5-1-16251 Disk group deport of testdg failed with error 70 - Volume or plex device is open or attachedDESCRIPTION:During mount of a dirty file system, vxvm device open count is leaked, and consequently, the deport of the vxvm DG got failed. During the VXFS FS mount operation the corresponding vxvm device will be opened.If the FS is not clean, it signifies mount to do the log replay. Later the log replay completes, and the mount will succeed.But this device open count leak causes the diskgroup deport to fail.RESOLUTION:Code changes are done to address the device open count leak.* 4138381 (Tracking ID: 4117342)SYMPTOM:System might panic due to hard lock up detected on CPUDESCRIPTION:When purging the dentries, there is a possible race which can lead to corrupted vnode flag. Because of these corrupted flag, vxfs tries to purge dentry again and it gets stuck for vnode lock which was taken in the current thread context which leads to deadlock/softlockup.RESOLUTION:Code is modified to protect vnode flag with vnode lock.* 4138384 (Tracking ID: 4134194)SYMPTOM:vxfs/glm worker thread panic with kernel NULL pointer dereferenceDESCRIPTION:In vx_dir_realloc(), When the directory block is full, to fit new file entry it reallocate this directory block into a larger extent.So as the new extent gets allocated, the old cbuf is now part of the new extent.But we dont invalidate old cbuf during dir_realloc, which ends up with a staled cbuf in the cache.This staled buffer can cause the buffer overflow issue.RESOLUTION:Code changes are done to invalidate the cbuf immediately after the realloc.* 4138430 (Tracking ID: 4137037)SYMPTOM:Binary termination due to segmentation fault.DESCRIPTION:1. removing basename() routine as it is not safe.2. correcting snprintf() length parameter.3. fixing bug present when we check for present of existing job.4. Fixing double close() and fd leak inside run_command.5. Introduced regex while parsing vxsnap snapshot list for more accurate result.RESOLUTION:Code changes have been done in the vxfstaskd binary to avoid above mentioned issues.* 4138444 (Tracking ID: 4118840)SYMPTOM:Abnormal behaviour e.g. WORM is getting set, despite on-disk WORM flag is not set.DESCRIPTION:we were not undoing in-core fs changes if on-disk flushing returns error.RESOLUTION:Code changes have been done in the culprit kernel function to avoid this.* 4142554 (Tracking ID: 4142555)SYMPTOM:Modify reference to running fsck -y in mount.vxfs error message and update fsck_vxfs manpageDESCRIPTION:While trying to mount the corrupted FS the error message promts in to run fsck -y. Without understanding the implication of running fsck -y on a FS can lead to data loss.RESOLUTION:Updated the mount.vxfs error messages with recommedation to refer to fsck_vxfs manpage.And in the fsck_vxfs manpage added additional message to connect with veritas support for further assistance in collecting more debug logs before running fsck -y.* 4142810 (Tracking ID: 4149899)SYMPTOM:Veritas file replication failover fails to perform the failover of job to target cluster.DESCRIPTION:As part of Veritas file failover operation, target filesystem needs to offline and made online to become new source.. After this operation the filesystem needs to marked with protected flag off, as this becomes new site and application can write to this site. So to update the flag current code performs second remount of file system which is taking time as it is processing file set removal as part of first offline(umount) and online(mount).RESOLUTION:Code changes have been done to make sure all operations including setting protected flag during single offline and online processing of filesystem* 4152221 (Tracking ID: 4151836)SYMPTOM:Out of bound kernel memory is accessed during logging the event if mount encounters metadata corruption.DESCRIPTION:Out of bound kernel memory is accessed during logging the event if mount encounters metadata corruption.RESOLUTION:Code changes have been done in the culprit kernel function to avoid accessing invalid memory.* 4153560 (Tracking ID: 4089199)SYMPTOM:Dynamic reconfiguration operation for CPU takes a lot of time. Temporary I/O hang is also observed during DR.DESCRIPTION:DR processing in VxFS is done for each CPU change notified by kernel. DR processing involves VxFS reinit and cluster-wide file system freeze.If the processor has SMT enabled then the cluster-wide file system freeze happens for each SMT thread per virtual CPU. This causes the slowness and temporary I/O hangs during CPU DR operations.RESOLUTION:Optimised the DR code to do the processing of several CPU DR events together.* 4154052 (Tracking ID: 4119281)SYMPTOM:Higher page-in requests on Solaris 11 SPARC.DESCRIPTION:After upgrading Infoscale, page-in requests are much higher. "vmstat" output looks normal but "sar" output looks abnormal (showing high page-in requests). "sar" is taking absolute sample for some reasons. "sar" is not supposed to use these values.RESOLUTION:Code changes are done to solve this issue* 4154058 (Tracking ID: 4136235)SYMPTOM:System with higher number of attribute inodes and pnlct inodes my see higher number of IOs on an idle system.DESCRIPTION:System with higher number of attribute inodes and pnlct inodes my see higher number of IOs on an idle CFS. Hence reducing the pnlct merge frequency may showsome performance improvement.RESOLUTION:Module parameter to change pnlct merge frequency.* 4154119 (Tracking ID: 4126943)SYMPTOM:Create lost+found directory in VxFS file system with default ACL permissions as 700.DESCRIPTION:Due to security reasons, there was ask to create lost+found directory in VxFS file system with default ACL permissions as 700. So that, except root, no other users are able to access files under lost+found directory.RESOLUTION:VxFS filesystem creation with mkfs command will now result in creation of lost+found directory with default ACL permissions as 700.* 4134040 (Tracking ID: 3979756)SYMPTOM:Multiple fcntl F_GETLK calls are taking longer to complete in case of CFS than LM. Each call adds more and more delay and contributes to what seen later asperformance degradation. DESCRIPTION:F_SETLK is utilizing the lock caches while taking or invalidating the locks, that's why it does not need to broadcast the messages to peer nodes. Whereas,F_GETLK does not utilize the caches and broadcasts the messages to all the peer nodes. Therefore operations of F_SETLK are not penalized but F_GETLK by almostthe factor of 2 when used in CFS as compared to LM.RESOLUTION:Added cache for F_GETLK operation as well so that broadcast messages are avoided which would save some time.INSTALLING THE PATCH--------------------Run the Installer script to automatically install the patch:-----------------------------------------------------------Please be noted that the installation of this P-Patch will cause downtime.To install the patch perform the following steps on at least one node in the cluster:1. Copy the patch infoscale-aix-Patch-8.0.2.1800.tar.gz to /tmp2. Untar infoscale-aix-Patch-8.0.2.1800.tar.gz to /tmp/hf # mkdir /tmp/hf # cd /tmp/hf # gunzip /tmp/infoscale-aix-Patch-8.0.2.1800.tar.gz # tar xf /tmp/infoscale-aix-Patch-8.0.2.1800.tar3. Install the hotfix(Please be noted that the installation of this P-Patch will cause downtime.) # pwd /tmp/hf # ./installVRTSinfoscale802P1800 [<host1> <host2>...]You can also install this patch together with 8.0.2 base release using Install Bundles1. Download this patch and extract it to a directory2. Change to the Veritas InfoScale 8.0.2 directory and invoke the installer script with -patch_path option where -patch_path should point to the patch directory # ./installer -patch_path [<path to this patch>] [<host1> <host2>...]Install the patch manually:--------------------------Manual installation is not recommended.REMOVING THE PATCH------------------Run the Uninstaller script to automatically remove the patch:------------------------------------------------------------To uninstall the patch perform the following step on at least one node in the cluster: # /opt/VRTS/install/uninstallVRTSinfoscale802P1800 [<host1> <host2>...]Remove the patch manually:-------------------------Manual uninstallation is not recommended.KNOWN ISSUES------------* Tracking ID: 4163517SYMPTOM: DMP cannot be enabled for AIX rootvg.WORKAROUND: No Workaround available.SPECIAL INSTRUCTIONS--------------------1. In case the internet is not available, Installation of the patch must be performed concurrently with the latest CPI patch downloaded from Download Center. 2.While performing uninstallation operation through CPI, ASLAPM failed to uninstall. Currently ASLAPM is shipped in package format instead of patch format, which is causing this issue. The customer will need to uninstall ASLAPM manually if they hit this issue.3. When attempting to manually start the Collector Service on an AIX platform, it might fail to start on the host machine.There is no functional impact and support for Telemetry Collector Service operations is not available in InfoScale 8.0.2. Consequently, any warnings associated with it can be disregarded safely.OTHERS------NONE
IS-8.0.2 Update2 (Unix) on AIX platform (2024)

Resumo

Aplica-se às seguintes releases de produtos

References