DNF install old packages
This happens mostly when your NVidia display card is too old to be compatible to the most recent driver version. Although the older version of NVidia driver is still available in the corresponding repository, like cuda-rhel9.repo, which supports driver version from 515 till 580, you can only install latest version when doing
dnf install nvidia-driver
Then after installation, you will get
$ nvidia-smi
NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
Showing that the driver is not compatible with your hardware.
If you attempt to specify the version you want, dnf will complain the version and/or its dependencies are filtered out my module filtering.
To overcome this, you will need to switch to the correct module stream. To do so, first you’d run:
dnf module list nvidia-driver
This will show the available streams like:
cuda-rhel9-x86_64
Name Stream Profiles Summary
nvidia-driver latest default [d], fm, ks Nvidia driver for latest branch
nvidia-driver latest-dkms default [d], fm, ks Nvidia driver for latest-dkms branch
nvidia-driver open-dkms [d][e] default [d], fm, ks Nvidia driver for open-dkms branch
nvidia-driver 515 default [d], fm, ks, src Nvidia driver for 515 branch
nvidia-driver 515-dkms default [d], fm, ks Nvidia driver for 515-dkms branch
nvidia-driver 515-open default [d], fm, ks, src Nvidia driver for 515-open branch
nvidia-driver 520 default [d], fm, ks, src Nvidia driver for 520 branch
nvidia-driver 520-dkms default [d], fm, ks Nvidia driver for 520-dkms branch
nvidia-driver 520-open default [d], fm, ks, src Nvidia driver for 520-open branch
nvidia-driver 525 default [d], fm, ks, src Nvidia driver for 525 branch
nvidia-driver 525-dkms default [d], fm, ks Nvidia driver for 525-dkms branch
nvidia-driver 525-open default [d], fm, ks, src Nvidia driver for 525-open branch
nvidia-driver 530 default [d], fm, ks, src Nvidia driver for 530 branch
nvidia-driver 530-dkms default [d], fm, ks Nvidia driver for 530-dkms branch
nvidia-driver 530-open default [d], fm, ks, src Nvidia driver for 530-open branch
nvidia-driver 535 default [d], fm, ks, src Nvidia driver for 535 branch
nvidia-driver 535-dkms default [d], fm, ks Nvidia driver for 535-dkms branch
nvidia-driver 535-open default [d], fm, ks, src Nvidia driver for 535-open branch
nvidia-driver 545 default [d], fm, ks, src Nvidia driver for 545 branch
nvidia-driver 545-dkms default [d], fm, ks Nvidia driver for 545-dkms branch
nvidia-driver 545-open default [d], fm, ks, src Nvidia driver for 545-open branch
nvidia-driver 550 default [d], fm, ks, src Nvidia driver for 550 branch
nvidia-driver 550-dkms default [d], fm, ks Nvidia driver for 550-dkms branch
nvidia-driver 550-open default [d], fm, ks, src Nvidia driver for 550-open branch
nvidia-driver 555 default [d], fm, ks, src Nvidia driver for 555 branch
nvidia-driver 555-dkms default [d], fm, ks Nvidia driver for 555-dkms branch
nvidia-driver 555-open default [d], fm, ks, src Nvidia driver for 555-open branch
nvidia-driver 560 default [d], fm, ks, src Nvidia driver for 560 branch
nvidia-driver 560-dkms default [d], fm, ks Nvidia driver for 560-dkms branch
nvidia-driver 560-open default [d], fm, ks, src Nvidia driver for 560-open branch
nvidia-driver 565 default [d], fm, ks, src Nvidia driver for 565 branch
nvidia-driver 565-dkms default [d], fm, ks Nvidia driver for 565-dkms branch
nvidia-driver 565-open default [d], fm, ks, src Nvidia driver for 565-open branch
nvidia-driver 570 default [d], fm, ks Nvidia driver for 570 branch
nvidia-driver 570-dkms default [d], fm, ks Nvidia driver for 570-dkms branch
nvidia-driver 570-open default [d], fm, ks Nvidia driver for 570-open branch
nvidia-driver 575 default [d], fm, ks Nvidia driver for 575 branch
nvidia-driver 575-dkms default [d], fm, ks Nvidia driver for 575-dkms branch
nvidia-driver 575-open default [d], fm, ks Nvidia driver for 575-open branch
nvidia-driver 580-dkms default [d], fm, ks Nvidia driver for 580-dkms branch
nvidia-driver 580-open default [d], fm, ks Nvidia driver for 580-open branch
Then you can pick the branch you prefer, for example,570-open:
dnf module switch-to nvidia-driver:570-open
This will install the correct version.
NVidia-smi shows API mismatch
Sometime when updated NVidia driver and CUDA on Rocky Linux systems, running nvidia-smi shows that kernel driver version mismatch. If you run
dmseg
It will show:
NVRM: API mismatch: the client has the version aaa.bbb, but
NVRM: this kernel module has the version ccc.ddd. Please
NVRM: make sure that this kernel module and all NVIDIA driver
NVRM: components have the same version.
And aaa.bbb is not the same with ccc.ddd.
This happens that the corresponding nvidia driver was not properly registered by dkms.
Other solution suggested to reboot the server, reinstall drivers, recreate initramfs, and rmmod of corresponding nvidia mods. These methods sometimes works. When they are all not working, you can try
dkms install -m nvidia -v 570.144
where replacing 570.144 to your most recent installed nvidia driver version. Then reboot the server. This should work.
When dnf update kernel failed to generate initramfs
This sometimes happen when the automatic nvidia kernel module installation fails.
The workaround is:
- Boot into an older kernel;
- ls /boot/ and find the kernel name of the vmlinuz without corresponding initramfs. For example, it could be vmlinuz-6.14.3-300.fc42.x86_64. Your kernel version will be 6.14.3-300.fc42.x86_64, in the form of major.mid,minor-nnn.osver.arch. Let’s call it $KVER
- Check the existence of /lib/modules/$KVER/ by ls it. If exists, do “depmod -v $KVER”. This will create modules.dep in the folder /lib/modules/$KVER/.
- Do dracut –force –kver $KVER. If not working, use a lower version of gcc like “CC=gcc-14 dracut –force –kver $KVER”.
- Reboot into the newest kernel. Usually this will not contail nvidia driver kernel module.
- Run the nvidia driver installation downloaded from NVidia driver site, like NVIDIA-Linux-x86_64-570.144.run. This will install the NVidia driver kernel module into the kernel. If not working, use a lower version of gcc, like “CC=gcc-14 ./NVIDIA-Linux-x86_64-570.144.run”
- Now you have a good kernel with proper graphic driver kernel module.
Setting firewalld to allow nodes on intranet to access internet
firewall-cmd --zone=public --add-interface=<internet interface> --permanent
firewall-cmd --zone=internal --add-interface=<intranet interface as gateway> --permanent
firewall-cmd --set-default-zone=public --permanent
firewall-cmd --reload
firewall-cmd --get-default-zone
firewall-cmd --new-policy internal-public --permanent
firewall-cmd --reload
firewall-cmd --policy internal-public --add-ingress-zone=internal --permanent
firewall-cmd --policy internal-public --add-egress-zone=public --permanent
firewall-cmd --policy internal-public --set-target=ACCEPT --permanent
firewall-cmd --reload
firewall-cmd --info-policy internal-public
When upgrading OS, dnf and rpm fails on SHA1 packages
First, use
rpm -q gpg-pubkey –qf ‘%{NAME}-%{VERSION}-%{RELEASE}\t%{SUMMARY}\n’
to identify keys from obsolete repositories, then use
rpm -e gpg-pubkey-xxxxxxxx-yyyyyyyy
to remove the keys that were imported from the SHA1 era.
Then have the offending packages removed by
rpm -q –nosignature –querybynumber xxxx
where you can get the xxxx from the stderr messages from
rpm -qa >/dev/null
Grub force install to boot sector
When upgrading OS or replacing drives, the UEFI booting may not be able to automatically installed on the boot drive, resulting not able to update the boot menu to include the new kernels.
When you do grub2-install <your boot device>, it errors out with information:
Installing for x86_64-efi platform.
grub2-install: error: This utility should not be used for EFI platforms because it does not support UEFI Secure Boot. If you really wish to proceed, invoke the –force option.
Make sure Secure Boot is disabled before proceeding.
Do not worry. Just force it,
grub2-install –force <your boot device>
then update the grub menu,
grub2-mkconfig -o /boot/grub2/grub.cfg
it will work.
Linux ssh log in super slow
It was found that systemd-logind malfunctioned,
By restarting it —
systemctl restart systemd-logind
The problem is resolved.
tcsh script behavior change
Recently I noticed a tcsh behavior change.
CentOS7, Rocky 8, Ubuntu 20.04, Fedora 41, and Linux Mint if you have a string variable in tcsh,
like
set a=”mystring”
and attempt to get its path when treating is as a filename:
set b=”$a:h”
it will return $a itself.
To be noted that the expect behavior when a=”mypath/mystring”,
set b=”$a:h”
get string $b as mypath
However, in Rocky9, $b will be “” empty string when $a does not contain any slash.
This behavior caused some of our tcsh scripts to malfunction.
The reason is still under investigating.
Window 11 cannot install Asian Keyboard
It happened to me that when I attempt to install Chinese/Japan/Korean input method to my windows 11 box, it fails on installing the “Basic Typing” after attempted to download for about 30 seconds with error
“Sorry, we’re having trouble installing this feature. You can try again later. Error code 0x0”.
And all other components like Handwriting, Text-to-speech, Speech recognition also fail. And the installed Asian keyboard does not work showing the feature is not ready.
I tried the tricks in this: [https://answers.microsoft.com/en-us/windows/forum/all/windows-11-unable-to-download-language-packs/b78b04da-2c75-45d8-a828-f553441b220f] but none of them works.
The workaround I found is to download
26100.1.240331-1435.ge_release_amd64fre_CLIENT_LOF_PACKAGES_OEM.iso
from https://files.rg-adguard.net/file/025cfc5d-f5fa-7d00-246e-76c04a40e210
and extract the corresponding language pack .cab files like
Microsoft-Windows-Client-Language-Pack_x64_zh-cn.cab
Microsoft-Windows-LanguageFeatures-Basic-zh-cn-Package~31bf3856ad364e35~amd64~~.cab
Microsoft-Windows-LanguageFeatures-Handwriting-zh-cn-Package~31bf3856ad364e35~amd64~~.cab
Microsoft-Windows-LanguageFeatures-Speech-zh-cn-Package~31bf3856ad364e35~amd64~~.cab
Microsoft-Windows-LanguageFeatures-TextToSpeech-zh-cn-Package~31bf3856ad364e35~amd64~~.cab
and install them one by one in PowerShell with Admin privilege like:
Add-WindowsPackage -Online -PackagePath “.\Microsoft-Windows-LanguageFeatures-Basic-zh-cn-Package~31bf3856ad364e35~amd64~~.cab”
After all of these, the “Basic typing” is still not available but the input method works.
When dnf/yum update stuck on cleaning up…
Sometimes when you are doing dnf/yum update, the progress may stop on the last step – cleaning up packages for hours, if you have a super large data drive. This may be caused by an installing script falsely attempts to scan through multi-million files on your data drive that is not mounted in a regular location. If this is the case, you can do the following:
Open another terminal, use “top” to find out which process is keeping working, like texlua etc will show up on top.
Then you can do “lsof | grep <process_name> to find out which drive this process is scanning through.
When you find it, for example, if it is “/data/home”, you can do “umount -l <volume_name>”, (here it is “umount -l /data/home”), wait 10 seconds, then “mount /data/home” to remount it. Then the process that scanning the drive will think there is no more files, and quit it.
This will allow the dnf/yum finish without any error.