Mellanox MSN2100 Switch fan tolerance

By: | Comments: No Comments

Posted in categories: Computer Tips, Work related

My Mellanox MSN2100 switch constantly has the “system status” LED on red since purchase.

By looking into its system status:

show system-health detail

System status summary

System status LED red
Services:
Status: OK
Hardware:
Status: Not OK
Reasons: Failed to get speed tolerance for fan4
Failed to get speed tolerance for fan3
Failed to get speed tolerance for fan2
Failed to get speed tolerance for fan1

System services and devices monitor list

Name Status Type
——————— ——– ———-
sonic OK System
rsyslog OK Process
root-overlay OK Filesystem
var-log OK Filesystem
routeCheck OK Program
dualtorNeighborCheck OK Program
diskCheck OK Program
container_checker OK Program
vnetRouteCheck OK Program
memory_check OK Program
container_memory_snmp OK Program
container_memory_gnmi OK Program
container_eventd OK Program
database:redis OK Process
syncd:syncd OK Process
bgp:zebra OK Process
bgp:staticd OK Process
bgp:bgpd OK Process
bgp:fpmsyncd OK Process
bgp:bgpcfgd OK Process
teamd:teammgrd OK Process
teamd:teamsyncd OK Process
teamd:tlm_teamd OK Process
swss:orchagent OK Process
swss:portsyncd OK Process
swss:neighsyncd OK Process

swss:fdbsyncd OK Process
swss:vlanmgrd OK Process
swss:intfmgrd OK Process
swss:portmgrd OK Process
swss:buffermgrd OK Process
swss:vrfmgrd OK Process
swss:nbrmgrd OK Process
swss:vxlanmgrd OK Process
swss:coppmgrd OK Process
swss:tunnelmgrd OK Process
eventd:eventd OK Process
snmp:snmpd OK Process
snmp:snmp-subagent OK Process
lldp:lldpd OK Process
lldp:lldp-syncd OK Process
lldp:lldpmgrd OK Process
gnmi:gnmi-native OK Process
fan1 Not OK Fan
fan2 Not OK Fan
fan3 Not OK Fan
fan4 Not OK Fan
ASIC OK ASIC
PSU 1 OK PSU
PSU 2 OK PSU

System services and devices ignore list

Name Status Type
————— ——– ——
psu.voltage Ignored Device
psu.temperature Ignored Device

It shows that it cannot obtain fan tolerance data from the database.

Researching the system files, it is found in file /usr/local/lib/python3.9/dist-packages/sonic_platform/fan.py, the fan tolerance is hard set to 50%. And this is not passed to /usr/local/lib/python3.9/dist-packages/health_checker/hardware_checker.py via function data_dict.get(‘speed_tolerance’, None)

There is a simple fix of this: comment out the line #105 of the hardware_checker.py , replace it a hard setting

speed_tolerance = 50

And the system status LED turns green.

Using SONiC to run MSN2100 switch

By: | Comments: No Comments

Posted in categories: Computer Tips, Work related

The Mellanox/NVidia MSN2100 switch came with a bare ONIE system and is not Plug and Play. NVidia service can not provide setting up assistance without a ONYX//Cumulus/SONiC entitlement. I installed the community version of SONiC from sonic.software for sonic-mellanox.bin.

After the installation, the switch is automatically configured with all 16 ports as routers with a preset IP from 10.0.0.0 to 10.0.0.30. This rendered the switch useless since no two ports can talk with each other.

The following steps resolved the issue and made it a normal dummy switch:

Once log in the management interface:

Step 1: sudo bash # to bring a root interface;

Step 2: config vlan add 100 # This create a vlan;

Step 3: config interface ip remove Ethernet0 10.0.0.0 # do this to all 16 interfaces to remove their router status;

Step 4: config vlan member add 100 -u Ethernet0 # do this to all 16 interfaces to group all ports to a vlan;

Step 5: config clock timezone America/New_York # set time zone of the switch;

Step 6: config clock YYYY-MM-DD HH:MM:SS # set time;

Step 7: cp /etc/sonic/config_db.json ~/ # make a backup of the initial configuration;

Step 8: config save -y # save current configuration;

Step 9: reboot

Then you will get a working switch.

Certainly, do not forget to reset the management password and management port (eth0) ip.

Enjoy.

6020095 · 2^6972593 – 1 is prime

By: | Comments: No Comments

Posted in categories: Uncategorized

It has 2098967 decimal digits. See

record page here.

3^1681130 + 3^445781 + 1 is prime.

By: | Comments: No Comments

Posted in categories: Fun Stuffs, Prime Search

Proof file:

HERE Record page here.

3^1681130+3^445781+1 has 802,104 digits

factor of p-1:

3^1681130+3^445781 = 3^445781*(3^1235349+1)

for 3^1235349+1:

Divisor of 1235349: {1, 3, 9, 317, 433, 951, 1299, 2853, 3897, 137261, 411783, 1235349}

for which Cyclotomic[2x,3] divides 3^1235349+1, 3^1235349+1=Product of Phi[2m,3], m is in the above divisor list.

Phi: Cyclotomic function in Mathematica.

Phi[2,3] = 4 = 2^2

Phi[6,3] = 7

Phi[18,3] = 703 = 19*37

Phi[634,3] = 4419546979734297356356282440566337101076156046659615868982274149497888767412192670918210853094818317044028461371947192040491503227984348283966155849541 = 76824655095930309016347566008213507 *
57527716515168805301518081894378217354338616431161859310399702927298973147371258119477233264821539853770687247299863

Phi[866,3] = 98049030167138868235272337490364772443278341679136020523778017973332584069900487425071613559614188734321215787769494533118401761096043098454024530209394531738061292786842020371958081509924820005117783644881 = 5197 * 25981 * 10986786121 * 2545641110123181683038777028652229 * 4078872265954752990198205964113561121059 * 48719802170502964448205757292294067215837963275487416471 * 130653562236213178463808841348368553389712956028240802263033

Phi[1902,3] = 454579 * 36324397 * 4242031875148351 * 110636908354198084399 * 549512115983126575855924649876601277813687417 * cp_1681130_+3_445781_c1_+1_P1902_c209[ecm50]

Phi[2598,3] = 49363 * 69183924850105534244247847 * 590120330956087803199 * 24518336107758964004567688595327267 * 72294725439409852600619119163684989 *
cp_1681130_+3_445781_c1_+1_P2598_c292[ecm40]

Phi[5706,3] = 958609 * 1061317 * 105429763 * 65641802577791850271 * 466581320825998331248111 * 161770166547973894056628560367 * 4121313630402929050856881 * 35041165225208415265755372540106879 * 145166850946125723698625301282936985977 *
cp_1681130_+3_445781_c1_+1_P5706_c715[ecm40]

Phi[7794,3] = 2369645352847 * 18271583019418565050835779232047 *
cp_1681130_+3_445781_c1_+1_P7794_c1194[ecm40]

Phi[274522,3] = cp_1681130_+3_445781_c1_+1_P274522_c65133[ecm30 – done 20220828]
Phi[823566,3] = 48031581409153 * cp_1681130_+3_445781_c1_+1_P823566_c130252[ecm20]

Phi[2470698,3] = 59296753 * 2841667923519757 * cp_1681130_+3_445781_c1_+1_P2470698_c390774[ecmpm]

Factors of p-1:
2^2
3^445781
7
19
37
5197
25981
49363
454579
958609
1061317
36324397
59296753
105429763
10986786121
2369645352847
48031581409153
2841667923519757
4242031875148351
65641802577791850271

110636908354198084399
590120330956087803199
466581320825998331248111
4121313630402929050856881
69183924850105534244247847
161770166547973894056628560367
18271583019418565050835779232047
2545641110123181683038777028652229
24518336107758964004567688595327267
35041165225208415265755372540106879
72294725439409852600619119163684989
76824655095930309016347566008213507
145166850946125723698625301282936985977
4078872265954752990198205964113561121059
549512115983126575855924649876601277813687417
48719802170502964448205757292294067215837963275487416471
130653562236213178463808841348368553389712956028240802263033
57527716515168805301518081894378217354338616431161859310399702927298973147371258119477233264821539853770687247299863

Factors of p-1: [2^23^445781*7*19*37*5197*25981*49363*454579*958609*1061317*36324397*59296753*105429763*10986786121*2369645352847*48031581409153*2841667923519757*4242031875148351*65641802577791850271*110636908354198084399*590120330956087803199*466581320825998331248111*4121313630402929050856881*69183924850105534244247847*161770166547973894056628560367*18271583019418565050835779232047*2545641110123181683038777028652229*24518336107758964004567688595327267*35041165225208415265755372540106879*72294725439409852600619119163684989*76824655095930309016347566008213507*145166850946125723698625301282936985977*4078872265954752990198205964113561121059*549512115983126575855924649876601277813687417*48719802170502964448205757292294067215837963275487416471*130653562236213178463808841348368553389712956028240802263033*57527716515168805301518081894378217354338616431161859310399702927298973147371258119477233264821539853770687247299863]

$ ./pfgw -tc -k -h”cp_1681130_+3_445781_c1_+1.helper” cp_1681130_+3_445781_c1_+1
Primality testing 3^1681130+3^445781+1 [N-1/N+1, Brillhart-Lehmer-Selfridge]
Reading factors from helper file cp_1681130_+3_445781_c1_+1.helper
Running N-1 test using base 2
Running N+1 test using discriminant 5, base 5+sqrt(5)
3^1681130+3^445781+1 is Fermat and Lucas PRP! (72918.0430s+0.0527s)

$ gp
Reading GPRC: /etc/gprc …Done.

                                                                                      GP/PARI CALCULATOR Version 2.11.3 (released)
                                                                              amd64 running linux (x86-64/GMP-6.1.2 kernel) 64-bit version
                                                                        compiled: Apr  6 2020, gcc version 8.3.1 20190507 (Red Hat 8.3.1-4) (GCC)
                                                                                                threading engine: single
                                                                                     (readline v7.0 enabled, extended help enabled)

                                                                                         Copyright (C) 2000-2018 The PARI Group

PARI/GP is free software, covered by the GNU General Public License, and comes WITHOUT ANY WARRANTY WHATSOEVER.

Type ? for help, \q to quit.
Type ?17 for how to get moral (and possibly technical) support.

parisize = 8000000, primelimit = 500000
? \r CHG.GP
*** Warning: new stack size = 17179869184 (16384.000 Mbytes).
realprecision = 350003 significant digits (350000 digits displayed)

Welcome to the CHG primality prover!

Input file is: cp_1681130_+3_445781_c1_+1.in
Certificate file is: cp_1681130_+3_445781_c1_+1.out
Found values of n, F and G.
Number to be tested has 802103 digits.
Modulus has 213408 digits.
Modulus is 26.605944113118165830% of n.

NOTICE: This program assumes that n has passed
a BLS PRP-test with n, F, and G as given. If
not, then any results will be invalid!

Square test passed for F >> G. Using modified right endpoint.

Search for factors congruent to 1.
Running CHG with h = 16, u = 7. Right endpoint has 161883 digits.
Done! Time elapsed: 810732137ms.
Running CHG with h = 16, u = 7. Right endpoint has 155873 digits.
Done! Time elapsed: 825907227ms.
Running CHG with h = 15, u = 6. Right endpoint has 149984 digits.
Done! Time elapsed: 367063741ms.
Running CHG with h = 15, u = 6. Right endpoint has 141864 digits.
Done! Time elapsed: 403218994ms.
Running CHG with h = 13, u = 5. Right endpoint has 135499 digits.
Done! Time elapsed: 167649019ms.
Running CHG with h = 13, u = 5. Right endpoint has 126589 digits.
Done! Time elapsed: 166344631ms.
Running CHG with h = 11, u = 4. Right endpoint has 115818 digits.
Done! Time elapsed: 62966916ms.
Running CHG with h = 9, u = 3. Right endpoint has 102257 digits.
Done! Time elapsed: 21474045ms.
Running CHG with h = 9, u = 3. Right endpoint has 86036 digits.
Done! Time elapsed: 18674057ms.
Running CHG with h = 7, u = 2. Right endpoint has 57119 digits.
Done! Time elapsed: 4457878ms.
A certificate has been saved to the file: cp_1681130_+3_445781_c1_+1.out

Running David Broadhurst’s verifier on the saved certificate…

Testing a PRP called “cp_1681130_+3_445781_c1_+1.in”.

Pol[1, 1] with [h, u]=[7, 2] has ratio=3.292364120998444652 E-98758 at X, ratio=3.718585580771175970 E-194048 at Y, witness=2.
Pol[2, 1] with [h, u]=[9, 3] has ratio=6.126565950304076461 E-9475 at X, ratio=1.1209288730275909543 E-86752 at Y, witness=2.
Pol[3, 1] with [h, u]=[9, 3] has ratio=4.686378714776338379 E-48664 at X, ratio=4.686378714776338379 E-48664 at Y, witness=2.
Pol[4, 1] with [h, u]=[11, 4] has ratio=1.1328882856917741933 E-83833 at X, ratio=4.446338007106078208 E-54243 at Y, witness=2.
Pol[5, 1] with [h, u]=[10, 5] has ratio=1.3491587713343525002 E-21542 at X, ratio=2.1142563956459774573 E-53855 at Y, witness=2.
Pol[6, 1] with [h, u]=[10, 5] has ratio=2.1142563956459774573 E-53855 at X, ratio=7.365995469140945291 E-44554 at Y, witness=2.
Pol[7, 1] with [h, u]=[15, 6] has ratio=2.101399014710577216 E-89107 at X, ratio=2.654191320795251087 E-38189 at Y, witness=2.
Pol[8, 1] with [h, u]=[13, 6] has ratio=3.530699982729085833 E-8120 at X, ratio=1.9371573230603122010 E-48717 at Y, witness=2.
Pol[9, 1] with [h, u]=[16, 7] has ratio=5.410890206996781797 E-22058 at X, ratio=1.0071182035514125678 E-41222 at Y, witness=2.
Pol[10, 1] with [h, u]=[16, 7] has ratio=5.814090534021862076 E-14617 at X, ratio=8.275867222128713413 E-42071 at Y, witness=2.

Validated in 116 sec.

Congratulations! n is prime!
Goodbye!

NFS_v4 stall, how to rescure without reboot

By: | Comments: No Comments

Posted in categories: Computer Tips, Work related

Sometimes your NFS mount stopped response. You cannot umount normally, if you use umount -l to unmount it, you cannot mount it again, Mounting attempt stalls too. The suggestion on the interenet was to just reboot the client. But you may have important jobs or services on it and you really do not want to.

Here is a rescue:

After you unmounted the volume using umount -l, edit your /etc/fstab file, add vers=3 in front of the “defaults,_netdev” or whatever nfs mounting option column of the stalled volume, save it. Then you try mounting it again, and it is mounted.

This is not the end of story. There are some important new features from NFS_v4 that you do not want to permanently backup to NFS_v3. And someday the vers=3 mount may stall again.

Do not worry. Actually as long as the nfs server is still working, the calls that caused nfs stall will resolve or time out slowly. Wait a day or two, then you go back to the nfs client node, umount the volume mounted with vers=3. If it is in use, use umount -l to unmount it. Then you go editing the /etc/fstab file again, remove the vers=3 instruction, save it. Then you go back to your prompt and mount the volume again. As long as all pending calls that caused the original nfs stall are resolved, you will be able to mount the volume by default version, NFS_v4, again.

This process can be repeated is similar issue happens again, and you are free of being forced to reboot the nfs client box.

dnf install/update fail with error rpmdbNextIterator skipping Header V3 RSA/SHA1 Signature BAD

By: | Comments: No Comments

Posted in categories: Computer Tips, Work related

Sometimes when an dnf update was unexpectedly terminated like a power failure, the system may leave an inconsistent rpm database. If this happens, in most case by rebuilding rpm database (rpmdb –rebuilddb), and to use (package-cleanup –cleandupes), it could be fixed.

For some severe case, after both, when you are trying to run (dnf update –allowerasing), it still shows this kind of errors:

Running transaction check
error: rpmdbNextIterator: skipping h# ***
Header V3 RSA/SHA1 Signature, key ID ********: BAD
Header SHA256 digest: OK
Header SHA1 digest: OK

And the install/update cannot go through.

If you use (rpm -qa | grep <affected-package>), it shows that the package is not installed. Reinstalling the package using (rpm -Uvh <downloaded package name>) does not help.

This means that a package was erased in the interrupted dnf install/update but the record was not removed from the rpm database. You will need to do

rpm -e –justdb –noscripts –nosignature <package name>

to remove it.

However, sometimes the blocking package name cannot be found in the rpm database.

Here is a clue: when you are doing dnf install, it spits off errors like

Error: transaction check vs depsolve:
libxxx.so.x()(64bit) is needed by <package name>

This shows what lib file is missing.

You can go to internet to google the rpm package of the libxxx.so.x to find which package for your linux distro provides the missing lib file. Then run the

rpm -e –justdb –noscripts –nosignature <package name>

to remove the missing package.

After this, the dnf should function properly thereafter.

ssh command does not exit when X11 forwarding is enabled

By: | Comments: No Comments

Posted in categories: Computer Tips, Work related

If ForwadX11 and ForwardX11Trusted are both enabled in /etc/ssh/ssh_config, when you do “ssh host” it will invoke X11 forwarding by default. This is convenient when you frequently need to run programs with GUI on remote host. However, with X11 forwarding, if you are simply running a text mode command/script via ssh, like “ssh node ls” to list files on a remote host, it may not exit properly with X11 forwarding.

To avoid this, the simplest way is to disable X11 forwarding in the ssh command using -x option: using “ssh -x node command” instead of “ssh node command”.

ssh pass password to sudo

By: | Comments: No Comments

Posted in categories: Computer Tips, Work related

echo <your password> | ssh <node name> “sudo -S <command>”

Recovering LVM from disk image

By: | Comments: No Comments

Posted in categories: Computer Tips, Work related

For a RAID NAS box, when the NAS box dies, the data on the drives are still intact. They can be retrieved in the following steps:

  1. dd disks containing LVM to hard drive; # create images of hard drives
  2. losetup /dev/loopx disks.img; # make images appear as loop devices on OS
  3. partx -v -add /dev/loopx; # make partitions on the loopback devices available to OS
  4. vgscan, vgimport -a; # to import Virtual Groups
  5. vgdisplay; # showing imported VG names
  6. vgchange -a y vgname; # to activate VG
  7. fsck /dev/mapper/vgname; # to check file system of VG
  8. fuseext2 -o ro -o sync_read /dev/mapper/vgname /mounting_point; # to mount the VG
  9. rsync -av –progress /mounting_point/ /destination/

Then you will get everything at your destination, and then you can safely remove the images.

bash script does not run command

By: | Comments: No Comments

Posted in categories: Computer Tips, Work related

Sometimes a bash script does not run the commands as cron or start up scripts that usually run when you debugging it. This may be caused by that the embedded running environment does not provide proper search path. To avoid this, always use full path to the executibles.