Steve's Blog

Multiple Watchdog handler

Recently, I’ve been having a problem with kernel panics beyond kernel 6.3.7 which causes a hard hang of the system.

So, the first thing to do was set up a watchdog to reset the system after 60 seconds with nothing feeding the it. At that point, the system would reset and wouldn’t need me to manually reboot it each time.

The problem is, the default watchdog daemon can only handle a single watchdog - and I want to activate two.

Sounds like time for another simple perl script!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
#!/usr/bin/perl
use strict;
use warnings;
use POSIX qw(nice);
$|++;

my @watchdogs = glob ("/dev/watchdog?");

## Find the lowest timeout...
print "Finding lowest watchdog timeout...\n";
my $sleep_timeout = 60;
my @wd_timeouts = glob("/sys/class/watchdog/*/timeout");
for my $wd_timeout ( @wd_timeouts ) {
	open my $fh, '<', $wd_timeout;
	my $timeout = do { local $/; <$fh> };
	close $fh;
	print "Timeout $wd_timeout = $timeout";
	if ( $timeout < $sleep_timeout ) {
		$sleep_timeout = $timeout;
	}
}

## Half the timeout to ensure reliability
$sleep_timeout = $sleep_timeout / 2;
print "Using final timeout of $sleep_timeout\n";

nice(-19);
$SIG{INT}  = \&signal_handler;
$SIG{TERM} = \&signal_handler;

## Open the file handles...
my @fhs;
for my $watchdog ( @watchdogs ) {
	print "Opening: $watchdog\n";
	open(my $fh, ">", $watchdog);
	$fh->autoflush(1);
	my $device = {
		device	=> $watchdog,
		fh	=> $fh,
	};
	push @fhs, $device;
}

## Start feeding the watchdogs.
while (1) {
    for my $watchdog ( @fhs ) {
        #print "Feeding: " . $watchdog->{"device"} . "\n";
        my $fh = $watchdog->{"fh"};
        print $fh ".\n";
    }
    #print "Sleeping $sleep_timeout seconds...\n";
    sleep $sleep_timeout;
}

sub signal_handler {
    for my $watchdog ( @fhs ) {
        print "Sending STOP to " . $watchdog->{"device"} . "\n";
        my $fh = $watchdog->{"fh"};
        print $fh "V";
    }
    exit 0;
}

This script will scan for the lowest timeout across all watchdogs installed in the system, and then feed them at 1/2 the watchdog timeout rate.

It can be started with a simple systemd unit:

1
2
3
4
5
6
7
8
9
10
11
12
[Unit]
Description=Run watchdog feeder

[Service]
Type=simple
ExecStart=/usr/local/bin/watchdog.pl
Restart=always
CPUSchedulingPolicy=fifo
CPUSchedulingPriority=99

[Install]
WantedBy=multi-user.target

When the program stops, it sends the magic STOP command to the watchdog so a stopped service won’t trigger a system reset.

Nice and simple.

Setting up MariaDB Replicas - the easy way

The MariaDB page on replication is good, but I feel it lacks a few details to make things easier. Specifically, in moving the data between the master and slave to be able to get the replica running with as little effort as possible.

If we assume that the Master has been configured with the steps in the MariaDB Guide, we can then look at how to get data to the slave for the initial replication to happen.

In my configuration, I use a master server already configured with SSL - you should really do the same for your master BEFORE you set up any replication. I use a LetsEncrypt certificate and this reference.

Using the script below, we can skip the Getting the Master’s Binary Log Co-ordinates step, and export the GTID in a dump - and import that into the new slave in the one command. When running, mariadb-dump will automatically lock the database tables, and unlock again after the transfer has completed.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
#!/bin/bash
declare -x MYSQL_PWD="MySqlRootPassword"
declare -x MARIA_MASTER="my.master.server.example.com"
declare -x MARIA_REPL_USER="replication_user"
declare -x MARIA_REPL_PASS="replication_password"

echo "Stopping the local slave (if running)..."
mysql -e "stop slave;"

echo "Transferring initial dataset... Please wait..."
mariadb-dump -A --gtid \
	--add-drop-database \
	--compress=true \
	--master-data=1 \
	--include-master-host-port \
	--ssl=true \
	-h $MARIA_MASTER \
	-u root | mysql

echo "Configuring slave..."
mysql -e "
CHANGE MASTER TO
    MASTER_HOST=\"$MARIA_MASTER\",
    MASTER_USER=\"$MARIA_REPL_USER\",
    MASTER_PASSWORD=\"$MARIA_REPL_PASS\",
    MASTER_PORT=3306,
    MASTER_SSL=1,
    MASTER_SSL_CA=\"/etc/pki/tls/certs/ca-bundle.trust.crt\",
    MASTER_SSL_VERIFY_SERVER_CERT=1;
start slave;
"

After this script completes, you can then check the status of the slave - and confirm the values of Slave_IO_Running and Slave_SQL_Running with:

1
SHOW SLAVE STATUS \G;

Keep this script handy, as if replication breaks for whatever reason, you can run it again to resync to the master server, and the existing databases on the slave will get dropped as the import happens. Keep in mind though that it won’t drop databases that don’t exist on the master anymore.

NOTE: If you have a large or busy database, you might be better served using the mariabackup tool. This tool will make a local export of all the data to allow you to transfer it out-of-band and therefore reduce the amount of time the master database is locked. MariaDB have a guide to using this tool here. While its more steps, your locking time will be greatly reduced.

I also use the following on the replica in /etc/my.cnf.d/replication.cnf to configure the slave:

1
2
3
[mariadb]
slave_compressed_protocol = 1
log-basename = <slave hostname>

Change <slave hostname> to the hostname of the configured slave. This will use compression for the slave, which is helpful for replication over WAN connections, and setting log-basename will ensure that if the slave host changes its name at some point in the future, that replication won’t break.

Automating Secondary DNS servers

When running several name-servers, it can be difficult to configure which domains end up on them. There’s multiple ways - copying config files, getting a config snippet from a web site regularly, or having a deployment script. All of which will break at some point in time and leave you with a semi-functional name server.

Wouldn’t it be great if we could use DNS to configure DNS?

This is a great use for TXT records - or misuse - depending on how pure you want to be ;)

How good would it be to be able to have a TXT record that contains what zone files a secondary DNS should have in its config file by referencing a DNS entry?

So here’s a script to do just that.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
#!/usr/bin/perl
# vim: set ts=4:
use strict;
use warnings;
use Net::DNS;

my $outputfile = "/etc/named/secondary_domains.conf";
my $output = "";

my $header = '
masters dns-masters {
	1.2.3.4;
};
';

my $entry_template = '
zone "ZONE" IN {
	type		slave;
	file		"/var/named/slaves/FILE";
	masters		{ dns-masters; };
};
';

my $resolver = Net::DNS::Resolver->new;
$resolver->nameservers("8.8.8.8");
my $reply = $resolver->query("secondary_domains.example.com", "TXT");

if ($reply) {
	$output = $header;
	foreach my $rr ($reply->answer) {
		foreach my $txt ( $rr->txtdata ) {
			my $entry = $entry_template;
			$entry =~ s/ZONE/$txt/g;

			## Have a sane filename...
			$txt =~ s@/@_@g;
			$entry =~ s/FILE/$txt/g;
			$output = $output . $entry;
		}
	}

	## Write file to disk.
	open(FH, '>', $outputfile) or die $!;
	print FH $output;
	close(FH);

	## Find which systemd unit we use...
	my $service = "named-chroot.service";
	if ( -f "/etc/systemd/system/multi-user.target.wants/named.service" ) {
		$service = "named.service";
	}
	system("systemctl reload $service");
} else {
	warn "query failed: ", $resolver->errorstring, "\n";
}

Then in your /etc/named.conf config file, include the generated /etc/named/secondary_domains.conf with the following at the bottom of the file.

1
include "/etc/named/secondary_domains.conf";

Get cron or a systemd timer to run the perl script once an hour or so, and you’ll be quickly adding / removing entire zones from your secondary DNS servers with ease.

On a second part, because this will query the 8.8.8.8 name-server for the TXT record, as long as one DNS server can respond with the correct entry, your secondary will be able to generate a new configuration file.

On your primary (normally a hidden master), you will add a TXT record to the zone file as follows:

1
secondary_domains.example.com.   1800 IN  TXT "domain1.com" "domain2.com" "domain3.com"

You can adjust your TTL, paths and other items to reflect your implementations. This is also simple enough that it will allow you to run a secondary DNS server on the free-tier cloud platforms like GCP.

What are the limitations of this approach? Well, once you get over 64Kb worth of domain names, you’ll have to either split the TXT records and implement something like a "include:record_b.example.com" and loop over that as well for another 64Kb worth of text, or loop over a counter like secondary-1.example.com, secondary-2.example.com until you get an NXDOMAIN reply.

You could also change the domain name to be something that only exists on the hidden master server - and not on the wider internet and query the master directly to get the list of domains. This has the advantage that the list of domain names included can’t become public.

There’s probably more variations that can be done to further fine-tune this approach, but this is a good, functional start.

Netgear GS728TP CLI and scripting

It’s been a long time between posts. I can assure you I’m not dead. At least not that I know of. I’ve been doing some neat stuff, but have been too lazy to post any of it - because as any dev will tell you - doing documentation is boring.

So, after upgrading my access points to use a Cisco Virtual Wireless Controller, I was using the web interface on my Netgear GS728TP - but it really sucks. It’s slow. It’s really slow - but it has a fully functioning CLI under the hood. However, the SSH server is old as well, so it needs a whole heap of legacy enablement in the SSH client to function. It is an EoL product, being first released in November 2017 from what I can tell. However they’re cheap to find second hand, and have 8x POE+ ports - which means they’ll run your high power PoE kit.

I did some work with expect - and came up with this script to do some pretty common functions.

Save as ~/bin/switch.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
#!/usr/bin/expect
set switch_password "<switch password>"
set switch_login "admin@<switch ip>"
set TFTPSERVER "<tftp server ip>"
set switch_ssh_opts "-oPubKeyAcceptedAlgorithms=ssh-rsa -oRequiredRSASize=1024 -oKexAlgorithms=diffie-hellman-group1-sha1 -oHostKeyAlgorithms=ssh-rsa,ssh-dss -oCiphers=aes128-cbc,aes128-ctr,aes256-ctr"
set commands(0) "terminal datadump"
set commands(1) "terminal width 0"
set timeout 120

set func [lindex $argv 0]
switch $func {
	if {
		set arg [lindex $argv 1]
		switch $arg {
			description {
				append commands([array size commands]) "show interfaces description"
			}
			detail {
				set port [lindex $argv 2]
				append commands([array size commands]) "show interfaces switchport ge $port"
			}
			reset {
				set port [lindex $argv 2]
				append commands([array size commands]) "config"
				append commands([array size commands]) "interface GigabitEthernet $port"
				append commands([array size commands]) "shutdown"
				append commands([array size commands]) "no shutdown"
				append commands([array size commands]) "exit\rexit\r"
			}
			summary {
				append commands([array size commands]) "show interfaces status detailed"
			}
			default {
				puts "Subcommands:\r"
				puts "\tdescription\r"
				puts "\tdetail <port>\r"
				puts "\treset <port>\r"
				puts "\tsummary\r"
				exit 0
			}
		}
	}
	poe {
		set arg [lindex $argv 1]
		switch $arg {
			consumption {
				append commands([array size commands]) "show power inline consumption"
			}	
			off {
				set port [lindex $argv 2]
				append commands([array size commands]) "config"
				append commands([array size commands]) "interface GigabitEthernet $port"
				append commands([array size commands]) "power inline never"
				append commands([array size commands]) "exit\rexit\r"
			}
			on {
				set port [lindex $argv 2]
				append commands([array size commands]) "config"
				append commands([array size commands]) "interface GigabitEthernet $port"
				append commands([array size commands]) "power inline auto"
				append commands([array size commands]) "exit\rexit\r"
			}
			reset {
				set port [lindex $argv 2]
				append commands([array size commands]) "config"
				append commands([array size commands]) "interface GigabitEthernet $port"
				append commands([array size commands]) "power inline never"
				append commands([array size commands]) "power inline auto"
				append commands([array size commands]) "exit\rexit\r"
			}
			summary {
				append commands([array size commands]) "show power inline"
			}	
			default {
				puts "Subcommands:\r"
				puts "\tconsumption\r"
				puts "\toff <port>\r"
				puts "\ton <port>\r"
				puts "\treset <port>\r"
				puts "\tsummary\r"
				exit 0
			}
		}
	}
	reboot {
		send -- "reload\r"
		expect "(Y/N)"
		send -- "y\r"
		interact
	}
	saveconfig {
		set DATE [clock format [clock seconds] -format {%Y-%m-%d_%H%M}]
		append commands([array size commands]) "copy running-config tftp://$TFTPSERVER/gs728tp/running-config-$DATE.txt"
	}
	vlan {
		append commands([array size commands]) "show vlan"
	}
	default {
		puts "Commands:\r"
		puts "\tif <cmd>\r"
		puts "\tpoe <cmd>\r"
		puts "\treboot\r"
		puts "\tsaveconfig\r"
		puts "\tvlan\r"
		exit
	}
}

eval spawn ssh $switch_ssh_opts $switch_login

expect "assword:"
send -- "$switch_password\r"
for { set index 0 }  { $index < [array size commands] }  { incr index } {
	expect "#"
	send -- "$commands($index)\r"
}
expect "\r\nconsole#"
send -- "exit\r"
expect eof

This gives me a nice little easy to use command set for some very basic stuff - however, if you’re like me with a million scripts, I also forget the options :)

As such, I took a dive into the bash-completion universe and managed to come up with this.

Save as ~/.bash_completion.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
#/usr/bin/env bash
_switch_completions()
{
  local cur prev

  COMPREPLY=()
  cur=${COMP_WORDS[COMP_CWORD]}
  prev=${COMP_WORDS[COMP_CWORD-1]}

  if [ $COMP_CWORD -eq 1 ]; then
    COMPREPLY=( $(compgen -W "if poe reboot saveconfig vlan" -- $cur) )
  elif [ $COMP_CWORD -eq 2 ]; then
    case "$prev" in
      "if")
        COMPREPLY=( $(compgen -W "description detail reset summary" -- $cur) )
        ;;
      "poe")
        COMPREPLY=( $(compgen -W "consumption off on reset summary" -- $cur) )
        ;;
      *)
        ;;
    esac
  fi

  return 0
} &&
complete -F _switch_completions switch

As an example, if I run switch if summary, I’ll get the following output:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
spawn ssh -oPubKeyAcceptedAlgorithms=ssh-rsa -oRequiredRSASize=1024 -oKexAlgorithms=diffie-hellman-group1-sha1 -oHostKeyAlgorithms=ssh-rsa,ssh-dss -oCiphers=aes128-cbc,aes128-ctr,aes256-ctr admin@<switch-ip>
admin@<switch-ip>'s password:

console#terminal datadump
console#terminal width 0
console#show interfaces status detailed
                                             Flow Link          Back   Mdix
Port     Type         Duplex  Speed Neg      ctrl State       Pressure Mode
-------- ------------ ------  ----- -------- ---- ----------- -------- -------
g1       1G-Copper    Full    1000  Disabled Off  Up          Disabled On     
g2       1G-Copper      --      --     --     --  Down           --     --    
g3       1G-Copper      --      --     --     --  Down           --     --    
g4       1G-Copper      --      --     --     --  Down           --     --    
g5       1G-Copper      --      --     --     --  Down           --     --    
g6       1G-Copper      --      --     --     --  Down           --     --    
g7       1G-Copper      --      --     --     --  Down           --     --    
g8       1G-Copper      --      --     --     --  Down           --     --    
g9       1G-Copper    Full    1000  Disabled Off  Up          Disabled On     
g10      1G-Copper    Full    1000  Disabled Off  Up          Disabled On     
g11      1G-Copper    Full    1000  Disabled Off  Up          Disabled On     
g12      1G-Copper    Full    1000  Disabled Off  Up          Disabled On     
g13      1G-Copper    Full    100   Enabled  Off  Up          Disabled On     
g14      1G-Copper    Full    100   Enabled  Off  Up          Disabled Off    
g15      1G-Copper      --      --     --     --  Down           --     --    
g16      1G-Copper      --      --     --     --  Down           --     --    
g17      1G-Copper      --      --     --     --  Down           --     --    
g18      1G-Copper      --      --     --     --  Down           --     --    
g19      1G-Copper      --      --     --     --  Down           --     --    
g20      1G-Copper      --      --     --     --  Down           --     --    
g21      1G-Copper      --      --     --     --  Down           --     --    
g22      1G-Copper      --      --     --     --  Down           --     --    
g23      1G-Copper    Full    1000  Enabled  Off  Up          Disabled On     
g24      1G-Copper    Full    100   Enabled  Off  Up          Disabled Off    
g25      1G-Fiber       --      --     --     --  Down           --     --    
g26      1G-Fiber       --      --     --     --  Down           --     --    
g27      1G-Fiber       --      --     --     --  Down           --     --    
g28      1G-Fiber       --      --     --     --  Down           --     --    

                                          Flow    Link        
Ch       Type    Duplex  Speed  Neg      control  State       
-------- ------- ------  -----  -------- -------  ----------- 
po 1        --     --      --      --       --    Not Present 
po 2        --     --      --      --       --    Not Present 
po 3        --     --      --      --       --    Not Present 
po 4        --     --      --      --       --    Not Present 
po 5        --     --      --      --       --    Not Present 
po 6        --     --      --      --       --    Not Present 
po 7        --     --      --      --       --    Not Present 
po 8        --     --      --      --       --    Not Present 
console#exit

So now, not only do I have an easy way to bring up basic details of the switch, when I forget what options do what, you can just use the TAB auto-completion in bash. Yay, yet another thing I don’t have to remember in the day-to-day :)

Upgrading an Ender 3 hotbed

For a long time now, I’ve been annoyed at how long it takes for the stock 24v / 120W hotbed to get to 100C to start printing ABS.

Normally, you set the bed heating up, then wait a while, then download your model, slice it, send it to the printer and you’d be almost at printing temperature.

This needs to be faster.

I got a 500W 240v AC powered heater and to control it, some SSR-D3808 solid state relays.

The SSR-D3808 is good for 8A at 24-380vAC - which is waaaaay more than what the hotbed will ever draw - but its only $0.10USD more expensive than the 5A version. More is better.

Before I go any further, I have to give this warning. This modification plays with mains voltage power. It can kill you. It can also hurt you the entire time that its killing you. If you’re not comfortable with that, read the rest of this page, go “huh, that’s cool” and get someone else to do it for you.

To wire this up, we want to run the + / - from the output of the control board to the + / - terminals on the solid state relay. In a nutshell, this:

Basic Circuit Diagram

You’ll then need to do some modifications in Marlin’s Configuration - or use my trusty Firmware Builder!

Set:

  • Hotbed Thermistor Type (TEMP_SENSOR_BED) to 11 (100k beta 3950 1% thermistor)
  • Enable PIDTEMPBED

This will ensure you are able to accurately control temperatures with the additional power. Using the stock BANG BANG method, I overshot target temperatures by 6 or more degrees.

When you’ve flashed that firmware, make sure you do a PID Tune on the bed using M303 E-1 C10 S100. This will cycle around the 100C target temperature 10 times and then give you some Kp, Ki, and Kd values. Set these in Marlin via M304 P35.63 I6.94 D121.86 - but remember to replace the values here with ones for your setup.

Finally, save your configuration to EEPROM using M500.

Enjoy the faster heating speeds :)

Octoprint