Friday, April 17, 2026

Managing kubeconfig files + download from Rancher

Managing kubeconfig files across multiple Rancher clusters

When you manage several Kubernetes clusters through Rancher, you quickly end up with a pile of kubeconfig files. Downloading them by hand from the Rancher UI is tedious, keeping track of which ones are loaded is error-prone, and testing connectivity after a VPN change or token rotation is a chore.

Here is the setup I landed on: a download script, a directory convention, auto-discovery in the shell, and a parallel connectivity tester.

  1. Directory layout
  2. Downloading kubeconfigs from Rancher
  3. Auto-discovery in .bashrc
  4. Testing connectivity
  5. Putting it all together

Directory layout

~/.kube/
  config                        # default kubectl config (GKE, kind, etc.)
  clusters/
    rancher_kubeconfig_dl.sh    # download script
    test-kubeconfigs.sh         # connectivity tester
    local.yaml                  # kubeconfig files live here
    cluster-us-east-1.yaml
    cluster-us-west-2.yaml
    cluster-eu-west-1.yaml
    cluster-ap-south-1.yaml

The scripts and kubeconfig files all live in ~/.kube/clusters/. The test script filters by kind: Config so it ignores non-kubeconfig files like itself.

Downloading kubeconfigs from Rancher

Rancher exposes a generateKubeconfig action on its v3 API. The following script lists all clusters your token has access to and downloads each kubeconfig into a directory.

First, create an API key in Rancher: top-right avatar, Account & API Keys, Create API Key. You get a token like token-xxxxx:yyyyyyy.

#!/usr/bin/env bash
# rancher_kubeconfig_dl.sh
# Downloads all kubeconfig YAML files from a Rancher instance.
#
# Usage:
#   RANCHER_TOKEN=token-xxxxx:yyyyyyy ./rancher_kubeconfig_dl.sh
#
# To get a token manually:
#   Rancher UI -> top-right avatar -> Account & API Keys -> Create API Key

set -euo pipefail

RANCHER_URL="${RANCHER_URL:-https://rancher.example.com}"
RANCHER_TOKEN="${RANCHER_TOKEN:-}"
OUTPUT_DIR="${OUTPUT_DIR:-.}"

# -- Auth --
if [[ -z "$RANCHER_TOKEN" ]]; then
  echo "ERROR: Set RANCHER_TOKEN (e.g. token-xxxxx:yyyyyyy)"
  echo "       Rancher UI -> Avatar -> Account & API Keys -> Create API Key"
  exit 1
fi

CURL_OPTS=(-sSf -H "Authorization: Bearer ${RANCHER_TOKEN}")

# -- Fetch cluster list --
echo "-> Fetching cluster list from ${RANCHER_URL} ..."
clusters_json=$(curl "${CURL_OPTS[@]}" "${RANCHER_URL}/v3/clusters")

tmp=$(mktemp)
trap 'rm -f "$tmp"' EXIT
echo "$clusters_json" | jq -r '.data[] | .id + "\t" + .name' > "$tmp"

count=$(wc -l < "$tmp" | tr -d ' ')
[[ "$count" -eq 0 ]] && { echo "No clusters found (check your token permissions)."; exit 1; }
echo "-> Found ${count} cluster(s)"
mkdir -p "$OUTPUT_DIR"

# -- Download kubeconfig per cluster --
while IFS=$(printf '\t') read -r id name; do
  safe=$(echo "$name" | tr -cs '[:alnum:]-_.' '-' | sed 's/-$//')
  out="${OUTPUT_DIR}/${safe}.yaml"
  echo "  ${name} (${id}) -> ${out}"
  curl "${CURL_OPTS[@]}" \
    -X POST \
    "${RANCHER_URL}/v3/clusters/${id}?action=generateKubeconfig" \
    | jq -r '.config' \
    > "$out"
done < "$tmp"

echo ""
echo "Done. Kubeconfigs saved to: ${OUTPUT_DIR}/"
ls -lh "$OUTPUT_DIR"

Run it once, or whenever you add a new cluster to Rancher:

RANCHER_TOKEN=token-xxxxx:yyyyyyy ./rancher_kubeconfig_dl.sh

Each cluster gets its own file, named after the cluster. No manual copy-pasting from the Rancher UI.

Auto-discovery in .bashrc

kubectl merges all files listed in the KUBECONFIG environment variable (colon-separated). Instead of maintaining a hardcoded list that goes stale every time you add or remove a cluster, use find to discover them at shell startup:

# In ~/.bashrc or ~/.zshrc
export KUBECONFIG=~/.kube/config:$(find ~/.kube/clusters -name '*.yaml' 2>/dev/null | tr '\n' ':' | sed 's/:$//')

This starts with the default ~/.kube/config (for GKE, kind, minikube, or anything else) and appends every YAML file found in the clusters directory. find, tr, and sed are all POSIX — this works on macOS, Linux, and BSDs. Avoid find -printf which is a GNU extension and won't work on macOS.

Drop a new file in, open a new terminal, and kubectx sees it immediately. Remove a file and it disappears. No editing required.

Testing connectivity

After a VPN reconnect, a token rotation, or just to check that everything is reachable, run the test script. It finds all kubeconfig files in a directory, extracts every context from each file, and tests them all in parallel:

#!/usr/bin/env bash
# test-kubeconfigs.sh
# Test connectivity for all kubeconfig YAML files in a directory.
# Usage: ./test-kubeconfigs.sh [directory]
#   directory: path containing kubeconfig YAML files (default: script directory)

set -euo pipefail
shopt -s nullglob

dir="${1:-$(dirname "$0")}"

if [[ ! -d "$dir" ]]; then
  echo "Error: '$dir' is not a directory" >&2
  exit 1
fi

test_context() {
  local f="$1" ctx="$2" name
  name="$(basename "$f")"
  if kubectl --kubeconfig="$f" --context="$ctx" cluster-info --request-timeout=5s &>/dev/null; then
    echo "OK    $name  context=$ctx"
  else
    echo "FAIL  $name  context=$ctx"
  fi
}

pids=()
found=0

for f in "$dir"/*.yaml "$dir"/*.yml; do
  [[ -f "$f" ]] || continue
  grep -q 'kind: Config' "$f" 2>/dev/null || continue
  found=$((found + 1))

  contexts=$(kubectl --kubeconfig="$f" config get-contexts -o name 2>/dev/null)
  if [[ -z "$contexts" ]]; then
    echo "SKIP  $(basename "$f")  (no contexts found)" &
    pids+=($!)
    continue
  fi

  for ctx in $contexts; do
    test_context "$f" "$ctx" &
    pids+=($!)
  done
done

if [[ $found -eq 0 ]]; then
  echo "No kubeconfig files (kind: Config) found in $dir"
  exit 1
fi

fail=0
for pid in "${pids[@]}"; do
  wait "$pid" || fail=$((fail + 1))
done

pass=$(( ${#pids[@]} - fail ))
echo ""
echo "Results: $pass ok, $fail failed (from $found files)"
[[ $fail -eq 0 ]]

Each test runs as a background job, so checking six clusters takes as long as the slowest one (typically the 5-second timeout for an unreachable cluster), not six times that.

$ ./test-kubeconfigs.sh
OK    cluster-us-east-1.yaml   context=cluster-us-east-1
OK    cluster-us-west-2.yaml   context=cluster-us-west-2
OK    cluster-eu-west-1.yaml   context=cluster-eu-west-1
FAIL  cluster-ap-south-1.yaml  context=cluster-ap-south-1
OK    local.yaml               context=local

Results: 4 ok, 1 failed (from 5 files)

Putting it all together

The workflow is:

  1. Run rancher_kubeconfig_dl.sh once (or after adding clusters in Rancher) to download all kubeconfigs into ~/.kube/clusters/.
  2. Open a new terminal. The find-based KUBECONFIG export picks up all files automatically. kubectx lists every context.
  3. Run test-kubeconfigs.sh to verify connectivity to all clusters in parallel.

No manual list to maintain, no UI clicking, and a quick way to verify everything is reachable.

Wednesday, April 1, 2026

Reverse-engineering Ropvacnic S1 vacuum for homeassistant integration

Integrating the Ropvacnic S1 into Home Assistant via LocalTuya

The Ropvacnic S1 is a Tuya-based robot vacuum. Getting it to work locally in Home Assistant requires some digging — the DP numbers aren't documented anywhere, and LocalTuya has a bug in its locate function. This guide documents everything I found.

Why local control? Faster response, no cloud dependency, works if Tuya servers go down. The device still sends data to Tuya's cloud, but Home Assistant commands go directly over your LAN.

Setup: Home Assistant OS 2026.3.4 · LocalTuya 5.2.3 · Protocol 3.4 · Region: Western America


Step 1 — Tuya IoT Platform Setup

Create an account at iot.tuya.com (not tuya.com — click "Developer Platform" at the bottom). Create a Cloud Project with these settings:

  • Industry: Smart Home
  • Development Method: Smart Home
  • Data Center: Western America (for Canada/USA accounts)

Important: The Data Center must match your Tuya Smart app region. A mismatch causes a "Data center inconsistency" error when scanning the QR code.

Under Devices → Link Tuya App Account, scan the QR code with your Tuya Smart app. Then note:

  • Access ID — from the Overview tab
  • Access Secret — from the Overview tab
  • UID — from the linked account (starts with az...)

Step 2 — Discover DPs with tinytuya

A DP (Data Point) is a numbered channel that controls one specific function of the device. Install tinytuya on your computer to identify them all:

pip3 install tinytuya
python3 -m tinytuya wizard
# Enter your Access ID, Secret, UID and region (us)
 
python3 -m tinytuya snapshot
# Poll: Y → shows all live DP values

Complete DP Map — Ropvacnic S1

DP Code Type Values
1switchBooleantrue / false
2switch_goBooleantrue / false
3modeEnumstandby, random, wall_follow, spiral, chargego
4direction_controlEnumforward, backward, turn_left, turn_right, stop
5statusEnumstandby, smart_clean, goto_charge, charging, charge_done, paused
6residual_electricityInteger0–100 (%)
7clean_timeIntegerminutes
9clean_areaInteger
13seek (locate)Booleantrue / false
14suctionEnumgentle, normal, strong
17edge_brushInteger0–100 (%)
18filterInteger0–100 (%)
20water_controlEnumclosed, low, middle, high

Note: DP 13 (seek/locate) is not visible in the standard tinytuya output. It was found by brute-force testing DPs 10–16 using d.set_value(dp, True) and listening for the vacuum's beep.


Step 3 — LocalTuya Entity Configuration

Install LocalTuya via HACS, restart HA, then go to Settings → Devices & Services → Add Integration → LocalTuya.

Set Manual DPS to: 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25

Configure the entity with these values:

FieldValue
ID (status DP)5
Power DP (powergo_dp)1
Mode DP3
Battery DP6
Fan Speed DP14
Clean Time DP7
Clean Area DP9
Locate DP13
Fault DP(leave empty)
Idle Statusstandby,sleep
Docked Statuscharging,chargecompleted,charge_done
Returning Statusgoto_charge
Modes liststandby,random,wall_follow,spiral,chargego
Return home modechargego
Fan speeds listgentle,normal,strong
Pause statepaused
Stop statusstandby

Step 4 — Fix vacuum.py (LocalTuya bug)

LocalTuya 5.2.3 has a bug in async_locate — it sends an empty string "" instead of True, so the locate button does nothing. Fix it via the Terminal add-on:

sed -i 's/set_dp("", self._config\[CONF_LOCATE_DP\])/set_dp(True, self._config[CONF_LOCATE_DP])/g' \
  /config/custom_components/localtuya/vacuum.py
 
# Verify
grep -A3 "async_locate" /config/custom_components/localtuya/vacuum.py

Step 5 — Fix config via Terminal (if GUI doesn't save)

The LocalTuya GUI sometimes doesn't persist certain changes (notably mode_dp and id). Edit the config file directly:

# Backup first!
cp /config/.storage/core.config_entries \
   /config/.storage/core.config_entries.backup
 
# Status DP = 5 (reads DP 5 for state)
sed -i 's/"id":1,"idle_status_value":"standby,sleep"/"id":5,"idle_status_value":"standby,sleep"/g' \
  /config/.storage/core.config_entries
 
# powergo_dp must be 1 (start/pause)
sed -i 's/"powergo_dp":5/"powergo_dp":1/g' \
  /config/.storage/core.config_entries
 
# mode_dp = 3 (stop + return to base)
sed -i 's/"mode_dp":0/"mode_dp":3/g' \
  /config/.storage/core.config_entries
 
# locate_dp = 13
sed -i 's/"locate_dp":0/"locate_dp":13/g' \
  /config/.storage/core.config_entries
 
# Remove fault_dp entirely (causes false errors)
sed -i 's/,"fault_dp":[0-9]*//g' \
  /config/.storage/core.config_entries

After edits, reload LocalTuya: Settings → Devices & Services → LocalTuya → 3 dots → Reload — no full restart needed.


Final Result

After completing all steps, the Ropvacnic S1 is fully controllable from Home Assistant with local-only communication:

  • State correctly shows Docked, Cleaning, Returning, and Paused
  • Start, Pause, Stop, and Return to Base all work
  • Locate (beep) works after the vacuum.py fix
  • Battery percentage displayed in real time
  • Fan speed (gentle / normal / strong) selectable from HA
  • Clean time and area tracked per session

Friday, November 21, 2025

UN / unodc elearning



 https://elearningunodc.org/local/pages/?id=3

Thursday, September 11, 2025

Image magick, reduce size of jpg files by 50%

 

for file in *.jpg; do magick "$file" -resize 50% "resized/$file"; done

Monday, September 8, 2025

Ansible : GCP Google Cloud (Compute) ansible dynamic inventory with cache

Inventory definition GCP compute

---
plugin: google.cloud.gcp_compute
# https://docs.ansible.com/ansible/latest/collections/google/cloud/gcp_compute_inventory.html
# https://gitlab.com/gitlab-org/gitlab-environment-toolkit/-/blob/main/docs/environment_configure.md#google-cloud-platform-gcp

projects:
- project_id

auth_kind: serviceaccount
# must match `ansible_user` below, cf. other article on how to set this up
service_account_file: ./gcp-sa.json

filters:
# only return running instances, we won't be able to connect to sopped instances
- status = RUNNING
# for example, only return compute instances with label foo = foobar
- labels.foo = foobar

keyed_groups:
- key: labels
prefix: label

hostnames:
- name
- public_ip
- private_ip

compose:
#<ansible variable to be set> <data from gcp discovery>
# Set an inventory parameter to use the Public IP address to connect to the host
#ansible_host: public_ip
ansible_host: networkInterfaces[0].accessConfigs[0].natIP
ansible_user: "'sa_115528571027174573787'"

# GCP compute label "activate_this" value => ansible variable "run_this" value
run_this: labels['activate_this']


Thursday, August 28, 2025

Systemd healtcheck with side service and monotonic timer, auto-healing

What : bypass the lack of healthcheck of systemd

  • systemd service "what-service.service"
  • systemd timer  "what-service-healthcheck.timer"
    • triggers a systemd service "what-service-healthcheck.service"
       which lanches a script "
      service_health_check.sh"
    • script that :
      • curl's heal-tcheck URL "HEALTH_CHECK_URL"
      • if KO, restart the targetted service



what-service-healthcheck.timer

[Unit]
Description=Run health check every 15 seconds
[Timer]
# Wait 1 minute after boot before the first check
OnBootSec=1min
# Run the check 15 seconds after the last time it finished
OnUnitActiveSec=15s
[Install]
WantedBy=timers.target


By default the timer service will trigger the unit service with the same name, no need to specify it.

what-service-healthcheck.service.j2
[Unit]
Description=Health Check for {{ what_service }}
Requires={{ what_service }}.service
[Service]
Type=oneshot
ExecStart=/usr/local/bin/service_health_check.sh
Restart=on-failure
OnFailure={{ what_service }}


service_health_check.sh 
#!/bin/bash
# The health check endpoint
HEALTH_CHECK_URL="http://localhost:{{ running_port }}/health_check"
# Use curl to check the endpoint.
# --fail: Makes curl exit with a non-zero status code on server errors (4xx or 5xx).
# --silent: Hides the progress meter.
# --output /dev/null: Discards the response body.
if ! curl --silent --fail --max-time 2 --output /dev/null "$HEALTH_CHECK_URL"; then
echo "Health check failed for {{ service_name }}. Restarting..."
# Restart is performed on failure from healthcheck service
exit 1
fi



Adding through ansible (to do : fix indentation, blog isn't great for this)

<role>/tasks/main.yml
--- 

- name: Generate what-service systemd file
ansible.builtin.template:
src: what-service.service.j2
dest: /etc/systemd/system/what-service.service
mode: "0755"
notify: Restart what-service

 - name: Copy the health check script
ansible.builtin.copy:
src: service_health_check.sh
dest: /usr/local/bin/service_health_check.sh
owner: root
group: root
mode: '0755'
vars: 
  service_name: what-service

- name: Copy the health check systemd service file
ansible.builtin.copy:
src: what-service-healthcheck.service
dest: /etc/systemd/system/what-service-healthcheck.service
owner: root
group: root
mode: '0644'
notify: Reload systemd

- name: Copy the health check systemd timer file
ansible.builtin.copy:
src: what-service-healthcheck.timer
dest: /etc/systemd/system/what-service-healthcheck.timer
owner: root
group: root
mode: '0644'
notify: Reload systemd

- name: Enable and start the health check timer
ansible.builtin.systemd:
name: healthcheck.timer
state: started
enabled: yes
daemon_reload: yes # Ensures systemd is reloaded before starting


<role>/handlers/main.yml
---
- name: Restart what-service 
ansible.builtin.service:
name: what-service
state: restarted
daemon_reload: true

- name: Reload systemd
ansible.builtin.service:
daemon_reload: yes

- name: Restart what-service-healthcheck.timer
ansible.builtin.service:
name: what-service-healthcheck.timer
state: restarted
daemon_reload: true