Initial commit: Backup Restore Orchestrator
Windmill-Flow + restore.sh für das automatische tägliche Backup-Verifikationssystem. Direkter Windmill-Sync via `wmill sync push` möglich. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,236 @@
|
||||
# Backup Restore Orchestrator
|
||||
|
||||
Automatisiertes tägliches Backup-Restore-Testsystem auf Basis von [Windmill](https://windmill.dev).
|
||||
|
||||
Jeden Tag um **00:11 Uhr** werden alle PBS-Backups auf mehreren Restore-Servern wiederhergestellt, auf Bootfähigkeit geprüft, als verschlüsselte 7z-Archive gespeichert und auf einen zentralen Backup-Server übertragen. Ergebnisse werden in einer MySQL-Datenbank gespeichert und per Nextcloud Talk gemeldet.
|
||||
|
||||
---
|
||||
|
||||
## Repository-Struktur
|
||||
|
||||
```
|
||||
├── f/Backup/ ← Windmill Workspace (sync-fähig)
|
||||
│ ├── backup_restore_orchestrator__flow/ ← Hauptflow (Orchestrator)
|
||||
│ │ ├── flow.yaml
|
||||
│ │ ├── aktive_datastores_aus_db_holen.my.sql ─ Step F
|
||||
│ │ ├── job_initialisieren_&_backup-queue_...py ─ Step A
|
||||
│ │ ├── alle_freien_restore-server_holen.py ─ Step B
|
||||
│ │ ├── ssh-credentials_fuer_alle_...py ─ Step G
|
||||
│ │ ├── script_deployen_&_pbs-datastores_...py ─ Step C
|
||||
│ │ ├── alte_restore-ordner_...py ─ Step H
|
||||
│ │ ├── ersten_restore_pro_server_starten.py ─ Step D
|
||||
│ │ └── webhook_verarbeiten_&_...py ─ Step E
|
||||
│ ├── backup_restore_report___nextcloud_talk__flow/ ← Täglicher Report (08:00 Uhr)
|
||||
│ ├── folder.meta.yaml
|
||||
│ ├── nextcloud_talk_room.variable.yaml
|
||||
│ ├── nextcloud_talk_url.variable.yaml
|
||||
│ └── restore_version.variable.yaml
|
||||
└── restore-worker/
|
||||
└── restore.sh ← Restore-Script (auf Restore-Servern)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Windmill Sync
|
||||
|
||||
```bash
|
||||
# Einmalig: Windmill CLI installieren
|
||||
npm install -g windmill-cli
|
||||
|
||||
# Workspace konfigurieren
|
||||
wmill workspace add <workspace-id> https://windmill.stines.de
|
||||
|
||||
# Aus Repo in Windmill einspielen
|
||||
wmill sync push --workspace <workspace-id>
|
||||
|
||||
# Aus Windmill ins Repo ziehen
|
||||
wmill sync pull --workspace <workspace-id>
|
||||
```
|
||||
|
||||
> **Hinweis:** Secrets (`skipSecrets: true`) werden nicht synchronisiert. Variablen mit sensiblen Werten müssen nach dem Push manuell in Windmill gesetzt werden.
|
||||
|
||||
---
|
||||
|
||||
## Systemarchitektur
|
||||
|
||||
```
|
||||
Windmill (Schedule 00:11 Uhr)
|
||||
│
|
||||
├─► PBS backup.stines.de:8007 ← Backup-Quelle
|
||||
│
|
||||
├─► STI-PROX01 (max 200 GB) ── restore.sh ──► 7z ──► Rsync ──► Backup-Server
|
||||
├─► ITD-PROX01 (max 100 GB) ── restore.sh ──► 7z ──► Rsync ──► Backup-Server
|
||||
└─► STI-BAC01 (min 250 GB) ── restore.sh ──► 7z (lokal, kein Rsync)
|
||||
│
|
||||
▼
|
||||
Webhook → Windmill Step E
|
||||
→ nächstes Backup starten
|
||||
```
|
||||
|
||||
### Restore-Server
|
||||
|
||||
| Hostname | max_backup_size_gb | min_backup_size_gb | Rsync |
|
||||
|------------|--------------------|--------------------|-------|
|
||||
| STI-PROX01 | 200 | NULL | ja |
|
||||
| ITD-PROX01 | 100 | NULL | ja |
|
||||
| STI-BAC01 | NULL | 250 | nein (lokal gemountet) |
|
||||
|
||||
---
|
||||
|
||||
## Flow-Ablauf: F → A → B → G → C → H → D → E
|
||||
|
||||
| Step | ID | Sprache | Funktion |
|
||||
|------|----|---------|----------|
|
||||
| F | `f` | MySQL | Aktive Datastores aus DB holen |
|
||||
| A | `a` | Python | Job anlegen, PBS-Snapshots holen, Queue aufbauen (größte zuerst) |
|
||||
| B | `b` | Python | Freie Restore-Server holen |
|
||||
| G | `g` | Python | SSH-Credentials aus Bitwarden |
|
||||
| C | `c` | Python | Script deployen, PBS-Storages registrieren, Session speichern |
|
||||
| H | `h` | Python | Alte Backup-Ordner auf Backup-Server löschen |
|
||||
| D | `d` | Python | Ersten Restore pro Server starten |
|
||||
| E | `e` | Python | Webhook verarbeiten, nächsten Restore starten |
|
||||
|
||||
### Zwei Modi
|
||||
|
||||
**Schedule-Pfad** (täglich 00:11):
|
||||
Steps F → A → B → G → C → H → D laufen sequenziell. Step D startet den ersten Restore pro Server per SSH non-blocking und gibt `waiting_for_webhook` zurück.
|
||||
|
||||
**Webhook-Pfad** (nach jedem `restore.sh`):
|
||||
Flow-Input enthält `job_uuid` → Step A erkennt Webhook-Aufruf. Steps B–H werden übersprungen. Step E verarbeitet das Ergebnis, schreibt es in die DB und startet sofort das nächste passende Backup auf demselben Server.
|
||||
|
||||
---
|
||||
|
||||
## restore.sh — Ablauf
|
||||
|
||||
Das Script läuft auf den Restore-Servern unter `/opt/windmill-restore/restore.sh`.
|
||||
Gestartet von Step D und Step E via SSH + `nohup` (non-blocking).
|
||||
|
||||
```
|
||||
[0] 7z-Passwort vom PBS holen (password_7z.txt via Rsync)
|
||||
[1] Space-Check: free_space >= backup_size * 1.5
|
||||
[2] IDs ermitteln: Original aus Backup-Pfad, Restore-ID ab 1000
|
||||
[2.5] ZIP-bereits-vorhanden-Check → bei Treffer: success Webhook + exit
|
||||
[3] qmrestore (VM) / pct restore (CT)
|
||||
[4] IMAGE_DIR dynamisch aus PVE-Storage-Pfad ermitteln
|
||||
[5] Images prüfen (leer → failed)
|
||||
[6] Vorbereiten: Netzwerkkarten entfernen, qemu-Agent aktivieren
|
||||
[7] VM/CT starten, Bootfähigkeit prüfen (Agent ping 120s / pct exec)
|
||||
[8] VM: qm shutdown --timeout 120, CT: pct stop
|
||||
[9] Config sichern (qemu-server.conf / lxc.conf)
|
||||
[10] Verschlüsseltes 7z-Archiv erstellen (mx=0, mhe=on)
|
||||
[11] Rsync zum Backup-Server (3 Versuche + Größenprüfung)
|
||||
ODER lokal speichern (STI-BAC01: SKIP_RSYNC=1)
|
||||
[12] VM/CT destroy, ZIP löschen (außer STI-BAC01)
|
||||
[13] Webhook → Windmill Step E
|
||||
```
|
||||
|
||||
### Script-Deployment
|
||||
|
||||
Das Script wird von Step C automatisch auf alle Restore-Server deployed:
|
||||
|
||||
1. `restore.sh` in diesem Repo (Ordner `restore-worker/`) aktualisieren
|
||||
2. In Gitea pushen: `http://172.17.1.251:8080/sebastian.serfling/BackupScript.git`
|
||||
3. Windmill-Variable `f/Backup/restore_version` erhöhen (z.B. `1.0.27`)
|
||||
4. Nächster Flow-Lauf: Step C erkennt Versionsunterschied → deployed automatisch
|
||||
|
||||
---
|
||||
|
||||
## Windmill-Variablen
|
||||
|
||||
| Variable | Inhalt |
|
||||
|----------|--------|
|
||||
| `f/Backup/pbs_variable` | JSON: host, port, user, password, fingerprint |
|
||||
| `f/Backup/mysql_config` | JSON: MySQL-Verbindungsdaten |
|
||||
| `f/Backup/bitwarden_api_login` | JSON: bw_clientid, bw_clientsecret, bw_masterpassword |
|
||||
| `f/Backup/gitea_token` | Gitea Access Token |
|
||||
| `f/Backup/restore_version` | Aktuelle Script-Version, z.B. `1.0.26` |
|
||||
| `f/Backup/backup_server_host` | Hostname/IP Backup-Server |
|
||||
| `f/Backup/backup_server_ssh_password` | SSH-Passwort Backup-Server |
|
||||
| `f/Backup/windmill_webhook_url` | Webhook-URL für restore.sh Callbacks |
|
||||
| `f/Backup/windmill_webhook_token` | Bearer Token |
|
||||
| `f/Backup/nextcloud_talk_url` | https://nextcloud.stines.de |
|
||||
| `f/Backup/nextcloud_talk_room` | Room-Token |
|
||||
| `f/Backup/nextcloud_talk_user` | Benutzername |
|
||||
| `f/Backup/nextcloud_talk_password` | App-Passwort |
|
||||
|
||||
---
|
||||
|
||||
## Datenbank-Schema (MySQL: `Kunden`)
|
||||
|
||||
### `bronze.restore.jobs`
|
||||
| Spalte | Typ | Beschreibung |
|
||||
|--------|-----|--------------|
|
||||
| job_uuid | VARCHAR(64) PK | Eindeutige Job-ID |
|
||||
| started_at | DATETIME | Startzeitpunkt |
|
||||
| finished_at | DATETIME | Endzeitpunkt |
|
||||
| status | VARCHAR(20) | running / completed / failed |
|
||||
| total_backups | INT | Anzahl Backups in Queue |
|
||||
| restored_count | INT | Erfolgreich abgeschlossen |
|
||||
| failed_count | INT | Fehlgeschlagen |
|
||||
|
||||
### `bronze.restore.result`
|
||||
| Spalte | Typ | Beschreibung |
|
||||
|--------|-----|--------------|
|
||||
| job_uuid | VARCHAR(64) | Referenz auf Job |
|
||||
| client_name | VARCHAR(128) | z.B. `tnp-Invest-GmbH:vm/100` |
|
||||
| backup_path | VARCHAR(256) | Vollpfad mit Timestamp |
|
||||
| vm_name | VARCHAR(128) | Hostname der VM/CT |
|
||||
| restore_server | VARCHAR(128) | Hostname des Restore-Servers |
|
||||
| status | VARCHAR(20) | restoring / done / failed |
|
||||
| restore_duration_sec | INT | Dauer Restore in Sekunden |
|
||||
| zip_size_bytes | BIGINT | Größe des 7z-Archivs |
|
||||
| rsync_ok | TINYINT | Rsync erfolgreich |
|
||||
| qm_agent_ok | TINYINT | Boot-Check erfolgreich |
|
||||
| error_message | TEXT | Fehlermeldung falls failed |
|
||||
|
||||
### `bronze.backup.queue`
|
||||
| Spalte | Typ | Beschreibung |
|
||||
|--------|-----|--------------|
|
||||
| job_uuid | VARCHAR(64) | Referenz auf Job |
|
||||
| client_name | VARCHAR(128) | Backup-Bezeichnung |
|
||||
| backup_path | VARCHAR(256) | Vollpfad mit Timestamp |
|
||||
| backup_size_bytes | BIGINT | Komprimierte PBS-Größe |
|
||||
| priority | INT | 0 = größtes (höchste Prio) |
|
||||
| rsync_target | VARCHAR(256) | Zielpfad auf Backup-Server |
|
||||
| pbs_storage_id | VARCHAR(128) | z.B. `pbs-firma-gmbh` |
|
||||
| status | VARCHAR(20) | queued / assigned / done / failed / obsolete |
|
||||
|
||||
### `bronze.restore.server`
|
||||
| Spalte | Typ | Beschreibung |
|
||||
|--------|-----|--------------|
|
||||
| hostname | VARCHAR(128) PK | Server-Hostname |
|
||||
| ip | VARCHAR(45) | IP-Adresse |
|
||||
| is_active | TINYINT | 1 = aktiv |
|
||||
| free_space_gb | INT | Freier Speicher (wird aktualisiert) |
|
||||
| restore_mount | VARCHAR(128) | z.B. `/mnt/BTRFS` |
|
||||
| restore_path | VARCHAR(128) | PVE-Storage-Name |
|
||||
| current_job_uuid | VARCHAR(64) | NULL = frei |
|
||||
| max_backup_size_gb | INT | NULL = kein Limit |
|
||||
| min_backup_size_gb | INT | NULL = kein Limit |
|
||||
| script_deployed | TINYINT | Script vorhanden |
|
||||
| script_version | VARCHAR(20) | Aktuelle Script-Version |
|
||||
|
||||
---
|
||||
|
||||
## SQL-Reset bei Problemen
|
||||
|
||||
```sql
|
||||
-- Kompletter Reset für neuen Testlauf
|
||||
UPDATE Kunden.`bronze.restore.jobs` SET status='failed', finished_at=NOW() WHERE status='running';
|
||||
UPDATE Kunden.`bronze.restore.server` SET current_job_uuid=NULL;
|
||||
DELETE FROM Kunden.`bronze.restore.session`;
|
||||
UPDATE Kunden.`bronze.backup.queue` SET status='obsolete' WHERE status IN ('queued','assigned');
|
||||
|
||||
-- Einzelnen Server freigeben
|
||||
UPDATE Kunden.`bronze.restore.server` SET current_job_uuid=NULL WHERE hostname='ITD-PROX01';
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Bekannte Probleme
|
||||
|
||||
| Problem | Ursache | Fix |
|
||||
|---------|---------|-----|
|
||||
| Falsches Datum in Logs | Server auf UTC statt CET | `timedatectl set-timezone Europe/Berlin` |
|
||||
| `/var/tmp` voll (ITD-PROX01) | Proxmox schreibt tmp auf Root-Partition | `mount --bind /mnt/BTRFS/tmp /var/tmp` |
|
||||
| Server bleibt nach letztem Backup belegt | `current_job_uuid` nicht zurückgesetzt | `UPDATE bronze.restore.server SET current_job_uuid=NULL WHERE hostname=...` |
|
||||
@@ -0,0 +1 @@
|
||||
SELECT datastore FROM Kunden.`bronze.backup.server.datastore` AS kbbsd WHERE kbbsd.restore = 1
|
||||
@@ -0,0 +1,17 @@
|
||||
# py: 3.12
|
||||
anyio==4.12.1
|
||||
bcrypt==5.0.0
|
||||
certifi==2026.2.25
|
||||
cffi==2.0.0
|
||||
cryptography==46.0.5
|
||||
h11==0.16.0
|
||||
httpcore==1.0.9
|
||||
httpx==0.28.1
|
||||
idna==3.11
|
||||
invoke==2.2.1
|
||||
mysql-connector-python==9.6.0
|
||||
paramiko==4.0.0
|
||||
pycparser==3.0
|
||||
pynacl==1.6.2
|
||||
typing-extensions==4.15.0
|
||||
wmill==1.657.2
|
||||
@@ -0,0 +1,40 @@
|
||||
import wmill, mysql.connector, json
|
||||
|
||||
def main(prev: dict):
|
||||
if prev.get("mode") == "webhook":
|
||||
return prev
|
||||
|
||||
db_cfg = json.loads(wmill.get_variable("f/Backup/mysql_config"))
|
||||
conn = mysql.connector.connect(**db_cfg)
|
||||
cur = conn.cursor(dictionary=True)
|
||||
|
||||
# FIX: max_backup_size_gb hinzugefügt
|
||||
cur.execute("""
|
||||
SELECT hostname, ip, free_space_gb,
|
||||
script_deployed, script_version,
|
||||
restore_mount, restore_path,
|
||||
max_backup_size_gb,
|
||||
min_backup_size_gb
|
||||
FROM Kunden.`bronze.restore.server`
|
||||
WHERE is_active = 1 AND current_job_uuid IS NULL
|
||||
ORDER BY free_space_gb DESC
|
||||
""")
|
||||
servers = cur.fetchall()
|
||||
cur.close(); conn.close()
|
||||
|
||||
if not servers:
|
||||
raise Exception("Kein freier Restore-Server verfuegbar!")
|
||||
|
||||
for s in servers:
|
||||
if not s.get("restore_mount"):
|
||||
raise Exception(
|
||||
f"restore_mount fuer '{s['hostname']}' nicht konfiguriert!"
|
||||
)
|
||||
if not s.get("restore_path"):
|
||||
raise Exception(
|
||||
f"restore_path fuer '{s['hostname']}' nicht konfiguriert!"
|
||||
)
|
||||
|
||||
print(f"{len(servers)} freie Restore-Server: "
|
||||
f"{[s['hostname'] for s in servers]}")
|
||||
return {**prev, "target_servers": servers}
|
||||
+17
@@ -0,0 +1,17 @@
|
||||
# py: 3.12
|
||||
anyio==4.13.0
|
||||
bcrypt==5.0.0
|
||||
certifi==2026.2.25
|
||||
cffi==2.0.0
|
||||
cryptography==46.0.6
|
||||
h11==0.16.0
|
||||
httpcore==1.0.9
|
||||
httpx==0.28.1
|
||||
idna==3.11
|
||||
invoke==2.2.1
|
||||
mysql-connector-python==9.6.0
|
||||
paramiko==4.0.0
|
||||
pycparser==3.0
|
||||
pynacl==1.6.2
|
||||
typing-extensions==4.15.0
|
||||
wmill==1.666.0
|
||||
+83
@@ -0,0 +1,83 @@
|
||||
import wmill, json, paramiko, mysql.connector, re, io
|
||||
|
||||
|
||||
def main(prev: dict):
|
||||
if prev.get("mode") == "webhook":
|
||||
return prev
|
||||
|
||||
db_cfg = json.loads(wmill.get_variable("f/Backup/mysql_config"))
|
||||
conn = mysql.connector.connect(**db_cfg)
|
||||
cur = conn.cursor(dictionary=True)
|
||||
|
||||
cur.execute("""
|
||||
SELECT datastore, rsync_target, retention_days
|
||||
FROM Kunden.`bronze.backup.datastore.config`
|
||||
WHERE rsync_target IS NOT NULL AND rsync_target != ''
|
||||
""")
|
||||
configs = cur.fetchall()
|
||||
cur.close(); conn.close()
|
||||
|
||||
if not configs:
|
||||
print("Keine Datastore-Configs - Cleanup uebersprungen.")
|
||||
return prev
|
||||
|
||||
backup_server = wmill.get_variable("f/Backup/backup_server_host")
|
||||
ip_match = re.search(r'https?://([0-9.]+)', backup_server)
|
||||
ip = ip_match.group(1) if ip_match else backup_server
|
||||
bw_pass = wmill.get_variable("f/Backup/backup_server_ssh_password")
|
||||
|
||||
ssh = paramiko.SSHClient()
|
||||
ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
|
||||
ssh.connect(ip, username="root", password=bw_pass)
|
||||
|
||||
# Cleanup-Script aufbauen
|
||||
# Datum-basierter Vergleich statt -mtime
|
||||
script_lines = ["#!/bin/bash", "set -euo pipefail", ""]
|
||||
|
||||
for cfg in configs:
|
||||
datastore = cfg["datastore"]
|
||||
target = cfg["rsync_target"]
|
||||
retention_days = cfg.get("retention_days") or 7
|
||||
|
||||
script_lines.append(
|
||||
f"echo '[{datastore}] Cleanup {target} (Ordner aelter als {retention_days} Tage)'"
|
||||
)
|
||||
script_lines.append(f"if [ -d '{target}' ]; then")
|
||||
script_lines.append(f" cutoff=$(date -d '{retention_days} days ago' +%Y-%m-%d)")
|
||||
script_lines.append(f" find '{target}' -maxdepth 1 -type d -name '????-??-??' | while read dir; do")
|
||||
script_lines.append(f" folder_date=$(basename \"$dir\")")
|
||||
script_lines.append(f" if [[ \"$folder_date\" < \"$cutoff\" ]]; then")
|
||||
script_lines.append(f" echo \" Loesche: $dir (Datum: $folder_date < Cutoff: $cutoff)\"")
|
||||
script_lines.append(f" rm -rf \"$dir\"")
|
||||
script_lines.append(f" else")
|
||||
script_lines.append(f" echo \" Behalte: $dir\"")
|
||||
script_lines.append(f" fi")
|
||||
script_lines.append(f" done")
|
||||
script_lines.append(f"else")
|
||||
script_lines.append(f" echo ' WARNUNG: Verzeichnis nicht gefunden: {target}'")
|
||||
script_lines.append(f"fi")
|
||||
script_lines.append("")
|
||||
|
||||
script_lines.append("echo 'Cleanup abgeschlossen.'")
|
||||
cleanup_script = "\n".join(script_lines)
|
||||
|
||||
# Script per SFTP hochladen statt Heredoc
|
||||
sftp = ssh.open_sftp()
|
||||
sftp.putfo(
|
||||
io.BytesIO(cleanup_script.encode()),
|
||||
"/tmp/cleanup_restore.sh"
|
||||
)
|
||||
sftp.close()
|
||||
|
||||
ssh.exec_command("chmod +x /tmp/cleanup_restore.sh")
|
||||
|
||||
import time; time.sleep(1)
|
||||
|
||||
ssh.exec_command(
|
||||
"nohup /tmp/cleanup_restore.sh > /tmp/cleanup_restore.log 2>&1 &"
|
||||
)
|
||||
|
||||
ssh.close()
|
||||
print(f"Cleanup auf {ip} im Hintergrund gestartet.")
|
||||
print(f"Log: /tmp/cleanup_restore.log")
|
||||
return {**prev, "cleanup": "started_background"}
|
||||
@@ -0,0 +1,17 @@
|
||||
# py: 3.12
|
||||
anyio==4.13.0
|
||||
bcrypt==5.0.0
|
||||
certifi==2026.2.25
|
||||
cffi==2.0.0
|
||||
cryptography==46.0.5
|
||||
h11==0.16.0
|
||||
httpcore==1.0.9
|
||||
httpx==0.28.1
|
||||
idna==3.11
|
||||
invoke==2.2.1
|
||||
mysql-connector-python==9.6.0
|
||||
paramiko==4.0.0
|
||||
pycparser==3.0
|
||||
pynacl==1.6.2
|
||||
typing-extensions==4.15.0
|
||||
wmill==1.664.0
|
||||
@@ -0,0 +1,158 @@
|
||||
import wmill, json, paramiko, mysql.connector
|
||||
|
||||
|
||||
def start_restore(server, backup, job_uuid, webhook_url, webhook_tok):
|
||||
"""Startet restore.sh auf einem Server non-blocking."""
|
||||
ssh = paramiko.SSHClient()
|
||||
ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
|
||||
ssh.connect(
|
||||
server["ip"],
|
||||
username=server["ssh_creds"]["user"],
|
||||
password=server["ssh_creds"]["password"]
|
||||
)
|
||||
safe_client = backup["client_name"].replace("/", "_").replace(":", "_")
|
||||
srv_hostname = server["hostname"]
|
||||
bk_size = backup.get("backup_size_bytes") or 0
|
||||
cmd = (
|
||||
f"nohup /opt/windmill-restore/restore.sh"
|
||||
f" --job-uuid '{job_uuid}'"
|
||||
f" --backup-path '{backup['backup_path']}'"
|
||||
f" --client '{backup['client_name']}'"
|
||||
f" --restore-mount '{server['restore_mount']}'"
|
||||
f" --restore-path '{server['restore_path']}'"
|
||||
f" --rsync-target '{backup['rsync_target']}'"
|
||||
f" --pbs-storage '{backup['pbs_storage_id']}'"
|
||||
f" --webhook-url '{webhook_url}'"
|
||||
f" --webhook-token '{webhook_tok}'"
|
||||
f" --server-hostname '{srv_hostname}'"
|
||||
f" --backup-size '{bk_size}'"
|
||||
f" > /opt/windmill-restore/logs/{safe_client}.log 2>&1 &"
|
||||
)
|
||||
ssh.exec_command(cmd)
|
||||
ssh.close()
|
||||
|
||||
|
||||
def find_backup_for_server(server, backups):
|
||||
"""
|
||||
Sucht das erste Backup aus der Liste das zur Server-Größenklasse passt.
|
||||
max_backup_size_gb = NULL -> kein oberes Limit
|
||||
min_backup_size_gb = NULL -> kein unteres Limit
|
||||
Gibt (index, backup) zurueck oder (None, None) wenn nichts passt.
|
||||
"""
|
||||
max_gb = server.get("max_backup_size_gb")
|
||||
min_gb = server.get("min_backup_size_gb")
|
||||
max_bytes = max_gb * 1024 * 1024 * 1024 if max_gb is not None else None
|
||||
min_bytes = min_gb * 1024 * 1024 * 1024 if min_gb is not None else None
|
||||
|
||||
for i, backup in enumerate(backups):
|
||||
size = backup.get("backup_size_bytes") or 0
|
||||
if max_bytes is not None and size > max_bytes:
|
||||
continue
|
||||
if min_bytes is not None and size < min_bytes:
|
||||
continue
|
||||
return i, backup
|
||||
return None, None
|
||||
|
||||
|
||||
def main(prev: dict):
|
||||
if prev.get("mode") == "webhook":
|
||||
return prev
|
||||
|
||||
servers = prev.get("target_servers", [])
|
||||
backups = list(prev.get("backups", []))
|
||||
job_uuid = prev["job_uuid"]
|
||||
|
||||
if not backups:
|
||||
return {**prev, "status": "no_backups"}
|
||||
|
||||
db_cfg = json.loads(wmill.get_variable("f/Backup/mysql_config"))
|
||||
conn = mysql.connector.connect(**db_cfg)
|
||||
cur = conn.cursor()
|
||||
|
||||
webhook_url = wmill.get_variable("f/Backup/windmill_webhook_url")
|
||||
webhook_tok = wmill.get_variable("f/Backup/windmill_webhook_token")
|
||||
|
||||
servers_sorted = sorted(
|
||||
servers,
|
||||
key=lambda s: s.get("max_backup_size_gb") or 999999,
|
||||
reverse=True
|
||||
)
|
||||
|
||||
started = []
|
||||
|
||||
for server in servers_sorted:
|
||||
if not backups:
|
||||
break
|
||||
|
||||
idx, backup = find_backup_for_server(server, backups)
|
||||
|
||||
if backup is None:
|
||||
print(f"Kein passendes Backup fuer '{server['hostname']}' "
|
||||
f"(max: {server.get('max_backup_size_gb')} GB)")
|
||||
continue
|
||||
|
||||
backups.pop(idx)
|
||||
|
||||
def fail_backup(msg, bk=backup, sv=server):
|
||||
cur.execute("""
|
||||
INSERT INTO Kunden.`bronze.restore.result`
|
||||
(job_uuid, client_name, backup_path,
|
||||
backup_size_bytes, restore_server, status, error_message)
|
||||
VALUES (%s, %s, %s, %s, %s, 'failed', %s)
|
||||
""", (job_uuid, bk["client_name"], bk["backup_path"],
|
||||
bk.get("backup_size_bytes", 0), sv["hostname"], msg))
|
||||
cur.execute("""
|
||||
UPDATE Kunden.`bronze.restore.jobs`
|
||||
SET failed_count=failed_count+1 WHERE job_uuid=%s
|
||||
""", (job_uuid,))
|
||||
conn.commit()
|
||||
|
||||
if not backup.get("rsync_target"):
|
||||
fail_backup("rsync_target fehlt"); continue
|
||||
if not backup.get("pbs_storage_id"):
|
||||
fail_backup("pbs_storage_id fehlt"); continue
|
||||
if not server.get("restore_mount"):
|
||||
fail_backup("restore_mount fehlt"); continue
|
||||
if not server.get("restore_path"):
|
||||
fail_backup("restore_path fehlt"); continue
|
||||
|
||||
client_like = f"{backup['client_name']}%"
|
||||
|
||||
cur.execute("""
|
||||
INSERT INTO Kunden.`bronze.restore.result`
|
||||
(job_uuid, client_name, backup_path,
|
||||
backup_size_bytes, restore_server, status, started_at)
|
||||
VALUES (%s, %s, %s, %s, %s, 'restoring', NOW())
|
||||
""", (job_uuid, backup["client_name"], backup["backup_path"],
|
||||
backup.get("backup_size_bytes", 0), server["hostname"]))
|
||||
|
||||
cur.execute("""
|
||||
UPDATE Kunden.`bronze.restore.server`
|
||||
SET current_job_uuid=%s WHERE hostname=%s
|
||||
""", (job_uuid, server["hostname"]))
|
||||
|
||||
cur.execute("""
|
||||
UPDATE Kunden.`bronze.backup.queue`
|
||||
SET status='assigned'
|
||||
WHERE job_uuid=%s AND backup_path LIKE %s
|
||||
""", (job_uuid, client_like))
|
||||
conn.commit()
|
||||
|
||||
start_restore(server, backup, job_uuid, webhook_url, webhook_tok)
|
||||
size_gb = (backup.get("backup_size_bytes") or 0) / 1024 / 1024 / 1024
|
||||
print(f"Restore gestartet: {backup['client_name']} "
|
||||
f"({size_gb:.1f} GB) auf {server['hostname']} "
|
||||
f"(max: {server.get('max_backup_size_gb')} GB)")
|
||||
started.append({
|
||||
"client": backup["client_name"],
|
||||
"server": server["hostname"],
|
||||
})
|
||||
|
||||
cur.close(); conn.close()
|
||||
|
||||
return {
|
||||
**prev,
|
||||
"status": "restore_started",
|
||||
"started": started,
|
||||
"backups": backups,
|
||||
}
|
||||
@@ -0,0 +1,121 @@
|
||||
summary: Backup Restore Orchestrator
|
||||
description: |
|
||||
Startet täglich um 00:11 Uhr. Holt Backup-Liste direkt via
|
||||
proxmox-backup-client, schreibt Queue nach Größe sortiert in DB,
|
||||
registriert PBS-Datastores auf allen freien Restore-Servern,
|
||||
startet Restores parallel auf mehreren Servern per Webhook-Chaining.
|
||||
value:
|
||||
modules:
|
||||
- id: f
|
||||
summary: Aktive Datastores aus DB holen
|
||||
value:
|
||||
type: rawscript
|
||||
content: '!inline aktive_datastores_aus_db_holen.my.sql'
|
||||
input_transforms:
|
||||
database:
|
||||
type: static
|
||||
value: $res:u/sebastianserfling/fascinating_mysql
|
||||
lock: ''
|
||||
language: mysql
|
||||
- id: a
|
||||
summary: Job initialisieren & Backup-Queue aus PBS aufbauen
|
||||
value:
|
||||
type: rawscript
|
||||
content: '!inline job_initialisieren_&_backup-queue_aus_pbs_aufbauen.py'
|
||||
input_transforms:
|
||||
datastores:
|
||||
type: javascript
|
||||
expr: results.f
|
||||
trigger_type:
|
||||
type: javascript
|
||||
expr: "flow_input.job_uuid ? 'webhook' : 'schedule'"
|
||||
webhook_data:
|
||||
type: javascript
|
||||
expr: 'flow_input.job_uuid ? flow_input : {}'
|
||||
lock: '!inline job_initialisieren_&_backup-queue_aus_pbs_aufbauen.lock'
|
||||
language: python3
|
||||
- id: b
|
||||
summary: Alle freien Restore-Server holen
|
||||
value:
|
||||
type: rawscript
|
||||
content: '!inline alle_freien_restore-server_holen.py'
|
||||
input_transforms:
|
||||
prev:
|
||||
type: javascript
|
||||
expr: results.a
|
||||
lock: '!inline alle_freien_restore-server_holen.lock'
|
||||
language: python3
|
||||
- id: g
|
||||
summary: SSH-Credentials fuer alle Restore-Server aus Bitwarden
|
||||
value:
|
||||
type: rawscript
|
||||
content: '!inline ssh-credentials_fuer_alle_restore-server_aus_bitwarden.py'
|
||||
input_transforms:
|
||||
bw_url:
|
||||
type: static
|
||||
value: https://bitwarden.stines.de
|
||||
prev:
|
||||
type: javascript
|
||||
expr: results.b
|
||||
lock: '!inline ssh-credentials_fuer_alle_restore-server_aus_bitwarden.lock'
|
||||
language: python3
|
||||
- id: c
|
||||
summary: Script deployen & PBS-Datastores auf allen Servern registrieren
|
||||
value:
|
||||
type: rawscript
|
||||
content: '!inline
|
||||
script_deployen_&_pbs-datastores_auf_allen_servern_registrieren.py'
|
||||
input_transforms:
|
||||
bw_result:
|
||||
type: javascript
|
||||
expr: results.g
|
||||
datastores:
|
||||
type: javascript
|
||||
expr: results.f
|
||||
prev:
|
||||
type: javascript
|
||||
expr: results.g
|
||||
lock: '!inline
|
||||
script_deployen_&_pbs-datastores_auf_allen_servern_registrieren.lock'
|
||||
language: python3
|
||||
- id: h
|
||||
summary: Alte Restore-Ordner auf Backup-Server loeschen
|
||||
value:
|
||||
type: rawscript
|
||||
content: '!inline alte_restore-ordner_auf_backup-server_loeschen.py'
|
||||
input_transforms:
|
||||
prev:
|
||||
type: javascript
|
||||
expr: results.c
|
||||
lock: '!inline alte_restore-ordner_auf_backup-server_loeschen.lock'
|
||||
language: python3
|
||||
- id: d
|
||||
summary: Ersten Restore pro Server starten
|
||||
value:
|
||||
type: rawscript
|
||||
content: '!inline ersten_restore_pro_server_starten.py'
|
||||
input_transforms:
|
||||
prev:
|
||||
type: javascript
|
||||
expr: results.h
|
||||
lock: '!inline ersten_restore_pro_server_starten.lock'
|
||||
language: python3
|
||||
continue_on_error: false
|
||||
- id: e
|
||||
summary: Webhook verarbeiten & naechsten Restore auf demselben Server starten
|
||||
value:
|
||||
type: rawscript
|
||||
content: '!inline
|
||||
webhook_verarbeiten_&_naechsten_restore_auf_demselben_server_starten.py'
|
||||
input_transforms:
|
||||
from_init:
|
||||
type: javascript
|
||||
expr: results.a
|
||||
lock: '!inline
|
||||
webhook_verarbeiten_&_naechsten_restore_auf_demselben_server_starten.lock'
|
||||
language: python3
|
||||
schema:
|
||||
$schema: https://json-schema.org/draft/2020-12/schema
|
||||
type: object
|
||||
order: []
|
||||
properties: {}
|
||||
+10
@@ -0,0 +1,10 @@
|
||||
# py: 3.12
|
||||
anyio==4.12.1
|
||||
certifi==2026.2.25
|
||||
h11==0.16.0
|
||||
httpcore==1.0.9
|
||||
httpx==0.28.1
|
||||
idna==3.11
|
||||
mysql-connector-python==9.6.0
|
||||
typing-extensions==4.15.0
|
||||
wmill==1.662.0
|
||||
+161
@@ -0,0 +1,161 @@
|
||||
import wmill, mysql.connector, json, uuid, subprocess, sys, os
|
||||
from datetime import datetime
|
||||
|
||||
def main(
|
||||
trigger_type: str = "schedule",
|
||||
webhook_data: dict = {},
|
||||
datastores: list = [],
|
||||
):
|
||||
# Webhook erkennen: kind=webhook ODER job_uuid im payload
|
||||
if trigger_type == "webhook" or webhook_data.get("job_uuid"):
|
||||
return {"mode": "webhook", "data": webhook_data}
|
||||
|
||||
pbs = json.loads(wmill.get_variable("f/Backup/pbs_variable"))
|
||||
port = pbs.get("port", 8007)
|
||||
|
||||
env = os.environ.copy()
|
||||
env["PBS_PASSWORD"] = pbs["password"]
|
||||
if pbs.get("fingerprint"):
|
||||
env["PBS_FINGERPRINT"] = pbs["fingerprint"]
|
||||
|
||||
if subprocess.run(["which", "proxmox-backup-client"],
|
||||
capture_output=True).returncode != 0:
|
||||
print("Installiere proxmox-backup-client...", file=sys.stderr)
|
||||
subprocess.run(["bash", "-c", (
|
||||
"echo 'deb http://download.proxmox.com/debian/pbs bookworm pbs-no-subscription'"
|
||||
" > /etc/apt/sources.list.d/pbs-client.list && "
|
||||
"wget -qO /etc/apt/trusted.gpg.d/proxmox-release-bookworm.gpg "
|
||||
" https://enterprise.proxmox.com/debian/proxmox-release-bookworm.gpg && "
|
||||
"apt-get update -qq && apt-get install -y proxmox-backup-client"
|
||||
)], check=True)
|
||||
|
||||
db_cfg = json.loads(wmill.get_variable("f/Backup/mysql_config"))
|
||||
conn = mysql.connector.connect(**db_cfg)
|
||||
cur = conn.cursor(dictionary=True)
|
||||
|
||||
cur.execute("""
|
||||
SELECT datastore, rsync_target, pbs_storage_id
|
||||
FROM Kunden.`bronze.backup.datastore.config`
|
||||
""")
|
||||
ds_config = {row["datastore"]: row for row in cur.fetchall()}
|
||||
|
||||
all_snaps = []
|
||||
for row in datastores:
|
||||
datastore = row["datastore"]
|
||||
if port != 8007:
|
||||
repo = f"{pbs['user']}@{pbs['host']}!{port}:{datastore}"
|
||||
else:
|
||||
repo = f"{pbs['user']}@{pbs['host']}:{datastore}"
|
||||
|
||||
env["PBS_REPOSITORY"] = repo
|
||||
print(f"Hole Snapshots: {datastore}...", file=sys.stderr)
|
||||
|
||||
result = subprocess.run(
|
||||
["proxmox-backup-client", "snapshots", "--output-format", "json"],
|
||||
capture_output=True, text=True, env=env,
|
||||
)
|
||||
if result.returncode != 0:
|
||||
print(f"WARNUNG: {datastore} fehlgeschlagen:\n{result.stderr}",
|
||||
file=sys.stderr)
|
||||
continue
|
||||
|
||||
snaps = json.loads(result.stdout)
|
||||
snaps = [s for s in snaps if s.get("backup-type") in ("vm", "ct")]
|
||||
for s in snaps:
|
||||
s["_datastore"] = datastore
|
||||
all_snaps.extend(snaps)
|
||||
print(f" -> {len(snaps)} Snapshots.", file=sys.stderr)
|
||||
|
||||
if not all_snaps:
|
||||
raise Exception("Keine Snapshots gefunden!")
|
||||
|
||||
latest: dict = {}
|
||||
for snap in all_snaps:
|
||||
key = f"{snap['_datastore']}/{snap['backup-type']}/{snap['backup-id']}"
|
||||
if key not in latest or snap["backup-time"] > latest[key]["backup-time"]:
|
||||
latest[key] = snap
|
||||
|
||||
sorted_snaps = sorted(latest.values(), key=lambda s: s.get("size", 0), reverse=True)
|
||||
print(f"{len(sorted_snaps)} Gruppen -> Queue.", file=sys.stderr)
|
||||
|
||||
job_uuid = str(uuid.uuid4())
|
||||
|
||||
cur.execute("""
|
||||
SELECT job_uuid FROM Kunden.`bronze.restore.jobs`
|
||||
WHERE status = 'running'
|
||||
LIMIT 1
|
||||
""")
|
||||
existing = cur.fetchone()
|
||||
if existing:
|
||||
cur.close(); conn.close()
|
||||
raise Exception(
|
||||
f"Job bereits aktiv: {existing['job_uuid']} – "
|
||||
f"kein neuer Job gestartet."
|
||||
)
|
||||
|
||||
cur.execute("SET time_zone = 'Europe/Berlin'")
|
||||
|
||||
cur.execute("""
|
||||
INSERT INTO Kunden.`bronze.restore.jobs`
|
||||
(job_uuid, started_at, status, restore_server)
|
||||
VALUES (%s, NOW(), 'running', '')
|
||||
""", (job_uuid,))
|
||||
|
||||
cur.execute("""
|
||||
UPDATE Kunden.`bronze.backup.queue`
|
||||
SET status='obsolete' WHERE status='queued'
|
||||
""")
|
||||
|
||||
for idx, snap in enumerate(sorted_snaps):
|
||||
backup_type = snap["backup-type"]
|
||||
backup_id = str(snap["backup-id"])
|
||||
datastore = snap["_datastore"]
|
||||
size_bytes = snap.get("size", 0)
|
||||
ts = snap["backup-time"]
|
||||
client_name = f"{datastore}:{backup_type}/{backup_id}"
|
||||
backup_time_str = datetime.utcfromtimestamp(ts).strftime("%Y-%m-%dT%H:%M:%SZ")
|
||||
backup_path = f"{datastore}:{backup_type}/{backup_id}/{backup_time_str}"
|
||||
cfg = ds_config.get(datastore, {})
|
||||
rsync_target = cfg.get("rsync_target")
|
||||
pbs_storage_id = cfg.get("pbs_storage_id")
|
||||
|
||||
cur.execute("""
|
||||
INSERT INTO Kunden.`bronze.backup.queue`
|
||||
(job_uuid, client_name, backup_path, backup_size_bytes,
|
||||
encrypt_key_ref, priority, rsync_target,
|
||||
pbs_storage_id, status, last_seen_at)
|
||||
VALUES (%s, %s, %s, %s, %s, %s, %s, %s, 'queued', NOW())
|
||||
""", (
|
||||
job_uuid, client_name, backup_path, size_bytes,
|
||||
client_name, idx, rsync_target, pbs_storage_id,
|
||||
))
|
||||
|
||||
cur.execute("""
|
||||
INSERT INTO Kunden.`bronze.restore.plan`
|
||||
(job_uuid, client_name, vm_name)
|
||||
VALUES (%s, %s, %s)
|
||||
""", (job_uuid, client_name, f"{backup_type}/{backup_id}"))
|
||||
|
||||
cur.execute("""
|
||||
UPDATE Kunden.`bronze.restore.jobs`
|
||||
SET total_backups=%s WHERE job_uuid=%s
|
||||
""", (len(sorted_snaps), job_uuid))
|
||||
conn.commit()
|
||||
|
||||
cur.execute("""
|
||||
SELECT client_name, backup_path, backup_size_bytes,
|
||||
encrypt_key_ref, priority,
|
||||
rsync_target, pbs_storage_id
|
||||
FROM Kunden.`bronze.backup.queue`
|
||||
WHERE job_uuid = %s AND status = 'queued'
|
||||
ORDER BY priority ASC
|
||||
""", (job_uuid,))
|
||||
queued = cur.fetchall()
|
||||
cur.close(); conn.close()
|
||||
|
||||
print(f"Queue: {len(queued)} Backups.", file=sys.stderr)
|
||||
return {
|
||||
"mode": "schedule",
|
||||
"job_uuid": job_uuid,
|
||||
"backups": queued,
|
||||
}
|
||||
+17
@@ -0,0 +1,17 @@
|
||||
# py: 3.12
|
||||
anyio==4.12.1
|
||||
bcrypt==5.0.0
|
||||
certifi==2026.2.25
|
||||
cffi==2.0.0
|
||||
cryptography==46.0.5
|
||||
h11==0.16.0
|
||||
httpcore==1.0.9
|
||||
httpx==0.28.1
|
||||
idna==3.11
|
||||
invoke==2.2.1
|
||||
mysql-connector-python==9.6.0
|
||||
paramiko==4.0.0
|
||||
pycparser==3.0
|
||||
pynacl==1.6.2
|
||||
typing-extensions==4.15.0
|
||||
wmill==1.659.1
|
||||
+245
@@ -0,0 +1,245 @@
|
||||
import wmill, json, paramiko, io, mysql.connector, re
|
||||
|
||||
GITEA_REPO = "http://172.17.1.251:8080/sebastian.serfling/BackupScript.git"
|
||||
|
||||
|
||||
def deploy_to_server(ssh, server, pbs, pbs_host, pbs_user, pbs_pass,
|
||||
pbs_port, ds_config, datastores, gitea_token,
|
||||
backup_server_host, job_uuid, ssh_creds, db_cfg,
|
||||
script_version):
|
||||
|
||||
hostname = server["hostname"]
|
||||
|
||||
_, out, _ = ssh.exec_command(
|
||||
"cat /opt/windmill-restore/version.txt 2>/dev/null || echo none"
|
||||
)
|
||||
current_version = out.read().decode().strip()
|
||||
needs_deploy = current_version != script_version \
|
||||
or not server.get("script_deployed")
|
||||
|
||||
if needs_deploy:
|
||||
print(f"[{hostname}] Deploye Script v{script_version}...")
|
||||
ssh.exec_command("mkdir -p /opt/windmill-restore/logs")
|
||||
ssh.exec_command("which git || apt-get install -y git 2>/dev/null")
|
||||
repo_url = GITEA_REPO.replace("http://", f"http://{gitea_token}@")
|
||||
|
||||
_, out, err = ssh.exec_command(
|
||||
"cd /opt/windmill-restore && "
|
||||
"if [ -d 'BackupScript/.git' ]; then "
|
||||
" cd BackupScript && git pull; "
|
||||
"else "
|
||||
" rm -rf BackupScript && "
|
||||
" git clone '" + repo_url + "' BackupScript; "
|
||||
"fi"
|
||||
)
|
||||
exit_code = out.channel.recv_exit_status()
|
||||
stderr_out = err.read().decode().strip()
|
||||
if exit_code != 0:
|
||||
raise Exception(f"[{hostname}] Git fehlgeschlagen: {stderr_out}")
|
||||
|
||||
_, out, err = ssh.exec_command(
|
||||
"cp /opt/windmill-restore/BackupScript/restore.sh "
|
||||
" /opt/windmill-restore/restore.sh && "
|
||||
"chmod +x /opt/windmill-restore/restore.sh && "
|
||||
"echo '" + script_version + "' > /opt/windmill-restore/version.txt"
|
||||
)
|
||||
exit_code = out.channel.recv_exit_status()
|
||||
stderr_out = err.read().decode().strip()
|
||||
if exit_code != 0:
|
||||
raise Exception(f"[{hostname}] Script kopieren fehlgeschlagen: {stderr_out}")
|
||||
|
||||
print(f"[{hostname}] Script v{script_version} deployed.")
|
||||
|
||||
pbs_conf = "\n".join([
|
||||
f"PBS_HOST={pbs_host}",
|
||||
f"PBS_PORT={pbs_port}",
|
||||
f"PBS_USER={pbs_user}",
|
||||
f"PBS_PASSWORD={pbs_pass}",
|
||||
]) + "\n"
|
||||
sftp = ssh.open_sftp()
|
||||
sftp.putfo(io.BytesIO(pbs_conf.encode()),
|
||||
"/opt/windmill-restore/pbs.conf")
|
||||
sftp.putfo(io.BytesIO(backup_server_host.encode()),
|
||||
"/opt/windmill-restore/backup_server_host")
|
||||
sftp.close()
|
||||
ssh.exec_command("chmod 600 /opt/windmill-restore/pbs.conf")
|
||||
print(f"[{hostname}] backup_server_host: {backup_server_host}")
|
||||
else:
|
||||
print(f"[{hostname}] Script aktuell (v{current_version}).")
|
||||
|
||||
ssh.exec_command(
|
||||
"mkdir -p /opt/windmill-restore/keys && "
|
||||
"chmod 700 /opt/windmill-restore/keys"
|
||||
)
|
||||
|
||||
_, out, _ = ssh.exec_command(
|
||||
"pvesm status 2>/dev/null | awk 'NR>1{print $1}'"
|
||||
)
|
||||
existing_storages = out.read().decode().splitlines()
|
||||
pbs_storages = []
|
||||
|
||||
for row in datastores:
|
||||
datastore = row["datastore"]
|
||||
storage_id = "pbs-" + datastore.lower() \
|
||||
.replace(" ", "-") \
|
||||
.replace("_", "-")
|
||||
ds_cfg = ds_config.get(datastore, {})
|
||||
fingerprint = ds_cfg.get("fingerprint", "") or ""
|
||||
keyfile_path = f"/opt/windmill-restore/keys/{datastore}.keyfile"
|
||||
|
||||
rsync_src = f"root@{pbs_host}:/root/Scripte/{datastore}.keyfile"
|
||||
_, out, err = ssh.exec_command(
|
||||
"if [ -s '" + keyfile_path + "' ]; then "
|
||||
" echo 'vorhanden'; "
|
||||
"else "
|
||||
" rsync -az -e 'ssh -o StrictHostKeyChecking=no' "
|
||||
" '" + rsync_src + "' '" + keyfile_path + "' "
|
||||
" && chmod 600 '" + keyfile_path + "' "
|
||||
" && echo 'geholt'; "
|
||||
"fi"
|
||||
)
|
||||
exit_code = out.channel.recv_exit_status()
|
||||
stderr_out = err.read().decode().strip()
|
||||
if exit_code != 0:
|
||||
raise Exception(
|
||||
f"[{hostname}] Keyfile fehlgeschlagen fuer '{datastore}': {stderr_out}"
|
||||
)
|
||||
|
||||
if storage_id in existing_storages:
|
||||
print(f"[{hostname}] Storage '{storage_id}' vorhanden.")
|
||||
else:
|
||||
fp_part = f"--fingerprint '{fingerprint}'" if fingerprint else ""
|
||||
cmd = (
|
||||
f"pvesm add pbs {storage_id} "
|
||||
f"--server '{pbs_host}' "
|
||||
f"--datastore '{datastore}' "
|
||||
f"--username '{pbs_user}' "
|
||||
f"--password '{pbs_pass}' "
|
||||
f"--port {pbs_port} "
|
||||
f"--encryption-key '{keyfile_path}' "
|
||||
f"--content backup"
|
||||
)
|
||||
_, out, err = ssh.exec_command(cmd)
|
||||
exit_code = out.channel.recv_exit_status()
|
||||
stderr = err.read().decode().strip()
|
||||
if exit_code != 0:
|
||||
raise Exception(
|
||||
f"[{hostname}] pvesm add '{datastore}' fehlgeschlagen: {stderr}"
|
||||
)
|
||||
print(f"[{hostname}] -> '{storage_id}' registriert")
|
||||
|
||||
pbs_storages.append({"datastore": datastore, "storage_id": storage_id})
|
||||
|
||||
conn = mysql.connector.connect(**db_cfg)
|
||||
cur = conn.cursor()
|
||||
|
||||
if needs_deploy:
|
||||
cur.execute("""
|
||||
UPDATE Kunden.`bronze.restore.server`
|
||||
SET script_deployed=1, script_version=%s WHERE hostname=%s
|
||||
""", (script_version, hostname))
|
||||
|
||||
for entry in pbs_storages:
|
||||
cur.execute("""
|
||||
UPDATE Kunden.`bronze.backup.datastore.config`
|
||||
SET pbs_storage_id=%s WHERE datastore=%s
|
||||
""", (entry["storage_id"], entry["datastore"]))
|
||||
|
||||
cur.execute("""
|
||||
INSERT INTO Kunden.`bronze.restore.session`
|
||||
(job_uuid, hostname, ip, ssh_user, ssh_password)
|
||||
VALUES (%s, %s, %s, %s, %s)
|
||||
ON DUPLICATE KEY UPDATE
|
||||
ip=VALUES(ip), ssh_user=VALUES(ssh_user),
|
||||
ssh_password=VALUES(ssh_password)
|
||||
""", (
|
||||
job_uuid, hostname,
|
||||
ssh_creds["ip"], ssh_creds["user"], ssh_creds["password"],
|
||||
))
|
||||
|
||||
conn.commit(); cur.close(); conn.close()
|
||||
print(f"[{hostname}] Session-Creds gespeichert.")
|
||||
|
||||
return pbs_storages
|
||||
|
||||
|
||||
def main(prev: dict, bw_result: dict = {}, datastores: list = []):
|
||||
if prev.get("mode") == "webhook":
|
||||
return prev
|
||||
|
||||
servers = prev["target_servers"]
|
||||
server_creds = prev.get("server_creds", {})
|
||||
job_uuid = prev["job_uuid"]
|
||||
|
||||
script_version = wmill.get_variable("f/Backup/restore_version").strip()
|
||||
print(f"Script-Version aus Variable: {script_version}")
|
||||
|
||||
pbs = json.loads(wmill.get_variable("f/Backup/pbs_variable"))
|
||||
pbs_host = pbs["host"]
|
||||
pbs_user = pbs["user"]
|
||||
pbs_pass = pbs["password"]
|
||||
pbs_port = str(pbs.get("port", 8007))
|
||||
|
||||
db_cfg = json.loads(wmill.get_variable("f/Backup/mysql_config"))
|
||||
conn = mysql.connector.connect(**db_cfg)
|
||||
cur = conn.cursor(dictionary=True)
|
||||
cur.execute("""
|
||||
SELECT datastore, rsync_target, pbs_storage_id, fingerprint
|
||||
FROM Kunden.`bronze.backup.datastore.config`
|
||||
""")
|
||||
ds_config = {row["datastore"]: row for row in cur.fetchall()}
|
||||
cur.close(); conn.close()
|
||||
|
||||
gitea_token = wmill.get_variable("f/Backup/gitea_token")
|
||||
backup_server_host = wmill.get_variable("f/Backup/backup_server_host")
|
||||
|
||||
target_servers_out = []
|
||||
|
||||
for server in servers:
|
||||
hostname = server["hostname"]
|
||||
creds = server_creds.get(hostname, {})
|
||||
|
||||
url = creds.get("url", "")
|
||||
ip_match = re.search(r'https?://([0-9.]+)', url)
|
||||
ip = ip_match.group(1) if ip_match else server.get("ip", "")
|
||||
|
||||
if not ip:
|
||||
print(f"WARNUNG: Keine IP fuer '{hostname}' – uebersprungen.")
|
||||
continue
|
||||
|
||||
ssh_creds_dict = {
|
||||
"ip": ip,
|
||||
"user": creds.get("username", "root"),
|
||||
"password": creds.get("password", ""),
|
||||
}
|
||||
|
||||
ssh = paramiko.SSHClient()
|
||||
ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
|
||||
ssh.connect(ip, username=ssh_creds_dict["user"],
|
||||
password=ssh_creds_dict["password"])
|
||||
|
||||
try:
|
||||
pbs_storages = deploy_to_server(
|
||||
ssh, server, pbs, pbs_host, pbs_user, pbs_pass,
|
||||
pbs_port, ds_config, datastores, gitea_token,
|
||||
backup_server_host, job_uuid, ssh_creds_dict, db_cfg,
|
||||
script_version
|
||||
)
|
||||
finally:
|
||||
ssh.close()
|
||||
|
||||
target_servers_out.append({
|
||||
**server,
|
||||
"ip": ip,
|
||||
"ssh_creds": ssh_creds_dict,
|
||||
"pbs_storages": pbs_storages,
|
||||
})
|
||||
|
||||
if not target_servers_out:
|
||||
raise Exception("Kein Server konnte vorbereitet werden!")
|
||||
|
||||
return {
|
||||
**prev,
|
||||
"target_servers": target_servers_out,
|
||||
"script_version": script_version,
|
||||
}
|
||||
+9
@@ -0,0 +1,9 @@
|
||||
# py: 3.12
|
||||
anyio==4.12.1
|
||||
certifi==2026.2.25
|
||||
h11==0.16.0
|
||||
httpcore==1.0.9
|
||||
httpx==0.28.1
|
||||
idna==3.11
|
||||
typing-extensions==4.15.0
|
||||
wmill==1.657.2
|
||||
+82
@@ -0,0 +1,82 @@
|
||||
import subprocess, sys, json, os, wmill
|
||||
|
||||
def bw_lookup(search_term, env, run):
|
||||
run(["bw", "sync", "--session", env["BW_SESSION"]], check=False)
|
||||
for attempt in range(3):
|
||||
result = run(
|
||||
["bw", "list", "items", "--search", search_term,
|
||||
"--session", env["BW_SESSION"]]
|
||||
)
|
||||
items = json.loads(result.stdout)
|
||||
if items:
|
||||
break
|
||||
import time; time.sleep(3)
|
||||
if not items:
|
||||
raise Exception(f"Kein Bitwarden-Eintrag fuer: '{search_term}'")
|
||||
exact = next(
|
||||
(i for i in items
|
||||
if i.get("name","").strip().lower() == search_term.strip().lower()),
|
||||
None,
|
||||
)
|
||||
item = exact if exact else items[0]
|
||||
url = ((item.get("login",{}).get("uris") or [{}])[0].get("uri","")) \
|
||||
if item.get("login") else ""
|
||||
return {
|
||||
"username": item.get("login",{}).get("username","") if item.get("login") else "",
|
||||
"password": item.get("login",{}).get("password","") if item.get("login") else "",
|
||||
"url": url,
|
||||
}
|
||||
|
||||
def main(
|
||||
prev: dict,
|
||||
bw_url: str = "https://bitwarden.stines.de",
|
||||
):
|
||||
if prev.get("mode") == "webhook":
|
||||
return prev
|
||||
|
||||
servers = prev.get("target_servers", [])
|
||||
|
||||
bw_creds = json.loads(wmill.get_variable("f/Backup/bitwarden_api_login"))
|
||||
env = os.environ.copy()
|
||||
env["BW_CLIENTID"] = bw_creds["bw_clientid"]
|
||||
env["BW_CLIENTSECRET"] = bw_creds["bw_clientsecret"]
|
||||
env["BW_PASSWORD"] = bw_creds["bw_masterpassword"]
|
||||
|
||||
def run(cmd, check=True):
|
||||
return subprocess.run(cmd, env=env, text=True, capture_output=True, check=check)
|
||||
|
||||
if subprocess.run(["which", "bw"], capture_output=True).returncode != 0:
|
||||
run(["wget",
|
||||
"https://github.com/bitwarden/cli/releases/download/v1.22.1/bw-linux-1.22.1.zip",
|
||||
"-O", "bw.zip"])
|
||||
run(["unzip", "bw.zip"])
|
||||
run(["chmod", "+x", "bw"])
|
||||
run(["mv", "bw", "/usr/local/bin/bw"])
|
||||
|
||||
with open("/etc/hosts", "a") as f:
|
||||
f.write("172.17.1.3 bitwarden.stines.de\n")
|
||||
|
||||
run(["bw", "config", "server", bw_url])
|
||||
run(["bw", "logout"], check=False)
|
||||
|
||||
result = run(["bw", "login", "--apikey"], check=False)
|
||||
if result.returncode != 0:
|
||||
raise Exception(f"Bitwarden Login fehlgeschlagen: {result.stderr}")
|
||||
|
||||
unlock = run(["bw", "unlock", bw_creds["bw_masterpassword"], "--raw"])
|
||||
bw_session = unlock.stdout.strip()
|
||||
if not bw_session:
|
||||
raise Exception("Vault konnte nicht entsperrt werden")
|
||||
env["BW_SESSION"] = bw_session
|
||||
|
||||
server_creds = {}
|
||||
for server in servers:
|
||||
hostname = server["hostname"]
|
||||
print(f"Hole Creds fuer: {hostname}")
|
||||
creds = bw_lookup(hostname, env, run)
|
||||
server_creds[hostname] = creds
|
||||
print(f" -> OK: {creds['username']}@{hostname}")
|
||||
|
||||
run(["bw", "logout"], check=False)
|
||||
|
||||
return {**prev, "server_creds": server_creds}
|
||||
+17
@@ -0,0 +1,17 @@
|
||||
# py: 3.12
|
||||
anyio==4.13.0
|
||||
bcrypt==5.0.0
|
||||
certifi==2026.2.25
|
||||
cffi==2.0.0
|
||||
cryptography==46.0.7
|
||||
h11==0.16.0
|
||||
httpcore==1.0.9
|
||||
httpx==0.28.1
|
||||
idna==3.11
|
||||
invoke==3.0.3
|
||||
mysql-connector-python==9.6.0
|
||||
paramiko==4.0.0
|
||||
pycparser==3.0
|
||||
pynacl==1.6.2
|
||||
typing-extensions==4.15.0
|
||||
wmill==1.680.0
|
||||
+404
@@ -0,0 +1,404 @@
|
||||
import wmill, json, mysql.connector, paramiko, re, base64
|
||||
|
||||
from datetime import datetime
|
||||
|
||||
import httpx
|
||||
|
||||
|
||||
|
||||
def send_nextcloud_message(message: str):
|
||||
"""Sendet eine Nachricht an Nextcloud Talk."""
|
||||
try:
|
||||
nc_url = wmill.get_variable("f/Backup/nextcloud_talk_url").rstrip("/")
|
||||
nc_room = wmill.get_variable("f/Backup/nextcloud_talk_room")
|
||||
nc_user = wmill.get_variable("f/Backup/nextcloud_talk_user")
|
||||
nc_password = wmill.get_variable("f/Backup/nextcloud_talk_password")
|
||||
|
||||
credentials = base64.b64encode(
|
||||
f"{nc_user}:{nc_password}".encode()
|
||||
).decode()
|
||||
|
||||
url = f"{nc_url}/ocs/v2.php/apps/spreed/api/v1/chat/{nc_room}"
|
||||
headers = {
|
||||
"Authorization": f"Basic {credentials}",
|
||||
"OCS-APIREQUEST": "true",
|
||||
"Content-Type": "application/json",
|
||||
"Accept": "application/json",
|
||||
}
|
||||
|
||||
response = httpx.post(
|
||||
url,
|
||||
headers=headers,
|
||||
json={"message": message},
|
||||
timeout=15,
|
||||
verify=False,
|
||||
)
|
||||
print(f"Nextcloud Talk: HTTP {response.status_code}")
|
||||
except Exception as e:
|
||||
print(f"Nextcloud Talk Fehler (nicht kritisch): {e}")
|
||||
|
||||
|
||||
def find_next_backup_for_server(cur, job_uuid, server_hostname, max_backup_size_gb):
|
||||
if max_backup_size_gb is not None:
|
||||
max_bytes = max_backup_size_gb * 1024 * 1024 * 1024
|
||||
cur.execute("""
|
||||
SELECT q.client_name, q.backup_path, q.backup_size_bytes,
|
||||
q.rsync_target, q.pbs_storage_id,
|
||||
r.hostname AS server_hostname,
|
||||
r.ip AS server_ip,
|
||||
r.restore_mount,
|
||||
r.restore_path AS restore_path,
|
||||
r.free_space_gb,
|
||||
r.max_backup_size_gb,
|
||||
r.min_backup_size_gb
|
||||
FROM Kunden.`bronze.backup.queue` q
|
||||
JOIN Kunden.`bronze.restore.server` r
|
||||
ON r.hostname = %s
|
||||
WHERE q.job_uuid = %s AND q.status = 'queued'
|
||||
AND r.current_job_uuid = %s
|
||||
AND (q.backup_size_bytes IS NULL OR q.backup_size_bytes <= %s)
|
||||
AND (r.min_backup_size_gb IS NULL OR q.backup_size_bytes >= r.min_backup_size_gb * 1024 * 1024 * 1024)
|
||||
ORDER BY q.priority ASC
|
||||
LIMIT 1
|
||||
""", (server_hostname, job_uuid, job_uuid, max_bytes))
|
||||
else:
|
||||
cur.execute("""
|
||||
SELECT q.client_name, q.backup_path, q.backup_size_bytes,
|
||||
q.rsync_target, q.pbs_storage_id,
|
||||
r.hostname AS server_hostname,
|
||||
r.ip AS server_ip,
|
||||
r.restore_mount,
|
||||
r.restore_path AS restore_path,
|
||||
r.free_space_gb,
|
||||
r.max_backup_size_gb,
|
||||
r.min_backup_size_gb
|
||||
FROM Kunden.`bronze.backup.queue` q
|
||||
JOIN Kunden.`bronze.restore.server` r
|
||||
ON r.hostname = %s
|
||||
WHERE q.job_uuid = %s AND q.status = 'queued'
|
||||
AND r.current_job_uuid = %s
|
||||
AND (r.min_backup_size_gb IS NULL OR q.backup_size_bytes >= r.min_backup_size_gb * 1024 * 1024 * 1024)
|
||||
ORDER BY q.priority ASC
|
||||
LIMIT 1
|
||||
""", (server_hostname, job_uuid, job_uuid))
|
||||
return cur.fetchone()
|
||||
|
||||
|
||||
def release_server_and_check_done(cur, conn, job_uuid, server_hostname):
|
||||
"""Server freigeben und prüfen ob alle Jobs fertig sind."""
|
||||
cur.execute("""
|
||||
UPDATE Kunden.`bronze.restore.server`
|
||||
SET current_job_uuid=NULL
|
||||
WHERE hostname=%s
|
||||
""", (server_hostname,))
|
||||
conn.commit()
|
||||
print(f"Server '{server_hostname}' fertig.")
|
||||
|
||||
cur.execute("""
|
||||
SELECT COUNT(*) AS cnt FROM Kunden.`bronze.restore.server`
|
||||
WHERE current_job_uuid = %s
|
||||
""", (job_uuid,))
|
||||
still_active = cur.fetchone()["cnt"]
|
||||
|
||||
if still_active == 0:
|
||||
cur.execute("""
|
||||
UPDATE Kunden.`bronze.restore.jobs`
|
||||
SET status='completed', finished_at=NOW()
|
||||
WHERE job_uuid=%s
|
||||
""", (job_uuid,))
|
||||
cur.execute("""
|
||||
DELETE FROM Kunden.`bronze.restore.session`
|
||||
WHERE job_uuid=%s
|
||||
""", (job_uuid,))
|
||||
# Statistik für Abschluss-Nachricht
|
||||
cur.execute("""
|
||||
SELECT total_backups, restored_count, failed_count,
|
||||
TIMESTAMPDIFF(MINUTE, started_at, NOW()) AS duration_min
|
||||
FROM Kunden.`bronze.restore.jobs`
|
||||
WHERE job_uuid=%s
|
||||
""", (job_uuid,))
|
||||
job_stats = cur.fetchone()
|
||||
conn.commit()
|
||||
print(f"Job {job_uuid} vollstaendig abgeschlossen.")
|
||||
if job_stats:
|
||||
total = job_stats["total_backups"] or 0
|
||||
restored = job_stats["restored_count"] or 0
|
||||
failed = job_stats["failed_count"] or 0
|
||||
dur_min = job_stats["duration_min"] or 0
|
||||
dur_str = f"{dur_min//60}h {dur_min%60}m" if dur_min >= 60 else f"{dur_min}m"
|
||||
if failed == 0:
|
||||
msg = (
|
||||
f"✅ **Backup-Job abgeschlossen**\n"
|
||||
f"Alle {total} Backups erfolgreich | Dauer: {dur_str}"
|
||||
)
|
||||
else:
|
||||
msg = (
|
||||
f"⚠️ **Backup-Job abgeschlossen mit Fehlern**\n"
|
||||
f"✅ {restored}/{total} erfolgreich | ❌ {failed} fehlgeschlagen | Dauer: {dur_str}"
|
||||
)
|
||||
send_nextcloud_message(msg)
|
||||
return "all_done"
|
||||
else:
|
||||
conn.commit()
|
||||
print(f"Noch {still_active} Server aktiv.")
|
||||
return "server_done"
|
||||
|
||||
|
||||
def main(from_init: dict):
|
||||
|
||||
if from_init.get("mode") == "schedule":
|
||||
print("Restores gestartet - Flow wartet auf Webhooks.")
|
||||
return {"status": "waiting_for_webhook",
|
||||
"job_uuid": from_init.get("job_uuid")}
|
||||
|
||||
data = from_init.get("data", {})
|
||||
job_uuid = data.get("job_uuid")
|
||||
client = data.get("client_name")
|
||||
status = data.get("status", "unknown")
|
||||
server_hostname = data.get("server_hostname", "")
|
||||
|
||||
if not job_uuid or not client:
|
||||
raise Exception(f"Ungueltiger Webhook-Payload: {data}")
|
||||
|
||||
print(f"Webhook: {client} -> {status} (Server: {server_hostname})")
|
||||
|
||||
client_like = f"{client}%"
|
||||
|
||||
db_cfg = json.loads(wmill.get_variable("f/Backup/mysql_config"))
|
||||
conn = mysql.connector.connect(**db_cfg)
|
||||
cur = conn.cursor(dictionary=True)
|
||||
|
||||
cur.execute("""
|
||||
UPDATE Kunden.`bronze.restore.result` SET
|
||||
vm_name = %s,
|
||||
vm_id_original = %s,
|
||||
vm_id_restored = %s,
|
||||
restore_duration_sec = %s,
|
||||
actual_disk_used_bytes = %s,
|
||||
zip_size_bytes = %s,
|
||||
zip_duration_sec = %s,
|
||||
rsync_size_bytes = %s,
|
||||
rsync_ok = %s,
|
||||
rsync_retries = %s,
|
||||
qm_agent_ok = %s,
|
||||
status = %s,
|
||||
error_message = %s,
|
||||
webhook_received_at = %s
|
||||
WHERE job_uuid = %s AND backup_path LIKE %s
|
||||
""", (
|
||||
data.get("vm_name", ""),
|
||||
data.get("vm_id_original"),
|
||||
data.get("vm_id_restored"),
|
||||
data.get("restore_duration_sec"),
|
||||
data.get("actual_disk_used_bytes"),
|
||||
data.get("zip_size_bytes"),
|
||||
data.get("zip_duration_sec"),
|
||||
data.get("rsync_size_bytes"),
|
||||
1 if data.get("rsync_ok") else 0,
|
||||
data.get("rsync_retries", 0),
|
||||
1 if data.get("qm_agent_ok") in (True, "true", "skipped") else 0,
|
||||
"done" if status == "success" else "failed",
|
||||
data.get("error_message", ""),
|
||||
datetime.now(),
|
||||
job_uuid, client_like,
|
||||
))
|
||||
|
||||
cur.execute("""
|
||||
UPDATE Kunden.`bronze.backup.queue` SET status=%s
|
||||
WHERE job_uuid=%s AND backup_path LIKE %s
|
||||
""", ("done" if status == "success" else "failed", job_uuid, client_like))
|
||||
|
||||
field = "restored_count" if status == "success" else "failed_count"
|
||||
cur.execute(
|
||||
f"UPDATE Kunden.`bronze.restore.jobs` "
|
||||
f"SET {field}={field}+1 WHERE job_uuid=%s",
|
||||
(job_uuid,)
|
||||
)
|
||||
|
||||
if data.get("free_space_gb") is not None and server_hostname:
|
||||
cur.execute("""
|
||||
UPDATE Kunden.`bronze.restore.server`
|
||||
SET free_space_gb = %s WHERE hostname = %s
|
||||
""", (data.get("free_space_gb"), server_hostname))
|
||||
|
||||
conn.commit()
|
||||
|
||||
vm_name = data.get("vm_name") or client
|
||||
dur_sec = data.get("restore_duration_sec") or 0
|
||||
dur_str = f"{dur_sec//60}m {dur_sec%60}s" if dur_sec >= 60 else f"{dur_sec}s"
|
||||
zip_mb = (data.get("zip_size_bytes") or 0) // 1024 // 1024
|
||||
icon = "✅" if status == "success" else "❌"
|
||||
|
||||
if status != "success":
|
||||
err = data.get("error_message", "")[:100]
|
||||
nc_msg = (
|
||||
f"{icon} **{vm_name}** ({client})\n"
|
||||
f"Server: {server_hostname} | Fehler: {err}"
|
||||
)
|
||||
send_nextcloud_message(nc_msg)
|
||||
|
||||
cur.execute("""
|
||||
SELECT max_backup_size_gb, min_backup_size_gb, free_space_gb
|
||||
FROM Kunden.`bronze.restore.server`
|
||||
WHERE hostname = %s
|
||||
""", (server_hostname,))
|
||||
srv_cfg = cur.fetchone()
|
||||
max_backup_size_gb = srv_cfg["max_backup_size_gb"] if srv_cfg else None
|
||||
min_backup_size_gb = srv_cfg["min_backup_size_gb"] if srv_cfg else None
|
||||
max_bytes = max_backup_size_gb * 1024 * 1024 * 1024 if max_backup_size_gb is not None else None
|
||||
min_bytes = min_backup_size_gb * 1024 * 1024 * 1024 if min_backup_size_gb is not None else None
|
||||
|
||||
nxt = find_next_backup_for_server(cur, job_uuid, server_hostname, max_backup_size_gb)
|
||||
conn.commit()
|
||||
|
||||
if not nxt:
|
||||
if max_backup_size_gb is not None and min_backup_size_gb is not None:
|
||||
cur.execute("""
|
||||
SELECT client_name, backup_path, backup_size_bytes,
|
||||
rsync_target, pbs_storage_id
|
||||
FROM Kunden.`bronze.backup.queue`
|
||||
WHERE job_uuid = %s AND status = 'queued'
|
||||
AND (backup_size_bytes IS NULL OR backup_size_bytes <= %s)
|
||||
AND backup_size_bytes >= %s
|
||||
ORDER BY priority ASC
|
||||
LIMIT 1
|
||||
""", (job_uuid, max_bytes, min_bytes))
|
||||
elif max_backup_size_gb is not None:
|
||||
cur.execute("""
|
||||
SELECT client_name, backup_path, backup_size_bytes,
|
||||
rsync_target, pbs_storage_id
|
||||
FROM Kunden.`bronze.backup.queue`
|
||||
WHERE job_uuid = %s AND status = 'queued'
|
||||
AND (backup_size_bytes IS NULL OR backup_size_bytes <= %s)
|
||||
ORDER BY priority ASC
|
||||
LIMIT 1
|
||||
""", (job_uuid, max_bytes))
|
||||
elif min_backup_size_gb is not None:
|
||||
cur.execute("""
|
||||
SELECT client_name, backup_path, backup_size_bytes,
|
||||
rsync_target, pbs_storage_id
|
||||
FROM Kunden.`bronze.backup.queue`
|
||||
WHERE job_uuid = %s AND status = 'queued'
|
||||
AND backup_size_bytes >= %s
|
||||
ORDER BY priority ASC
|
||||
LIMIT 1
|
||||
""", (job_uuid, min_bytes))
|
||||
else:
|
||||
cur.execute("""
|
||||
SELECT client_name, backup_path, backup_size_bytes,
|
||||
rsync_target, pbs_storage_id
|
||||
FROM Kunden.`bronze.backup.queue`
|
||||
WHERE job_uuid = %s AND status = 'queued'
|
||||
ORDER BY priority ASC
|
||||
LIMIT 1
|
||||
""", (job_uuid,))
|
||||
next_queued = cur.fetchone()
|
||||
|
||||
if next_queued:
|
||||
print(f"Server '{server_hostname}' nimmt naechstes passendes Backup: "
|
||||
f"{next_queued['client_name']}")
|
||||
|
||||
cur.execute("""
|
||||
SELECT hostname, ip, restore_mount, restore_path,
|
||||
free_space_gb, max_backup_size_gb, min_backup_size_gb
|
||||
FROM Kunden.`bronze.restore.server`
|
||||
WHERE hostname = %s
|
||||
""", (server_hostname,))
|
||||
srv = cur.fetchone()
|
||||
|
||||
nxt = {
|
||||
**next_queued,
|
||||
"server_hostname": server_hostname,
|
||||
"server_ip": srv["ip"] if srv else "",
|
||||
"restore_mount": srv["restore_mount"] if srv else "",
|
||||
"restore_path": srv["restore_path"] if srv else "",
|
||||
"free_space_gb": srv["free_space_gb"] if srv else 0,
|
||||
"max_backup_size_gb": srv["max_backup_size_gb"] if srv else None,
|
||||
"min_backup_size_gb": srv["min_backup_size_gb"] if srv else None,
|
||||
}
|
||||
else:
|
||||
result = release_server_and_check_done(cur, conn, job_uuid, server_hostname)
|
||||
cur.close(); conn.close()
|
||||
if result == "all_done":
|
||||
return {"status": "all_done", "job_uuid": job_uuid}
|
||||
else:
|
||||
return {"status": "server_done",
|
||||
"server": server_hostname,
|
||||
"job_uuid": job_uuid}
|
||||
|
||||
cur.close(); conn.close()
|
||||
|
||||
# FIX: started_at hinzugefügt
|
||||
conn2 = mysql.connector.connect(**db_cfg)
|
||||
cur2 = conn2.cursor()
|
||||
cur2.execute("""
|
||||
INSERT INTO Kunden.`bronze.restore.result`
|
||||
(job_uuid, client_name, backup_path,
|
||||
backup_size_bytes, restore_server, status, started_at)
|
||||
VALUES (%s, %s, %s, %s, %s, 'restoring', NOW())
|
||||
""", (
|
||||
job_uuid, nxt["client_name"], nxt["backup_path"],
|
||||
nxt.get("backup_size_bytes", 0), nxt["server_hostname"],
|
||||
))
|
||||
cur2.execute("""
|
||||
UPDATE Kunden.`bronze.backup.queue` SET status='assigned'
|
||||
WHERE job_uuid=%s AND backup_path LIKE %s
|
||||
""", (job_uuid, f"{nxt['client_name']}%"))
|
||||
conn2.commit(); cur2.close(); conn2.close()
|
||||
|
||||
conn3 = mysql.connector.connect(**db_cfg)
|
||||
cur3 = conn3.cursor(dictionary=True)
|
||||
cur3.execute("""
|
||||
SELECT ip, ssh_user, ssh_password
|
||||
FROM Kunden.`bronze.restore.session`
|
||||
WHERE job_uuid = %s AND hostname = %s
|
||||
LIMIT 1
|
||||
""", (job_uuid, server_hostname))
|
||||
session = cur3.fetchone()
|
||||
cur3.close(); conn3.close()
|
||||
|
||||
if not session:
|
||||
raise Exception(
|
||||
f"Keine Session-Creds fuer '{server_hostname}'. "
|
||||
f"Job-UUID: {job_uuid}"
|
||||
)
|
||||
|
||||
webhook_url = wmill.get_variable("f/Backup/windmill_webhook_url")
|
||||
webhook_tok = wmill.get_variable("f/Backup/windmill_webhook_token")
|
||||
|
||||
ssh = paramiko.SSHClient()
|
||||
ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
|
||||
ssh.connect(
|
||||
session["ip"],
|
||||
username=session["ssh_user"],
|
||||
password=session["ssh_password"]
|
||||
)
|
||||
|
||||
safe_client = nxt["client_name"].replace("/", "_").replace(":", "_")
|
||||
nxt_size = nxt.get("backup_size_bytes") or 0
|
||||
cmd = (
|
||||
f"nohup /opt/windmill-restore/restore.sh"
|
||||
f" --job-uuid '{job_uuid}'"
|
||||
f" --backup-path '{nxt['backup_path']}'"
|
||||
f" --client '{nxt['client_name']}'"
|
||||
f" --restore-mount '{nxt['restore_mount']}'"
|
||||
f" --restore-path '{nxt['restore_path']}'"
|
||||
f" --rsync-target '{nxt['rsync_target']}'"
|
||||
f" --pbs-storage '{nxt['pbs_storage_id']}'"
|
||||
f" --webhook-url '{webhook_url}'"
|
||||
f" --webhook-token '{webhook_tok}'"
|
||||
f" --server-hostname '{server_hostname}'"
|
||||
f" --backup-size '{nxt_size}'"
|
||||
f" > /opt/windmill-restore/logs/{safe_client}.log 2>&1 &"
|
||||
)
|
||||
ssh.exec_command(cmd)
|
||||
ssh.close()
|
||||
|
||||
size_gb = nxt_size / 1024 / 1024 / 1024
|
||||
print(f"Naechster Restore: {nxt['client_name']} ({size_gb:.1f} GB) auf {server_hostname}")
|
||||
return {
|
||||
"status": "next_restore_started",
|
||||
"client": nxt["client_name"],
|
||||
"server": server_hostname,
|
||||
"job_uuid": job_uuid,
|
||||
}
|
||||
@@ -0,0 +1,31 @@
|
||||
summary: Backup Restore Report - Nextcloud Talk
|
||||
description: |
|
||||
Läuft täglich um 08:00 Uhr. Holt das Ergebnis des letzten
|
||||
Backup-Restore-Jobs aus der DB und sendet eine Zusammenfassung
|
||||
per Nextcloud Talk (Webhook/Bot).
|
||||
value:
|
||||
modules:
|
||||
- id: a
|
||||
summary: Letzten Job aus DB holen & Report zusammenbauen
|
||||
value:
|
||||
type: rawscript
|
||||
content: '!inline letzten_job_aus_db_holen_&_report_zusammenbauen.py'
|
||||
input_transforms: {}
|
||||
lock: '!inline letzten_job_aus_db_holen_&_report_zusammenbauen.lock'
|
||||
language: python3
|
||||
- id: b
|
||||
summary: Nachricht an Nextcloud Talk senden
|
||||
value:
|
||||
type: rawscript
|
||||
content: '!inline nachricht_an_nextcloud_talk_senden.py'
|
||||
input_transforms:
|
||||
report:
|
||||
type: javascript
|
||||
expr: results.a
|
||||
lock: '!inline nachricht_an_nextcloud_talk_senden.lock'
|
||||
language: python3
|
||||
schema:
|
||||
$schema: https://json-schema.org/draft/2020-12/schema
|
||||
type: object
|
||||
order: []
|
||||
properties: {}
|
||||
+10
@@ -0,0 +1,10 @@
|
||||
# py: 3.12
|
||||
anyio==4.12.1
|
||||
certifi==2026.2.25
|
||||
h11==0.16.0
|
||||
httpcore==1.0.9
|
||||
httpx==0.28.1
|
||||
idna==3.11
|
||||
mysql-connector-python==9.6.0
|
||||
typing-extensions==4.15.0
|
||||
wmill==1.659.1
|
||||
+129
@@ -0,0 +1,129 @@
|
||||
import wmill, mysql.connector, json
|
||||
from datetime import datetime
|
||||
|
||||
def fmt_bytes(b):
|
||||
if not b: return "—"
|
||||
b = int(b)
|
||||
for unit in ["B","KB","MB","GB","TB"]:
|
||||
if b < 1024: return f"{b:.1f} {unit}"
|
||||
b /= 1024
|
||||
return f"{b:.1f} PB"
|
||||
|
||||
def fmt_dur(sec):
|
||||
if not sec: return "—"
|
||||
sec = int(sec)
|
||||
if sec < 60: return f"{sec}s"
|
||||
if sec < 3600: return f"{sec//60}m {sec%60}s"
|
||||
return f"{sec//3600}h {(sec%3600)//60}m"
|
||||
|
||||
def main():
|
||||
db_cfg = json.loads(wmill.get_variable("f/Backup/mysql_config"))
|
||||
conn = mysql.connector.connect(**db_cfg)
|
||||
cur = conn.cursor(dictionary=True)
|
||||
|
||||
cur.execute("""
|
||||
SELECT job_uuid, started_at, finished_at, status,
|
||||
total_backups, restored_count, failed_count
|
||||
FROM Kunden.`bronze.restore.jobs`
|
||||
WHERE status IN ('completed', 'failed')
|
||||
ORDER BY started_at DESC
|
||||
LIMIT 1
|
||||
""")
|
||||
job = cur.fetchone()
|
||||
|
||||
if not job:
|
||||
cur.close(); conn.close()
|
||||
return {"message": "⚠️ Kein abgeschlossener Backup-Job gefunden."}
|
||||
|
||||
job_uuid = job["job_uuid"]
|
||||
|
||||
duration = ""
|
||||
if job["started_at"] and job["finished_at"]:
|
||||
secs = (job["finished_at"] - job["started_at"]).seconds
|
||||
duration = fmt_dur(secs)
|
||||
|
||||
cur.execute("""
|
||||
SELECT client_name, vm_name, restore_server, status,
|
||||
restore_duration_sec, zip_size_bytes,
|
||||
rsync_ok, qm_agent_ok, error_message
|
||||
FROM Kunden.`bronze.restore.result`
|
||||
WHERE job_uuid = %s
|
||||
ORDER BY status ASC, client_name ASC
|
||||
""", (job_uuid,))
|
||||
results = cur.fetchall()
|
||||
|
||||
# Nicht gelaufene VMs
|
||||
cur.execute("""
|
||||
SELECT p.client_name, p.vm_name
|
||||
FROM Kunden.`bronze.restore.plan` p
|
||||
LEFT JOIN Kunden.`bronze.restore.result` r
|
||||
ON r.job_uuid = p.job_uuid
|
||||
AND r.client_name = p.client_name
|
||||
WHERE p.job_uuid = %s
|
||||
AND r.id IS NULL
|
||||
ORDER BY p.client_name ASC
|
||||
""", (job_uuid,))
|
||||
not_run = cur.fetchall()
|
||||
|
||||
cur.close(); conn.close()
|
||||
|
||||
total = job["total_backups"] or 0
|
||||
done = job["restored_count"] or 0
|
||||
failed = job["failed_count"] or 0
|
||||
skipped = len(not_run)
|
||||
date_str = job["started_at"].strftime("%d.%m.%Y") if job["started_at"] else "?"
|
||||
time_str = job["started_at"].strftime("%H:%M") if job["started_at"] else "?"
|
||||
|
||||
if failed == 0 and skipped == 0:
|
||||
status_icon = "✅"
|
||||
elif done == 0:
|
||||
status_icon = "❌"
|
||||
else:
|
||||
status_icon = "⚠️"
|
||||
|
||||
lines = [
|
||||
f"{status_icon} **Backup Restore Report – {date_str}**",
|
||||
f"",
|
||||
f"🕐 Start: {time_str} | Dauer: {duration}",
|
||||
f"📊 Gesamt: {total} | ✅ OK: {done} | ❌ Fehler: {failed} | ⏭️ Nicht gestartet: {skipped}",
|
||||
f"",
|
||||
]
|
||||
|
||||
# Fehlgeschlagene
|
||||
failed_results = [r for r in results if r["status"] == "failed"]
|
||||
if failed_results:
|
||||
lines.append("**❌ Fehlgeschlagen:**")
|
||||
for r in failed_results:
|
||||
name = r["vm_name"] or r["client_name"]
|
||||
err = r["error_message"] or "unbekannt"
|
||||
if len(err) > 80:
|
||||
err = err[:80] + "..."
|
||||
lines.append(f" • {name} ({r['restore_server']}): {err}")
|
||||
lines.append("")
|
||||
|
||||
# Nicht gestartet
|
||||
if not_run:
|
||||
lines.append("**⏭️ Nicht gestartet:**")
|
||||
for r in not_run:
|
||||
name = r["vm_name"] or r["client_name"]
|
||||
lines.append(f" • {name}")
|
||||
lines.append("")
|
||||
|
||||
# Erfolgreich
|
||||
done_results = [r for r in results if r["status"] == "done"]
|
||||
if done_results:
|
||||
lines.append("**✅ Erfolgreich:**")
|
||||
for r in done_results:
|
||||
name = r["vm_name"] or r["client_name"]
|
||||
dauer = fmt_dur(r["restore_duration_sec"])
|
||||
zipsize = fmt_bytes(r["zip_size_bytes"])
|
||||
agent = "✓" if r["qm_agent_ok"] else "✗"
|
||||
rsync = "✓" if r["rsync_ok"] else "✗"
|
||||
lines.append(
|
||||
f" • {name} | {r['restore_server']} | "
|
||||
f"⏱ {dauer} | 📦 {zipsize} | Agent: {agent} | Rsync: {rsync}"
|
||||
)
|
||||
|
||||
message = "\n".join(lines)
|
||||
print(message)
|
||||
return {"message": message, "job_uuid": job_uuid, "failed": failed, "skipped": skipped}
|
||||
+9
@@ -0,0 +1,9 @@
|
||||
# py: 3.12
|
||||
anyio==4.12.1
|
||||
certifi==2026.2.25
|
||||
h11==0.16.0
|
||||
httpcore==1.0.9
|
||||
httpx==0.28.1
|
||||
idna==3.11
|
||||
typing-extensions==4.15.0
|
||||
wmill==1.657.2
|
||||
+44
@@ -0,0 +1,44 @@
|
||||
import wmill, json
|
||||
import httpx
|
||||
|
||||
def main(report: dict):
|
||||
import base64
|
||||
message = report.get("message", "Kein Bericht verfuegbar.")
|
||||
|
||||
nc_url = wmill.get_variable("f/Backup/nextcloud_talk_url").rstrip("/")
|
||||
nc_room = wmill.get_variable("f/Backup/nextcloud_talk_room")
|
||||
nc_user = wmill.get_variable("f/Backup/nextcloud_talk_user")
|
||||
nc_password = wmill.get_variable("f/Backup/nextcloud_talk_password")
|
||||
|
||||
credentials = base64.b64encode(
|
||||
f"{nc_user}:{nc_password}".encode()
|
||||
).decode()
|
||||
|
||||
url = f"{nc_url}/ocs/v2.php/apps/spreed/api/v1/chat/{nc_room}"
|
||||
|
||||
headers = {
|
||||
"Authorization": f"Basic {credentials}",
|
||||
"OCS-APIREQUEST": "true",
|
||||
"Content-Type": "application/json",
|
||||
"Accept": "application/json",
|
||||
}
|
||||
|
||||
payload = {"message": message}
|
||||
|
||||
response = httpx.post(
|
||||
url,
|
||||
headers=headers,
|
||||
json=payload,
|
||||
timeout=30,
|
||||
verify=False, # falls self-signed cert
|
||||
)
|
||||
|
||||
print(f"HTTP: {response.status_code}")
|
||||
print(f"URL: {url}")
|
||||
if response.status_code not in (200, 201):
|
||||
raise Exception(
|
||||
f"Nextcloud Talk Fehler: {response.status_code} – {response.text}"
|
||||
)
|
||||
|
||||
print("Nachricht gesendet ✓")
|
||||
return {"status": "sent", "http": response.status_code}
|
||||
@@ -0,0 +1,6 @@
|
||||
summary: ''
|
||||
display_name: Backup
|
||||
extra_perms:
|
||||
sebastianserfling@stines.de: true
|
||||
owners:
|
||||
- sebastianserfling@stines.de
|
||||
@@ -0,0 +1,3 @@
|
||||
description: ''
|
||||
value: btrv2jb9
|
||||
is_secret: false
|
||||
@@ -0,0 +1,3 @@
|
||||
description: ''
|
||||
value: https://cloudstorage.stines.de
|
||||
is_secret: false
|
||||
@@ -0,0 +1,3 @@
|
||||
description: ''
|
||||
value: 1.0.28
|
||||
is_secret: false
|
||||
@@ -0,0 +1,734 @@
|
||||
#!/usr/bin/env bash
|
||||
# =============================================================================
|
||||
# /opt/windmill-restore/restore.sh
|
||||
# Windmill Backup Restore Worker
|
||||
# Version: 1.0.26
|
||||
#
|
||||
# Unterstützt sowohl VM (qm) als auch CT (pct) Backups.
|
||||
# Backup-Typ wird automatisch aus dem Backup-Pfad erkannt (vm/ oder ct/).
|
||||
#
|
||||
# ABLAUF:
|
||||
# [0] 7z-Passwort holen – password_7z.txt per Rsync vom PBS-Server
|
||||
# [1] Space-Check – Freier Platz auf restore-mount prüfen
|
||||
# [2] ID ermitteln – Original aus Backup-Pfad, Restore-ID ab 1000
|
||||
# [3] Restore – qmrestore (VM) oder pct restore (CT)
|
||||
# [4] IMAGE_DIR – Dynamisch aus PVE-Storage-Pfad ermitteln
|
||||
# [5] Images prüfen – Abbruch wenn leer/nicht vorhanden
|
||||
# [6] Vorbereiten – VM: unlock/cdrom/net entfernen/Agent
|
||||
# CT: unlock/net entfernen
|
||||
# [7] Starten & prüfen – VM: qm-Agent 120s | CT: pct exec ping
|
||||
# [8] Stoppen – VM: qm shutdown | CT: pct stop
|
||||
# [9] Config sichern – Originale Config ins ZIP-Verzeichnis
|
||||
# [10] 7z-Archiv – Images verschlüsselt zippen
|
||||
# [11] Rsync – ZIP zum Backup-Server
|
||||
# [12] Aufräumen – destroy, ZIP löschen
|
||||
# [13] Webhook – JSON → Windmill
|
||||
# =============================================================================
|
||||
set -euo pipefail
|
||||
|
||||
# ── Konfigdatei laden ─────────────────────────────────────────────────────────
|
||||
CONF_FILE="/opt/windmill-restore/pbs.conf"
|
||||
[[ ! -f "$CONF_FILE" ]] && { echo "FEHLER: $CONF_FILE fehlt!" >&2; exit 1; }
|
||||
source "$CONF_FILE"
|
||||
|
||||
# ── Argument-Parser ───────────────────────────────────────────────────────────
|
||||
JOB_UUID=""
|
||||
BACKUP_PATH=""
|
||||
CLIENT_NAME=""
|
||||
RESTORE_MOUNT=""
|
||||
RESTORE_PATH=""
|
||||
RSYNC_TARGET=""
|
||||
PBS_STORAGE=""
|
||||
WEBHOOK_URL=""
|
||||
WEBHOOK_TOKEN=""
|
||||
SERVER_HOSTNAME=""
|
||||
BACKUP_SIZE_BYTES=0
|
||||
|
||||
while [[ $# -gt 0 ]]; do
|
||||
case $1 in
|
||||
--job-uuid) JOB_UUID="$2"; shift 2 ;;
|
||||
--backup-path) BACKUP_PATH="$2"; shift 2 ;;
|
||||
--client) CLIENT_NAME="$2"; shift 2 ;;
|
||||
--restore-mount) RESTORE_MOUNT="$2"; shift 2 ;;
|
||||
--restore-path) RESTORE_PATH="$2"; shift 2 ;;
|
||||
--rsync-target) RSYNC_TARGET="$2"; shift 2 ;;
|
||||
--pbs-storage) PBS_STORAGE="$2"; shift 2 ;;
|
||||
--webhook-url) WEBHOOK_URL="$2"; shift 2 ;;
|
||||
--webhook-token) WEBHOOK_TOKEN="$2"; shift 2 ;;
|
||||
--server-hostname) SERVER_HOSTNAME="$2"; shift 2 ;;
|
||||
--backup-size) BACKUP_SIZE_BYTES="$2"; shift 2 ;;
|
||||
*) echo "Unbekannter Parameter: $1" >&2; exit 1 ;;
|
||||
esac
|
||||
done
|
||||
|
||||
for var in JOB_UUID BACKUP_PATH CLIENT_NAME \
|
||||
RESTORE_MOUNT RESTORE_PATH RSYNC_TARGET PBS_STORAGE WEBHOOK_URL; do
|
||||
[[ -z "${!var}" ]] && { echo "FEHLER: --${var//_/-} fehlt" >&2; exit 1; }
|
||||
done
|
||||
|
||||
[[ ! -d "$RESTORE_MOUNT" ]] && {
|
||||
echo "FEHLER: Restore-Mount '$RESTORE_MOUNT' existiert nicht!" >&2; exit 1
|
||||
}
|
||||
|
||||
# Fallback SERVER_HOSTNAME
|
||||
SERVER_HOSTNAME="${SERVER_HOSTNAME:-$(hostname -f 2>/dev/null || hostname)}"
|
||||
|
||||
# ── Logging ───────────────────────────────────────────────────────────────────
|
||||
LOG_DIR="/opt/windmill-restore/logs"
|
||||
mkdir -p "$LOG_DIR"
|
||||
SAFE_CLIENT="${CLIENT_NAME//\//_}"
|
||||
SAFE_CLIENT="${SAFE_CLIENT//:/_}"
|
||||
LOG_FILE="$LOG_DIR/${SAFE_CLIENT}.log"
|
||||
exec >> "$LOG_FILE" 2>&1
|
||||
|
||||
# ── Backup-Pfad zerlegen ──────────────────────────────────────────────────────
|
||||
DATASTORE=$(echo "$BACKUP_PATH" | cut -d: -f1)
|
||||
SNAPSHOT_PATH=$(echo "$BACKUP_PATH" | cut -d: -f2-)
|
||||
BACKUP_TYPE=$(echo "$SNAPSHOT_PATH" | cut -d/ -f1)
|
||||
PVE_BACKUP_REF="${PBS_STORAGE}:backup/${SNAPSHOT_PATH}"
|
||||
|
||||
# ── Komprimierungsstufe festlegen ─────────────────────────────────────────────
|
||||
# Standard: mx=1 (schnellste Komprimierung)
|
||||
# Ausnahme: tnp-Invest-GmbH vm/108 → mx=0 (Store-Modus, kein Komprimieren)
|
||||
# Hintergrund: Diese VM ist sehr groß und würde mit mx=1 ~10h brauchen.
|
||||
COMPRESS_LEVEL=0
|
||||
|
||||
# ── 7z Thread-Anzahl je Host festlegen ────────────────────────────────────────
|
||||
# STI-BAC01 → Ryzen 9 5950X (16 Kerne / 32 Threads) → mmt=16
|
||||
# ITD-PROX01 → Ryzen 7 3700X ( 8 Kerne / 16 Threads) → mmt=8
|
||||
# STI-PROX01 → Xeon E5-1650v3 ( 6 Kerne / 12 Threads) → mmt=6
|
||||
# Fallback → mmt=4
|
||||
case "$SERVER_HOSTNAME" in
|
||||
STI-BAC01) MMT_THREADS=30 ;;
|
||||
ITD-PROX01) MMT_THREADS=8 ;;
|
||||
STI-PROX01) MMT_THREADS=16 ;;
|
||||
*) MMT_THREADS=4 ;;
|
||||
esac
|
||||
echo "INFO: Server '$SERVER_HOSTNAME' → 7z mmt=${MMT_THREADS}"
|
||||
|
||||
# ── Messvariablen ─────────────────────────────────────────────────────────────
|
||||
LAST_DATE=$(TZ="Europe/Berlin" date +"%Y-%m-%d" -d "1 day ago")
|
||||
|
||||
# STI-BAC01: rsync_target lokal gemountet → ZIP direkt dorthin, kein Rsync
|
||||
if [[ "$SERVER_HOSTNAME" == "STI-BAC01" ]]; then
|
||||
ZIP_DIR="${RSYNC_TARGET}/${LAST_DATE}"
|
||||
SKIP_RSYNC=1
|
||||
else
|
||||
ZIP_DIR="${RESTORE_MOUNT}/zips/${LAST_DATE}"
|
||||
SKIP_RSYNC=0
|
||||
fi
|
||||
|
||||
BACKUP_SERVER_HOST=$(cat /opt/windmill-restore/backup_server_host 2>/dev/null \
|
||||
|| echo "backup-server")
|
||||
KEY_DIR="/opt/windmill-restore/keys"
|
||||
|
||||
RESTORE_START=$(date +%s)
|
||||
STATUS="success"
|
||||
ERROR_MSG=""
|
||||
VM_ID_ORIGINAL=0
|
||||
VM_ID_RESTORED=0
|
||||
VM_NAME=""
|
||||
IMAGE_DIR=""
|
||||
ACTUAL_DISK_BYTES=0
|
||||
ZIP_SIZE_BYTES=0
|
||||
ZIP_DURATION=0
|
||||
RSYNC_SIZE_BYTES=0
|
||||
RSYNC_OK="true"
|
||||
RSYNC_RETRIES=0
|
||||
QM_AGENT_OK="false"
|
||||
ZIP_FILE=""
|
||||
ZIP_PASSWORD=""
|
||||
FREE_GB=0
|
||||
|
||||
echo "============================================================"
|
||||
echo " Windmill Restore Worker"
|
||||
echo " Client: $CLIENT_NAME"
|
||||
echo " Typ: $BACKUP_TYPE"
|
||||
echo " Datastore: $DATASTORE"
|
||||
echo " Backup: $BACKUP_PATH"
|
||||
echo " PBS-Storage: $PBS_STORAGE"
|
||||
echo " Restore-Mount: $RESTORE_MOUNT"
|
||||
echo " Restore-Path: $RESTORE_PATH"
|
||||
echo " Rsync-Target: $RSYNC_TARGET"
|
||||
echo " Server: $SERVER_HOSTNAME"
|
||||
echo " Skip-Rsync: $SKIP_RSYNC"
|
||||
echo " Job-UUID: $JOB_UUID"
|
||||
echo " 7z-Level: mx=${COMPRESS_LEVEL} mmt=${MMT_THREADS}"
|
||||
echo " Start: $(date '+%Y-%m-%d %H:%M:%S')"
|
||||
echo "============================================================"
|
||||
|
||||
# ── JSON Escape Funktion ──────────────────────────────────────────────────────
|
||||
escape_json() {
|
||||
local input="$1"
|
||||
input="${input//\\/\\\\}"
|
||||
input="${input//\"/\\\"}"
|
||||
input="${input//$'\n'/\\n}"
|
||||
input="${input//$'\r'/\\r}"
|
||||
input="${input//$'\t'/\\t}"
|
||||
echo "$input"
|
||||
}
|
||||
|
||||
# ── Webhook-Funktion ──────────────────────────────────────────────────────────
|
||||
send_webhook() {
|
||||
local wh_status="$1"
|
||||
local wh_error
|
||||
wh_error=$(escape_json "${2:-}")
|
||||
local wh_vm_name
|
||||
wh_vm_name=$(escape_json "${VM_NAME:-$SAFE_CLIENT}")
|
||||
local duration=$(( $(date +%s) - RESTORE_START ))
|
||||
local payload
|
||||
payload=$(printf '{
|
||||
"job_uuid": "%s",
|
||||
"client_name": "%s",
|
||||
"status": "%s",
|
||||
"error_message": "%s",
|
||||
"server_hostname": "%s",
|
||||
"free_space_gb": %d,
|
||||
"vm_name": "%s",
|
||||
"vm_id_original": %d,
|
||||
"vm_id_restored": %d,
|
||||
"restore_duration_sec": %d,
|
||||
"actual_disk_used_bytes": %d,
|
||||
"zip_size_bytes": %d,
|
||||
"zip_duration_sec": %d,
|
||||
"rsync_size_bytes": %d,
|
||||
"rsync_ok": %s,
|
||||
"rsync_retries": %d,
|
||||
"qm_agent_ok": "%s",
|
||||
"log_file": "%s"
|
||||
}' \
|
||||
"$JOB_UUID" "$CLIENT_NAME" "$wh_status" "$wh_error" \
|
||||
"$SERVER_HOSTNAME" "$FREE_GB" "$wh_vm_name" \
|
||||
"$VM_ID_ORIGINAL" "$VM_ID_RESTORED" \
|
||||
"$duration" "$ACTUAL_DISK_BYTES" \
|
||||
"$ZIP_SIZE_BYTES" "$ZIP_DURATION" \
|
||||
"$RSYNC_SIZE_BYTES" "$RSYNC_OK" "$RSYNC_RETRIES" \
|
||||
"$QM_AGENT_OK" "$LOG_FILE")
|
||||
|
||||
echo ""
|
||||
echo "$(date '+%Y-%m-%d %H:%M:%S') ==> Sende Webhook..."
|
||||
echo " Payload: $payload"
|
||||
local http_response
|
||||
http_response=$(curl -s -w "\n%{http_code}" \
|
||||
-X POST "$WEBHOOK_URL" \
|
||||
-H "Content-Type: application/json" \
|
||||
${WEBHOOK_TOKEN:+-H "Authorization: Bearer ${WEBHOOK_TOKEN}"} \
|
||||
-d "$payload")
|
||||
local http_code
|
||||
http_code=$(echo "$http_response" | tail -1)
|
||||
local http_body
|
||||
http_body=$(echo "$http_response" | head -n -1)
|
||||
echo " HTTP: $http_code"
|
||||
echo " Response: $http_body"
|
||||
[[ "$http_code" =~ ^2 ]] && echo " Webhook OK." \
|
||||
|| echo " WARNUNG: HTTP $http_code"
|
||||
}
|
||||
|
||||
# ── ERR-Trap ──────────────────────────────────────────────────────────────────
|
||||
trap 'STATUS="failed"
|
||||
ERROR_LINE=$LINENO
|
||||
echo ""
|
||||
echo "FEHLER in Zeile ${ERROR_LINE} – räume auf..."
|
||||
if [[ ${VM_ID_RESTORED:-0} -gt 0 ]]; then
|
||||
if [[ "$BACKUP_TYPE" == "ct" ]]; then
|
||||
pct stop "$VM_ID_RESTORED" 2>/dev/null || true
|
||||
sleep 3
|
||||
pct destroy "$VM_ID_RESTORED" --purge 1 2>/dev/null || true
|
||||
else
|
||||
qm stop "$VM_ID_RESTORED" --skiplock 1 2>/dev/null || true
|
||||
sleep 5
|
||||
qm destroy "$VM_ID_RESTORED" \
|
||||
--destroy-unreferenced-disks 1 --purge 1 2>/dev/null || true
|
||||
fi
|
||||
echo " ${BACKUP_TYPE^^} ${VM_ID_RESTORED} entfernt."
|
||||
fi
|
||||
[[ -n "${ZIP_FILE:-}" && -f "$ZIP_FILE" ]] && rm -f "$ZIP_FILE"
|
||||
[[ -n "${IMAGE_DIR:-}" && -d "$IMAGE_DIR" ]] && rm -rf "$IMAGE_DIR"
|
||||
send_webhook "failed" "Abgebrochen in Zeile ${ERROR_LINE} – $LOG_FILE"' ERR
|
||||
|
||||
# ═════════════════════════════════════════════════════════════════════════════
|
||||
# [0/13] 7Z-PASSWORT VOM PBS-SERVER HOLEN
|
||||
# ═════════════════════════════════════════════════════════════════════════════
|
||||
echo ""
|
||||
echo "$(date '+%Y-%m-%d %H:%M:%S') ==> [0/13] 7z-Passwort vom PBS-Server holen ($PBS_HOST)..."
|
||||
mkdir -p "$KEY_DIR"
|
||||
chmod 700 "$KEY_DIR"
|
||||
|
||||
PW_FILE_LOCAL="${KEY_DIR}/password_7z.txt"
|
||||
if [[ ! -f "$PW_FILE_LOCAL" || ! -s "$PW_FILE_LOCAL" ]]; then
|
||||
echo " Hole password_7z.txt..."
|
||||
rsync -az \
|
||||
-e "ssh -o StrictHostKeyChecking=no" \
|
||||
"root@${PBS_HOST}:/root/Scripte/password_7z.txt" \
|
||||
"$PW_FILE_LOCAL" \
|
||||
2>&1
|
||||
chmod 600 "$PW_FILE_LOCAL"
|
||||
echo " password_7z.txt gespeichert ✓"
|
||||
else
|
||||
echo " password_7z.txt bereits vorhanden."
|
||||
fi
|
||||
|
||||
ZIP_PASSWORD=$(grep -m1 "^${DATASTORE}:" "$PW_FILE_LOCAL" \
|
||||
| awk -F': ' '{print $2}' | tr -d '[:space:]')
|
||||
|
||||
[[ -z "$ZIP_PASSWORD" ]] && {
|
||||
echo "FEHLER: Kein 7z-Passwort für '$DATASTORE' in password_7z.txt" >&2; exit 1
|
||||
}
|
||||
echo " 7z-Passwort geladen ✓"
|
||||
|
||||
# ═════════════════════════════════════════════════════════════════════════════
|
||||
# [1/13] SPACE-CHECK
|
||||
# ═════════════════════════════════════════════════════════════════════════════
|
||||
echo ""
|
||||
echo "$(date '+%Y-%m-%d %H:%M:%S') ==> [1/13] Prüfe freien Speicherplatz auf $RESTORE_MOUNT..."
|
||||
mkdir -p "$ZIP_DIR"
|
||||
FREE_KB=$(df "$RESTORE_MOUNT" 2>/dev/null | awk 'NR==2{print $4}' || echo "0")
|
||||
FREE_GB=$(( FREE_KB / 1024 / 1024 ))
|
||||
FREE_BYTES=$(( FREE_KB * 1024 ))
|
||||
echo " Frei: ${FREE_GB} GB"
|
||||
|
||||
# ═════════════════════════════════════════════════════════════════════════════
|
||||
# [2/13] ID ERMITTELN
|
||||
# ═════════════════════════════════════════════════════════════════════════════
|
||||
echo ""
|
||||
echo "$(date '+%Y-%m-%d %H:%M:%S') ==> [2/13] Ermittle IDs..."
|
||||
VM_ID_ORIGINAL=$(echo "$SNAPSHOT_PATH" | grep -oP '\d+' | head -1 || echo "0")
|
||||
echo " Original-ID: $VM_ID_ORIGINAL (Typ: $BACKUP_TYPE)"
|
||||
|
||||
VM_ID_RESTORED=$(
|
||||
{
|
||||
pvesh get /nodes/localhost/qemu --output-format json 2>/dev/null || echo "[]"
|
||||
pvesh get /nodes/localhost/lxc --output-format json 2>/dev/null || echo "[]"
|
||||
} | python3 -c "
|
||||
import json, sys
|
||||
data = []
|
||||
for line in sys.stdin:
|
||||
line = line.strip()
|
||||
if line:
|
||||
try: data.extend(json.loads(line))
|
||||
except: pass
|
||||
existing = {int(v.get('vmid', 0)) for v in data}
|
||||
for i in range(1000, 2000):
|
||||
if i not in existing:
|
||||
print(i); break
|
||||
" 2>/dev/null || echo "1000"
|
||||
)
|
||||
echo " Restore-ID: $VM_ID_RESTORED"
|
||||
|
||||
# ═════════════════════════════════════════════════════════════════════════════
|
||||
# [2.5/13] CONFIG-CHECK
|
||||
# Config direkt aus PBS-Backup lesen um VM-Name zu ermitteln und zu prüfen
|
||||
# ob ZIP bereits auf dem Backup-Server existiert → Restore überspringen
|
||||
# ═════════════════════════════════════════════════════════════════════════════
|
||||
echo ""
|
||||
echo "$(date '+%Y-%m-%d %H:%M:%S') ==> [2.5/13] Config aus PBS-Backup lesen..."
|
||||
|
||||
CONFIG_VM_NAME=""
|
||||
CONFIG_TMP="/tmp/pbs_config_${VM_ID_ORIGINAL}_$$.conf"
|
||||
|
||||
if [[ "$BACKUP_TYPE" == "ct" ]]; then
|
||||
CONF_FILE_IN_BACKUP="pct.conf"
|
||||
NAME_KEY="^hostname:"
|
||||
else
|
||||
CONF_FILE_IN_BACKUP="qemu-server.conf"
|
||||
NAME_KEY="^name:"
|
||||
fi
|
||||
|
||||
export PBS_PASSWORD
|
||||
export PBS_REPOSITORY="${PBS_USER}@${PBS_HOST}:${DATASTORE}"
|
||||
|
||||
SNAP_ID=$(echo "$SNAPSHOT_PATH" | cut -d/ -f3)
|
||||
echo " Repository: $PBS_REPOSITORY"
|
||||
echo " Snapshot: ${BACKUP_TYPE}/${VM_ID_ORIGINAL}/${SNAP_ID}"
|
||||
echo " Config: $CONF_FILE_IN_BACKUP"
|
||||
echo " Keyfile: ${KEY_DIR}/${DATASTORE}.keyfile"
|
||||
|
||||
proxmox-backup-client restore \
|
||||
--keyfile "${KEY_DIR}/${DATASTORE}.keyfile" \
|
||||
"${BACKUP_TYPE}/${VM_ID_ORIGINAL}/${SNAP_ID}" \
|
||||
"$CONF_FILE_IN_BACKUP" \
|
||||
"$CONFIG_TMP" \
|
||||
2>&1 || echo " WARNUNG: proxmox-backup-client restore fehlgeschlagen (exit $?)"
|
||||
|
||||
if [[ -f "$CONFIG_TMP" ]]; then
|
||||
CONFIG_VM_NAME=$(grep -m1 "$NAME_KEY" "$CONFIG_TMP" 2>/dev/null \
|
||||
| awk -F': ' '{print $2}' | tr -d '[:space:]' || echo "")
|
||||
rm -f "$CONFIG_TMP"
|
||||
echo " VM-Name: ${CONFIG_VM_NAME:-unbekannt}"
|
||||
else
|
||||
echo " Config nicht lesbar – überspringe ZIP-Check."
|
||||
fi
|
||||
|
||||
# Prüfen ob ZIP bereits vorhanden
|
||||
if [[ -n "$CONFIG_VM_NAME" ]]; then
|
||||
ZIP_CHECK="${RSYNC_TARGET}/${LAST_DATE}/${CONFIG_VM_NAME}-${VM_ID_ORIGINAL}.7z"
|
||||
if [[ "$SKIP_RSYNC" == "1" ]]; then
|
||||
if [[ -f "$ZIP_CHECK" ]]; then
|
||||
echo " ZIP bereits vorhanden (lokal): $ZIP_CHECK"
|
||||
VM_NAME="$CONFIG_VM_NAME"
|
||||
ZIP_SIZE_BYTES=$(stat -c%s "$ZIP_CHECK" 2>/dev/null || echo "0")
|
||||
RSYNC_OK="true"
|
||||
RSYNC_SIZE_BYTES=$ZIP_SIZE_BYTES
|
||||
QM_AGENT_OK="skipped"
|
||||
trap - ERR
|
||||
send_webhook "success" ""
|
||||
exit 0
|
||||
fi
|
||||
else
|
||||
if ssh "$BACKUP_SERVER_HOST" "test -f '$ZIP_CHECK'" 2>/dev/null; then
|
||||
echo " ZIP bereits vorhanden (remote): $ZIP_CHECK"
|
||||
VM_NAME="$CONFIG_VM_NAME"
|
||||
ZIP_SIZE_BYTES=$(ssh "$BACKUP_SERVER_HOST" \
|
||||
"stat -c%s '$ZIP_CHECK'" 2>/dev/null || echo "0")
|
||||
RSYNC_OK="true"
|
||||
RSYNC_SIZE_BYTES=$ZIP_SIZE_BYTES
|
||||
QM_AGENT_OK="skipped"
|
||||
trap - ERR
|
||||
send_webhook "success" ""
|
||||
exit 0
|
||||
fi
|
||||
fi
|
||||
echo " Kein vorhandenes ZIP – starte vollständigen Restore."
|
||||
fi
|
||||
|
||||
# ═════════════════════════════════════════════════════════════════════════════
|
||||
# [3/13] RESTORE
|
||||
# ═════════════════════════════════════════════════════════════════════════════
|
||||
echo ""
|
||||
echo "$(date '+%Y-%m-%d %H:%M:%S') ==> [3/13] Restore vom PBS-Storage ($BACKUP_TYPE)..."
|
||||
echo " Backup-Ref: $PVE_BACKUP_REF"
|
||||
echo " Storage: $RESTORE_PATH"
|
||||
echo " ID: $VM_ID_RESTORED"
|
||||
|
||||
RESTORE_START_INNER=$(date +%s)
|
||||
|
||||
if [[ "$BACKUP_TYPE" == "ct" ]]; then
|
||||
pct restore "$VM_ID_RESTORED" "$PVE_BACKUP_REF" \
|
||||
--storage "$RESTORE_PATH" \
|
||||
--unique 1 \
|
||||
2>&1
|
||||
else
|
||||
qmrestore "$PVE_BACKUP_REF" "$VM_ID_RESTORED" \
|
||||
--storage "$RESTORE_PATH" \
|
||||
--unique 1 \
|
||||
2>&1
|
||||
fi
|
||||
|
||||
RESTORE_DURATION=$(( $(date +%s) - RESTORE_START_INNER ))
|
||||
echo " Restore abgeschlossen in ${RESTORE_DURATION}s"
|
||||
|
||||
# ═════════════════════════════════════════════════════════════════════════════
|
||||
# [4/13] IMAGE_DIR DYNAMISCH ERMITTELN
|
||||
# ═════════════════════════════════════════════════════════════════════════════
|
||||
echo ""
|
||||
echo "$(date '+%Y-%m-%d %H:%M:%S') ==> [4/13] Ermittle Image-Verzeichnis..."
|
||||
STORAGE_BASE=$(pvesh get "/storage/${RESTORE_PATH}" --output-format json \
|
||||
2>/dev/null | python3 -c "
|
||||
import json, sys
|
||||
cfg = json.load(sys.stdin)
|
||||
print(cfg.get('path', ''))
|
||||
" 2>/dev/null || echo "")
|
||||
|
||||
if [[ -n "$STORAGE_BASE" ]]; then
|
||||
if [[ "$BACKUP_TYPE" == "ct" ]]; then
|
||||
IMAGE_DIR=""
|
||||
for candidate in \
|
||||
"${STORAGE_BASE}/images/${VM_ID_RESTORED}" \
|
||||
"${STORAGE_BASE}/private/${VM_ID_RESTORED}" \
|
||||
"${STORAGE_BASE}/rootdir/${VM_ID_RESTORED}"; do
|
||||
if [[ -d "$candidate" ]] && [[ -n "$(ls -A "$candidate" 2>/dev/null)" ]]; then
|
||||
IMAGE_DIR="$candidate"
|
||||
echo " CT-Image gefunden: $IMAGE_DIR"
|
||||
break
|
||||
else
|
||||
echo " Nicht gefunden: $candidate"
|
||||
fi
|
||||
done
|
||||
if [[ -z "$IMAGE_DIR" ]]; then
|
||||
IMAGE_DIR=$(find "$STORAGE_BASE" -maxdepth 2 -type d \
|
||||
-name "$VM_ID_RESTORED" 2>/dev/null | head -1 || echo "")
|
||||
[[ -n "$IMAGE_DIR" ]] && echo " CT-Image via find: $IMAGE_DIR"
|
||||
fi
|
||||
else
|
||||
IMAGE_DIR="${STORAGE_BASE}/images/${VM_ID_RESTORED}"
|
||||
fi
|
||||
else
|
||||
if [[ "$BACKUP_TYPE" == "ct" ]]; then
|
||||
IMAGE_DIR="/var/lib/vz/private/${VM_ID_RESTORED}"
|
||||
else
|
||||
IMAGE_DIR="/var/lib/vz/images/${VM_ID_RESTORED}"
|
||||
fi
|
||||
echo " WARNUNG: Storage-Pfad nicht ermittelt, Fallback: $IMAGE_DIR"
|
||||
fi
|
||||
|
||||
if [[ -z "$IMAGE_DIR" ]]; then
|
||||
if [[ "$BACKUP_TYPE" == "ct" ]]; then
|
||||
IMAGE_DIR="/var/lib/vz/private/${VM_ID_RESTORED}"
|
||||
else
|
||||
IMAGE_DIR="/var/lib/vz/images/${VM_ID_RESTORED}"
|
||||
fi
|
||||
echo " WARNUNG: Fallback: $IMAGE_DIR"
|
||||
fi
|
||||
echo " Image-Dir: $IMAGE_DIR"
|
||||
|
||||
ACTUAL_DISK_BYTES=$(du -sb "$IMAGE_DIR" 2>/dev/null | awk '{print $1}' || echo "0")
|
||||
echo " Image-Größe: $(( ACTUAL_DISK_BYTES / 1024 / 1024 / 1024 )) GB"
|
||||
|
||||
# ═════════════════════════════════════════════════════════════════════════════
|
||||
# [5/13] IMAGES PRÜFEN
|
||||
# ═════════════════════════════════════════════════════════════════════════════
|
||||
echo ""
|
||||
echo "$(date '+%Y-%m-%d %H:%M:%S') ==> [5/13] Prüfe Images..."
|
||||
if [[ ! -d "$IMAGE_DIR" ]] || [[ -z "$(ls -A "$IMAGE_DIR" 2>/dev/null)" ]]; then
|
||||
ERROR_MSG="IMAGE_DIR leer oder nicht vorhanden: $IMAGE_DIR"
|
||||
echo " FEHLER: $ERROR_MSG"
|
||||
if [[ "$BACKUP_TYPE" == "ct" ]]; then
|
||||
pct destroy "$VM_ID_RESTORED" --purge 1 2>/dev/null || true
|
||||
else
|
||||
qm destroy "$VM_ID_RESTORED" \
|
||||
--destroy-unreferenced-disks 1 --purge 1 2>/dev/null || true
|
||||
fi
|
||||
trap - ERR
|
||||
send_webhook "failed" "$ERROR_MSG"
|
||||
exit 0
|
||||
fi
|
||||
echo " Images vorhanden ✓"
|
||||
|
||||
# ═════════════════════════════════════════════════════════════════════════════
|
||||
# [6/13] VORBEREITEN
|
||||
# ═════════════════════════════════════════════════════════════════════════════
|
||||
echo ""
|
||||
echo "$(date '+%Y-%m-%d %H:%M:%S') ==> [6/13] Vorbereiten ($BACKUP_TYPE)..."
|
||||
|
||||
if [[ "$BACKUP_TYPE" == "ct" ]]; then
|
||||
pct unlock "$VM_ID_RESTORED" 2>/dev/null || true
|
||||
pct stop "$VM_ID_RESTORED" 2>/dev/null || true
|
||||
sleep 3
|
||||
for ((net=0; net<=10; net++)); do
|
||||
pct set "$VM_ID_RESTORED" --delete "net${net}" 2>/dev/null || true
|
||||
done
|
||||
echo " CT vorbereitet (Netzwerkkarten entfernt)."
|
||||
else
|
||||
qm unlock "$VM_ID_RESTORED" 2>/dev/null || true
|
||||
qm stop "$VM_ID_RESTORED" 2>/dev/null || true
|
||||
sleep 3
|
||||
qm set "$VM_ID_RESTORED" -delete cdrom 2>/dev/null || true
|
||||
qm set "$VM_ID_RESTORED" -delete ide0 2>/dev/null || true
|
||||
for ((net=0; net<=10; net++)); do
|
||||
qm set "$VM_ID_RESTORED" -delete "net${net}" 2>/dev/null || true
|
||||
done
|
||||
qm set "$VM_ID_RESTORED" --agent "enabled=1,type=virtio" 2>/dev/null || true
|
||||
echo " VM vorbereitet (Netzwerkkarten entfernt, Agent aktiviert)."
|
||||
fi
|
||||
|
||||
# ═════════════════════════════════════════════════════════════════════════════
|
||||
# [7/13] STARTEN & PRÜFEN
|
||||
# ═════════════════════════════════════════════════════════════════════════════
|
||||
echo ""
|
||||
echo "$(date '+%Y-%m-%d %H:%M:%S') ==> [7/13] Starte & prüfe ($BACKUP_TYPE)..."
|
||||
|
||||
if [[ "$BACKUP_TYPE" == "ct" ]]; then
|
||||
pct start "$VM_ID_RESTORED" 2>/dev/null || true
|
||||
sleep 10
|
||||
if pct status "$VM_ID_RESTORED" 2>/dev/null | grep -q "running"; then
|
||||
QM_AGENT_OK="true"
|
||||
echo " CT läuft ✓"
|
||||
CT_HOSTNAME=$(pct exec "$VM_ID_RESTORED" -- hostname 2>/dev/null || echo "unbekannt")
|
||||
echo " Hostname: $CT_HOSTNAME"
|
||||
else
|
||||
QM_AGENT_OK="false"
|
||||
echo " CT nicht gestartet."
|
||||
fi
|
||||
else
|
||||
qm start "$VM_ID_RESTORED" 2>/dev/null || true
|
||||
AGENT_WAIT=0
|
||||
AGENT_MAX=120
|
||||
AGENT_INTERVAL=10
|
||||
while [[ $AGENT_WAIT -lt $AGENT_MAX ]]; do
|
||||
sleep $AGENT_INTERVAL
|
||||
AGENT_WAIT=$(( AGENT_WAIT + AGENT_INTERVAL ))
|
||||
echo -n " [${AGENT_WAIT}s/${AGENT_MAX}s] qm-Agent... "
|
||||
if qm agent "$VM_ID_RESTORED" ping 2>/dev/null | grep -qi "pong\|ping"; then
|
||||
QM_AGENT_OK="true"
|
||||
echo "ONLINE ✓"
|
||||
hostname_info=$(qm agent "$VM_ID_RESTORED" get-host-name 2>/dev/null \
|
||||
| grep host-name | tr -d '"' || true)
|
||||
echo " Hostname: ${hostname_info:-unbekannt}"
|
||||
break
|
||||
else
|
||||
echo "nicht erreichbar."
|
||||
fi
|
||||
done
|
||||
[[ "$QM_AGENT_OK" == "false" ]] && \
|
||||
echo " qm-Agent nicht erreichbar – qm_agent_ok=false in DB."
|
||||
fi
|
||||
|
||||
# ═════════════════════════════════════════════════════════════════════════════
|
||||
# [8/13] STOPPEN
|
||||
# ═════════════════════════════════════════════════════════════════════════════
|
||||
echo ""
|
||||
echo "$(date '+%Y-%m-%d %H:%M:%S') ==> [8/13] Stoppe $BACKUP_TYPE..."
|
||||
if [[ "$BACKUP_TYPE" == "ct" ]]; then
|
||||
pct stop "$VM_ID_RESTORED" 2>/dev/null || true
|
||||
sleep 10
|
||||
else
|
||||
# Graceful shutdown mit 2 Minuten Timeout, danach force-stop
|
||||
qm shutdown "$VM_ID_RESTORED" --timeout 120 2>/dev/null || true
|
||||
# Prüfen ob VM noch läuft → force-stop
|
||||
if qm status "$VM_ID_RESTORED" 2>/dev/null | grep -q "running"; then
|
||||
echo " VM läuft noch nach 120s – force stop..."
|
||||
qm stop "$VM_ID_RESTORED" --skiplock 1 2>/dev/null || true
|
||||
sleep 5
|
||||
fi
|
||||
fi
|
||||
echo " Gestoppt."
|
||||
|
||||
# ═════════════════════════════════════════════════════════════════════════════
|
||||
# [9/13] CONFIG SICHERN
|
||||
# ═════════════════════════════════════════════════════════════════════════════
|
||||
echo ""
|
||||
echo "$(date '+%Y-%m-%d %H:%M:%S') ==> [9/13] Config sichern..."
|
||||
|
||||
if [[ "$BACKUP_TYPE" == "ct" ]]; then
|
||||
PVE_CONF="/etc/pve/lxc/${VM_ID_RESTORED}.conf"
|
||||
CONF_FILENAME="lxc.conf"
|
||||
VM_NAME=$(grep -m1 "^hostname:" "$PVE_CONF" 2>/dev/null \
|
||||
| awk -F': ' '{print $2}' | tr -d '[:space:]' \
|
||||
|| echo "${CONFIG_VM_NAME:-$SAFE_CLIENT}")
|
||||
else
|
||||
PVE_CONF="/etc/pve/qemu-server/${VM_ID_RESTORED}.conf"
|
||||
CONF_FILENAME="qemu-server.conf"
|
||||
VM_NAME=$(grep -m1 "^name:" "$PVE_CONF" 2>/dev/null \
|
||||
| awk -F': ' '{print $2}' | tr -d '[:space:]' \
|
||||
|| echo "${CONFIG_VM_NAME:-$SAFE_CLIENT}")
|
||||
fi
|
||||
|
||||
if [[ -f "$PVE_CONF" ]]; then
|
||||
cp "$PVE_CONF" "${IMAGE_DIR}/${CONF_FILENAME}"
|
||||
echo " Config gesichert: ${IMAGE_DIR}/${CONF_FILENAME}"
|
||||
else
|
||||
echo " WARNUNG: Config nicht gefunden: $PVE_CONF"
|
||||
fi
|
||||
echo " Name: $VM_NAME"
|
||||
|
||||
# ═════════════════════════════════════════════════════════════════════════════
|
||||
# [10/13] 7Z-ARCHIV
|
||||
# ═════════════════════════════════════════════════════════════════════════════
|
||||
echo ""
|
||||
echo "$(date '+%Y-%m-%d %H:%M:%S') ==> [10/13] Erstelle verschlüsseltes 7z-Archiv (mx=${COMPRESS_LEVEL})..."
|
||||
|
||||
ZIP_FILE="${ZIP_DIR}/${VM_NAME}-${VM_ID_ORIGINAL}.7z"
|
||||
ZIP_START=$(date +%s)
|
||||
|
||||
7z a -t7z \
|
||||
-mmt=${MMT_THREADS} \
|
||||
-mx=${COMPRESS_LEVEL} \
|
||||
-md=16M \
|
||||
-p"${ZIP_PASSWORD}" \
|
||||
-mhe=on \
|
||||
"$ZIP_FILE" \
|
||||
"${IMAGE_DIR}/"* \
|
||||
2>&1 | tail -5
|
||||
|
||||
ZIP_DURATION=$(( $(date +%s) - ZIP_START ))
|
||||
ZIP_SIZE_BYTES=$(stat -c%s "$ZIP_FILE" 2>/dev/null || echo "0")
|
||||
echo " ZIP: $(( ZIP_SIZE_BYTES / 1024 / 1024 )) MB in ${ZIP_DURATION}s"
|
||||
|
||||
# ═════════════════════════════════════════════════════════════════════════════
|
||||
# [11/13] RSYNC ZUM BACKUP-SERVER
|
||||
# ═════════════════════════════════════════════════════════════════════════════
|
||||
echo ""
|
||||
RSYNC_TARGET_DATE="${RSYNC_TARGET}/${LAST_DATE}"
|
||||
echo "$(date '+%Y-%m-%d %H:%M:%S') ==> [11/13] Rsync / Datei-Transfer..."
|
||||
|
||||
if [[ "$SKIP_RSYNC" == "1" ]]; then
|
||||
echo " Lokaler Modus: ZIP bereits in ${RSYNC_TARGET_DATE} – kein Rsync."
|
||||
RSYNC_OK="true"
|
||||
RSYNC_SIZE_BYTES=$ZIP_SIZE_BYTES
|
||||
echo " Groesse: $(( RSYNC_SIZE_BYTES / 1024 / 1024 )) MB"
|
||||
else
|
||||
MAX_RETRIES=3
|
||||
|
||||
rsync_transfer() {
|
||||
rsync -avz --progress --timeout=300 \
|
||||
"$ZIP_FILE" \
|
||||
"${BACKUP_SERVER_HOST}:${RSYNC_TARGET_DATE}/" \
|
||||
2>&1
|
||||
}
|
||||
|
||||
ssh "$BACKUP_SERVER_HOST" "mkdir -p '${RSYNC_TARGET_DATE}'" 2>/dev/null || true
|
||||
|
||||
while [[ $RSYNC_RETRIES -lt $MAX_RETRIES ]]; do
|
||||
if rsync_transfer; then
|
||||
RSYNC_OK="true"
|
||||
RSYNC_SIZE_BYTES=$ZIP_SIZE_BYTES
|
||||
echo " Rsync OK: $(( RSYNC_SIZE_BYTES / 1024 / 1024 )) MB"
|
||||
break
|
||||
else
|
||||
RSYNC_RETRIES=$(( RSYNC_RETRIES + 1 ))
|
||||
if [[ $RSYNC_RETRIES -lt $MAX_RETRIES ]]; then
|
||||
echo " Fehlgeschlagen ($RSYNC_RETRIES/$MAX_RETRIES). Warte 60s..."
|
||||
sleep 60
|
||||
else
|
||||
RSYNC_OK="false"
|
||||
STATUS="failed"
|
||||
ERROR_MSG="Rsync fehlgeschlagen nach ${RSYNC_RETRIES} Versuchen"
|
||||
echo " FEHLER: $ERROR_MSG"
|
||||
fi
|
||||
fi
|
||||
done
|
||||
|
||||
if [[ "$RSYNC_OK" == "true" ]]; then
|
||||
REMOTE_SIZE=$(ssh "$BACKUP_SERVER_HOST" \
|
||||
"stat -c%s '${RSYNC_TARGET_DATE}/$(basename "$ZIP_FILE")'" \
|
||||
2>/dev/null || echo "0")
|
||||
if [[ "$REMOTE_SIZE" != "$ZIP_SIZE_BYTES" ]]; then
|
||||
echo " WARNUNG: Remote ${REMOTE_SIZE}B != lokal ${ZIP_SIZE_BYTES}B"
|
||||
RSYNC_OK="false"
|
||||
STATUS="failed"
|
||||
ERROR_MSG="Groessenabweichung: lokal=${ZIP_SIZE_BYTES} remote=${REMOTE_SIZE}"
|
||||
else
|
||||
echo " Groessenprüfung OK: ${REMOTE_SIZE} Bytes."
|
||||
fi
|
||||
fi
|
||||
fi
|
||||
|
||||
# ═════════════════════════════════════════════════════════════════════════════
|
||||
# [12/13] AUFRÄUMEN
|
||||
# ═════════════════════════════════════════════════════════════════════════════
|
||||
echo ""
|
||||
echo "$(date '+%Y-%m-%d %H:%M:%S') ==> [12/13] Aufräumen..."
|
||||
if [[ "$BACKUP_TYPE" == "ct" ]]; then
|
||||
pct destroy "$VM_ID_RESTORED" --purge 1 \
|
||||
2>/dev/null || echo " CT $VM_ID_RESTORED nicht mehr vorhanden."
|
||||
else
|
||||
qm destroy "$VM_ID_RESTORED" \
|
||||
--destroy-unreferenced-disks 1 \
|
||||
--purge 1 \
|
||||
2>/dev/null || echo " VM $VM_ID_RESTORED nicht mehr vorhanden."
|
||||
fi
|
||||
if [[ "$SKIP_RSYNC" == "1" ]]; then
|
||||
echo " ${BACKUP_TYPE^^} ${VM_ID_RESTORED} entfernt. ZIP bleibt am Zielort."
|
||||
else
|
||||
rm -f "$ZIP_FILE"
|
||||
echo " ${BACKUP_TYPE^^} ${VM_ID_RESTORED} entfernt, ZIP gelöscht."
|
||||
fi
|
||||
|
||||
# ── Zusammenfassung & Webhook ─────────────────────────────────────────────────
|
||||
TOTAL=$(( $(date +%s) - RESTORE_START ))
|
||||
echo ""
|
||||
echo "============================================================"
|
||||
echo " Status: $STATUS"
|
||||
echo " Typ: $BACKUP_TYPE"
|
||||
echo " Gesamtdauer: ${TOTAL}s"
|
||||
echo " Name: ${VM_NAME:-$SAFE_CLIENT}"
|
||||
echo " Image-Dir: $IMAGE_DIR"
|
||||
echo " qm-Agent/CT: $QM_AGENT_OK"
|
||||
echo " Rsync: $RSYNC_OK (Versuche: $RSYNC_RETRIES)"
|
||||
echo " ZIP: $(( ZIP_SIZE_BYTES / 1024 / 1024 )) MB"
|
||||
echo " 7z-Level: mx=${COMPRESS_LEVEL} mmt=${MMT_THREADS}"
|
||||
[[ -n "$ERROR_MSG" ]] && echo " Fehler: $ERROR_MSG"
|
||||
echo "============================================================"
|
||||
|
||||
trap - ERR
|
||||
send_webhook "$STATUS" "$ERROR_MSG"
|
||||
+25
@@ -0,0 +1,25 @@
|
||||
# yaml-language-server: $schema=wmill.schema.json
|
||||
defaultTs: bun
|
||||
|
||||
includes:
|
||||
- "f/**"
|
||||
|
||||
excludes: []
|
||||
|
||||
skipVariables: false
|
||||
skipResources: false
|
||||
skipResourceTypes: false
|
||||
skipSecrets: true
|
||||
skipScripts: false
|
||||
skipFlows: false
|
||||
skipApps: false
|
||||
skipFolders: false
|
||||
skipWorkspaceDependencies: false
|
||||
|
||||
nonDottedPaths: true
|
||||
|
||||
codebases: []
|
||||
|
||||
gitBranches:
|
||||
main:
|
||||
overrides: {}
|
||||
Reference in New Issue
Block a user