Add dry-run dataset existence checks with option to create missing datasets

During dry runs, query pool.dataset.query on the destination to verify that every share path would have a backing ZFS dataset. Missing paths are collected in Summary.missing_datasets and surfaced as a WARNING block in the report. In interactive mode the user is prompted to create any auto-creatable (/mnt/…) datasets before the live migration proceeds. Non-interactive --dry-run mode prints the same warning in the summary report. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-04 09:37:44 -05:00
parent ab1781dbe2
commit ddb3fbd3ff
2 changed files with 184 additions and 0 deletions
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -0,0 +1,55 @@
+# CLAUDE.md
+
+This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
+
+## Project Overview
+
+Single-file Python CLI tool (`truenas_migrate.py`) that reads SMB shares, NFS shares, and SMB global config from a TrueNAS debug archive (`.tar`/`.tgz`) and re-creates them on a destination TrueNAS system via its JSON-RPC 2.0 WebSocket API (TrueNAS 25.04+).
+
+## Running the Tool
+
+```bash
+# Install the only dependency
+pip install websockets
+
+# Inspect the archive
+python truenas_migrate.py --debug-tar debug.tgz --list-archive
+
+# Dry run (no changes made)
+python truenas_migrate.py --debug-tar debug.tgz --dest 192.168.1.50 --api-key "1-xxx" --dry-run
+
+# Live migration
+python truenas_migrate.py --debug-tar debug.tgz --dest 192.168.1.50 --api-key "1-xxx"
+
+# Migrate a subset (smb, nfs, smb-config)
+python truenas_migrate.py --debug-tar debug.tgz --dest 192.168.1.50 --api-key "1-xxx" --migrate smb
+```
+
+## Architecture
+
+The script is structured as a linear pipeline with no external config files or modules:
+
+1. **Archive parser** (`parse_archive`, `_find_data`) — Opens the `.tgz` using `tarfile`, tries a ranked list of known paths (`_CANDIDATES`) for each data type, then falls back to keyword heuristic scanning (`_KEYWORDS`). Handles both `ixdiagnose` (SCALE) and `freenas-debug` (CORE) archive layouts, as well as date-stamped top-level directories.
+
+2. **Payload builders** (`_smb_share_payload`, `_nfs_share_payload`, `_smb_config_payload`) — Strip read-only/server-generated fields (`id`, `locked`, `server_sid`) before sending to the destination API.
+
+3. **`TrueNASClient`** — Minimal async JSON-RPC 2.0 WebSocket client. Authenticates with `auth.login_with_api_key`, drains server-push notifications to match replies by `id`. Used as an async context manager.
+
+4. **Migration routines** (`migrate_smb_shares`, `migrate_nfs_shares`, `migrate_smb_config`) — Each queries existing shares on the destination first, then skips conflicts (SMB: by name case-insensitively; NFS: by path exactly), then creates or dry-runs each share.
+
+5. **`Summary` dataclass** — Accumulates counts and errors, renders a box-drawing table at the end. Exit code 2 if any errors occurred.
+
+## TrueNAS API Reference
+
+When working with TrueNAS API methods, payloads, or field names, consult the official documentation:
+
+- **API Docs**: https://api.truenas.com/v25.10/
+- Always verify method signatures, required fields, and valid enum values against the docs before modifying API calls or payload builders.
+- The API version in use is TrueNAS 25.10 (SCALE). Legacy CORE field names may differ.
+
+## Key Design Constraints
+
+- **Never overwrites or deletes** existing shares on the destination — conflict policy is skip-only.
+- SSL verification is off by default (self-signed certs are common on TrueNAS).
+- The WebSocket API endpoint is `wss://<host>:<port>/api/current` (TrueNAS 25.04+).
+- The `call()` method has a 60-second per-message timeout and skips server-initiated notifications (messages without an `id`).
--- a/truenas_migrate.py
+++ b/truenas_migrate.py
@@ -95,6 +95,10 @@ class Summary:
    cfg_applied: bool = False
    errors: list[str] = field(default_factory=list)

+    # Populated during dry-run dataset safety checks
+    paths_to_create:  list[str] = field(default_factory=list)
+    missing_datasets: list[str] = field(default_factory=list)
+
    def report(self) -> str:
        w = 52
        hr = "─" * w
@@ -122,6 +126,17 @@ class Summary:
            lines.append(f"\n  {len(self.errors)} error(s):")
            for e in self.errors:
                lines.append(f"    • {e}")
+        if self.missing_datasets:
+            lines.append(
+                f"\n  WARNING: {len(self.missing_datasets)} share path(s) have no "
+                "matching dataset on the destination:"
+            )
+            for p in self.missing_datasets:
+                lines.append(f"    • {p}")
+            lines.append(
+                "  These paths must exist before shares can be created.\n"
+                "  Use interactive mode or answer 'y' at the dataset prompt to create them."
+            )
        lines.append("")
        return "\n".join(lines)

@@ -779,6 +794,83 @@ class TrueNASClient:
            return msg.get("result")


+# ─────────────────────────────────────────────────────────────────────────────
+# Dataset safety checks
+# ─────────────────────────────────────────────────────────────────────────────
+
+async def check_dataset_paths(
+    client: TrueNASClient,
+    paths: list[str],
+) -> list[str]:
+    """
+    Return the subset of *paths* that have no matching ZFS dataset on the
+    destination (i.e. no dataset whose mountpoint equals that path).
+    Returns an empty list when the dataset query itself fails (with a warning).
+    """
+    if not paths:
+        return []
+
+    unique = sorted({p.rstrip("/") for p in paths if p})
+    log.info("Checking %d share path(s) against destination datasets …", len(unique))
+    try:
+        datasets = await client.call("pool.dataset.query") or []
+    except RuntimeError as exc:
+        log.warning("Could not query datasets (skipping check): %s", exc)
+        return []
+
+    mountpoints = {
+        d.get("mountpoint", "").rstrip("/")
+        for d in datasets
+        if d.get("mountpoint")
+    }
+
+    missing = [p for p in unique if p not in mountpoints]
+    if missing:
+        for p in missing:
+            log.warning("  MISSING dataset for path: %s", p)
+    else:
+        log.info("  All share paths exist as datasets.")
+    return missing
+
+
+async def create_dataset(client: TrueNASClient, path: str) -> bool:
+    """
+    Create a ZFS dataset whose mountpoint will be *path*.
+
+    *path* must be an absolute /mnt/… path (e.g. /mnt/tank/data).
+    The dataset name is derived by stripping the leading /mnt/ prefix.
+    Returns True on success, False on failure.
+    """
+    if not path.startswith("/mnt/"):
+        log.error("Cannot auto-create dataset for non-/mnt/ path: %s", path)
+        return False
+
+    name = path[5:].rstrip("/")   # strip "/mnt/"
+    log.info("Creating dataset %r …", name)
+    try:
+        await client.call("pool.dataset.create", [{"name": name}])
+        log.info("  Created: %s", name)
+        return True
+    except RuntimeError as exc:
+        log.error("  Failed to create dataset %r: %s", name, exc)
+        return False
+
+
+async def _create_missing_datasets(
+    host: str,
+    port: int,
+    api_key: str,
+    paths: list[str],
+    verify_ssl: bool = False,
+) -> None:
+    """Open a fresh connection and create ZFS datasets for *paths*."""
+    async with TrueNASClient(
+        host=host, port=port, api_key=api_key, verify_ssl=verify_ssl,
+    ) as client:
+        for path in paths:
+            await create_dataset(client, path)
+
+
 # ─────────────────────────────────────────────────────────────────────────────
 # Migration routines
 # ─────────────────────────────────────────────────────────────────────────────
@@ -822,6 +914,8 @@ async def migrate_smb_shares(
            log.info("   [DRY RUN] would create SMB share %r → %s",
                     name, payload.get("path"))
            summary.smb_created += 1
+            if payload.get("path"):
+                summary.paths_to_create.append(payload["path"])
            continue

        try:
@@ -872,6 +966,8 @@ async def migrate_nfs_shares(
        if dry_run:
            log.info("   [DRY RUN] would create NFS export for %r", path)
            summary.nfs_created += 1
+            if path:
+                summary.paths_to_create.append(path)
            continue

        try:
@@ -955,6 +1051,13 @@ async def run(
            await migrate_smb_config(
                client, archive["smb_config"], args.dry_run, summary)

+        # During dry runs, verify that every path we would create a share for
+        # actually exists as a ZFS dataset on the destination system.
+        if args.dry_run and summary.paths_to_create:
+            summary.missing_datasets = await check_dataset_paths(
+                client, summary.paths_to_create,
+            )
+
    return summary


@@ -1074,6 +1177,32 @@ def interactive_mode() -> None:
    )
    print(dry_summary.report())

+    # Offer to create missing datasets before the live run
+    if dry_summary.missing_datasets:
+        non_mnt = [p for p in dry_summary.missing_datasets if not p.startswith("/mnt/")]
+        creatable = [p for p in dry_summary.missing_datasets if p.startswith("/mnt/")]
+
+        if non_mnt:
+            print(f"  NOTE: {len(non_mnt)} path(s) cannot be auto-created "
+                  "(not under /mnt/):")
+            for p in non_mnt:
+                print(f"    • {p}")
+            print()
+
+        if creatable:
+            print(f"  {len(creatable)} dataset(s) can be created automatically:")
+            for p in creatable:
+                print(f"    • {p}")
+            print()
+            if _confirm(f"Create these {len(creatable)} dataset(s) on {host} now?"):
+                asyncio.run(_create_missing_datasets(
+                    host=host,
+                    port=port,
+                    api_key=api_key,
+                    paths=creatable,
+                ))
+                print()
+
    if not _confirm(f"Apply these changes to {host}?"):
        print("Aborted – no changes made.")
        sys.exit(0)