Add dry-run dataset existence checks with option to create missing datasets

During dry runs, query pool.dataset.query on the destination to verify that
every share path would have a backing ZFS dataset. Missing paths are collected
in Summary.missing_datasets and surfaced as a WARNING block in the report.

In interactive mode the user is prompted to create any auto-creatable
(/mnt/…) datasets before the live migration proceeds. Non-interactive
--dry-run mode prints the same warning in the summary report.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-03-04 09:37:44 -05:00
parent ab1781dbe2
commit ddb3fbd3ff
2 changed files with 184 additions and 0 deletions

55
CLAUDE.md Normal file
View File

@@ -0,0 +1,55 @@
# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Project Overview
Single-file Python CLI tool (`truenas_migrate.py`) that reads SMB shares, NFS shares, and SMB global config from a TrueNAS debug archive (`.tar`/`.tgz`) and re-creates them on a destination TrueNAS system via its JSON-RPC 2.0 WebSocket API (TrueNAS 25.04+).
## Running the Tool
```bash
# Install the only dependency
pip install websockets
# Inspect the archive
python truenas_migrate.py --debug-tar debug.tgz --list-archive
# Dry run (no changes made)
python truenas_migrate.py --debug-tar debug.tgz --dest 192.168.1.50 --api-key "1-xxx" --dry-run
# Live migration
python truenas_migrate.py --debug-tar debug.tgz --dest 192.168.1.50 --api-key "1-xxx"
# Migrate a subset (smb, nfs, smb-config)
python truenas_migrate.py --debug-tar debug.tgz --dest 192.168.1.50 --api-key "1-xxx" --migrate smb
```
## Architecture
The script is structured as a linear pipeline with no external config files or modules:
1. **Archive parser** (`parse_archive`, `_find_data`) — Opens the `.tgz` using `tarfile`, tries a ranked list of known paths (`_CANDIDATES`) for each data type, then falls back to keyword heuristic scanning (`_KEYWORDS`). Handles both `ixdiagnose` (SCALE) and `freenas-debug` (CORE) archive layouts, as well as date-stamped top-level directories.
2. **Payload builders** (`_smb_share_payload`, `_nfs_share_payload`, `_smb_config_payload`) — Strip read-only/server-generated fields (`id`, `locked`, `server_sid`) before sending to the destination API.
3. **`TrueNASClient`** — Minimal async JSON-RPC 2.0 WebSocket client. Authenticates with `auth.login_with_api_key`, drains server-push notifications to match replies by `id`. Used as an async context manager.
4. **Migration routines** (`migrate_smb_shares`, `migrate_nfs_shares`, `migrate_smb_config`) — Each queries existing shares on the destination first, then skips conflicts (SMB: by name case-insensitively; NFS: by path exactly), then creates or dry-runs each share.
5. **`Summary` dataclass** — Accumulates counts and errors, renders a box-drawing table at the end. Exit code 2 if any errors occurred.
## TrueNAS API Reference
When working with TrueNAS API methods, payloads, or field names, consult the official documentation:
- **API Docs**: https://api.truenas.com/v25.10/
- Always verify method signatures, required fields, and valid enum values against the docs before modifying API calls or payload builders.
- The API version in use is TrueNAS 25.10 (SCALE). Legacy CORE field names may differ.
## Key Design Constraints
- **Never overwrites or deletes** existing shares on the destination — conflict policy is skip-only.
- SSL verification is off by default (self-signed certs are common on TrueNAS).
- The WebSocket API endpoint is `wss://<host>:<port>/api/current` (TrueNAS 25.04+).
- The `call()` method has a 60-second per-message timeout and skips server-initiated notifications (messages without an `id`).

View File

@@ -95,6 +95,10 @@ class Summary:
cfg_applied: bool = False
errors: list[str] = field(default_factory=list)
# Populated during dry-run dataset safety checks
paths_to_create: list[str] = field(default_factory=list)
missing_datasets: list[str] = field(default_factory=list)
def report(self) -> str:
w = 52
hr = "" * w
@@ -122,6 +126,17 @@ class Summary:
lines.append(f"\n {len(self.errors)} error(s):")
for e in self.errors:
lines.append(f"{e}")
if self.missing_datasets:
lines.append(
f"\n WARNING: {len(self.missing_datasets)} share path(s) have no "
"matching dataset on the destination:"
)
for p in self.missing_datasets:
lines.append(f"{p}")
lines.append(
" These paths must exist before shares can be created.\n"
" Use interactive mode or answer 'y' at the dataset prompt to create them."
)
lines.append("")
return "\n".join(lines)
@@ -779,6 +794,83 @@ class TrueNASClient:
return msg.get("result")
# ─────────────────────────────────────────────────────────────────────────────
# Dataset safety checks
# ─────────────────────────────────────────────────────────────────────────────
async def check_dataset_paths(
client: TrueNASClient,
paths: list[str],
) -> list[str]:
"""
Return the subset of *paths* that have no matching ZFS dataset on the
destination (i.e. no dataset whose mountpoint equals that path).
Returns an empty list when the dataset query itself fails (with a warning).
"""
if not paths:
return []
unique = sorted({p.rstrip("/") for p in paths if p})
log.info("Checking %d share path(s) against destination datasets …", len(unique))
try:
datasets = await client.call("pool.dataset.query") or []
except RuntimeError as exc:
log.warning("Could not query datasets (skipping check): %s", exc)
return []
mountpoints = {
d.get("mountpoint", "").rstrip("/")
for d in datasets
if d.get("mountpoint")
}
missing = [p for p in unique if p not in mountpoints]
if missing:
for p in missing:
log.warning(" MISSING dataset for path: %s", p)
else:
log.info(" All share paths exist as datasets.")
return missing
async def create_dataset(client: TrueNASClient, path: str) -> bool:
"""
Create a ZFS dataset whose mountpoint will be *path*.
*path* must be an absolute /mnt/… path (e.g. /mnt/tank/data).
The dataset name is derived by stripping the leading /mnt/ prefix.
Returns True on success, False on failure.
"""
if not path.startswith("/mnt/"):
log.error("Cannot auto-create dataset for non-/mnt/ path: %s", path)
return False
name = path[5:].rstrip("/") # strip "/mnt/"
log.info("Creating dataset %r", name)
try:
await client.call("pool.dataset.create", [{"name": name}])
log.info(" Created: %s", name)
return True
except RuntimeError as exc:
log.error(" Failed to create dataset %r: %s", name, exc)
return False
async def _create_missing_datasets(
host: str,
port: int,
api_key: str,
paths: list[str],
verify_ssl: bool = False,
) -> None:
"""Open a fresh connection and create ZFS datasets for *paths*."""
async with TrueNASClient(
host=host, port=port, api_key=api_key, verify_ssl=verify_ssl,
) as client:
for path in paths:
await create_dataset(client, path)
# ─────────────────────────────────────────────────────────────────────────────
# Migration routines
# ─────────────────────────────────────────────────────────────────────────────
@@ -822,6 +914,8 @@ async def migrate_smb_shares(
log.info(" [DRY RUN] would create SMB share %r%s",
name, payload.get("path"))
summary.smb_created += 1
if payload.get("path"):
summary.paths_to_create.append(payload["path"])
continue
try:
@@ -872,6 +966,8 @@ async def migrate_nfs_shares(
if dry_run:
log.info(" [DRY RUN] would create NFS export for %r", path)
summary.nfs_created += 1
if path:
summary.paths_to_create.append(path)
continue
try:
@@ -955,6 +1051,13 @@ async def run(
await migrate_smb_config(
client, archive["smb_config"], args.dry_run, summary)
# During dry runs, verify that every path we would create a share for
# actually exists as a ZFS dataset on the destination system.
if args.dry_run and summary.paths_to_create:
summary.missing_datasets = await check_dataset_paths(
client, summary.paths_to_create,
)
return summary
@@ -1074,6 +1177,32 @@ def interactive_mode() -> None:
)
print(dry_summary.report())
# Offer to create missing datasets before the live run
if dry_summary.missing_datasets:
non_mnt = [p for p in dry_summary.missing_datasets if not p.startswith("/mnt/")]
creatable = [p for p in dry_summary.missing_datasets if p.startswith("/mnt/")]
if non_mnt:
print(f" NOTE: {len(non_mnt)} path(s) cannot be auto-created "
"(not under /mnt/):")
for p in non_mnt:
print(f"{p}")
print()
if creatable:
print(f" {len(creatable)} dataset(s) can be created automatically:")
for p in creatable:
print(f"{p}")
print()
if _confirm(f"Create these {len(creatable)} dataset(s) on {host} now?"):
asyncio.run(_create_missing_datasets(
host=host,
port=port,
api_key=api_key,
paths=creatable,
))
print()
if not _confirm(f"Apply these changes to {host}?"):
print("Aborted no changes made.")
sys.exit(0)