Skip to content

Design Decisions

Safety-First Release Strategy

  • Public PyPI package is a non-functional educational stub.
  • Functional source remains in the repo, with strict safety gates:
  • Requires PE_PACKER_ALLOW_PACKING=1 and --force.
  • Dry-run prints entropy/size and exits 0 without writing outputs.
  • CI enforces these gates via python/tests/test_safety.py.

Tech Stack

  • Rust core with PyO3 bindings (abi3) for stable wheels and performance.
  • Python layer for CLI/UX, data orchestration, and training workflows.
  • maturin for mixed Rust/Python builds; uv for dependency management.

CLI UX (Typer)

  • Subcommands: pack, generate-training-data, analyze-dataset, validate.
  • Clear error messages, explicit options, and verbose logging mode.
  • Gating implemented at the CLI layer for least surprise.

Dataset Generation Philosophy

  • Focus on educational, controlled variants.
  • Record rich metadata for ML evaluation and reproducibility.
  • Provide analyze-dataset to examine distributions and coverage.

Policy Alignment

  • SECURITY.md defines purpose, ethics, and contact.
  • LICENSE-APPENDIX.md clarifies educational use and liability.
  • CI blocks unsafe publish flows; the stub workflow checks for native content.