Zip Compression Module

Methods for zipping files and folders.

@author: Antony Vamvakeros

Example usage:

# (1) one zip per tif (recursive) zips = zip_each_file_with_extension(“D:data”, “tif”, recursive=True)

# (2) one zip containing all png + tif under folder z = zip_all_files_with_extension(“D:data”, “png,tif”, “all_images”, recursive=True, keep_paths=True)

# (3) zip an explicit list of files (store basenames; rename if collisions) z = zip_file_list(

[“D:a1.tif”, “D:b1.tif”, “D:b2.tif”], “D:outselected.zip”, keep_paths=False, on_collision=”rename”,

)

nDTomo.methods.zip.seven_zip_file(file_path: str | Path, *, seven_zip: str | Path | None = None, level: int = 5, out_path: str | Path | None = None) Path[source]

Create a .7z archive from a single file using 7-Zip.

Parameters:
  • file_path – Path to the input file.

  • seven_zip – Optional explicit path to the 7-Zip executable. If not provided, resolved via _resolve_7z.

  • level – 7-Zip compression level (0..9). Default is a balanced value (5).

  • out_path – Optional output archive path. If omitted, creates <file_path>.7z alongside the input file.

Returns:

Path to the created .7z archive.

Return type:

pathlib.Path

Raises:
  • FileNotFoundError – If file_path does not exist or is not a file, or if 7-Zip cannot be resolved.

  • RuntimeError – If the underlying 7-Zip process fails.

nDTomo.methods.zip.seven_zip_folder(folder_path: str | Path, *, seven_zip: str | Path | None = None, level: int = 5, out_dir: str | Path | None = None, include_folder: bool = True) Path[source]

Create a .7z archive from a directory using 7-Zip.

Parameters:
  • folder_path – Directory to archive (contents are included recursively).

  • seven_zip – Optional explicit path to the 7-Zip executable. If not provided, resolved via _resolve_7z.

  • level – 7-Zip compression level (0..9). Default is a balanced value (5).

  • out_dir – Optional output directory. If omitted, writes <folder_path>.7z next to the source folder.

  • include_folder – Controls archive layout: - True: archive root contains the folder itself - False: archive root contains the folder contents

Returns:

Path to the created .7z archive.

Return type:

pathlib.Path

Raises:
nDTomo.methods.zip.seven_zip_folder_split(folder_path: str | Path, *, seven_zip: str | Path | None = None, level: int = 1, volume: str | int = '10g', include_folder: bool = True, out_dir: str | Path | None = None) List[Path][source]

Create a split-volume .7z archive from a directory using 7-Zip.

Output parts are named <name>.7z.001, <name>.7z.002, … and can be reassembled/extracted by 7-Zip when all parts are present.

Parameters:
  • folder_path – Directory to archive (contents are included recursively).

  • seven_zip – Optional explicit path to the 7-Zip executable. If not provided, resolved via _resolve_7z.

  • level – 7-Zip compression level (0..9). 1 is fast and is often suitable for already-compressed scientific data formats.

  • volume – Target part size passed to 7-Zip via -v. Examples: "10g", "500m", "100k", "123b", or an integer byte count.

  • include_folder – Controls archive layout: - True: archive root contains the folder itself - False: archive root contains the folder contents

  • out_dir – Optional output directory. If omitted, parts are written next to the source folder.

Returns:

Paths to the created part files (sorted by filename).

Return type:

list of pathlib.Path

Raises:

Notes

Part files are detected using the pattern <archive>.7z.<NNN...> in the output directory.

nDTomo.methods.zip.zip_all_files_with_extension(folder: str | Path, ext: str, zip_basename: str, *, recursive: bool = False, fast: bool = True, keep_paths: bool = True, on_collision: str = 'raise') Path[source]

Convenience wrapper for creating a single ZIP from all matched files.

This is a thin wrapper around zip_by_extension and shares the same semantics and collision handling.

Parameters:
  • folder – See zip_by_extension.

  • ext – See zip_by_extension.

  • zip_basename – See zip_by_extension.

  • recursive – See zip_by_extension.

  • fast – See zip_by_extension.

  • keep_paths – See zip_by_extension.

  • on_collision – See zip_by_extension.

Returns:

Path to the created ZIP archive.

Return type:

pathlib.Path

nDTomo.methods.zip.zip_by_extension(folder: str | Path, ext: str, zip_basename: str, *, recursive: bool = False, fast: bool = True, keep_paths: bool = True, on_collision: str = 'raise') Path[source]

Create a ZIP archive containing all files with the requested extension(s).

Parameters:
  • folder – Directory to scan.

  • ext – Extension specification(s) to include (see _normalize_exts), e.g. "tif", ".tif", "tif,png". Matching is case-insensitive.

  • zip_basename – Output ZIP name without the .zip suffix. The archive is created as <folder>/<zip_basename>.zip.

  • recursive – If True, include files from all subdirectories under folder.

  • fast – If True (default), uses DEFLATE with compresslevel=1 (fast, lossless). If False, uses STORED (no compression).

  • keep_paths – If True (default), archive members store paths relative to folder (preserves folder structure). If False, only basenames are stored, which can cause name collisions.

  • on_collision – Collision policy when two input files map to the same archive member name (most commonly when keep_paths=False): - "raise": raise FileExistsError - "rename": append __<n> before the suffix

Returns:

Path to the created ZIP archive.

Return type:

pathlib.Path

Raises:

Notes

Files are added in deterministic order (case-insensitive path sort) to make archive contents reproducible.

nDTomo.methods.zip.zip_each_file_with_extension(folder: str | Path, ext: str, *, recursive: bool = False, fast: bool = True, out_dir: str | Path | None = None) List[Path][source]

Create one ZIP archive per matched file.

Each output ZIP contains exactly one member (the input file), stored using the file basename.

Parameters:
  • folder – Directory to scan.

  • ext – Extension specification(s) to include (see _normalize_exts).

  • recursive – If True, include files from all subdirectories under folder.

  • fast – If True (default), uses DEFLATE with compresslevel=1 (fast, lossless). If False, uses STORED (no compression).

  • out_dir – Optional output directory. If omitted, each ZIP is created next to its corresponding input file.

Returns:

Paths to the created ZIP archives (sorted by input file path).

Return type:

list of pathlib.Path

Raises:

NotADirectoryError – If folder is not a directory.

nDTomo.methods.zip.zip_file_list(files: Sequence[str | Path], out_zip: str | Path, *, base_dir: str | Path | None = None, fast: bool = True, keep_paths: bool = False, on_collision: str = 'raise') Path[source]

Create a ZIP archive from an explicit list of files.

Parameters:
  • files – Sequence of file paths (absolute or relative). All paths must resolve to existing files.

  • out_zip – Output ZIP path. Parent directories are created if required.

  • base_dir – Base directory used to compute archive member paths when keep_paths=True. Must be provided in that case, and all files must be within base_dir.

  • fast – If True (default), uses DEFLATE with compresslevel=1 (fast, lossless). If False, uses STORED (no compression).

  • keep_paths – If True, store paths relative to base_dir (preserves structure). If False (default), store basenames only.

  • on_collision – Collision policy when multiple input files map to the same archive member name (primarily when keep_paths=False): - "raise": raise FileExistsError - "rename": append __<n> before the suffix

Returns:

Path to the created ZIP archive.

Return type:

pathlib.Path

Raises:
  • FileNotFoundError – If any input file does not exist.

  • ValueError – If keep_paths=True and base_dir is not provided, or if on_collision is invalid.

  • FileExistsError – If a name collision occurs and on_collision=”raise”.

Notes

Inputs are archived in deterministic order (case-insensitive path sort). Archive member paths use forward slashes to be platform-independent.

nDTomo.methods.zip.zip_with_7z(folder: str | Path, ext: str, zip_basename: str, *, recursive: bool = False, level: int = 1, seven_zip: str | Path | None = None, keep_paths: bool = True) Path[source]

Create a ZIP archive using 7-Zip (Deflate, multi-threaded).

This is typically faster than Python’s zipfile for large file sets.

Parameters:
  • folder – Directory to scan.

  • ext – Extension specification(s) to include (see _normalize_exts).

  • zip_basename – Output ZIP name without the .zip suffix. The archive is created as <folder>/<zip_basename>.zip.

  • recursive – If True, include files from all subdirectories under folder.

  • level – 7-Zip compression level (0..9). 1 is fast; higher values trade speed for smaller archives.

  • seven_zip – Optional explicit path to the 7-Zip executable. If not provided, the executable is resolved via _resolve_7z.

  • keep_paths – If True (default), store paths relative to folder in the archive.

Returns:

Path to the created ZIP archive.

Return type:

pathlib.Path

Raises:
  • ValueError – If recursive=True and keep_paths=False (7-Zip cannot reliably locate subfolder files when only basenames are provided).

  • FileNotFoundError – If folder does not exist or 7-Zip cannot be resolved.

  • NotADirectoryError – If folder is not a directory.

  • RuntimeError – If the underlying 7-Zip process fails.

Notes

A temporary listfile is used to avoid Windows command-length limits.