Zip Compression Module
Methods for zipping files and folders.
@author: Antony Vamvakeros
Example usage:
# (1) one zip per tif (recursive) zips = zip_each_file_with_extension(“D:data”, “tif”, recursive=True)
# (2) one zip containing all png + tif under folder z = zip_all_files_with_extension(“D:data”, “png,tif”, “all_images”, recursive=True, keep_paths=True)
# (3) zip an explicit list of files (store basenames; rename if collisions) z = zip_file_list(
[“D:a1.tif”, “D:b1.tif”, “D:b2.tif”], “D:outselected.zip”, keep_paths=False, on_collision=”rename”,
)
- nDTomo.methods.zip.seven_zip_file(file_path: str | Path, *, seven_zip: str | Path | None = None, level: int = 5, out_path: str | Path | None = None) Path[source]
Create a
.7zarchive from a single file using 7-Zip.- Parameters:
file_path – Path to the input file.
seven_zip – Optional explicit path to the 7-Zip executable. If not provided, resolved via _resolve_7z.
level – 7-Zip compression level (0..9). Default is a balanced value (5).
out_path – Optional output archive path. If omitted, creates
<file_path>.7zalongside the input file.
- Returns:
Path to the created
.7zarchive.- Return type:
- Raises:
FileNotFoundError – If file_path does not exist or is not a file, or if 7-Zip cannot be resolved.
RuntimeError – If the underlying 7-Zip process fails.
- nDTomo.methods.zip.seven_zip_folder(folder_path: str | Path, *, seven_zip: str | Path | None = None, level: int = 5, out_dir: str | Path | None = None, include_folder: bool = True) Path[source]
Create a
.7zarchive from a directory using 7-Zip.- Parameters:
folder_path – Directory to archive (contents are included recursively).
seven_zip – Optional explicit path to the 7-Zip executable. If not provided, resolved via _resolve_7z.
level – 7-Zip compression level (0..9). Default is a balanced value (5).
out_dir – Optional output directory. If omitted, writes
<folder_path>.7znext to the source folder.include_folder – Controls archive layout: -
True: archive root contains the folder itself -False: archive root contains the folder contents
- Returns:
Path to the created
.7zarchive.- Return type:
- Raises:
NotADirectoryError – If folder_path is not a directory.
FileNotFoundError – If 7-Zip cannot be resolved.
RuntimeError – If the underlying 7-Zip process fails.
- nDTomo.methods.zip.seven_zip_folder_split(folder_path: str | Path, *, seven_zip: str | Path | None = None, level: int = 1, volume: str | int = '10g', include_folder: bool = True, out_dir: str | Path | None = None) List[Path][source]
Create a split-volume
.7zarchive from a directory using 7-Zip.Output parts are named
<name>.7z.001,<name>.7z.002, … and can be reassembled/extracted by 7-Zip when all parts are present.- Parameters:
folder_path – Directory to archive (contents are included recursively).
seven_zip – Optional explicit path to the 7-Zip executable. If not provided, resolved via _resolve_7z.
level – 7-Zip compression level (0..9).
1is fast and is often suitable for already-compressed scientific data formats.volume – Target part size passed to 7-Zip via
-v. Examples:"10g","500m","100k","123b", or an integer byte count.include_folder – Controls archive layout: -
True: archive root contains the folder itself -False: archive root contains the folder contentsout_dir – Optional output directory. If omitted, parts are written next to the source folder.
- Returns:
Paths to the created part files (sorted by filename).
- Return type:
list of pathlib.Path
- Raises:
NotADirectoryError – If folder_path is not a directory.
FileNotFoundError – If 7-Zip cannot be resolved.
RuntimeError – If the underlying 7-Zip process fails, or if no part files are detected after the command completes.
ValueError, TypeError – If volume is invalid.
Notes
Part files are detected using the pattern
<archive>.7z.<NNN...>in the output directory.
- nDTomo.methods.zip.zip_all_files_with_extension(folder: str | Path, ext: str, zip_basename: str, *, recursive: bool = False, fast: bool = True, keep_paths: bool = True, on_collision: str = 'raise') Path[source]
Convenience wrapper for creating a single ZIP from all matched files.
This is a thin wrapper around zip_by_extension and shares the same semantics and collision handling.
- Parameters:
folder – See zip_by_extension.
ext – See zip_by_extension.
zip_basename – See zip_by_extension.
recursive – See zip_by_extension.
fast – See zip_by_extension.
keep_paths – See zip_by_extension.
on_collision – See zip_by_extension.
- Returns:
Path to the created ZIP archive.
- Return type:
- nDTomo.methods.zip.zip_by_extension(folder: str | Path, ext: str, zip_basename: str, *, recursive: bool = False, fast: bool = True, keep_paths: bool = True, on_collision: str = 'raise') Path[source]
Create a ZIP archive containing all files with the requested extension(s).
- Parameters:
folder – Directory to scan.
ext – Extension specification(s) to include (see _normalize_exts), e.g.
"tif",".tif","tif,png". Matching is case-insensitive.zip_basename – Output ZIP name without the
.zipsuffix. The archive is created as<folder>/<zip_basename>.zip.recursive – If
True, include files from all subdirectories under folder.fast – If
True(default), uses DEFLATE withcompresslevel=1(fast, lossless). IfFalse, uses STORED (no compression).keep_paths – If
True(default), archive members store paths relative to folder (preserves folder structure). IfFalse, only basenames are stored, which can cause name collisions.on_collision – Collision policy when two input files map to the same archive member name (most commonly when keep_paths=False): -
"raise": raise FileExistsError -"rename": append__<n>before the suffix
- Returns:
Path to the created ZIP archive.
- Return type:
- Raises:
FileNotFoundError – If folder does not exist.
NotADirectoryError – If folder is not a directory.
FileExistsError – If a name collision occurs and on_collision=”raise”.
ValueError – If on_collision is not one of
{"raise", "rename"}.
Notes
Files are added in deterministic order (case-insensitive path sort) to make archive contents reproducible.
- nDTomo.methods.zip.zip_each_file_with_extension(folder: str | Path, ext: str, *, recursive: bool = False, fast: bool = True, out_dir: str | Path | None = None) List[Path][source]
Create one ZIP archive per matched file.
Each output ZIP contains exactly one member (the input file), stored using the file basename.
- Parameters:
folder – Directory to scan.
ext – Extension specification(s) to include (see _normalize_exts).
recursive – If
True, include files from all subdirectories under folder.fast – If
True(default), uses DEFLATE withcompresslevel=1(fast, lossless). IfFalse, uses STORED (no compression).out_dir – Optional output directory. If omitted, each ZIP is created next to its corresponding input file.
- Returns:
Paths to the created ZIP archives (sorted by input file path).
- Return type:
list of pathlib.Path
- Raises:
NotADirectoryError – If folder is not a directory.
- nDTomo.methods.zip.zip_file_list(files: Sequence[str | Path], out_zip: str | Path, *, base_dir: str | Path | None = None, fast: bool = True, keep_paths: bool = False, on_collision: str = 'raise') Path[source]
Create a ZIP archive from an explicit list of files.
- Parameters:
files – Sequence of file paths (absolute or relative). All paths must resolve to existing files.
out_zip – Output ZIP path. Parent directories are created if required.
base_dir – Base directory used to compute archive member paths when keep_paths=True. Must be provided in that case, and all files must be within base_dir.
fast – If
True(default), uses DEFLATE withcompresslevel=1(fast, lossless). IfFalse, uses STORED (no compression).keep_paths – If
True, store paths relative to base_dir (preserves structure). IfFalse(default), store basenames only.on_collision – Collision policy when multiple input files map to the same archive member name (primarily when keep_paths=False): -
"raise": raise FileExistsError -"rename": append__<n>before the suffix
- Returns:
Path to the created ZIP archive.
- Return type:
- Raises:
FileNotFoundError – If any input file does not exist.
ValueError – If keep_paths=True and base_dir is not provided, or if on_collision is invalid.
FileExistsError – If a name collision occurs and on_collision=”raise”.
Notes
Inputs are archived in deterministic order (case-insensitive path sort). Archive member paths use forward slashes to be platform-independent.
- nDTomo.methods.zip.zip_with_7z(folder: str | Path, ext: str, zip_basename: str, *, recursive: bool = False, level: int = 1, seven_zip: str | Path | None = None, keep_paths: bool = True) Path[source]
Create a ZIP archive using 7-Zip (Deflate, multi-threaded).
This is typically faster than Python’s zipfile for large file sets.
- Parameters:
folder – Directory to scan.
ext – Extension specification(s) to include (see _normalize_exts).
zip_basename – Output ZIP name without the
.zipsuffix. The archive is created as<folder>/<zip_basename>.zip.recursive – If
True, include files from all subdirectories under folder.level – 7-Zip compression level (0..9).
1is fast; higher values trade speed for smaller archives.seven_zip – Optional explicit path to the 7-Zip executable. If not provided, the executable is resolved via _resolve_7z.
keep_paths – If
True(default), store paths relative to folder in the archive.
- Returns:
Path to the created ZIP archive.
- Return type:
- Raises:
ValueError – If
recursive=Trueandkeep_paths=False(7-Zip cannot reliably locate subfolder files when only basenames are provided).FileNotFoundError – If folder does not exist or 7-Zip cannot be resolved.
NotADirectoryError – If folder is not a directory.
RuntimeError – If the underlying 7-Zip process fails.
Notes
A temporary listfile is used to avoid Windows command-length limits.