Contributor guide¶
Development environment¶
If you’re reading this, you’re probably interested in contributing to py7zr. Thank you very much! The purpose of this guide is to get you to the point where you can make improvements to the py7zr and share them with the rest of the team.
Setup Python¶
The py7zr is written in the Python programming language. Python installation for various platforms with various ways. You need to install Python environment which support pip command. Venv/Virtualenv is recommended for development.
We have a test suite with python 3.9, 3.10, 3.11, 3.12, 3.13, and pypy3. If you want to run all the test with these versions and variant on your local, you should install these versions. You can run test with CI environment on Github actions.
Get Early Feedback¶
If you are contributing, do not feel the need to sit on your contribution until it is perfectly polished and complete. It helps everyone involved for you to seek feedback as early as you possibly can. Submitting an early, unfinished version of your contribution for feedback in no way prejudices your chances of getting that contribution accepted, and can save you from putting a lot of work into a contribution that is not suitable for the project.
Code Contributions¶
Steps submitting code¶
When contributing code, you’ll want to follow this checklist:
Fork the repository on GitHub.
Run the tox tests to confirm they all pass on your system. If they don’t, you’ll need to investigate why they fail. If you’re unable to diagnose this yourself, raise it as a bug report.
Write tests that demonstrate your bug or feature. Ensure that they fail.
Make your change.
Run the entire test suite again using tox, confirming that all tests pass including the ones you just added.
Send a GitHub Pull Request to the main repository’s master branch. GitHub Pull Requests are the expected method of code collaboration on this project.
Code review¶
Contribution will not be merged until they have been code reviewed. There are limited reviewer in the team, reviews from other contributors are also welcome. You should implemented a review feedback unless you strongly object to it.
Code style¶
The py7zr uses the PEP8 code style. In addition to the standard PEP8, we have an extended guidelines
line length should not exceed 125 characters.
It also use MyPy static type check enforcement.
Profiling¶
CPU and memory profiling¶
Run-time memory errors and leaks are among the most difficult errors to locate and the most important to correct. Memory profiling is used to detect memory leaks or unwanted memory usages.
It is also a difficult work to improve performance. CPU profiling help us to understand where is a hot spot of execution of a program.
mprofile¶
mprofile is a tool to do a memory profiling task for python. py7zr project has a test configuration for the memory profiling.
env PYTEST_ADDOPTS=--run-slow tox -e mprof
This example run all the test cases includes conditions which requires running duration.
After running test, you can find a chart in project root. memory-profiile.png and raw data as mprofile_yyyyMMddhhmmss.dat
Class and module design¶
The py7zr take class design that categorized into several sub modules to reflect its role.
The main class is py7zr.SevenZipFile() class which provide API for library users. The main internal classes are in the submodule py7zr.archiveinfo, which takes class structure as same as .7z file format structure.
Another important submodule is py7zr.compressor module that hold all related compression and encryption proxy classes for corresponding libraries to convert various interfaces into common ISevenZipCompressor() and ISevenZipDecompressor() interface.
All UI related classes and functions are separated from core modules. cli submodule is a place for command line functions and pretty printings.
digraph "packages" { charset="utf-8" rankdir=BT "0" [label="py7zr", shape="box"]; "1" [label="py7zr.__main__", shape="box"]; "2" [label="py7zr.archiveinfo", shape="box"]; "3" [label="py7zr.callbacks", shape="box"]; "4" [label="py7zr.cli", shape="box"]; "5" [label="py7zr.compressor", shape="box"]; "6" [label="py7zr.exceptions", shape="box"]; "7" [label="py7zr.helpers", shape="box"]; "8" [label="py7zr.properties", shape="box"]; "9" [label="py7zr.py7zr", shape="box"]; "10" [label="py7zr.win32compat", shape="box"]; "0" -> "6" [arrowhead="open", arrowtail="none"]; "0" -> "8" [arrowhead="open", arrowtail="none"]; "0" -> "9" [arrowhead="open", arrowtail="none"]; "2" -> "5" [arrowhead="open", arrowtail="none"]; "2" -> "6" [arrowhead="open", arrowtail="none"]; "2" -> "7" [arrowhead="open", arrowtail="none"]; "2" -> "8" [arrowhead="open", arrowtail="none"]; "4" -> "3" [arrowhead="open", arrowtail="none"]; "4" -> "5" [arrowhead="open", arrowtail="none"]; "4" -> "7" [arrowhead="open", arrowtail="none"]; "4" -> "8" [arrowhead="open", arrowtail="none"]; "4" -> "9" [arrowhead="open", arrowtail="none"]; "5" -> "6" [arrowhead="open", arrowtail="none"]; "5" -> "7" [arrowhead="open", arrowtail="none"]; "5" -> "8" [arrowhead="open", arrowtail="none"]; "9" -> "2" [arrowhead="open", arrowtail="none"]; "9" -> "3" [arrowhead="open", arrowtail="none"]; "9" -> "5" [arrowhead="open", arrowtail="none"]; "9" -> "6" [arrowhead="open", arrowtail="none"]; "9" -> "7" [arrowhead="open", arrowtail="none"]; "9" -> "8" [arrowhead="open", arrowtail="none"]; }Here is a whole classes diagram. There are part by part descriptions at Next sections.
digraph "classes" { charset="utf-8" rankdir=BT "0" [label="{AESCompressor|AES_CBC_BLOCKSIZE : int\lbuf\lcipher\lcycles : int\lflushed : bool\liv\lmethod\lsalt : bytes\l|compress(data)\lencode_filter_properties()\lflush()\l}", shape="record"]; "1" [label="{AESDecompressor|buf\lcipher\l|decompress(data)\l}", shape="record"]; "2" [label="{ArchiveCallback|\l|}", shape="record"]; "4" [label="{ArchiveFile|archivable\lcompressed\lcrc32\lemptystream\lfilename\lfolder\lid\lis_directory\lis_junction\lis_socket\lis_symlink\llastwritetime\lorigin\lposix_mode\lreadonly\lst_fmt\luncompressed\l|file_properties()\l}", shape="record"]; "5" [label="{ArchiveFileList|files_list : list\lindex : int\loffset : int\l|append(file_info)\l}", shape="record"]; "7" [label="{ArchiveInfo|blocks\lfilename\lheader_size\lmethod_names\lsize\lsolid\luncompressed\l|}", shape="record"]; "13" [label="{Buffer|view : memoryview\l|add(data)\lget()\lreset()\lset(data)\l}", shape="record"]; "16" [label="{Callback|\l|report_end(processing_file_path, wrote_bytes)\lreport_postprocess()\lreport_start(processing_file_path, processing_bytes)\lreport_start_preparation()\lreport_warning(message)\l}", shape="record"]; "19" [label="{CompressionMethod|ARM\lARMT\lBCJ\lBCJ_ARM\lBCJ_ARMT\lBCJ_IA64\lBCJ_PPC\lBCJ_SPARC\lCOPY\lCRYPT_AES256_SHA256\lCRYPT_RAR29AES\lCRYPT_ZIPCRYPT\lDELTA\lIA64\lLZMA\lLZMA2\lMISC_BROTLI\lMISC_BZIP2\lMISC_DEFLATE\lMISC_DEFLATE64\lMISC_LIZARD\lMISC_LZ4\lMISC_LZH\lMISC_LZS\lMISC_Z\lMISC_ZIP\lMISC_ZSTD\lNSIS_BZIP2\lNSIS_DEFLATE\lP7Z_BCJ\lP7Z_BCJ2\lPPC\lPPMD\lSPARC\lSWAP2\lSWAP4\l|}", shape="record"]; "20" [label="{CompressorChain|digest : int\lfilters : list\lmethods_map\lpacksize : int\lunpacksizes\l|add_filter(filter)\lcompress(data)\lflush()\l}", shape="record"]; "22" [label="{CopyCompressor|\l|compress(data)\lflush()\l}", shape="record"]; "23" [label="{CopyDecompressor|\l|decompress(data)\l}", shape="record"]; "26" [label="{DecompressorChain|filters : list\l|add_filter(filter)\ldecompress(data, max_length)\l}", shape="record"]; "27" [label="{DeflateCompressor|\l|compress(data)\lflush()\l}", shape="record"]; "28" [label="{DeflateDecompressor|flushed : bool\l|decompress(data)\l}", shape="record"]; "30" [label="{ExtractCallback|\l|}", shape="record"]; "31" [label="{FileInfo|archivable\lcompressed\lcrc32\lcreationtime\lfilename\lis_directory\luncompressed\l|}", shape="record"]; "32" [label="{FilesInfo|emptyfiles : list\lfiles : list\l|retrieve(cls, file)\lwrite(file)\l}", shape="record"]; "33" [label="{Folder|bindpairs : list\lcoders : list\lcompressor : NoneType\lcrc : int, NoneType\ldecompressor : NoneType\ldigestdefined : bool\lfiles : NoneType\lpacked_indices : list\lsolid : bool\lunpacksizes : list\l|get_compressor()\lget_decompressor(packsize, reset)\lget_unpack_size()\lis_simple(coder)\lprepare_coderinfo(filters)\lretrieve(cls, file)\lwrite(file)\l}", shape="record"]; "34" [label="{Header|files_info : NoneType\lmain_streams : NoneType\lsize : int\lsolid : bool\l|build_header(folders)\lretrieve(cls, fp, buffer, start_pos)\lwrite(file, afterheader, encoded, encrypted)\l}", shape="record"]; "35" [label="{HeaderStreamsInfo|packinfo\lunpackinfo\l|write(file)\l}", shape="record"]; "37" [label="{ISevenZipCompressor|\l|compress(data)\lflush()\l}", shape="record"]; "38" [label="{ISevenZipDecompressor|\l|decompress(data)\l}", shape="record"]; "41" [label="{MemIO|parent\l|close()\lflush()\lmkdir(parents, exist_ok)\lopen(mode)\lread(length)\lseek(position)\lwrite(data)\l}", shape="record"]; "44" [label="{NullIO|parent\l|close()\lflush()\lmkdir()\lopen(mode)\lread(length)\lwrite(data)\l}", shape="record"]; "45" [label="{PackInfo|crcs : list\lenable_digests : bool\lnumstreams : int\lpackpos : int\lpackpositions\lpacksizes : list\l|retrieve(cls, file)\lwrite(file)\l}", shape="record"]; "53" [label="{SevenZipCompressor|cchain\lcoders : list\ldigest\lfilters : NoneType, list\lmethods_map\lpacksize\lunpacksizes\l|compress(data)\lflush()\l}", shape="record"]; "54" [label="{SevenZipDecompressor|cchain\lconsumed : int\lcrc\ldigest : NoneType, int\linput_size\lmethods_map\lunpacksizes\l|check_crc()\ldecompress(data, max_length)\l}", shape="record"]; "55" [label="{SevenZipFile|afterheader\ldereference : bool\lencoded_header_mode : bool\lfilename : str\lfiles : NoneType\lfolder : NoneType\lfp\lheader : NoneType\lmode : str\lpassword : NoneType\lpassword_protected : bool\lq\lreporterd : NoneType\lsig_header : NoneType\lworker : NoneType\l|archiveinfo()\lclose()\lextract(path, targets)\lextractall(path, callback)\lgetnames()\llist()\lread(targets)\lreadall()\lreporter(callback)\lreset()\lset_encoded_header_mode(mode)\ltest()\ltestzip()\lwrite(file, arcname)\lwriteall(path, arcname)\l}", shape="record"]; "56" [label="{SignatureHeader|nextheadercrc : int\lnextheaderofs : int\lnextheadersize : int\lstartheadercrc : int\lversion : tuple\l|calccrc(length, header_crc)\lretrieve(cls, file)\lwrite(file)\l}", shape="record"]; "57" [label="{StreamsInfo|packinfo : NoneType\lsubstreamsinfo : NoneType\lunpackinfo : NoneType\l|read(file)\lretrieve(cls, file)\lwrite(file)\l}", shape="record"]; "58" [label="{SubstreamsInfo|digests : list\ldigestsdefined : list\lnum_unpackstreams_folders : list\lunpacksizes : list, NoneType\l|retrieve(cls, file, numfolders, folders)\lwrite(file, numfolders)\l}", shape="record"]; "59" [label="{SupportedMethods|formats : list\lmethods : list\l|get_coder(cls, filter)\lget_filter_id(cls, coder)\lget_method_id(cls, filter)\lis_compressor(cls, filter)\lis_crypto(cls, filter)\lis_native_coder(cls, coder)\lis_native_filter(cls, filter)\l}", shape="record"]; "64" [label="{UnpackInfo|datastreamidx : NoneType\lfolders : list\lnumfolders : NoneType, int\l|retrieve(cls, file)\lwrite(file)\l}", shape="record"]; "66" [label="{Worker|files\lheader\lsrc_start\ltarget_filepath : dict\l|archive(fp, folder, deref)\ldecompress(fp, folder, fq, size, compressed_size, src_end)\lextract(fp, parallel, q)\lextract_single(fp, files, src_start, src_end, q)\lregister_filelike(id, fileish)\l}", shape="record"]; "67" [label="{ZstdCompressor|\l|compress(data)\lflush()\l}", shape="record"]; "69" [label="{ZstdDecompressor|\l|decompress(data)\l}", shape="record"]; "0" -> "37" [arrowhead="empty", arrowtail="none"]; "1" -> "38" [arrowhead="empty", arrowtail="none"]; "2" -> "16" [arrowhead="empty", arrowtail="none"]; "22" -> "37" [arrowhead="empty", arrowtail="none"]; "23" -> "38" [arrowhead="empty", arrowtail="none"]; "27" -> "37" [arrowhead="empty", arrowtail="none"]; "28" -> "38" [arrowhead="empty", arrowtail="none"]; "30" -> "16" [arrowhead="empty", arrowtail="none"]; "35" -> "57" [arrowhead="empty", arrowtail="none"]; "37" -> "20" [arrowhead="empty", arrowtail="none"]; "38" -> "26" [arrowhead="empty", arrowtail="none"]; "67" -> "37" [arrowhead="empty", arrowtail="none"]; "69" -> "38" [arrowhead="empty", arrowtail="none"]; "5" -> "55" [arrowhead="diamond", arrowtail="none", fontcolor="green", label="files", style="solid"]; "5" -> "55" [arrowhead="diamond", arrowtail="none", fontcolor="green", label="files", style="solid"]; "13" -> "0" [arrowhead="diamond", arrowtail="none", fontcolor="green", label="buf", style="solid"]; "13" -> "1" [arrowhead="diamond", arrowtail="none", fontcolor="green", label="buf", style="solid"]; "20" -> "53" [arrowhead="diamond", arrowtail="none", fontcolor="green", label="cchain", style="solid"]; "26" -> "54" [arrowhead="diamond", arrowtail="none", fontcolor="green", label="cchain", style="solid"]; "32" -> "34" [arrowhead="diamond", arrowtail="none", fontcolor="green", label="files_info", style="solid"]; "32" -> "34" [arrowhead="diamond", arrowtail="none", fontcolor="green", label="files_info", style="solid"]; "33" -> "64" [arrowhead="diamond", arrowtail="none", fontcolor="green", label="folders", style="solid"]; "34" -> "55" [arrowhead="diamond", arrowtail="none", fontcolor="green", label="header", style="solid"]; "34" -> "55" [arrowhead="diamond", arrowtail="none", fontcolor="green", label="header", style="solid"]; "45" -> "35" [arrowhead="diamond", arrowtail="none", fontcolor="green", label="packinfo", style="solid"]; "45" -> "57" [arrowhead="diamond", arrowtail="none", fontcolor="green", label="packinfo", style="solid"]; "45" -> "57" [arrowhead="diamond", arrowtail="none", fontcolor="green", label="packinfo", style="solid"]; "53" -> "33" [arrowhead="diamond", arrowtail="none", fontcolor="green", label="compressor", style="solid"]; "54" -> "33" [arrowhead="diamond", arrowtail="none", fontcolor="green", label="decompressor", style="solid"]; "56" -> "55" [arrowhead="diamond", arrowtail="none", fontcolor="green", label="sig_header", style="solid"]; "56" -> "55" [arrowhead="diamond", arrowtail="none", fontcolor="green", label="sig_header", style="solid"]; "57" -> "34" [arrowhead="diamond", arrowtail="none", fontcolor="green", label="additional_streams", style="solid"]; "57" -> "34" [arrowhead="diamond", arrowtail="none", fontcolor="green", label="main_streams", style="solid"]; "57" -> "34" [arrowhead="diamond", arrowtail="none", fontcolor="green", label="main_streams", style="solid"]; "58" -> "57" [arrowhead="diamond", arrowtail="none", fontcolor="green", label="substreamsinfo", style="solid"]; "58" -> "57" [arrowhead="diamond", arrowtail="none", fontcolor="green", label="substreamsinfo", style="solid"]; "64" -> "35" [arrowhead="diamond", arrowtail="none", fontcolor="green", label="unpackinfo", style="solid"]; "64" -> "57" [arrowhead="diamond", arrowtail="none", fontcolor="green", label="unpackinfo", style="solid"]; "64" -> "57" [arrowhead="diamond", arrowtail="none", fontcolor="green", label="unpackinfo", style="solid"]; "66" -> "55" [arrowhead="diamond", arrowtail="none", fontcolor="green", label="worker", style="solid"]; }Header classes¶
Header related classes are in py7zr.archiveinfo submodule.
digraph "classes" { charset="utf-8" rankdir=BT "33" [label="{Folder|bindpairs : list\lcoders : list\lcompressor : NoneType\lcrc : int, NoneType\ldecompressor : NoneType\ldigestdefined : bool\lfiles : NoneType\lpacked_indices : list\lsolid : bool\lunpacksizes : list\l|get_compressor()\lget_decompressor(packsize, reset)\lget_unpack_size()\lis_simple(coder)\lprepare_coderinfo(filters)\lretrieve(cls, file)\lwrite(file)\l}", shape="record"]; "34" [label="{Header|files_info : FilesInfo\lmain_streams : StreamsInfo\lsize : int\lsolid : bool\l|build_header(folders)\lretrieve(cls, fp, buffer, start_pos)\lwrite(file, afterheader, encoded, encrypted)\l}", shape="record"]; "35" [label="{HeaderStreamsInfo|packinfo : PackInfo\lunpackinfo : UnpackInfo\l|write(file)\l}", shape="record"]; "45" [label="{PackInfo|crcs : list\lenable_digests : bool\lnumstreams : int\lpackpos : int\lpackpositions\lpacksizes : list\l|retrieve(cls, file)\lwrite(file)\l}", shape="record"]; "55" [label="{SevenZipFile}", shape="record"]; "56" [label="{SignatureHeader|nextheadercrc : int\lnextheaderofs : int\lnextheadersize : int\lstartheadercrc : int\lversion : tuple\l|calccrc(length, header_crc)\lretrieve(cls, file)\lwrite(file)\l}", shape="record"]; "57" [label="{StreamsInfo|packinfo : NoneType\lsubstreamsinfo : NoneType\lunpackinfo : NoneType\l|read(file)\lretrieve(cls, file)\lwrite(file)\l}", shape="record"]; "58" [label="{SubstreamsInfo|digests : list\ldigestsdefined : list\lnum_unpackstreams_folders : list\lunpacksizes : list\l|retrieve(cls, file, numfolders, folders)\lwrite(file, numfolders)\l}", shape="record"]; "64" [label="{UnpackInfo|folders : list\lnumfolders : int\l|retrieve(cls, file)\lwrite(file)\l}", shape="record"]; "35" -> "57" [arrowhead="empty", arrowtail="none"]; "33" -> "64" [arrowhead="diamond", arrowtail="none", fontcolor="green", label="folders", style="solid"]; "34" -> "55" [arrowhead="diamond", arrowtail="none", fontcolor="green", label="header", style="solid"]; "34" -> "55" [arrowhead="diamond", arrowtail="none", fontcolor="green", label="header", style="solid"]; "45" -> "35" [arrowhead="diamond", arrowtail="none", fontcolor="green", label="packinfo", style="solid"]; "45" -> "57" [arrowhead="diamond", arrowtail="none", fontcolor="green", label="packinfo", style="solid"]; "45" -> "57" [arrowhead="diamond", arrowtail="none", fontcolor="green", label="packinfo", style="solid"]; "56" -> "55" [arrowhead="diamond", arrowtail="none", fontcolor="green", label="sig_header", style="solid"]; "56" -> "55" [arrowhead="diamond", arrowtail="none", fontcolor="green", label="sig_header", style="solid"]; "57" -> "34" [arrowhead="diamond", arrowtail="none", fontcolor="green", label="additional_streams", style="solid"]; "57" -> "34" [arrowhead="diamond", arrowtail="none", fontcolor="green", label="main_streams", style="solid"]; "57" -> "34" [arrowhead="diamond", arrowtail="none", fontcolor="green", label="main_streams", style="solid"]; "58" -> "57" [arrowhead="diamond", arrowtail="none", fontcolor="green", label="substreamsinfo", style="solid"]; "58" -> "57" [arrowhead="diamond", arrowtail="none", fontcolor="green", label="substreamsinfo", style="solid"]; "64" -> "35" [arrowhead="diamond", arrowtail="none", fontcolor="green", label="unpackinfo", style="solid"]; "64" -> "57" [arrowhead="diamond", arrowtail="none", fontcolor="green", label="unpackinfo", style="solid"]; "64" -> "57" [arrowhead="diamond", arrowtail="none", fontcolor="green", label="unpackinfo", style="solid"]; }Compressor classes¶
There are compression related classes in py7zr.compressor submodule.
digraph "compressor_classes" { charset="utf-8" rankdir=BT "0" [label="{AESCompressor|cycles : int\liv\lmethod\lsalt : bytes\l|compress(data)\lencode_filter_properties()\lflush()\l}", shape="record"]; "1" [label="{AESDecompressor|\l|decompress(data)\l}", shape="record"]; "19" [label="{CompressionMethod|ARM\lARMT\lBCJ\lBCJ_ARM\lBCJ_ARMT\lBCJ_IA64\lBCJ_PPC\lBCJ_SPARC\lCOPY\lCRYPT_AES256_SHA256\lCRYPT_RAR29AES\lCRYPT_ZIPCRYPT\lDELTA\lIA64\lLZMA\lLZMA2\lMISC_BROTLI\lMISC_BZIP2\lMISC_DEFLATE\lMISC_DEFLATE64\lMISC_LIZARD\lMISC_LZ4\lMISC_LZH\lMISC_LZS\lMISC_Z\lMISC_ZIP\lMISC_ZSTD\lNSIS_BZIP2\lNSIS_DEFLATE\lP7Z_BCJ\lP7Z_BCJ2\lPPC\lPPMD\lSPARC\lSWAP2\lSWAP4\l|}", shape="record"]; "20" [label="{CompressorChain|digest : int\lfilters : list\lpacksize : int\lunpacksizes\l|add_filter(filter)\lcompress(data)\lflush()\l}", shape="record"]; "22" [label="{CopyCompressor|\l|compress(data)\lflush()\l}", shape="record"]; "23" [label="{CopyDecompressor|\l|decompress(data)\l}", shape="record"]; "26" [label="{DecompressorChain|filters : list\l|add_filter(filter)\ldecompress(data, max_length)\l}", shape="record"]; "27" [label="{DeflateCompressor|\l|compress(data)\lflush()\l}", shape="record"]; "28" [label="{DeflateDecompressor|\l|decompress(data)\l}", shape="record"]; "33" [label="{Folder}", shape="record"]; "37" [label="{ISevenZipCompressor|\l|compress(data)\lflush()\l}", shape="record"]; "38" [label="{ISevenZipDecompressor|\l|decompress(data)\l}", shape="record"]; "53" [label="{SevenZipCompressor|cchain\lcoders : list\ldigest\lfilters : list\lpacksize\lunpacksizes\l|compress(data)\lflush()\l}", shape="record"]; "54" [label="{SevenZipDecompressor|cchain : list\lcrc\ldigest : int\lunpacksizes\l|check_crc()\ldecompress(data, max_length)\l}", shape="record"]; "59" [label="{SupportedMethods|formats : list\lmethods : list\l|get_coder(cls, filter)\lget_filter_id(cls, coder)\lget_method_id(cls, filter)\lis_compressor(cls, filter)\lis_crypto(cls, filter)\lis_native_coder(cls, coder)\lis_native_filter(cls, filter)\l}", shape="record"]; "67" [label="{ZstdCompressor|\l|compress(data)\lflush()\l}", shape="record"]; "69" [label="{ZstdDecompressor|\l|decompress(data)\l}", shape="record"]; "0" -> "37" [arrowhead="empty", arrowtail="none"]; "1" -> "38" [arrowhead="empty", arrowtail="none"]; "22" -> "37" [arrowhead="empty", arrowtail="none"]; "23" -> "38" [arrowhead="empty", arrowtail="none"]; "27" -> "37" [arrowhead="empty", arrowtail="none"]; "28" -> "38" [arrowhead="empty", arrowtail="none"]; "37" -> "20" [arrowhead="empty", arrowtail="none"]; "38" -> "26" [arrowhead="empty", arrowtail="none"]; "67" -> "37" [arrowhead="empty", arrowtail="none"]; "69" -> "38" [arrowhead="empty", arrowtail="none"]; "20" -> "53" [arrowhead="diamond", arrowtail="none", fontcolor="green", label="cchain", style="solid"]; "26" -> "54" [arrowhead="diamond", arrowtail="none", fontcolor="green", label="cchain", style="solid"]; "53" -> "33" [arrowhead="diamond", arrowtail="none", fontcolor="green", label="compressor", style="solid"]; "54" -> "33" [arrowhead="diamond", arrowtail="none", fontcolor="green", label="decompressor", style="solid"]; }IO Abstraction classes¶
There are two IO abstraction classes to provide Mem API and check method.
digraph "abstractio" { charset="utf-8" rankdir=BT "41" [label="{MemIO|\l|close()\lflush()\lmkdir(parents, exist_ok)\lopen(mode)\lread(length)\lseek(position)\lwrite(data)\l}", shape="record"]; "44" [label="{NullIO|\l|close()\lflush()\lmkdir()\lopen(mode)\lread(length)\lwrite(data)\l}", shape="record"]; }Callback classes¶
Here is a callback interface class. ExtractCallback class is a concrete class used in CLI.
digraph "callbacks" { charset="utf-8" rankdir=BT "16" [label="{Callback|\l|report_end(processing_file_path, wrote_bytes)\lreport_postprocess()\lreport_start(processing_file_path, processing_bytes)\lreport_start_preparation()\lreport_warning(message)\l}", shape="record"]; "30" [label="{ExtractCallback|\l|}", shape="record"]; "30" -> "16" [arrowhead="empty", arrowtail="none"]; }Classes details¶
Here is a detailed interface documentation for implementer.
ArchiveFile Objects¶
Read 7zip format archives.
- class py7zr.py7zr.ArchiveFile(id: int, file_info: dict[str, Any])[source]¶
Represent each files metadata inside archive file. It holds file properties; filename, permissions, and type whether it is directory, link or normal file.
Instances of the
ArchiveFile
class are returned by iteratingfiles_list
ofSevenZipFile
objects. Each object stores information about a single member of the 7z archive. Most of users useextractall()
.The class also hold an archive parameter where file is exist in archive file folder(container).
- property archivable: bool¶
File has a Windows archive flag.
- property compressed: int | None¶
Compressed size
- property crc32: int | None¶
CRC of archived file(optional)
- property emptystream: bool¶
True if file is empty(0-byte file), otherwise False
- file_properties() dict[str, Any] [source]¶
Return file properties as a hash object. Following keys are included: ‘readonly’, ‘is_directory’, ‘posix_mode’, ‘archivable’, ‘emptystream’, ‘filename’, ‘creationtime’, ‘lastaccesstime’, ‘lastwritetime’, ‘attributes’
- property filename: str¶
return filename of archive file.
- property is_directory: bool¶
True if file is a directory, otherwise False.
- property is_junction: bool¶
True if file is a junction/reparse point on windows, otherwise False.
- property is_socket: bool¶
True if file is a socket, otherwise False.
- property is_symlink: bool¶
True if file is a symbolic link, otherwise False.
- property lastwritetime: ArchiveTimestamp | None¶
Return last written timestamp of a file.
- property posix_mode: int | None¶
posix mode when a member has a unix extension property, or None :return: Return file stat mode can be set by os.chmod()
- property readonly: bool¶
True if file is readonly, otherwise False.
- property st_fmt: int | None¶
- Returns:
Return the portion of the file mode that describes the file type
- class py7zr.py7zr.ArchiveInfo(filename: str, stat: stat_result, header_size: int, method_names: list[str], solid: bool, blocks: int, uncompressed: list[int])[source]¶
Hold archive information
- class py7zr.py7zr.FileInfo(filename, compressed, uncompressed, archivable, is_directory, creationtime, crc32)[source]¶
Hold archived file information.
- class py7zr.py7zr.SevenZipFile(file: BinaryIO | str | Path, mode: str = 'r', *, filters: list[dict[str, int]] | None = None, dereference=False, password: str | None = None, header_encryption: bool = False, blocksize: int | None = None, mp: bool = False)[source]¶
The SevenZipFile Class provides an interface to 7z archives.
- close()[source]¶
Flush all the data into archive and close it. When close py7zr start reading target and writing actual archive file.
- extractall(path: Any | None = None, *, callback: ExtractCallback | None = None, factory: WriterFactory | None = None) None [source]¶
Extract all members from the archive to the current working directory and set owner, modification time and permissions on directories afterward.
path
specifies a different directory to extract to.
- getinfo(name: str) FileInfo [source]¶
Return a FileInfo object with information about the archive member name. Calling getinfo() for a name not currently contained in the archive will raise a KeyError.
- class py7zr.py7zr.Worker(files, src_start: int, header, mp=False)[source]¶
Extract worker class to invoke handler.
- archive(fp: BinaryIO, files, folder, deref=False)[source]¶
Run archive task for specified 7zip folder.
- decompress(fp: BinaryIO, folder, fq: IO[Any], size: int, compressed_size: int | None, src_end: int, q: Queue | None = None) int [source]¶
decompressor wrapper called from extract method.
- Parameters:
fp – archive source file pointer
folder – Folder object that have decompressor object.
fq – output file pathlib.Path
size – uncompressed size of target file.
compressed_size – compressed size of target file.
src_end – end position of the folder
q – the queue for the reporter
:returns None
- extract(fp: BinaryIO, path: Path | None, parallel: bool, skip_notarget=True, q=None) None [source]¶
Extract worker method to handle 7zip folder and decompress each files.
- py7zr.py7zr.is_7zfile(file: BinaryIO | str | Path) bool [source]¶
Quickly see if a file is a 7Z file by checking the magic number. The file argument may be a filename or file-like object too.
archiveinfo module¶
- class py7zr.archiveinfo.Bond(incoder, outcoder)[source]¶
Represent bindings between two methods. bonds[i] = (incoder, outstream) means methods[i].stream[outstream] output data go to method[incoder].stream[0]
- class py7zr.archiveinfo.Folder[source]¶
a “Folder” represents a stream of compressed data. coders: list of coder num_coders: length of coders coder: hash list keys of coders: method, numinstreams, numoutstreams, properties unpacksizes: uncompressed sizes of outstreams
- class py7zr.archiveinfo.SignatureHeader[source]¶
The SignatureHeader class hold information of a signature header of archive.
- class py7zr.archiveinfo.WriteWithCrc(fp: BinaryIO)[source]¶
Thin wrapper for file object to calculate crc32 when write called.
- py7zr.archiveinfo.read_real_uint64(file: BinaryIO) tuple[int, bytes] [source]¶
read 8 bytes, return unpacked value as a little endian unsigned long long, and raw data.
- py7zr.archiveinfo.read_uint32(file: BinaryIO) tuple[int, bytes] [source]¶
read 4 bytes, return unpacked value as a little endian unsigned long, and raw data.
- py7zr.archiveinfo.read_uint64(file: BinaryIO) int [source]¶
read UINT64, definition show in write_uint64()
- py7zr.archiveinfo.write_real_uint64(file: BinaryIO | WriteWithCrc, value: int)[source]¶
write 8 bytes, as an unsigned long long.
- py7zr.archiveinfo.write_uint32(file: BinaryIO | WriteWithCrc, value)[source]¶
write uint32 value in 4 bytes.
- py7zr.archiveinfo.write_uint64(file: BinaryIO | WriteWithCrc, value: int)[source]¶
UINT64 means real UINT64 encoded with the following scheme:
Size of encoding sequence depends from first byte:First_Byte Extra_Bytes Value(binary)0xxxxxxx : ( xxxxxxx )10xxxxxx BYTE y[1] : ( xxxxxx << (8 * 1)) + y110xxxxx BYTE y[2] : ( xxxxx << (8 * 2)) + y…1111110x BYTE y[6] : ( x << (8 * 6)) + y11111110 BYTE y[7] : y11111111 BYTE y[8] : y
- py7zr.archiveinfo.write_utf16(file: BinaryIO | WriteWithCrc, val: str)[source]¶
write a utf-16 string to file
compressor module¶
- class py7zr.compressor.AESCompressor(password: str, blocksize: int | None = None)[source]¶
AES Compression(Encryption) class. It accept pre-processing filter which may be a LZMA compression.
- class py7zr.compressor.AESDecompressor(aes_properties: bytes, password: str, blocksize: int | None = None)[source]¶
Decrypt data
- class py7zr.compressor.BCJEncoder[source]¶
- class py7zr.compressor.BcjArmEncoder[source]¶
- class py7zr.compressor.BcjArmtEncoder[source]¶
- class py7zr.compressor.BcjPpcEncoder[source]¶
- class py7zr.compressor.BcjSparcEncoder[source]¶
- class py7zr.compressor.BrotliCompressor(level)[source]¶
- class py7zr.compressor.CopyCompressor[source]¶
- class py7zr.compressor.Deflate64Compressor[source]¶
- class py7zr.compressor.ISevenZipCompressor[source]¶
- class py7zr.compressor.LZMA1Compressor(filters)[source]¶
- class py7zr.compressor.MethodsType(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]¶
- class py7zr.compressor.PpmdCompressor(properties: bytes)[source]¶
Compress with PPMd compression algorithm
- class py7zr.compressor.PpmdDecompressor(properties: bytes, blocksize: int | None = None)[source]¶
Decompress PPMd compressed data
- class py7zr.compressor.SevenZipCompressor(filters=None, password=None, blocksize: int | None = None)[source]¶
Main compressor object to configured for each 7zip folder.
- class py7zr.compressor.SevenZipDecompressor(coders: list[dict[str, Any]], packsize: int, unpacksizes: list[int], crc: int | None, password: str | None = None, blocksize: int | None = None)[source]¶
Main decompressor object which is properly configured and bind to each 7zip folder. because 7zip folder can have a custom compression method
- class py7zr.compressor.ZstdCompressor(level: int)[source]¶
helpers module¶
- py7zr.helpers.calculate_crc32(data: bytes, value: int = 0, blocksize: int = 1048576) int [source]¶
Calculate CRC32 of strings with arbitrary lengths.
- py7zr.helpers.calculate_key(password: bytes, cycles: int, salt: bytes, digest: str) bytes ¶
Calculate 7zip AES encryption key. Concat values in order to reduce number of calls of Hash.update().
- py7zr.helpers.canonical_path(target: Path) Path [source]¶
Return a canonical path of target argument.
- py7zr.helpers.check_archive_path(arcname: str) bool [source]¶
Check arcname argument is valid for archive. It should not be absolute, if so it returns False. It should not be evil traversal attack path. Otherwise, returns True.
- py7zr.helpers.filetime_to_dt(ft)[source]¶
Convert Windows NTFS file time into python datetime object.
- py7zr.helpers.get_sanitized_output_path(fname: str, path: Path | None) Path [source]¶
check f.filename has invalid directory traversals When condition is not satisfied, raise Bad7zFile
- py7zr.helpers.is_path_valid(target: Path, parent: Path) bool [source]¶
Check if target path is valid against parent path. It returns False when target path has ‘..’ and point out of parent path. Otherwise, returns True.
- py7zr.helpers.is_relative_to(my: Path, *other) bool [source]¶
Return True when path is relative to other path, otherwise False.
- py7zr.helpers.islink(path: str | Path) bool [source]¶
Cross-platform islink implementation. Support Windows NT symbolic links and reparse points.
- py7zr.helpers.readlink(path: str | Path, *, dir_fd=None) str | Path [source]¶
Cross-platform compat implementation of os.readlink and Path.readlink(). Support Windows NT symbolic links and reparse points. When called with path argument as pathlike(str), return result as a pathlike(str). When called with Path object, return also Path object. When called with path argument as bytes, return result as a bytes.