Advanced usage¶
Stream 7z content¶
The py7zr provide a way to stream archive content with extract
and extractall
methods.
The methods accept an object which implement py7zr.io.WriterFactory
interface as factory
arugment.
Your custom class which implements WriterFactory
interface should have create
method.
The create
method should return your custom class object which implements Py7zIO
interface.
The Py7zIO
interface is as similar as Python Standard BinaryIO
but it also have size
method.
When the py7zr extract the archive contents, it calls the factory object with filename and ask creating the io object.
The py7zr write the io object with write
method, which is as similar as an object returned by the standard open
method.
You can process write
in your custom class. The method is called on-the-fly when the py7zr move to extract the target
archive file.
Note: Because the py7zr may run in multi-threaded, your custom class should be thread-safe.
The py7zr provide the way to stream content without buffering all the archive content in a memory.
Example to extract into network storage¶
Here is a pseudo code to demonstrate a way to extract contents into cloud storage. To simplify things in an example code, all the authentication, bucket operation, all the mandatory headers are omitted.
def extract_stream():
factory = StreamIOFactory()
with py7zr.SevenZipFile("target.7z") as archive:
archive.extractall(factory=factory)
class StreamIO(py7zr.io.Py7zIO):
"""Example network storage writer."""
def __init__(self, fname: str):
self.fname = fname
self.length = 0
def write(self, data: [bytes, bytearray]):
self.length += len(data)
# the py7zr will call write multiple time, so you need to append data
requests.put("https://your.custom.network.storage.example.com/append/to/file/command/", data=data)
def read(self, size: Optional[int] = None) -> bytes:
return b''
def seek(self, offset: int, whence: int = 0) -> int:
return offset
def flush(self) -> None:
pass
def size(self) -> int:
return self.length
class StreamIOFactory(py7zr.io.WriterFactory):
"""Factory class to return StreamWriter object."""
def __init__(self):
self.products = {}
def create(self, filename: str) -> Py7zIO:
product = StreamIO(filename)
self.products[filename] = product
return product