Open An Azure Storagestreamdownloader Without Saving It As A File
I need to download a PDF from a blob container in azure as a download stream (StorageStreamDownloader) and open it in both PDFPlumber and PDFminer. I developed all the requirements
Solution 1:
download_blob()
download the blob to a StorageStreamDownloader
class, and in this class there is a download_to_stream
, with this you will get the blob stream.
from azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient
from io import BytesIO
import PyPDF2
filename = "test.pdf"
container_name="test"
blob_service_client = BlobServiceClient.from_connection_string("connection string")
container_client=blob_service_client.get_container_client(container_name)
blob_client = container_client.get_blob_client(filename)
streamdownloader=blob_client.download_blob()
stream = BytesIO()
streamdownloader.download_to_stream(stream)
fileReader = PyPDF2.PdfFileReader(stream)
print(fileReader.numPages)
And this is my result. It will print the pdf pages number.
Solution 2:
It seems download_to_stream() is now deprecated and instead should be used readinto().
from azure.storage.blob import BlobClient
conn_string = ''
container_name = ''
blob_name = ''
blob_obj = BlobClient.from_connection_string(
conn_str=conn_string, container_name=container_name,
blob_name=blob_name
)
withopen(blob_name, 'wb') as f:
b = blob_obj.download_blob()
b.readinto(f)
This will create a file in working directory with the data that was downloaded.
Solution 3:
simply add readall()
to the download_blob()
which will read the data
as bytes.
from azure.storage.blob import BlobClient
conn_string = ''
container_name = ''
blob_name = ''
blob_obj =
BlobClient.from_connection_string(conn_string,container_name,blob_name)
withopen(blob_name, 'wb') as f:
b = blob_obj.download_blob().readall()
Post a Comment for "Open An Azure Storagestreamdownloader Without Saving It As A File"