Is There A Way To Stream Data Directly From Python Request To Minio Bucket
Solution 1:
Reading documentation on MinIO about put_object
, there are examples how to add a new object to the object storage server. Those examples only explain how to add a file.
This is definition of put_object
function:
put_object(bucket_name, object_name, data, length, content_type='application/octet-stream', metadata=None, progress=None, part_size=510241024)
We are interested in data
parameter. It states:
Any python object implementing io.RawIOBase.
RawIOBase is base class for raw binary I/O. It also defines method read
.
If we were to use dir() built-in function to attempt to return a list of valid attributes for r.content
, we could then check if read
is there:
'read' in dir(r.content)
-> return False
That's the reason why you get AttributeError: 'bytes' object has no attribute 'read'
. It's because type(r.content)
is bytes
class.
You can convert r.content
into class that inherits from RawIOBase
. That is, using io.BytesIO
class. To get size of an object in bytes, we could use io.BytesIO(r.content).getbuffer().nbytes
.
So if you want to stream raw bytes of data to your bucket, convert bytes
class to io.BytesIO
class:
import io
importrequestsr= requests.get(url_to_download, stream=True)
raw_img = io.BytesIO(r.content)
raw_img_size = raw_img.getbuffer().nbytes
Minio_client.put_object("bucket_name", "stream_test.tiff", raw_img, raw_img_size)
NOTE: Examples show reading binary data from file and getting its size by reading st_size
attribute from stat_result
which is returned by using os.stat()
function.
st_size
is equivalent of to io.BytesIO(r.content).getbuffer().nbytes
.
Solution 2:
You can stream your file directly into a minio bucket like this:
import requests
from pathlib import Path
from urllib.parse import urlparse
from django.conf import settings
from django.core.files.storage import default_storage
client = default_storage.client
object_name = Path(urlparse(response.url).path).name
bucket_name = settings.MINIO_STORAGE_MEDIA_BUCKET_NAME
with requests.get(url_to_download, stream=True) as r:
content_length = int(r.headers["Content-Length"])
result = client.put_object(bucket_name, object_name, r.raw, content_length)
Or you can use a django file field directly:
with requests.get(url_to_download, stream=True) as r:
# patch the stream to make django-minio-storage belief# it's about to read from a legit file
r.raw.seek = lambda x: 0
r.raw.size = int(r.headers["Content-Length"])
model = MyModel()
model.file.save(object_name, r.raw, save=True)
The RawIOBase hint from Dinko Pehar was really helpful, thanks a lot. But you have to use response.raw not response.content which would download your file immediately and be really inconvenient when trying to store a large video for example.
Post a Comment for "Is There A Way To Stream Data Directly From Python Request To Minio Bucket"