as i also noticed that uploads are sometimes slow, and i see here max_connections=2
Makes sense to me, please go ahead and add that as well (basically the same thing on _AzureBlobServiceStorageDriver.upload_object
and an additional variable on the AzureContainerConfigurations
class.
Could you PR a tested draft ? we will be able to take from there
AgitatedDove14 that looks good, i'd like to request an addition to control the other max_connections i see on that file, as i also noticed that uploads are sometimes slow, and i see here max_connections=2
in clearml.conf we could have:azure.storage { max_connections = 10 # containers: [ # { # account_name: "clearml" # account_key: "secret" # # container_name: # } # ] }
Then in AzureContainerConfigurations
:
` @classmethod
def from_config(cls, configuration):
...
class AzureContainerConfigurations(object):
def init(self, container_configs=None, max_connections=None):
super(AzureContainerConfigurations, self).init()
self._container_configs = container_configs or []
self.max_connections = max_connections
@classmethod
def from_config(cls, configuration):
default_account = getenv("AZURE_STORAGE_ACCOUNT")
default_key = getenv("AZURE_STORAGE_KEY")
default_container_configs = []
if default_account and default_key:
default_container_configs.append(AzureContainerConfig(
account_name=default_account, account_key=default_key
))
max_connections = configuration.get("max_connections", 10)
if configuration is None:
return cls(default_container_configs, max_connections)
containers = configuration.get("containers", list())
container_configs = [AzureContainerConfig(**entry) for entry in containers] + default_container_configs
return cls(container_configs, max_connections) `And finally:
in _AzureBlobServiceStorageDriver.download_object(...)
_ = container.blob_service.get_blob_to_path( container.name, obj.blob_name, local_path, max_connections=container.max_connections or 10, progress_callback=callback_func, )
ShakyJellyfish91 wdyt?
well that's much faster, from 1mb/s to 60mb/s 🙂
hm, maybe, i will try to override this to see what happens, thanks!
Hi ShakyJellyfish91
It seems clearml is using a single connection, that takes a long time download
Hmm, I found this one:
https://github.com/allegroai/clearml/blob/1cb5dbb276026644ae20fef63d58256cdc887818/clearml/storage/helper.py#L1763
Does max_connections=10
mean 10 concurrent connections ?