ah, I just ran:helper = StorageHelper(url=s3_uri, base_url=s3_uri) context.log.info(f"helper: {helper}")
and I got:ValueError: Missing key and secret for S3 storage access (
) File "/opt/venv/lib/python3.7/site-packages/dagster/core/execution/plan/utils.py", line 44, in solid_execution_error_boundary yield File "/opt/venv/lib/python3.7/site-packages/dagster/utils/__init__.py", line 383, in iterate_with_context next_output = next(iterator) File "/opt/venv/lib/python3.7/site-packages/dagster/core/execution/plan/compute_generator.py", line 65, in _coerce_solid_compute_fn_to_iterator result = fn(context, **kwargs) if context_arg_provided else fn(**kwargs) File "orchestrate/pipelines/adoption_risk_score.py", line 111, in main model, model_name, days_in_period = identify_current_model(context.solid_config["target_tags"], context) File "orchestrate/pipelines/adoption_risk_score.py", line 60, in identify_current_model helper = StorageHelper(url=s3_uri, base_url=s3_uri) File "/opt/venv/lib/python3.7/site-packages/clearml/storage/helper.py", line 352, in __init__ "Missing key and secret for S3 storage access (%s)" % base_url
I wonder if there’s an exception being caught by the final try/except
block (below) but because I’m running it in dagster I’m not getting the logger output like I would normally:
` # Don't canonize URL since we already did it
try:
instance = cls(base_url=base_url, url=url, logger=logger, canonize_url=False, **kwargs)
except (StorageError, UsageError) as ex:
cls._get_logger().error(str(ex))
return None
except Exception as ex:
cls._get_logger().error("Failed creating storage object {} Reason: {}".format(
base_url or url, ex))
return None
cls._helpers[instance_key] = instance
return instance `
I have already successfully printed model.name
though I didn’t try the other ones.
s3://<bucket>/<foo>/local/<env>/<project-name>/v0-0-1/2022-05-12-30-9-rocketclassifier.7b7c02c4dac946518bf6955e83128bc2/models/2022-05-12-30-9-rocketclassifier.pkl.gz
I had to modify the code a bit because I’m running this in a dagster pipeline but that print line put out: helper None
in case it’s helpful, the url is an S3 URI
So the thing is, regardless of the link you should end with:helper <clearml.storage.helper.StorageHelper object at 0x....>
But the code that failed seemed to return None, which makes me suspect the url itself is somehow broken.
Any chance you have a space before the "s3://" ?
BTW : what's the clearml version you are using ?
looks like I accidentally placed my clearml.conf
file in a non-standard place so I had to set the CLEARML_CONFIG_FILE
environment variable. Thanks for your help AgitatedDove14 !!
S3 access would return a different error...
Can you do:
` from clearml.storage.helper import StorageHelper
helper = StorageHelper.get("s3://<bucket>/<foo>/local/<env>/<project-name>/v0-0-1/2022-05-12-30-9-rocketclassifier.7b7c02c4dac946518bf6955e83128bc2/models/2022-05-12-30-9-rocketclassifier.pkl.gz")
print("helper", helper) `
also, I can run this same code (same model) in a separate script (not in a docker container) with the AWS_PROFILE
env variable set and load the model properly with joblib
ValueError: Missing key and secret for S3 storage access
Yes that makes sense, I think we should make sure we do not suppress this warning it is too important.
Bottom line missing configuration section in your clearml.conf
I just checked for white space at the beginning or end of the url but there are none
It seems to fail when trying to download the modellocal_download = StorageManager.get_local_copy(uri, extract_archive=False) File "/opt/venv/lib/python3.7/site-packages/clearml/storage/manager.py", line 47, in get_local_copy cached_file = cache.get_local_copy(remote_url=remote_url, force_download=force_download) File "/opt/venv/lib/python3.7/site-packages/clearml/storage/cache.py", line 55, in get_local_copy if helper.base_url == "file://":
And based on the error I suspect the URL is incorrect (i.e. it failed to find a driver for it)
What's the exact URL (mask any private content) ?
AttributeError: 'NoneType' object has no attribute 'base_url'
can you print the model
object ?
(I think the error is a bit cryptic, but generally it might be the model is missing an actual URL link?)print(model.id, model.name, model.url)
For that reason I suspect there is a silent error trying to grab boto3 credentials.