
Reputation
Badges 1
25 × Eureka!also, I can run this same code (same model) in a separate script (not in a docker container) with the AWS_PROFILE
env variable set and load the model properly with joblib
in case it’s helpful, the url is an S3 URI
For that reason I suspect there is a silent error trying to grab boto3 credentials.
yes I did finally resolve this. If my memory serves, I think I needed to do some double checking that the clearml.conf
file was in the proper location. It wasn’t where I thought I put it and I had to set all the proper environment variables in the container.
I think I can set AWS_PROFILE
. I have theoretically placed the clearml.conf
in the correct place in the container: $HOME/clearml.conf
. But I get an error that’s actually not very helpful, I still think the issue is getting boto3 credentials but I’m not sure I can prove it at this point. I posted the error in the community channel https://clearml.slack.com/archives/CTK20V944/p1653507863528969
looks like I accidentally placed my clearml.conf
file in a non-standard place so I had to set the CLEARML_CONFIG_FILE
environment variable. Thanks for your help AgitatedDove14 !!
s3://<bucket>/<foo>/local/<env>/<project-name>/v0-0-1/2022-05-12-30-9-rocketclassifier.7b7c02c4dac946518bf6955e83128bc2/models/2022-05-12-30-9-rocketclassifier.pkl.gz
well, I can get this exact object from my machine, from the same script through boto3, so I’m not sure where the disconnect is AgitatedDove14 SuccessfulKoala55
SuccessfulKoala55 and AgitatedDove14 So setting my environment variable worked for my local runs, but it doesn’t work when running it from a container (in AWS ECS, for example)
I have already successfully printed model.name
though I didn’t try the other ones.
so a little more clarity. I get nearly this same error when downloading the same object from boto3 if I try to get at it from an s3 resource like so:
` bad_s3_resource = boto3.resource("s3")
obj = bad_s3_resource.Object(bucket, key)
obj.get()["Body"]
botocore.exceptions.NoCredentialsError: Unable to locate credentials
Only when I set the boto3 session with the appropriate profile does it work:
sess = boto3.session.Session(profile_name="foo-profile")
s3_resource = sess.resource("s3")
...
ah, I just ran:helper = StorageHelper(url=s3_uri, base_url=s3_uri) context.log.info(f"helper: {helper}")
and I got:ValueError: Missing key and secret for S3 storage access (
` )
File "/opt/venv/lib/python3.7/site-packages/dagster/core/execution/plan/utils.py", line 44, in solid_execution_error_boundary
yield
File "/opt/venv/lib/python3.7/site-packages/dagster/utils/init.py", line 383, in iterate_with_context
next_output = next(iterator)
File "/opt/venv/lib/python3.7...
I just checked for white space at the beginning or end of the url but there are none
I had to modify the code a bit because I’m running this in a dagster pipeline but that print line put out: helper None
I wonder if there’s an exception being caught by the final try/except
block (below) but because I’m running it in dagster I’m not getting the logger output like I would normally:
` # Don't canonize URL since we already did it
try:
instance = cls(base_url=base_url, url=url, logger=logger, canonize_url=False, **kwargs)
except (StorageError, UsageError) as ex:
cls._get_logger().error(str(ex))
return None
except Exception as ex:
cls._get_logger().error("Failed creating stora...
I suppose in the end I’m going to want to log the inference values back to snowflake as well … haven’t gotten to that part yet
Don’t know why I didn’t think of it earlier, but I set my env variable AWS_PROFILE=<foo-profile>
and it copied it successfully.
Thanks! I think you were right on all counts there. That was my work around.
That’s how I generate my raw input data is from a Snowflake query, then I do all the feature encoding/building etc.
I’m happy to use AWS’s ALB I was just even less sure about how to set that up at the time. I assume I need to setup a Target Group Attachement to the clearml-webserver
ip address, correct?
That’s very helpful, thanks!. BTW when I enter that command I get No resources found in clearml namespace.
Similarly if I run:kubectl get --namespace default -o jsonpath="{.spec.ports[0].nodePort}" services clearml
I get:Error from server (NotFound): services "clearml" not found
I
sounds good I’ll give that a shot real quick