Reputation
Badges 1
53 × Eureka!Ok. Can I check that only the main script was stored in the task but not the dependent packages?
I guess the more correct way is to upload to some repo where the remote task can still pull from it?
I not very sure tbh. Just want to see if this is useful....
I got SSL error few days back and I solved it by adding cert to /etc/ssl/certs
and perform update-ca-certificates
.
export REQUESTS_CA_BUNDLE=/etc/ssl/certs/ca-certificates.crt
Add this. Note that verify
might not work with sdk.aws.s3.verify
but sdk.aws.s3.credentials
. Pls see the attached image.
Example:aws {
s3 {
credentials: [
{
` ...
Yea. Added an issue. We can follow up from there. Really hope that clearml serving can work, is a nice project.
https://clear.ml/docs/latest/docs/integrations/storage/
Try add the <path to your cert> for s3.credentials.verify.
SdK meaning I run the agent using clearml-agent daemon ....
Alternatively I understand I can also run the agent using docker run allegroai/clearml-agent:latest.
But I cannot figure out how to add --restart, --queue, -- gpus flag to the container
U want to share your clearml.conf here?
JuicyFox94 and SuccessfulKoala55 Thanks alot. Indeed it is caused by dirty cookies.
Clearml 1.1.1. Yes, i have boto3 installed too.
seems like it was broken for numpy version 1.24.1.
Tried with numpy 1.23.5 and it works.
Just to add, when I run the pipeline locally it works as well.
May I know where to set the cert to in env variable?
I figured out that it maybe possible to do theseexperiment_task = Task.current_task()
OutputModel(experiment_task ).update_weights('
http://model.pt ')
to attach it to the ClearML experiment task.
Hi CostlyOstrich36 I have run this task locally at first. This attempt was successful.
When I use this task to run in a pipeline (task was run remotely), it cannot find the external package. This seems logical but I not sure how to resolve this.
SuccessfulKoala55 Nope. I didn't even get to enter my name. I suspect there is some mistake in mapping the data folder.
Was using the template in https://github.com/allegroai/clearml-helm-charts to deploy.
Hello CostlyOstrich36 I am facing an issue now. basically i installed all necessary python packages in my docker image. But somehow, the clearml-agent does not seems to be able to detect these global packages. I don't see them in the "installed packages". Any advice?
Thanks AgitatedDove14 and TimelyMouse69 . The intention was to have some traceability between the two setups. I think the best way is to enforce some naming convention (for project and name) so we can know how they are related? Any better suggestions?
Example i build my docker image using a image in docker hub. In this image, i installed torch and cupy packages. But when i run my experiment in this image, the packages are not found.
Yes, I ran the experiment inside.
Nice. That should work. Thanks
CostlyOstrich36 I mean the dataset object in clearml as well as the data that is tied to this object.
The intent is to bring over to another clearlml setup and keep some form of traceability.
Nice. It is actually dataset.id
.
@<1526734383564722176:profile|BoredBat47> Just to check if u need to do update-ca-certificates or equivalent?
I was browsing clearml agent gihub and saw this. Isn't this for spinning up clearml-agent in a docker and perform like a daemon?
Not exactly sure yet but I would think user tag for deployed make sense as it should be a deliberated user action. And additional system state is required too since a deployed state should have some pre-requitise system state.
I would also like to ask if clearml has different states for a task, model, or even different task types? Right now I dun see differences, is this a deliberated design?
Hi ExasperatedCrab78 I managed to get it. It was due to ip address set in examples.env.
Hi Bart, yes. Running with inference container.
Hi @<1523701070390366208:profile|CostlyOstrich36> , basically
- I uploaded dataset using clearml Datasets. The output_uri is pointed to my s3, thus the dataset is stored in s3. My s3 is setup with http only.
- When I retrieve the dataset for training, using
Dataset.get()
, I encountered ssl cert error as the url to retrieve data washttps://<s3url>/...
instead ofs3://<s3url>/...
which is http. This is weird as the dataset url is without https. - I am not too sure why and I susp...