Hmmmm this looks like what you're looking for:
https://clear.ml/docs/latest/docs/references/sdk/automation_controller_pipelinecontroller#stop-1
Tell me if this helps 🙂
Hi @<1523703397830627328:profile|CrookedMonkey33> , you can also set the credentials with an env variable. Would that work?
Hi @<1719162252994547712:profile|FloppyLeopard12> , not sure what you're trying to do, can you please elaborate?
YummyLion54 , let me take a look 🙂
DepressedFish57 , I'm not sure there is something for this. There is a cleanup service to kill old failed/aborted tasks but to clear caches you'd have to handle it manually 🙂
Do you explicitly specify the region?
Also, please try specifying the buckets directly in your ~/clearml.conf
The relevant section is sdk.aws.s3.credentials
It would look something like this:
{ host: "<BUCKET_A_HOST>" key: "<KEY>" secret: "<SECRET>" region: "<REGION>" }
Build a Docker container that when launched executes a specific experiment, or a clone (copy) of that experiment.
From the docs
TrickyRaccoon92 , Hi!
Yes I believe this is the intended behavior. Since if you upload automatically you can upload many artifacts during a single run, whereas when you upload manually you create the object yourself.
If you killed all processes directly, there can't be any workers on that machine. It means that these two workers are running somewhere else...
DeliciousSeal67 , something along these lines:task.upload_artifact('<ARTIFACT_NAME>', artifact_object=os.path.join('<FOLDER>', '<FILE_NAME>'))
So in your case it would be along the lines oftask.upload_artifact('trained_model', 'model_folder/best_mode.pt')
Oh, you want the same machine to execute the two tasks/steps?
What about if you specify the repo user/pass in clearml.conf?
I think it removes the user/pass so it wouldn't be shown in the logs
Hi @<1717350310768283648:profile|SplendidFlamingo62> , are you using a self hosted server or the community?
Sounds like an issue with your deployment. Did your Devops deploy this? How was it deployed?
Hi @<1547028031053238272:profile|MassiveGoldfish6> , what version of clearml & pytorch-lightning ? Does this happen to you with the example as well? Are you on a self deployed or the community server?
Hi @<1610445887681597440:profile|WittyBadger59> , how are you reporting the plots?
I would suggest taking a look here and running all the different examples to see the reporting capabilities:
None
Hi, I think this is what you're looking for - None
Can you please add the ~/clearml.conf for the agent? Also, are you trying to run everything on the same machine or different ones?
Hi @<1523701062857396224:profile|AttractiveShrimp45> , I'm afraid not. But you can always export these tables and plots into a report and add your custom data into the ClearML report as well
Hi @<1526734383564722176:profile|BoredBat47> , it should be very easy and I've done it multiple times. For the quickest fix you can use api.files_server in clearml.conf
Can you please open a GitHub issue to follow up on this issue?
Hi @<1572395190897872896:profile|ShortWhale75> , this capability exists as part of the HyperDatasets feature which is present in the Scale/Enterprise licenses.
Yes, however I think you might be able to expose this via an env variable on the Task object itself
BoredPigeon26 , do you run them manually or with the agent? If you run manually then I'm afraid it doesn't show the config currently. However if you run with the agent, then it will also print out the entire config (excluding the secrets, of course) at the start of the run and it will be shown in the console output in the UI 🙂
Or do you mean the different parameters you've changed about in the task itself?
Is it a self deployed server or the Community?
In the task hyper parameters section you have a section called Hydra. In that section there should be a configuration called _allow_omegaconf_edit_ , what is it set to?
Hi @<1572395190897872896:profile|ShortWhale75> , that is not the correct way to use workers & queues.
First of all, Task.init will mark your task as running so this error makes sense.
The idea is first you run the code locally on your machine, once everything is logged (packages, repo, uncomitted changes & configurations) you can clone the task and then enqueue it into the agent.
Programmatically, you would watch to fetch an existing task in the system, clone it and then enqueue the n...
Hi @<1564060263047499776:profile|ThoughtfulCentipede62> , you manage it with the docker arguments you can provide (you can inject env vars this way) or with a setup shell script.
None
Hi OutrageousSheep60 , how did you add external links to a dataset? Can you provide a snippet?