ArrogantBlackbird16 the file.py
is the file contains the Task.init
call?
not sure I’m getting the flow, if you just want to create a template task in the system, clone and enqueue it, you can use task.execute_remotely(queue_name="my_queue", clone=True)
,can this solve the issue?
So which data is being deleted? which folder is the “artifact folder”?
Hi @<1523707056526200832:profile|ScaryKoala63> .
try using task.upload_artifact
for manually uploading artifacts, like in here , you can also configure the upload destination
Hi SquareFish25 , what about AWS_DEFAULT_REGION
, did you add it too? Can you try with it if not?
What version of ClearML are you using?
Hi SmarmySeaurchin8 ,
I suspect the same, can you share an example of the path? I want to try and reproduce it on my side
can you try with the latest?pip install clearml==1.7.3rc1
Hi ItchyHippopotamus18 ,
it seems the request does not reach the Trains File Server (port 8081, same machine running Trains Server), can you reach it?
Hi SubstantialElk6 ,
Which server do you use? http://app.community.clear.ml ?
which clearml
and clearml-agent
versions are you using?
Hi ArrogantBlackbird16 ,
How do you generate and run your tasks? Do you use the same flow as in the https://clear.ml/docs/latest/docs/fundamentals/agents_and_queues#agent-and-queue-workflow ? Some other automation?
Hi JollyChimpanzee19 ,
From the UI you can choose 2 or more tasks and click the compare
button, from the code you can use task.get_last_scalar_metrics()
for getting the last results of a task as a dictionary. what do you think?
I will check the aws token, just to verify, you imported the StorageManager after the os.environ
calls?
try pip install clearml==0.17.6rc1
Correct 🙂polling_interval_time_min
= the scaler interval for checking tasks in the queue
after that, I wanted to create steps from scratch, because I have many steps and I hope to avoid manual editing in GUI (commits and other things). I create this tasks:
You can add this to the template task Task.init(project_name=<your project name>, task_name=<your task name>)
instead of the Task.create
call and it will have all the inputs for you.
After, add task.set_base_docker("docker command")
and it will configure the docker for the task.
Once finish configuring ...
Hi DeliciousKoala34 ,
dataset on the share is deleted
the dataset task or the data in the folder? both shouldnt be deleted by get_local_copy
activation
Hi HealthyStarfish45
If you are running the task via docker, we dont auto detect the image and docker command, but you have more than one way to set those:
You can set the docker manually like you suggested. You can configure the docker image + commands in your ~/trains.conf
https://github.com/allegroai/trains-agent/blob/master/docs/trains.conf#L130 (on the machine running the agent). You can start the agent with the image you want to run with. You can change the base docker image...
In the self-hosted we do not have user permissions, so every user sees all the data.
which ClearML agent version are you running?
BattyLion34 when you are running the script locally, you have this script ( ResNetFineTune.py
) so you can run it without any issue, but when running the agent, the agent clone the repo, create an env and run the script. Now, the issue is that when trying to run the script, from the cloned repo, it can’t find it, because it’s only on your local machine, in the original git repo.
Thanks ImpressionableAlligator9 and MagnificentWorm7 for reporting this, I will double check it
Hi WearyLeopard29 ,
try:
` from clearml import Dataset
ds = Dataset.get(dataset_id="you dataset task id")
ds.tags = ["my tag"] `
can you check the agent’s logs? maybe we can find something there
ThickDove42 you mean setting the docker init script?
Can you try upgrade to the latest? pip install clearml-agent==0.17.2
?
ThickDove42 you can get the version with
clearml-agent --version
Hi WackyRabbit7
If you are running the code as a script (without a repo), you can view the full script file at EXECUTION tab under UNCOMMITTED CHANGES. Are you running it as a standalone or as a part of a repo?
I suspect that
I will try to generate a new token for myself and reproduce it with it
The Hyperparameter Optimizer can give you such table, but I’m not sure this is what you are looking for ( https://allegro.ai/clearml/docs/docs/examples/frameworks/pytorch/notebooks/image/hyperparameter_search.html and https://medium.com/pytorch/accelerate-your-hyperparameter-optimization-with-pytorchs-ecosystem-tools-bc17001b9a49 )