Reputation
Badges 1
25 × Eureka!Hi @<1614069770586427392:profile|FlutteringFrog26>
So since you have the Task id. you do:
task = Task.get_task("task id here")
Then to get the models
models = task.models["output]
the models is a list And a dict, if you want the lats one you do last_model = models[-1] if you know the best model name you do model = models["best model"] (notice the model name is the exact one you see in the UI. Once you have the model object you can get a copy with `model.get_lo...
Is this a bug, or an issue with clearml not working correctly with hydra?
It might be a bug?! Hydra is fully supported, i.e. logging the state and allowing you to change the Arguments from the UI.
Is this example working as expected ?
https://github.com/allegroai/clearml/blob/master/examples/frameworks/hydra/hydra_example.py
If you're referring to the run executed by the agent, it ends after this message because my script does not get the right args and so does not know what to...
GreasyPenguin14 I think the default is reporting on failed tasks only? could that be?
Hi MortifiedDove27
I think you can resize the plot area in the UI (try to drag the horizontal separator)
Hi @<1523701523954012160:profile|ShallowCormorant89>
This means the system did not detect any "iteration" reporting (think scalars) and it needs a time-series axis for the monitoring, so it just uses seconds from start
I can't see any reason it should not work 😀
what do you have here in your docker compose :
None
Hi SparklingElephant70
Anyone know how to solve?
I tired git push before,
Can you send the entire log? Could it be that the requested commit ID does not exist on the remote git (for example force push deleted it) ?
Hi MuddySquid7
You can only add reports (scalars plots etc.) , though not to a published Task.
If you want to add an artifact, this should work.prev_task = Task.get_task(task_id='112233') prev_task.mark_started(force=True) prev_task.reload() prev_task.upload_artifact(..., wait_for_upload=True) prev_task.mark_stopped(force=True)
Hi @<1523701504827985920:profile|SubstantialElk6>
I would split the first stage into two. The first one passing data to the others, the second as "monitoring ", Wdyt?
GrumpyPenguin23 could you help and point us to an overview/getting-started video?
Hi @<1697419082875277312:profile|OutrageousReindeer5>
Is NetApp S3 protocol enabled or are you referring to NFS mounts?
Thank you @<1523701949617147904:profile|PricklyRaven28> !!!
Let me see if we can reproduce and how to solve it
is "my_package" a local package ?
what is the output of:pip freeze | grep my_package
Hi ExasperatedCrocodile76
It seems like it is using conda package manager, were you using conda when you run the code manually ?ERROR: This cross-compiler package contains no program /home/ivan/miniconda3/envs/clearML/bin/x86_64-conda_cos6-linux-gnu-gfortranWhy is it trying to install from source code?
BTW: can you test with the latest agent RC? ( pip install clearml-agent==1.4.0rc4 )
Hi RotundSquirrel78
How did you end up with this command line?/home/sigalr/.clearml/venvs-builds/3.8/code/unet_sindiff_1_level_2_resblk --dataset humanml --device 0 --arch unet --channel_mult 1 --num_res_blocks 2 --use_scale_shift_norm --use_checkpoint --num_steps 300000the arguments passed are odd (there should be none, they are passed inside the execution) and I suspect this is the issue
FYI:ssh -R 8080:localhost:8080 -R 8008:localhost:8008 -R 8081:localhost:8081 replace_with_username@ubuntu_ip_heresolved the issue 🙂
Sure :task = Task.init(..., auto_connect_arg_parser={'arg_not_to_log': False})This will cause all argparse to automatically be logged (and later editable) with the exception of the argument arg_not_to_log
Notice that if you have --arg-something, to exclude it add to the dict arg_something': False
Hi SourSwallow36
What do you man by Log each experiment separately ? How would you differentiate between them?
So we basically have two options, one is when you call Dataset.get_local_copy() , we register it on the Task automatically, the other is a more explicit, with something like:ds = Datasset.get(...) folder = ds.get_local_copy() task.connect(ds, name=train) ... ds_val = Datasset.get(...) folder = ds_val.get_local_copy() task.connect(ds_val, name=validate)wdyt?
Our remote machine is Windows 10
JumpyDragonfly13 seems like the Windows 10 + docker is the issue (that would explain the OCI error)
Is this relevant ?
https://github.com/microsoft/WSL/issues/5100
Hi JumpyDragonfly13
Let's assume we have two machines, one we call remote, one we call laptop (at least for this discussion)
On the Remote machine we need to run: (notice we must have docker preinstalled on the remote machine, it can work without docker, let me know if this is the case for you)clearml-agent daemon --queue interactive --create-queue --docker
On the Laptop we runclearml-session --docker nvidia/cuda:10.1-cudnn7-runtime-ubuntu18.04What clearml-session will do is crea...
Hi CluelessElephant89
When you edit the args (General section) in the UI, you are editing the args for "remote execution"
(i.e. when executed by the agent, the args dict will get the values from the UI , as oppsed to "manual execution" where there UI gets the values from code)
In order to simulate the "remote execution" inside your development environment
Try:
` from clearml import Task
simulate remote execution of a specific Task instance
Task.debug_simulate_remote_task(task_id='R...
This points to the wrong cu117 / driver - could that be?
Basic setup:
glues service per "job template" (e.g. k8s resources, for example cpu requirement, or gpu requirement).
queue per glue service, e.g. cpu_machine queue, and 1xGPU queue
wdyt?
Correct 🙂
I'm assuming the Task object is not your Current task, but a different one?