
Reputation
Badges 1
25 × Eureka!GreasyPenguin14 I think this is what you are looking forTask.get_project_id('project_name')
clearml_agent: ERROR: Can not run task without repository or literalscript in
script.diff
This is odd ...
OutrageousSheep60 when you launch clearml-session
it tells you the session ID (which is also a Task ID), can you look for it in the UI and check there is something in the repo/uncommitted-changes section ?
Hmmm maybe
I thought that was expected behavior from poetry side actually
I think this is the expected behavior, hence bug?!
DefeatedOstrich93 can you verify lightning actually only stored once ?
VirtuousFish83 I remember an issue on github with something similar , what's the cleamrl- server version you are using ?
what do you mean? the same env for all components ? if they are using/importing exactly the same packages, and using the same container, then yes it could
Unfortunately not, the queues tab shows only the number of tasks, but not resources used
in the queue
Oh, yes, that makes sense to add, I like that 🙂
(the main question is what data is there in the backend DBs, let me know what I can get)
PompousParrot44 unfortunately not yet 😞
But the gist is :
MongoDB stores experiment data (i.e. execution parameters, git ref etc.)
ElasticSearch stores results (i.e. metrics console logs, debug image links etc.)
Does that help?
Oh you can definitely use the RestAPI, but in this specific case, I'm not sure there is something better.
(BTW: Look for APIClient it a pythonic interface for the RestAPI)
ConvolutedSealion94 if you do bash:cd ~/work/repo/code/ git status
what are you getting ?
What about the epochs though? Is there a recommended number of epochs when you train on that new batch?
I'm assuming you are also using the "old" images ?
The main factor here is the ratio between the previously used data and the newly added data, you might also want to resample (i.e. train on more) new data vs old data. make sense ?
VirtuousFish83 is the exit(1) called from the main process or a subprocess? Are you running it with an agent?
DepressedChimpanzee34 I cannot find cfg.py here
https://github.com/allegroai/clearml/tree/master/examples/frameworks/hydra/config_files
(or anywhere else)
Hi UnsightlyBeetle11
Is it possible to report the model's architecture (PyTorch model) automatically on ClearML, as we do it via Netron or other neural network visualisation tools?You mean like the actual network layout? Unfortunately, there is currently no option to do that, you can however manually store a plot/image that represents it
BTW:I think that at the beginning Netron was somehow integrated, but it was rarely used and support for it was not trivial so it was phased out. You can ho...
Hi WickedBee96
How can I do that?
clearml-task
https://clear.ml/docs/latest/docs/apps/clearml_task#what-is-clearml-task-for
I know this way to run it in the agent only by enqueue the draft after running it on my local machine so is there another way?
Or maybe are you looking for task.execute_remotely
https://clear.ml/docs/latest/docs/references/sdk/task#execute_remotely
Not really 😞
Everyone can do everything, the idea is sharability and accessibility.
I do know that in the paid tier they have full access control roles SSO etc, but unfortunately its way too complicated for the open-source.
Basically what I'm saying is trust your fellow colleagues 🙂
replace the base-docker-image and it should work fine 🙂
ScantChimpanzee51 what's the use case for the full path without specific artifact?
Yes they are supposed to be routed there by pytorch dist
(and the TB logs are on the master only anyhow)
Would be cool to let it get untracked as well, especially if we want to as an option
How would you decide what should be tracked?
I think non-master processes trying to log something, but have no Logger instance because have no Task instance.
Hmm is your code calling Logger.current_logger()
directly ?
Logs in master process include all training history or I need to concatenate logs from different nodes somehow?
So the main problem is that you need to pass the TASK ID that the master node creates to the second node, so it can report to the same Task.
I know that the enterprise version of ClearML support...
Hi EagerOtter28
The agent knows how to do the http->ssh conversion on the fly, in your cleaml.conf (on the agent's machine) set force_git_ssh_protocol: true
https://github.com/allegroai/clearml-agent/blob/42606d9247afbbd510dc93eeee966ddf34bb0312/docs/clearml.conf#L25
Yes 🙂
BTW: do you guys do remote machine development (i.e. Jupyter / vscode-server) ?
Sounds good to me 🙂
Woot woot! 🤩
Hi AverageBee39
What's the clearml-server and clearml packge you are using ?
(I looks like some capability that is missing from the server, i.e. needs upgrade ?!)