SubstantialElk6 "Execution Tab" scroll down you should have "Installed Packages" section, what do you have there?
Makes sense
we need to figure what would be the easiest way to have an "opt-in" for the demo server, that will still make it a breeze to quickly test code integration ...
Any suggestions are welcomed 🙂
CleanPigeon16 Coming very soon, we adding a few features for the pipeline, this one will also be included :)
DeliciousBluewhale87 not on the opensource, for some reason it is not passed 😞
Could you explain the use case ?
if it ain't broke, don't fix it
😄
Up to you, just a few features & nicer UI.
BTW: everything is backwards compatible, there is no need to change anything all the previous trains/trains-agent packages will work without changing anything 🙂
(This even includes the configuration file, so you can keep the current ~/trains.conf and work with whatever combination you like of trains/clearml on the same machine)
DefeatedOstrich93 can you verify lightning actually only stored once ?
Copy paste it here 🙂
WackyRabbit7 How do I reproduce it ?
I think task.init flag would be great!
👍
Hmmm, what's your trains version ?
Yes, hopefully they have a different exception type so we could differentiate ... :) I'll check
this results at the end of an experiment in an object to be saved under a given name regardless if it was dynamic or not?
Yes, at the end the name of the artifact is what it will be stored under (obviously if you reuse the name you basically overwrites the artifact)
I'm really for adding an interface, but I was not able to locate a simple integration option with basically anything, Wdyt ?
and the agent default runtime mode is docker correct?
Actually the default is venv mode, to run in docker mode add --docker
to the command line
So I could install all my system dependencies in my own docker image?
Correct, inside the docker it will inherit all the preinstalled packages, But it will also install any missing ones (based on the Task requirements. i.e. "installed packages" section)
Also what is the purpose of the
aws
block in the clearml.c...
an implementation of this kind is interesting for you or do you suggest to fork
You mean adding a config map storing a default trains.conf for the agent?
It can be a different agent.
If inside a docker thenclearml-agent execute --id <task_id here> --docker
If you need venv doclearml-agent execute --id <task_id here>
You can run that on any machine and it will respin and continue your Task
(obviously your code needs to be aware of that and be able to pull its own last model checkpoint from the Task artifacts / models)
Is this what you are after?
The agents are docker containers, how do I modify the startup script so it creates a queue?
Hmm actually not sure about that, might not be part of the helm chart.
So maybe the easiest is:from clearml.backend_api.session.client import APIClient c = APIClient() c.queues.create(name="new_queue")
VictoriousPenguin97 basically spin down sereverA (this should flush all DBs) then copy /opt/clearml to the new server and spin it with docker-compose. As long as the new server is on the same address as the previous one, everything should work out of the box
Thanks for the details TroubledJellyfish71 !
So the agent should have resolved automatically this line:torch == 1.11.0+cu113
into the correct torch version (based on the cuda version installed, or cpu version if no cuda is installed)
Can you send the Task log (console) as executed by the agent (and failed)?
(you can DM it to me, so it's not public)
Yes, though the main caveat is the data is not really immutable 😞
or even different task types
Yes there are:
https://clear.ml/docs/latest/docs/fundamentals/task#task-types
https://github.com/allegroai/clearml/blob/b3176a223b192fdedb78713dbe34ea60ccbf6dfa/clearml/backend_interface/task/task.py#L81
Right now I dun see differences, is this a deliberated design?
You mean on how to use them? I.e. best practice ?
https://clear.ml/docs/latest/docs/fundamentals/task#task-states
could be nice to have a direct "task comparison" link in the UI somewhere,
you mean like a "cart" for comparison ? or just to "save the state" so you can move between projects ?
CrookedWalrus33 from the log it seems the code is trying to use "kwcoco" but it is not listed under any "Installed packages" nor do you see any attempt to install it. Can you confirm ?
Yes MuddySquid7 it is automatically detects it (regardless of you uploading DF as an artifact).
How are you saving the dataframe ?
(it will auto log any joblib.save call, is that it?)
Hi FunnyTurkey96
what's the clearml server you are using ?
Hi ExcitedFish86
In Pytorch-Lightning I use DDP
I think a fix for pytorch multi-node / process distribution was commited to 1.0.4rc1, could you verify it solves the issue ? (rc1 should fix this specific issue)
BTW: no problem working with cleaml-server < 1
Hi FiercePenguin76
So currently the idea is you have full control over per user credentials (i.e. stored locally). Agents (depending on how deployed) can have shared credentials (with AWS the easiest is to push to the OS env)
How do you currently report images, with the Logger or Tensorboard or Matplotlib ?
Oh I see, try the following to get a list of all pipelines, then with the selected pipeline you can locate the component:
pipeline_project = Task.current_task().project
pipelines_ids = Task.query_tasks(task_filter=dict(
project=[pipeline_project],
type=["controller"],
system_tags=["pipeline"],
order_by=["-last_change"],
search_hidden=True,)
)
# take the second to the last updated one (becuase t...
Hi @<1631826770770530304:profile|GracefulHamster67>
if you want your current task:
task = Task.current_task()
if you need the pipeline Task from the pipeline component
pipeline = Task.get_task(Task.current_task().parent)
where are you trying to get the pipelines from? I'm not sure I understand the use case?