Reputation
Badges 1
25 × Eureka!Also, I just wanted to say thanks for the tool! I'm managing a small data science practice and it's going to be really nice to have a view of all of the experiments we've got and know our GPU utilization, all without having to give every data scientist access to each box where the workflows are run. Incredibly stoked.
β₯ β€ β₯
BTW: StickyMonkey98 if you feel like writing a few examples I think it will be easy to push into the docs, so that at least we improve iteratively...
Hi IrritableGiraffe81
Yes it deploys all ClearML (including web).
ClearML-serving unfortunately is a bit more complicated to spin, as it needs actual compute nodes.
That said we are working on making it a lot easier π
Any updates on trigger and schedule docsΒ
I think examples are already pushed, docs still in progress.
BTW: pipeline v2 examples are also out:
https://github.com/allegroai/clearml/blob/master/examples/scheduler/trigger_example.py
https://github.com/allegroai/clearml/blob/master/examples/pipeline/full_custom_pipeline.py
Hi LazyTurkey38
What do you mean the git repo is not recognized? When execute_remotely leaves you should see on the task a ref to the git repo with the exact commit ID you have locally pulled, do you see it under the Execution tab?
Hi PompousParrot44
You can check the cleanup service example.
It sleeps for 24 hours then spins up and does its thing.
You can always launch this service tasks on the services queue, its purpose is to run those services on the trains-server as additional CPU services. They will also be registered as service nodes, so you have visibility into which service is running.
In order to clone a task and wait for its completion.
Use the TrainsJob https://github.com/allegroai/trains/blob/65a4a...
Hi @<1523701168822292480:profile|ExuberantBat52>
I am trying to execute a pipeline remotely,
How are you creating your pipeline? and are you referring to an issue with the pipeline logic or is it a component that needs that repo installed ?
quick video of the search not working
Thank you! this is very helpful, passing along to front-end guys π
and ctrl-f (of the browser) doesnβt work as lines below not loaded (even when you scroll it will remove the other lines not visible, so you canβt ctrl-f them)
Yeah, that's because they are added lazily
But I do not have anything linked correctly since I rely in conda installing cuda/cudnn for me
From the log it installed:cudatoolkit==11.1.1
based on the CUDA it found on the host machine: agent.cuda_version = 110
But for some reason it installed the pytorch from the conda "pytorch" repo without the cuda support.
@<1671689437261598720:profile|FranticWhale40> this one: None
What's the error you are getting ?
Maybe the configuration file changed?
None
The logic is if the name and project are the same, and there are no artifacts/models, and the last time it was created was under 72 hours, reuse the Task
Train Data Params/a = {} Train Data Params/b = ...Then maybe we could "hack" it so that if you edit it in the UI like so:Train Data Params/a = {'new': 'value'} Train Data Params/b = ...You end up withparam = {'a': {'new': 'value'}, 'b' : ... }What do you think?
JumpyPig73 I think fire was just added:
https://github.com/allegroai/clearml/pull/550
You can test with the latest RC:pip install clearml==1.2.0rc1Regrading not finding Hydra-core package, what do you have listed under Execution: "Installed Packages" (it should have auto detected that you are importing hydra and list it there)
s there any way to see datasets uploaded to ClearML Data without downloading them using ClearML Data?
Hi VexedCat68
Currently when you create datasets with clearml-data it has to repackage your files, i.e. upload them. That said we have received numerous requests on "registering data", and we are looking into it.
Here is the main technical hurdles we are facing, and I would love to get your perspective:
If the data is not available locally, we cannot calculate the hash of the conten...
Let me try to add some color to this process analysis process.
Basically clearml will try to statically analyze the code (i.e. look for import/from packages)
Then it will list them in a pip requirements.txt format under installed packages.
When running inside conda environment, it will check which packages were installed via "conda install" (instead of pip install) and mark them internally. This process ensures that when the clearml-agent is running with conda package manager, it "knows" whic...
Since this fix is all about synchronizing different processes, we wanted to be extra careful with the release. That said I think that what we have now should be quite stable. Plan is to have the RC available right after the weekend.
You can do that programatically, clone the pipeline Task (a pipeline is also a Task) and change the Args section of that Task, wdyt?
Example:
None
Thanks EnviousStarfish54 we are working on moving them there!
BTW, in the mean time, please feel free to open GitHub issue under train, at least until they are moved (hopefully end of Sept).
Where did you add the Task.init call ?
I think the ClearmlLogger is kind of deprecated ...
Basically all you need is Task.init at the beginning , the default tensorboard logger will be caught by clearml
Or can I enable agent in this kind of local mode?
You just built a local agent
Hi @<1523715429694967808:profile|ThickCrow29>
I am using the PipelineController with abort_on_failure set to False.
Is this a pipeline from code or from Tasks?
What is the clearml version?
Lastly, if a component fails, and another components is dependent on it's output, how would it run? if it is not dependent, why is it a child component?
GiganticTurtle0 what's the Dataset Task status?
DepressedFox45
you can just copy/add this section π
https://github.com/allegroai/clearml-agent/blob/e43f31eb80f9399da01dc5432cdacdf81c1bd084/docs/clearml.conf#L15
FiercePenguin76
So running the Task.init from the jupyter-lab works, but running the Task.init from the VSCode notebook does not work?
okay, let me check it, but I suspect the issue is running over SSH, to overcome these issues with pycharm we have specific plugin to pass the git info to the remote machine. Let me check what we can do here.
FiercePenguin76 BTW, you can do the following to add / update packages on the remote sessionclearml-session --packages "newpackge>x.y" "jupyterlab>6"
these are being repeated as well for a single task (this is training a t5_model with transformers):Β (edited)
Seems like someone is storing lots of files with torch.save that ClearML automatically logs.
You can disable the autolog:task = Task.init(..., auto_connect_frameworks={'pytorch': False})