Reputation
Badges 1
25 × Eureka!first try the current setup usingΒ
pip
, and if it fails, useΒ
poetry
Β ifΒ
poetry.lock
Β exists
I guess the order here is not clear to me (the agent does the opposite), why would you start with pip if you are using poetry ?
Local changes are applied before installing requirements, right?
correct
why not let the user start with an empty comparison page and add them from "Add Experiment" button as well?
Apologies, I was not clear. Yes I'm with you, this is a great idea π
It's just that to access that comparison page, you have to make a comparison first.
Make total sense to me π
I meant even just a link to a blank comparison and one can then add the experiments from that view
Just making sure you are aware, once you are in comparison you can always add Tasks (any Task):
Notice you can press on the "Add experiments", then select Any experiment (including all projects! as filters)
Notice you need to remove all filters (right side red x on the filter Icon)
Finally managed; you keep saying "all projects" but you meant the "All Experiments" project instead. That's a good startΒ
Β Thanks!
Yes, my apologies you are correct: "all experiments"
Is Task.current_task() creating a task?
Hmm it should not, it should return a Task instance if one was already created.
That said, I remember there was a bug (not sure if it was in a released version or an RC) that caused it to create a new Task if there isn't an existing one. Could that be the case ?
from clearml import TaskTypes
That will only work if you are using the latest from the GitHub, I guess the example code was modified before a stable release ...
I believe that happens natively thanks to pyhocon? No idea why it fails on mac
That's the only explanation ...
But the weird thing is, it did not work on my linux box?!
Sounds good let's work on it after the weekend, π
JitteryCoyote63 I think that without specifically adding torch to the requirements, the agent will not be able to automatically resolve the correct cuda/torch version. Basically you should add torch to the requirements.txt file, and provide it to Task create, or use Task.force_requirements_env_freeze
Hi JitteryCoyote63
So that I could simply do
task._update_requirements(".[train]")
but when I do this, the clearml agent (latest version) does not try to grab the matching cuda version, it only takes the cpu version. Is it a known bug?
The easiest way to go about is to add:Task.add_requirements("torch", "==1.11.0") task = Task.init(...)
Then it will auto detect your custom package, and will always add the torch version. The main issue with relying on the package...
Hi RipeGoose2
You can also report_table them? what do you think?
https://github.com/allegroai/clearml/blob/master/examples/reporting/pandas_reporting.py
https://github.com/allegroai/clearml/blob/9ff52a8699266fec1cca486b239efa5ff1f681bc/clearml/logger.py#L277
Hi @<1523701181375844352:profile|ExasperatedCrocodile76>
the docker containers should get the host IP, not the internal docker IP. what am I missing ?
Hi @<1704304350400090112:profile|UpsetOctopus60>
https://clear.ml/docs/latest/docs/deploying_clearml/clearml_server_kubernetes_helm
Just use the helm charts. It's the easiest
Maybe combining the two, with an unload gRPC api we could have that ability moved to the "preprocessing" logic, wdyt?
Hi HelpfulHare30
I mean situations when training is long and its parts can be parallelized in some way like in Spark or Dask
Yes that makes sense, with both the function we are paralleling usually bottle-necked in both data & cpu, and both frameworks try to split & stream the data.
ClearML does not do data split & stream, but what you can do is launch multiple Tasks from a single "controller" and collect the results. I think that one of the main differences is that a ClearML Task is ...
but is there any other way to get env vars / any value or secret from the host to the docker of a task?
if this is docker -e/--env as argument would do the same-e VAR=somevalue
but this would be still part of the clearml.conf right?
You can pass it per Task , also you can configure the agent to always pass it add this env.
https://github.com/allegroai/clearml-agent/blob/5a080798cb4292e198948fbe16cba70136cb6bdf/docs/clearml.conf#L137
I like this approach more but it still requires resolved environment variables inside the clearml.conf
Yes π maybe this is a feature request ?
I mean to use a function decorated withΒ
PipelineDecorator.pipeline
Β inside another pipeline decorated in the same way.
Ohh... so would it make sense to add "helper_functions" so that a function will be available in the step's context ?
Or maybe we need a new to support "standalone" decorator?! Currently to actually "launch" the function step, you have to call it from the "pipeline" main logic function, but, at least in theory, one could do without the Pipeline itself.....
I think it was just pushed, including nested call you have to use the new argument for the decorator, helper_function
https://github.com/allegroai/clearml/blob/400c6ec103d9f2193694c54d7491bb1a74bbe8e8/clearml/automation/controller.py#L2392
Just to get the full picture, are we expecting to see the newly created step (aka eager execution) on the original pipeline (i.e. as part od the DAG visualization) ?
In the main pipeline I want to work with the secondary pipeline and other functions decorated withΒ
PipelineDecorator
. Does ClearMl allow this? I have not been able to get it to work.
Usually when we think about pipelines or pipelines, the nested pipeline is just another Task you are running in the DAG (where the target queue is the services
queue).
When you say nested pipelines with decorators, what exactly do you have in mind ?
Thanks SmallDeer34 !
Would you like us to? How about a footnote/acknowledgement?
How about a reference / footnote ?@misc{clearml, title = {ClearML - Your entire MLOps stack in one open-source tool}, year = {2019}, note = {Software available from
}, url={
}, author = {allegro.ai}, }
BTW: get_tasks has project_name argument, I would just use it π
Hi, Is there a way to stop a clearml-agent from within an experiment?
It is possible but only in the paid tier (it needs backend support for that) π
My use case it: in a spot instance marked for termination after 2 mins by aws
Basically what you are saying is you want the instance to spin down after the job is completed, correct?
Basic setup:
glues service per "job template" (e.g. k8s resources, for example cpu requirement, or gpu requirement).
queue per glue service, e.g. cpu_machine
queue, and 1xGPU
queue
wdyt?
Can you send the console output of this entire session please ?
Wait @<1523701066867150848:profile|JitteryCoyote63>
If you reset the Task you would have lost the artifacts anyhow, how is that different?