It seems like there is no way to define that a Task requires docker support from an agent, right?
Correct, basically the idea is you either have workers working in venv mode or docker.
If you have a mixture of the two, then you can have the venv agents pulling from one queue (say default_venv) and the docker mode agents pulling from a different queue (say default_docker). This way you always know what you are getting when you enqueue your Task
Hmm ElegantKangaroo44 low memory that might explain the behavior
BTW: 1==stop request, 3=Task Aborted/Failed
Which makes sense if it crashed on low memory...
You can make reports on experiments with interactive graphs
Yes, I can totally see how this is a selling point. The closest is the Project Overview (full markdown capabilities, with the ability to embed links to specific experiments). You can also add a "leader metric", so you can track the project performance/progress over time.
I have to admit that creating a better reporting tool is always pushed down in priority as I think this is a good selling point to management but the actual ...
RC should be out later today (I hope), this will already be there, I'll ping here when it is out
Would I be able to add customized columns like I am able to inΒ
task.connect
Β ? Same question applies for parallel coordinates and all kinds of comparisons
No to both π
GiganticTurtle0 in the PipelineDecorator.component , did you pass helper_functions=[] with refrence to all the sub component ?
Hi CheerfulGorilla72 ,
Sure there are:
https://github.com/allegroai/clearml/tree/master/examples/frameworks/pytorch-lightning
AbruptWorm50 can you send full image (X axis is missing from the graph)
WhimsicalLion91
What would you say the use case for running an experiment with iterations
That could be loss value per iteration, or accuracy per epoch (iteration is just a name for the x-axis in a sense , this is equivalent to time series)
Make sense?
You can already sort and filter experiments based on any hyper parameter or metric that the experiment reports, there is no need for any custom language query. Also all created filter/sorted table can be shared exactly as they are, so you can create leaderboards and share specific filters. You can also use the search bar in order to filter based on experiment name / comment. Tags will be added soon as well π
Example of custom columns is here (the screen grab is a bit old, now there is als...
Hi SteadyFox10
Short answer no π
Long answer, full permissions are available in the paid tier, along side a few more advanced features.
Fortunately in this specific use case, the community service allows you to share a single (or multiple) experiments with a read-only link. Would that work ?
GrittyHawk31
what are you getting when you are running:docker psand what are you getting with:netstat -natp | grep LISTEN
So clearml server already contains an authentication layer (JWT Token), and you do have a full user management on top:
https://clear.ml/docs/latest/docs/deploying_clearml/clearml_server_config#web-login-authentication
Basically what I'm saying if you add httpS on top of the communication, and only open the 3 ports, you should be good to go. Now if you really need SSO (AD included) for user login etc, unfortunately this is not part of the open source, but I know they have it in the scale/ent...
Hi CloudySwallow27
how can I just "define" it on my local PC, but not actually run it.
You can use the clearml-task CLI
https://clear.ml/docs/latest/docs/apps/clearml_task#how-does-clearml-task-work
Or you can add the following line in your code, that will cause the execution to stop, and to continue on a remote machine (basically creating the Task and pushing it into an execution queue, or just aborting it)task = Task.init(...) task.execute_remotely()https://clear.ml/do...
Hi BoredSquirrel45
as of today, my required packages aren't being recognized in cloned
Are you saying you are editing the code directly in the cloned Task, then enqueue the Task an the agent does not "auto recognize" the package ?
I find it quite difficult to explain these ideas succinctly, did I make any sense to you?
Yep, I think we are totally on the same wavelength π
However, it also seems to be not too prescriptive,
One last question, what do you mean by that?
Which one of those? the 3d ball dots or the 3d face mesh?
I'm running agent inside docker.
So this means venv mode...
Unfortunately, right now I can not attach the logs, I will attach them a little later.
No worries, feel free to DM them if you feel this is to much to post them here
I want in my CI tests to reproduce a run in an agent
you mean to run it on the CI machine ?
because the env changes and some things break in agents and not locally
That should not happen, no? Maybe there is a bug that needs fixing on clearml-agent ?
MiniatureCrocodile39 from the screen shot I imagine you are running inside a docker, this means that when you restart the docker, the configuration file is lost.
Could that be the case ?
Hi FierceHamster54
I would take a look at the decorator example here
https://github.com/allegroai/clearml/blob/master/examples/pipeline/pipeline_from_decorator.py
Think of every function as a stand-alone task running on a different machine. The controller itself is the logic that creates the jobs and passes data, and the clearml agent / autoscaler does the actual orchestration
Hi ObedientDolphin41
I keep bumping against the
ModuleNotFoundError: No module named
exception.
Import the package inside the component function (the one you decorated), it will make sure it lists it in the requirements section automatically.
You can also set it manually by passing it to as the "packages" argument on the decorator function:
Hi SkinnyPanda43
This issue was fixed with clearml-agent 1.5.1, can you verify?
How can I make a task that does a helm install or kubectl create deployment.yaml?
The task that it launches should have your code that actually does the helm deployments and other things, thing of the Task as a way to launch a script that does something, that script can then just interact with the cluster. The queue itself (i.e. clearml-agent) will not directly deploy helm charts, it will only deploy jobs (i.e. pods)
Ohh sorry you will also need to fix the
def _patched_task_function
The parameter order is important as the partial call relies on it.
My bad no need for that π
Ohh sorry you will also need to fix the def _patched_task_function
The parameter order is important as the partial call relies on it.
If it cannot find the Task ID I'm guessing it is trying to connect to the demo server and not your server (i.e. configuration is missing)