BTW: how are you using them? should we have a direct interface to those ?
So far my local and remote gitlab repositories are synchronized, I suspect, thatΒ
Failed applying git diff, see diff above
Β error is caused by cached repository from which clearml tries to run the process. I've cleaned the cache, but it haven't helped.
Hmm can you test with empty "uncommitted changes" ?
Just making sure when you say still does n't work, you are not trying to run the Task with the git diff that includes teh binary data right?
Yes, but where I can fi...
Hi UptightMouse31
First, thank you π
And to your question:
variable in the project is the kpi,
You mean like add it to the experiment table and get kind of leader-board ?
SillyPuppy19 yes you are correct, actually I can promise you the callback will be called from a different thread (basically the monitoring thread) so it's on the user to make sure the callback can handle it .
How about we move this discussion to GitHub?
Are you hosting your own server? Is it on http://app.clear.ml ?
Or maybe you could bundle some parameters that belongs to PipelineDecorator.component into high-level configuration variable (something like PipelineDecorator.global_config (?))
So in the PipelineController we have a per step callback and generic callbacks (i.e. for all the steps), is this what you are referring to ?
Well, I can see the difference here. Using the new pipelines generation the user has the flexibility to play with the returned values of each step.
Yep π
We...
Notice that if you are using TB, everything you report to the TB will appear as well π
I mean , the python package, not the trains-server version
Let me check, which helm chart are you referring to ?
when you clone the Task, it might be before it is done syncying git / packages.
Also, since you are using 0.16 you have to have a section name (Args or General etc.)
How will task b use the parameters ? (argparser / connect dict?)
SmilingFrog76 this is not a weird mechanism at all , this is proper HPC scheduler πtrains-agent is not actually aware of other nodes, it is responsible for launching a Task on its own hardware (with whatever configuration it was set). What can be done is to use the trains-agent inside a 3rd party scheduler and have the scheduler allocate the node and trains-agent spin the experiment. There is a k8s example here: basically pulling jobs for the trains-server queue and pushing ...
And how did you connect your example,yaml?
DisgustedDove53 , TrickySheep9
I'm all for it!
I can think of two options here, (1) use the k8s glue + apply template with ports mode see discussion https://clearml.slack.com/archives/CTK20V944/p1628091020175100
(2) create an interface (queue) to launch arbitrary job on the k8s cluster, with the full pod definition on the Task. This will allow the clearml-session to setup everything from the get go.
How would you interface with the k8s operator, and what exactly will it do?
(BTW: the reas...
Just curious, if
is a value I can set, where is it used?
It is used when Creating a dataset from inside the cluster (i.e. when launching using the clearml k8s glue),
it will have No effect on what users have on their local machines
i.e. they can always point to a diff server.
That said, when users create their initial clearml.conf and copy paste the info from the web UI, this value (or it might be another one, I'll double check later) will set the initial configuration the c...
Thank you for saying ! π
Hi @<1546303293918023680:profile|MiniatureRobin9>
Im not sure to understand the difference between a worker and an agent.
hmm we should probably make that clearer π
agent = the clearml-agent instance running on the machine
worker is the system term representing the instance of the agent
You can have one machine with multiple agents (i.e. multiple workers) running on it.
Does that make sense ?
Hi ZanyPig66
I used tensorboard as clearml claims to auto-capture tensorboard outputs, but it was a no go.
The auto TB logging should work out of the box, where is it failing ?
Also,task = Task.current_task()Why aren't you using Task.init in the original script?
The idea is that you run your code on your machine (where the environment works), ClearML auto detects code + python packages + args etc.
Then you clone it in the UI and launch it on a remote machine.
What am I missing ...
So would this pseudo code solve the issue
def pipeline_creator():
pipeline_a_id = os.system("python3 create_pipeline_a.py")
print(f"pipeline_a_id={pipeline_a_id}")
something like that?
(obviously the quesiton how would you get the return value of the new pipeline ID, but I'm getting a head of myself)
Do you have a specific numpy version you are installing ? why is it trying to install the wheel from code?
WackyRabbit7 this is funny, it is not ClearML providing this offering
some generic company grabbed the open-source and put t there, which they should not π
The agent cannot use another user (it literally has no way of getting credentials). I suspect this is all a by product of the actual mount point)
Hi @<1523706645840924672:profile|VirtuousFish83>
Hmm so generally I think the answer is no... I mean you can download all scalars and re-report them with a different title/series, but I think you will not be able to delete a specific set, and the only way would be to reset the entire Task.
I'm curious what's the scenario here? is it like a typo you want to fix?
Hi @<1547028116780617728:profile|TimelyRabbit96>
Notice that if running with docker compose you can pass an argument to the clearml triton container an use shared mem. You can do the same with the helm chart