BTW: the new documentation should contain a full search over the docstring
why doesn't this happen on my other experiments?
same 100+ reports ?
(My new theory is that calling Task.reload() will fix it, and it might be called internally for the other experiments, like when reporting models/artifacts)
Could that be the case ?
So the TB issue was reported images were not logged.
We are now talking about the caching, which is actually a UI thing which clearml-server version are you using ?
And where are the images stored (the default files server or is it S3/GS etc.) ?
yes you are correct, I would expect the same.
Can you try manually importing pt, and maybe also moving the Task.init before darts?
BTW: Full RestAPI reference here
https://allegro.ai/clearml/docs/rst/references/clearml_api_ref/index.html
OutrageousGiraffe8 this sounds like a bug, how can we reproduce it?
Maybe a add another layer here?
https://github.com/allegroai/clearml/blob/a47f127679ebf5912690f7c3e60791a2daa5c984/examples/frameworks/tensorflow/tensorflow_mnist.py#L40
okay let's PR this fix ?
Hi @<1570583227918192640:profile|FloppySwallow46>
Hey I have a question, Can you monitor the time for one pipeline,
you mean to see the start / end time of the pipeline?
Click on the details link on the right hand side and you will have all the details on the pipeline task, including running time
ColossalDeer61 btw, it turns out the docker-compose services docker was ill configured on the GitHub π I suggest you get the latest copy of it:curl
-o docker-compose.yml
GaudyPig83
I think there is some mismatch between the code creating the pipeline and the actual Task?! Could that somehow be the case? "relaunch_on_instance_failure" is a missing argument somehow
can you try to launch the entire Pipeline with the latest RC ?pip3 install clearml==1.7.3rc0
What is the Model url?print(model.url)
GiganticTurtle0 I think I located the issue:
it seems the change is in "config" (and for some reason it stores the entire dict) but the split values are not changed.
Is this it?
Hi SubstantialElk6 I believe you just need to use clearml 1.0.5 , and make sure you rae passing the correct OS environment to the agent
Hmm, let me check, there is a chance the level is dropped when manually reporting (it might be saved for internal critical reports). Regardless I can't see any reason we could not allow to control it.
is it also possible to somehow propagate ssh keys to the agent pod? Not sure how to approach that
I would use the k8s secret manager to do that (there is a way to mount secrets files into pod, SSH is relatively standard to do)
trains-agent doesn't run the clone, it is pip...
basically calling "pip install git+https://..."
Not sure you can pass extra arguments
Also, this is not a setup problem, otherwise it would have seen consistently failing ... this actually looks like a network issue.
The only thing I can think of is retrying to install if we get network error (not sure whats the exit code of pip though (maybe 9?)
I think EmbarrassedSpider34 is correct.
When you pass the requirements to clearml-task, actually the agent depending on how it was configured (conda / pip) will do the installation.
That said, maybe it is worth adding support to provide the env.yml in the CLI ?
(Notice that adding specific channels needs to be configured on the agent, they are not stored per Task)
AlertCamel57 wdyt?
Hi @<1587615463670550528:profile|DepravedDolphin12>
Is there anyway to get the id of the pipeline using pipeline name?
In the UI top right "details" panel should have the Pipeline ID
Is this what you are looking for ?
GiddyTurkey39
as others will also be running the same scripts from their own local development machine
Which would mean trains
` will update the installed packages, no?
his is why I was inquiring about theΒ
requirements.txt
Β file,
My apologies, of course this is supported π
If you have no "installed packages" (i.e. the field is empty in the UI) the trains-agent
will revert to installing the requirements.txt
from the git repo itself, then it...
Thanks GreasyPenguin66
How about:!curl
BTW, no need to rebuild the docker, next time you can always do !apt update && apt install -y <package here>
π
Or maybe you could bundle some parameters that belongs to PipelineDecorator.component into high-level configuration variable (something like PipelineDecorator.global_config (?))
So in the PipelineController we have a per step callback and generic callbacks (i.e. for all the steps), is this what you are referring to ?
Well, I can see the difference here. Using the new pipelines generation the user has the flexibility to play with the returned values of each step.
Yep π
We...
By default the pl Trainer will output everything to TB, which we automatically store. But verify that TB is installed
A true mystery π
That said, I hardly think it is directly related to the trains-agent
...
Do you have any more insights on when / how it happens ?
Hmm could you try to upload to your files server (not the S3)
Maybe some credentials error ?