
Reputation
Badges 1
25 × Eureka!I see, so in theory you could call add_step with a pipeline parameter (i.e. pipe.add_parameter etc.)
But currently the implementation is such that if you are starting the pipeline from the UI
(i.e. rerunning it with a different argument), the pipeline DAG is deserialized from the Pipeline Task (the idea that one could control the entire DAG externally without changing the code)
I think a good idea would be to actually allow the pipeline class to have an argument saying always create from cod...
Hmm, might be, check if your files server is running and configured properly
make sure you follow all the steps :
https://clear.ml/docs/latest/docs/deploying_clearml/upgrade_server_linux_mac
(basically make sure you get the latest docker-compose.yml and the pull itcurl
-o /opt/clearml/docker-compose.yml docker-compose -f /opt/clearml/docker-compose.yml pull docker-compose -f /opt/clearml/docker-compose.yml up -d
ElegantKangaroo44 I think TrainsCheckpoint
would probably be the easiest solution. I mean it will not be a must, but another option to deepen the integration, and allow us more flexibility.
HelplessCrocodile8 I just tried it, everything seems to work (ubuntu 20.04) 😞
What's the OS your are using? Python version? Is it conda ?
Hi GloriousPenguin2 , Sorry this is a bit confusing. Let me expand:
When converting into a plotly object (the default), you cannot really control the dimensions of the plot in the UI programatically, you can however drag the seperator and expand width / height If you pass to report_matplotlib_figure
the argument " report_image=True,
" it will create a static image from matplotlib figure (as rendered locally) and use that as the figure, this way you get exactly wysiwyg , but the...
Oh that is odd... let me check something
So the thing is clearml
automatically detects the last iteration of the previous run, my assumption you also add it hence the double shift.
SourOx12 could that be it?
IrritableJellyfish76 point taken, suggestions on improving the interface ?
DefeatedCrab47 yes that is correct. I actually meant if you see it on the tensorboard's UI 🙂
Anyhow if it there, you should find it in the Tasks Results Debug Samples
JitteryCoyote63 with pleasure 🙂
BTW: the Ignite TrainsLogger will be fixed soon (I think it's on a branch already by SuccessfulKoala55 ) to fix the bug ElegantKangaroo44 found. should be RC next week
Hi DeliciousKoala34
This means the pycharm plugin was not able to run git on your local machine.
Whats your OS ?
could it be that if you open cmd / shell "git" is not in the path ?
BTW: latest PyCharm plugin with 2022 support was just released:
https://github.com/allegroai/clearml-pycharm-plugin/releases/tag/1.1.0
See here:
https://download.pytorch.org/whl/torch_stable.html
cu110/* has no torch 1.3.1 only 1.7.0
Is there a way to do this using ssh keys?
the .ssh of the host machine should be automatically mounted, you can force it by setting force_git_ssh_protocol: true
None
It is still not working for me. Are you using Linux, windows or macos?
should work for linux mac and windows, what are you using ?
Hi SourSwallow36
- The same docker image is used for all three jobs, just because it is easier to manage and faster to download. The full code is available on the trains-server GitHub. If you want to spin the containers manually, check the docker-compose.yml on the main repo, it has all the commands there
- Fork the trains-server, commit the changes and don't forget to PR them ;)
- Elastic search is a database, we use it to log all the experiments outputs, console logs metrics etc. This...
seems it was fixed 🙂
MagnificentWorm7 thank you for noticing ! 🙏
Usually in the /tmp folder under a temp filename (it is generated automatically when spinned)
In case of the services, this will be inside the docker itself
If possible, can we have a "only one experiment can be given a single tag"
You mean "moving a tag" automatically (i.e. if someone else had the same tag it is removed from it)?
MiniatureHawk42 could you re-test with the latest from github?pip install -U git+
Hmm can you test with the latest RC?pip install clearml==0.17.6rc1
DeliciousSeal67 the agent will use the "install packages" section in order to install packages for the code. If you clear the entire section (you can do that in the UI or programmatically) then it will revert to requirementsd.txt
Make sense ?
CheerfulGorilla72 my guess is the Slack token does not have credentials for the private channel, could that be ?
BTW:
Just making sure, 74 was not supposed to be the last checkpoint (in other words it is not stuck on leaving the training process, but actually in the middle)
IrritableJellyfish76 hmm maybe we should an an extra argument partial_name_matching=False
to maintain backwards compatibility?
I am using importlib and this is probably why everythings weird.
Yes that will explain a lot 🙂
No worries, glad to hear it worked out
However, the pipeline experiment is not visible in the project experiment list.
I mean press on the "full details" in the pipeline page
Hi @<1684010629741940736:profile|NonsensicalSparrow35>
however for the remote file it always creates the name with the following pattern:
{filename_prefix}checkpoint{n}.pt
..
Is this the main issue?
Notice that the model name (i.e. the entry on the Task itself) is not directly connected with the stored file name on the target file server (or S3)