Hi @<1661904968040321024:profile|SpotlessOwl43> , you can achieve this using the REST API of ClearML - None
Hi @<1572395190897872896:profile|ShortWhale75> , is it possible you're using a very old version of clearml
package?
It would work from your machine as well, but the machine needs to be turned on... like when an ec2 instance that is running.
Check the docker containers logs to see if there are any errors in them when you try to view the worker stats
@<1541954607595393024:profile|BattyCrocodile47> , that is indeed the suggested method - although make sure that the server is down while doing this
Hi @<1547028031053238272:profile|MassiveGoldfish6> , I think this is what you're looking for - None
RattyLouse61 , I think you can save the yml conda env file as an artifact, this way it would also be accessible by other tasks 🙂
@<1734020162731905024:profile|RattyBluewhale45> , can you try upgrading to the latest version of the server? 1.16.2 should have a fix for this issue
Please do. You can download the entire log from the UI 🙂
Hi @<1559711623147425792:profile|PlainPelican41> , you can re-run an existing pipeline using different parameters from the UI. Otherwise, you need to create new pipelines with new code 🙂
Yes, you'll need to connect them via code
Where is the error?
I meant writing a new pipeline controller that will incorporate the previous pipelines as steps. What is the error that you're getting? Can you provide a snippet?
It looks like you are running on the community server. Can you right click the experiment in the experiments table and click on 'Share' on all the relevant experiments and send here?
@<1691620877822595072:profile|FlutteringMouse14> , what version of the agent are you using?
You can try, but I don't think so.
@<1523701132025663488:profile|SlimyElephant79> , it looks like you are right. I think it might be a bug. Could you open a GitHub issue to follow up on this?
As a workaround programmatically you can set Task.init(output_uri=True)
, this will make the experiment outputs all to be uploaded to whatever is defined as the files_server
in clearml.conf
.
Hi @<1523704157695905792:profile|VivaciousBadger56> , you can configure Task.init(..., output_uri=True)
and this will save the models to the clearml file server
Maybe SuccessfulKoala55 might have more insight on setting K8s integration 🙂
So when you run it standalone it works fine? How are you creating the pipeline?
Doesn't work for me either. I guess the guys are already looking into it
Hi @<1523708602928336896:profile|HungryArcticwolf62> , can you share an isolated code snippet that reproduces this? What version of the agent are you using now?
Also, what do you mean by skipped? What happens to the pipeline?
RotundSquirrel78 , do you have an estimate how much RAM the machine running ClearML server? Is it dedicated to ClearML only or are there other processes running?
Hi @<1714813627506102272:profile|CheekyDolphin49> , how are you setting the parameter in the HPO?
Hi FierceHamster54 , can you please elaborate on the process with a more specific example?
Hi ExuberantParrot61 , that's a good question. This is a bit hacky but what if you try to catch the task with Task.current_task()
from inside the step and try to change the output_uri
attribute there?
Does the Autoscaler try to spin new instances?
I'd suggest running Task.init
first and then exposing the dataset name using argparser afterwards