What do you mean by public to private mongo? @<1734020208089108480:profile|WickedHare16>
That's the controller. I would guess if you fetch the controller you can get it's id as well
Hi @<1706478691208400896:profile|WearyPelican78> , you can set up a GCP autoscaler for this - None
Hi @<1580367711848894464:profile|ApprehensiveRaven81> , I'm not sure what you mean. Can you please elaborate?
Can you elaborate on how you did that?
Hi @<1719524641879363584:profile|ThankfulClams64> , how are you reporting debug samples?
@<1707203455203938304:profile|FoolishRobin23> , the agent in the docker compose is a services agent and it's not for running GPU jobs. I'd suggest running the clearml-agent with the GPU manually.
Hi @<1625303791509180416:profile|ExasperatedGoldfish33> , I would suggest trying pipelines from decorators. This way you can have very easy access to the code.
None
MuddySquid7 , Yes! Reproduced like a charm. We're looking into it 🙂
Then indeed it looks like a network/provider issue
On prem is also K8s? Question is if you run the code unrelated to ClearML on EKS, do you still get the same issue?
SubstantialElk6 , can you view the dataset in the UI? Can you please provide a screenshot so I can mark it down for you
It seems like a networking issue on your side. ClearML isn't blocking anything. Most likely unrelated to connection speed but to DNS or something related.
What if you connect using your phone hotspot or another provider?
MuddySquid7 , I couldn't reproduce case 4.
In all cases it didn't detect sklearn.
Did you put anything inside _init_.py
?
Can you please zip up the folder from scenario 4. and post it here?
Hi @<1523701977094033408:profile|FriendlyElk26> , let's say you have a table, which you report. How would you suggest comparing between two tables?
But you can see in the log that it manages to connect for a bit and then its interrupted
In the HPO application I see the following explanation:
'Maximum iterations per experiment after which it will be stopped. Iterations are based on the experiments' own reporting (for example, if experiments report every epoch, then iterations=epochs)'
Task has to be in draft mode, hover over the section and you will see an edit button or just double click the field you want to edit
What is your use case though? I think the point of local/remote is that you can debug in local
Hi @<1752139558343938048:profile|ScatteredLizard17> , the two of supported types of instances by the ClearML autoscaler are on demand/spot instances, nothing to do with reserved ones
Hi @<1545216070686609408:profile|EnthusiasticCow4> , in the PRO plan you are limited to a certain max amount of parallel application instances. If you kill some running applications, your HPO application will start running
@<1722786138415960064:profile|BitterPuppy92> , we are more than happy to accept pull requests into our free open source 🙂
Hi @<1722786138415960064:profile|BitterPuppy92> , I believe pre-defining queues via the helm chart is an Enterprise/Scale license feature only and not available in the open source
Hi @<1664079296102141952:profile|DangerousStarfish38> , can you add a log of the execution?
When you generate new credentials in the GUI, it comes up with a section to copy and paste into either
clearml-init
or
~/clearml.conf
. I want the files server displayed here to be a GCP address
Regarding this - I think you should open a github feature request since there is currently no way to do this via UI
Hi @<1570583227918192640:profile|FloppySwallow46> , can you please share the autoscaler configuration?
From what I understand, by default the ES has a low disk waterkmark set at 95% of the disk capacity. Once reached the shard is transitioned to a read only mode. Since you have a large disk of 1.8Tb the remaining 85Gb is below the 5%.
Basically you need to set the following env vars in elasticsearch service in the docker compose:
` - cluster.routing.allocation.disk.watermark.low=10gb
- cluster.routing.allocation.disk.watermark.high=10gb
- cluster.routing.allocation.disk.wate...
Hi @<1582542029752111104:profile|GorgeousWoodpecker69> , can you elaborate please on the exact steps you took?
Hi @<1797800418953138176:profile|ScrawnyCrocodile51> , are you self hosting the server?