Reputation
Badges 1
55 × Eureka!Oh, I think that is for a very small data. I don't think it works for me.
I know. And the very fast help 🙏 😀
Can I see visualization of for example categorical columns as bar graphs?
Pefect, thanks. I will take a look at that.
It is an autoscaler for gcp. I think, there are unnecessary configs that were used in aws.
@<1523701070390366208:profile|CostlyOstrich36>
clearml==1.14.1
That is the version.
@<1523701070390366208:profile|CostlyOstrich36> I have been exploring. The problem seems to be when the docker container is using the cached dir.
Using cached repository in "/root/.clearml/vcs-cache/****.git.0081a6bc4d7afe6adde369e6aeab9406/****.git"
When inside that directory and tries to fetch, it asks for credentials. when it clones, it doesn't.
cloning: git@github.com:****/****.git
Using user/pass credentials - replacing ssh url 'git@github.com:****/****.git' with https ...
configurations:
extra_clearml_conf: 'sdk.aws.s3.region="us-west-2"
agent.extra_docker_arguments=["--shm-size=90g"]
agent.extra_docker_shell_script=["git config --global credential.helper cache --timeout=604800",]'
extra_trains_conf: ''
extra_vm_bash_script: ''
queues:
gcp-v100:
- - gcp-v100
- 4
gcp-l4:
- - gcp-l4
- 4
gcp-cpu:
- - gcp-cpu
- 4
resource_configurations:
gcp-v100:
...
they are different tasks. I start a new task but it can be same commit sometimes.
Okay. thanks. But, about the overriding, I tried committing and when commited it works. So, I think that means the configuration is not overriden else where.
Another very related question is, does uncommitted changes work for sub modules as well? I mean, when there is a directory from a different repository cloned as a submodule.
Sorry, It just worked now. I think It was slow internet connection issue. It just went away today.
CostlyOstrich36
It doesn't work when I insert the credentilas individually either. I am using EC2 as clearml server.
Thanks, it did
deployed it myself. It worked fine before I changed to ubuntu24.04 yesterday. We have been using clearml for years this way
I tried it on another laptop and I also tried it on the server machine where the clearml server is deployed. I get same error. @<1523701070390366208:profile|CostlyOstrich36>
It works on other machines. Can I clearml-init on a virtual environment? I installed clearml in a virtual environment.
I see. So, is it same thing when network is slow and there is a mistake in url?
import yaml
from clearml.automation.auto_scaler import AutoScaler, ScalerConfig
from gcp_driver import GCPDriver
with open('gcp_autoscaler.yaml') as f:
conf = yaml.load(f, Loader=yaml.SafeLoader)
driver = GCPDriver.from_config(conf)
conf = ScalerConfig.from_config(conf)
autoscaler = AutoScaler(conf, driver)
autoscaler.start()
That is the python code.
Thanks CostlyOstrich36
But, I am not using report_media() function. The debug samples (confusion matrices) are saved from tensorboard.
They are actually from tracked files. Actually, I get the uncommitted changes under Execution
tab.
@<1523705004920147968:profile|CloudySwallow27>
But, it uses the committed changes instead of these values.
CloudySwallow27
The devops changed the url and I had to go through some steps to find out what the problem was.