Reputation
Badges 1
981 × Eureka!this is the last line, same a before
There is a pinned github thread on https://github.com/allegroai/clearml/issues/81 , seems to be the right place?
hoo thats cool! I could place torch==1.3.1 there
is there a command / file for that?
Yes I agree, but I get a strange error when using dataloaders:RuntimeError: [enforce fail at context_gpu.cu:323] error == cudaSuccess. 3 vs 0. Error at: /pytorch/caffe2/core/context_gpu.cu:323: initialization error
only when I use num_workers > 0
So it is there already, but commented out, any reason why?
SuccessfulKoala55 I tried to setup in a different machine the clearml-agent and now I get a different error message in the logs:Warning: could not locate requested Python version 3.6, reverting to version 3.6 clearml_agent: ERROR: Python executable with version '3.6' defined in configuration file, key 'agent.default_python', not found in path, tried: ('python3.6', 'python3', 'python')
We would be super happy to have the possibility of documenting experiments (new tab in experiments UI) with a markdown editor!
Is it safe to turn off replication while a reindex operation is happening? the reindexing is rather slow and I am wondering if turning of replication will speed up the process
Hi SuccessfulKoala55 , will I be able to update all references to the old s3 bucket using this command?
the deep learning AMI from nvidia (Ubuntu 18.04)
Ok, but that means this cleanup code should live somewhere else than inside the task itself right? Otherwise it won't be executed since the task will be killed
I cannot share the file itself, but here are some potential helpful points:
Multiple lines empty One line is empty but has spaces (6 to be exact) The last line of the file is empty
Can I simply set agent.python_binary = path/to/conda/python3.6 ?
Maybe the agent could be adapted to have a max_batch_size parameter?
On clearml or clearml-server?
Nevertheless there might still be some value in that, because it would allow to reduce the starting time by removing the initial setup of the agent + downloading of the data to the instance - but not as much as I described initially, if instances stopped are bound to the same capacity limitations as new instances launched
Hi, yes, you can use trains_agent.backend_api.session.client.APIClient.queues.get_all()
Isn't it overkill to run a whole ubuntu 18.04 just to run a dead simple controller task?
Sorry both of you, my problem was actually lying somewhere else (both buckets are in the same region) - thanks for you time!
I assume youβre using a self-hosted server?
Yes
to pass secrets to each experiment
I will let the team answer you on that one π
It worked like a charm π± Awesome thanks AgitatedDove14 !