Hmm I see, add this for example
extra_docker_shell_script: ["rm ~/.bashrc", "echo removed bashrc"]
Hmm, I think the issue is here (the docker command mount)'-v', '/tmp/.clearml_agent.de0n48pm.cfg:/root/clearml.conf'
You can just spin another agent on the same machine 🙂
FiercePenguin76
So running the Task.init from the jupyter-lab works, but running the Task.init from the VSCode notebook does not work?
This is odd, what is the parameter?
I assume it needs sorting and one time this is Integer, and the next it is a String, so the server cannot sort based on it. Could that be ?
Hi @<1541954607595393024:profile|BattyCrocodile47>
see here: None
Try with app.clearml.mlops-club.org
and the rest of them
Hi CooperativeFox72
I think the upload reporting (files over 5mb) was added post 0.17 version, hence the log.
The default is upload chunk reporting is 5MB, but it is not configurable, maybe we should add it to the clearml.conf ? wdyt?
Yes my bad 😞
Let's try again:
` docker run -it --gpus "device=1" -e CLEARML_WORKER_ID=Gandalf:gpu1 -e CLEARML_DOCKER_IMAGE=nvidia/cuda:11.4.0-devel-ubuntu18.04 -v /home/dwhitena/.git-credentials:/root/.git-credentials -v /home/dwhitena/.gitconfig:/root/.gitconfig -v /tmp/.clearml_agent.7rjdh80a.cfg:/root/clearml.conf -v /tmp/clearml_agent.ssh.ppsd9sze:/root/.ssh -v /home/dwhitena/.clearml/apt-cache.1:/var/cache/apt/archives -v /home/dwhitena/.clearml/pip-cache:/root/.cache/pip ...
Okay, I'll make sure we always qoute "
, since it seems to work either way.
We will release an RC soon, with this fix.
Sounds good?
Hi GrievingTurkey78
I'm assuming similar to https://github.com/pallets/click/
?
Auto connect and store/override all the parameters?
Hmm you mean how long it takes for the server to timeout on registered worker? I'm not sure this is easily configured
Hi ShakyJellyfish91
It seems clearml is using a single connection, that takes a long time download
Hmm, I found this one:
https://github.com/allegroai/clearml/blob/1cb5dbb276026644ae20fef63d58256cdc887818/clearml/storage/helper.py#L1763
Does max_connections=10
mean 10 concurrent connections ?
Hi @<1657918706052763648:profile|SillyRobin38>
In the
preprocess.py
files, we will have so many similar lines which is not good.
Actually the clearml-serving supports also directories, i.e. you can package an entire module as part of the preprocess, which would be easier for your code
Another option is to package your code in a python package and have that installed on the container (there is a special env var that allows you to add those to the serving container)
...
Train Data Params/a = {} Train Data Params/b = ...
Then maybe we could "hack" it so that if you edit it in the UI like so:Train Data Params/a = {'new': 'value'} Train Data Params/b = ...
You end up withparam = {'a': {'new': 'value'}, 'b' : ... }
What do you think?
Yey @ https://app.slack.com/team/U01CJ43KX2N this one does not work!
Give me a minute I'll
Hi RoughHedgehog31
I'm assuming your git diff is just too big to be stored as is (probably some binary files)
it should not really have any effect on the execution, it just means the clearml-agent will not be able to reproduce the uncommitted changes.
Make sense ?
WittyOwl57 what about? vm.max_map_count
echo "vm.max_map_count=262144" > /tmp/99-clearml.conf
sudo mv /tmp/99-clearml.conf /etc/sysctl.d/99-clearml.conf
sudo sysctl -w vm.max_map_count=262144
sudo service docker restart `https://clear.ml/docs/latest/docs/deploying_clearml/clearml_server_linux_mac (5)
Oh that makes sense.
So now you can just get the models as dict as well (basically clearml allows you to access them both as a list, so it is easy to get the last created, and as dict so you can match the filenames)
This one will get the list of modelsprint(task.models["output"].keys())
Now you can just pick the best onemodel = task.models["output"]["epoch13-..."] my_model_file = model.get_local_copy()
but I'm pretty confident it was the size of the machine that caused it (as I mentioned it was a 1 cpu 1.5gb ram machine)
I have the feeling you are right 🙂
GiganticTurtle0 is it just --stop that throws this error ?
btw: if you add --queue default
to the command line I assume it will work, the thing is , without --queue it will look for any queue with the "default" tag on it, since there are none, we get the error.
regardless that should not happen with --stop
I will make sure we fix it
Just so we do not forget, can you please open an issue on clearml-agent github ?
${PWD} works!
This will be resolved every call to Task.init (so I would recommend against it), how about "$HOME/" ?
Sure 🙂
BTW: clearml-agent will mount your host .ssh into the docker to /root/.ssh by default.
So no need to do that manually
Or did you mean I can couple a short "mini config" with the package and redirect clearml to use this local one (instead of the one at ~/clearml.conf)?
Actually yes, you can set a "fixed" config point to it with ENV variable, then setup per user just the access/secret .
wdyt?
(I was also pointing to the fact you do not have to use clearml-init you can create a simple partial config template and let user just fill in the missing "key"/"secret")
You will have to build your own docker image based on that docker file, and then update the docker compose
, it's just a custom module.
Is this your own module ? Is this a local folder we import from ?
Hi SmugLizard25 I was able to test and it seems that style is being ignored by the FE 😞
I passed to FE guys to make sure it is fixed in the next version.
Notice this is just for tables, anything else works as expected (i.e. styling any other type of plot)