Reputation
Badges 1
611 × Eureka!I am using https://hub.docker.com/layers/nvidia/cuda/11.8.0-base-ubuntu22.04/images/sha256-88b85c6edd089acdf0cb7f3be020a1e812b009bafaf92c1715ab6677bd997ef1?context=explore
which has python 3.10.6 if I remember correctly.
Would it help you diagnose this problem if I ran conda env create --file=environment.yml and see whether it works?
Actually, my current approach looks like this:
carla-server-task : Launch carla server instance on a random port, set the port as param and then block the task/process, so I can kill carla when this task is aborted. This task keeps running the whole time.
start-carla-task : Launch a carla-server-task and wait for the port parameter to be set. Set the launched carla-server-task task-id and the port as param. Set task completed.
main-task : Run experiment when all start-carla-task are...
name: core
channels:
- pytorch
- conda-forge
- defaults
dependencies:
- _libgcc_mutex=0.1
- _openmp_mutex=4.5
- blas=1.0
- bzip2=1.0.8
- ca-certificates=2020.12.5
- certifi=2020.12.5
- cudatoolkit=11.1.1
- ffmpeg=4.3
- freetype=2.10.4
- gmp=6.2.1
- gnutls=3.6.13
- jpeg=9b
- lame=3.100
- lcms2=2.11
- ld_impl_linux-64=2.33.1
- libedit=3.1.20191231
- libffi=3.3
- libgcc-ng=9.3.0
- libiconv=1.16
- libpng=1.6.37
- libstdcxx-ng=9.3.0
- libtiff...
However, to use conda as package manager I need a docker image that provides conda.
In my case I use the conda freeze option and do not even have CUDA installed on the agents.
I created an github issue because the problem with the slow deletion still exists. https://github.com/allegroai/clearml/issues/586#issue-1142916619
Unfortunately, I do not know that. Must be before October 2021 at least. I know I asked here how to use the preinstalled version and AgitatedDove14 helped me to get it work. But I cannot find the old thread 😕
I installed as told on pytorch.org : pip3 install --pre torch torchvision torchaudio --index-url None
I usually also experience no problems with restarting the clearml-server. It seems like it has to do with the OOM (or whatever issue I have).
It seems like this is a bug however or is something like this to be expected? There shouldn't be files that are not shown in the WebUI..?
If I understood correctly, if I tried to print(os.environ["MUJOCO_GL"]) after the clearml Task is created, this should be set?