Hi @<1523701323046850560:profile|OutrageousSheep60> , the only thing you need to do in order to speed things up is start with a docker image that has most of the packages you need preinstalled - once you have that, when using this docker image, the agent can create a venv that inherits the system packages installed, thus avoiding re-download and installation of packages. Creating the venv itself is very fast.
How can we run a task in a docker container without having to configure it manually?
Assumptions with ClearML Agent:
To run an experiment using ClearML Agent in a Docker container, with a cached Python environment (to avoid repeated installations), the following configuration is required:
- ClearML's environment caching feature currently supports only the
venvenvironment, not the default Python image environment, which limits its usefulness.
- For runtime environment customization:- - Usage of the
-e CLEARML_AGENT_SKIP_PIP_VENV_INSTALL=/usr/local/bin/pythonflag to skip virtual environment creation, is only relevant if a valid Python environment path within the Docker container is provided.
- Usage of the
-e CLEARML_AGENT_SKIP_PYTHON_ENV_INSTALL=1flag to prevent ClearML from creating a new Python environment.
Unfortunately, we were not able to create a simplified workflow for dynamically creating a valid Python environment within a Docker container without manual intervention. Our Data Scientist would need to manually create the Python environment, add the Python path to the ClearML task runtime environment, push the Docker image to a registry accessible to the ClearML Agent, and run the task using the new Docker image with the provided
- Usage of the
- Is there an alternative approach to create a simpler workflow to the steps mentioned above?
- Are there any other potential improvements to enhance the workflow?