okay cool, I'm currently trying to migrate our stack to run from the git repository and using ClearML Datasets. I am still having an issue with relative imports in python, we were previously modifying PYTHONPATH
in the container, but now I need to modify it manually on the host. I saw there is some documentation about that here , but I'm not sure I understand that correctly since it does not look like it is getting picked up by the task
Hi @<1644147961996775424:profile|HurtStarfish47> , to handle this, you'll need to fun your code from inside your cloned git repository folder, and the ClearML SDK will auto detect it and log it on the task. When the ClearML Agent will run your code inside a container, it will clone the repository and install any required dependencies.
AWS credentials can be configured in the agent's clearml.conf
file (and it will pass them on to the task).
Any data obtained from outside the code can either be obtained dynamically by your code (i.e. downloaded from somewhere), or you can simply make sure it already exists inside the docker container - of course, you can simply use ClearML Datasets to store the dataset, and than get a local copy of it using your code.
As for pypi packages from private git repos, you can either add a direct reference to the repo in a requirements.txt file in your repo (a link complete with credentials), or have them preinstalled on the main system python inside the docker container.
This just related to logging. In general, when trying to do a proper remote execution setup relative imports from patched python paths are never stable and not recommended - I really think using a well-structured git repository (including submodules, if required) is the way to go