For anyone following along, my lesson was configuring the clearml-agent daemon with the --docker
flag to instruct it to spawn tasks in containers (and using the docker
arg passed through to my Pipeline component)
What’s interesting to me (as a ClearML newbie) is it’s clearly compiling that wheel using my host machine (MacOS).
Hmm kind of, and kind of not.
If you take a look at the Tasks created (regardless on how they are created,. pipeline, manually, etc.), you have a list of python packages required by the code, as they are detected at runtime (i.e. when the code was first executed, on the development machine). When creating a Pipeline controller (runner), the pipeline Tasks are just lists, and package version are listed based on the Machine running the initial pipeline (in your case Mac), the reason is so at least we have a version pf the packages (if exist) that will be working for you Yes you are correct, there should not be a connection between the runner machine and the remote machine, that said we do want to be able to specify the required packages and usually python packages are available on most OS distro. If we were not auto-detecting them, then you would have had to specify them manually, which you can also do and it will override the packages it detected. Does that make sense ?
Just threw a new file into the gist above
Not sure what I'm seeing there, but it definitely does not include the error.
If it helps you can DM me the full log (btw: all pass/secrets are automatically masked from the log, but I would double chech just in case 😉 )
Just threw a new file into the gist above
It doesn’t look like it even gets to the point where it installs from the numpy wheel (because it errors out installing Pillow elsewhere).
What’s interesting to me (as a ClearML newbie) is it’s clearly compiling that wheel using my host machine (MacOS).
I would have expected there to be separation between the “pipeline runner” if you will and the task. I would expect the pipeline runner to only need a dependency on ClearML and for the task to be spawned as a container with numpy
installed (Linux in this case)
Hmm maybe different numpy version? ( numpy==1.22.1
maybe the Task needs a diff version) ? Can you post the Task log ?
Pretty standard global install
https://gist.github.com/stevenhoelscher/0d345e26630e7d16ab76802871c39bd5
Could it be these packages (i.e. numpy etc) are not installed as system packages in the docker (i.e. inside a venv, inside the docker) ?
Right, my only complaint is it appears to be using cached wheels and building them (for packages like numpy
, scipy
, etc) even though numpy
is available in the Python runtime env
Even if you had any packages, I'm pretty sure there is nothing for you to worry about, it will just list them, and if they are preinstalled, the preinstalled will be used
If this is the case, there is nothing you need to change, just provide the docker image (no need to pass packages
)
Thanks, my pipeline script only takes a dependency on clearml
as well as an internal library (local Python module installed into the Docker image) that provides the _train_and_evaluate
function as seen above
Funny enough I’m running into a new issue now.
Sorry my bad, I thought have known 😉 yes it probably should be packages=["clearml==1.1.6"]
BTW: do you have any imports inside the pipeline function itself ? if you do not, then no need to pass "packages" at all, it will just add clearml
Thank you! I adjusted my pipeline logic so that the component used packages=[]
Funny enough I’m running into a new issue now. Does this mean I need to configure the Agent’s runtime environment so it has the necessary dependencies to execute Pipeline script?
` # Agent Logs
Starting Task Execution:
Traceback (most recent call last):
File "/Users/developer/.clearml/venvs-builds/3/code/train_and_evaluate.py", line 1, in <module>
from clearml import Task, TaskTypes
ModuleNotFoundError: No module named 'clearml' $ head -10 ~/.clearml/venvs-builds/3/code/train_and_evaluate.py
from clearml import Task, TaskTypes
from clearml.automation.controller import PipelineDecorator
def train_and_evaluate():
_train_and_evaluate()
if name == 'main':
task = Task.init() `
Hi WickedStarfish97
As a result, I don’t want the Agent to parse what imports are being used / install dependencies whatsoever
Nothing to worry about here, even if the agent detects the python packages, they are installed on top of the preexisting packages inside the docker. That said if you want to over ride it, you can also pass packages=[]