I've also followed https://clearml.slack.com/archives/CTK20V944/p1628333126247800 but it did not help
Hi UnevenDolphin73
Took a long time to figure out that there was a specific Python version with a specific virtualenv that was old ...
NICE!
Then the task requested to use Python 3.7, and that old virtualenv version was broken.
Yes, if the Task is using a specific python version it will first try to find this one (i.e. which python3.7
) then use it to create the new venv
As a result -> Could the agent maybe also output the
virtualenv
version used with setting up the environment for the first time?
You mean into the Log ? (I think you have all the log from the get go, no?!)
EDIT:
I also tried setting
agent.python_binary: "/usr/bin/python3.8"
but it still uses Python 2.7?
I think that in docker mode this argument is ignored, as the assumption is it will break inside diff containers.
It's enough that the task requirements will contain this latest version
Same result 😞 This is frustrating, wtf happened :shocked_face_with_exploding_head:
This is also specifically the services queue worker I'm trying to debug 🤔
So where should I install the latest clearml version? On the client that's running a task, or on the worker machine?
I also tried setting agent.python_binary: "/usr/bin/python3.8"
but it still uses Python 2.7?
I'm using 1.1.6 (upgraded from 1.1.6rc0) - should I try 1.1.7rc0 or smth?
Still failing with 1.2.0rc3 😞 AgitatedDove14 any thoughts on your end?
Will try!
Curious - is there a temporary changelog for 1.2.0? 😁 Always fun to poke at the upcoming features
EDIT: Wait, should the clearml RC be installed outside the venv for the agent as well?
UnevenDolphin73 are you using the latest clearml RC?
I also tried switching to dockerized mode now, getting the same issue 🤔
I'll have yet another look at both the latest agent RC and at the docker-compose, thanks!
There was no "default" services agent btw, just the queue, I had to launch an agent myself (not sure if it's relevant)
was consistent, whereas for some reason this old virtualenv decided to use python2.7 otherwise
Yes,
This sounds like a virtualenv bug I think it will not hurt to do both (obviously we have the information)
Thank you!!! 😍
If it's the services queue worker, than it is running on the server. However, we didn't change anything in its configuration
I know, that should indeed be the default behaviour, but at least from my tests the use of --python ...
was consistent, whereas for some reason this old virtualenv decided to use python2.7 otherwise 🤨
Okay this was a deep dive into clearml-agent code 😁
Took a long time to figure out that there was a specific Python version with a specific virtualenv that was old (Python 3.6.9 and Python 3.8 had latest virtualenv, but Python 3.7.5 had an old virtualenv).
Then the task requested to use Python 3.7, and that old virtualenv version was broken.
As a result -> Could the agent maybe also output the virtualenv
version used with setting up the environment for the first time?
Did you make sure the clearml-agent was not installed in a venv?
One way to circumvent this btw would be to also add/use the --python
flag for virtualenv
If it's the default services agent, you'll need to update the docker-compose for that
No changelog for now, we try to keep the commits clear and ordered 🙂
Wait, should the clearml RC be installed outside the venv for the agent as well?
No need, the agent is totally independent
https://github.com/allegroai/clearml-agent/pull/98 AgitatedDove14 😁
AgitatedDove14
I'll make a PR for it now, but the long story is that you have the full log, but the virtualenv
version is not logged anywhere (the usual output from virtualenv
just says which Python version is used, etc).
Did you try using the latest agent RC? It's 1.2.0rc3
One way to circumvent this btw would be to also add/use the
--python
flag for
virtualenv
Notice that when creating the venv , the cmd that is used is basically pythonx.y -m virtualenv ...
By definition this will create a new venv based on the python that executes the venv.
With all that said, it might be there is a bug in virtualenv and in some cases, it does Not adhere to this restriction
Yes; I tried running it both outside venv and inside a venv. No idea why it uses 2.7?
latest clearml is https://pypi.org/project/clearml/1.2.0rc0/
Any follow up thoughts SuccessfulKoala55 or CostlyOstrich36 ?