Don't know whether I do something wrong. Locally it works, but when executed via queue I get:
` File "run_task.py", line 14, in <module>
main()
File "run_task.py", line 9, in main
printme = importlib.import_module("some_package.file_to_import").printme
File "/home/tim/.clearml/venvs-builds/3.7/lib/python3.7/importlib/init.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1006, in _gcd...
I just realized that I forgot again that I am using importlib and this is probably why everythings weird. I tried to reproduce the error was a smaller project and was not able to get the error again. Sorry for having wasted your time! 😐
So the environment variables are not set by the clearml-agent, but by clearml itself
Hi Jake, thank you very much for the suggestion. I will try that!
I see, I just checked the logs and it showsurllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f246f0d6c18>: Failed to establish a new connection: [Errno 111] Connection refused [2022-04-29 08:45:55,018] [9] [WARNING] [elasticsearch] POST [status:N/A request:0.000s]Unfortunetely, there are no logs in /usr/share/elasticsearch/logs to see what elastic was up to
I don't think so. It is related to issue with the clearml-server I posted in the other thread. Essentially the clearml-server hangs, then I restart it with docker-compose down && docker-compose up -d and the experiments sometimes show as running, but on the clearml-agents I see that actually nothing is running or they show as aborted.
I know that usually clearml-agents do not abort on server restart and just continue.
One thing I want to add: Maybe you disabling deletion of artifacts if file-server deletion fails. Doesn't make sense that we cannot track existing files if something goes wrong.
I am pretty sure there is a flag in the clearml.conf where you can specify which python binary to use.
In the first run the package only existed because it is preinstalled in the docker image. Afaik, in the second run it is also preinstalled, but pip will first try to resolve it and then see whether it already exists. But I am not to sure about this.
The debug samples? or the artifacts/models?
Both.
Yes, change the Task's output destination in the UI (or programmatically)
This has no effect. I am not able to change the files_sever, e.g. I can not change from None to None
If my files_server is None , it will always look there no matter what I set as output destination.
Oh, I did not see the answer. Thank you very much. I was just wondering whether sync/async could lead to higher runtimes when doing a lot of remote logging compared to local logging.
WebApp: 1.2.0-153 • Server: 1.2.0-153 • API: 2.16
I see the problem, it should be 1.4 for the latest, right?
Let me check what I did wrong while upgrading.
I just tried to envrionment setup steps that clearml-agent is doing locally, but with my environment.yml instead of the one that clearml generates.
clearml will register conda packages that cannot be installed if clearml-agent is configured to use pip. So although it is nice that a complete package list is tracked, it makes it cumbersome to rerun the experiment.
You suggested this fix earlier, but I am not sure why it didnt work then.
AnxiousSeal95 This bug seems to be affecting me. I just tried forcing clearml-agent to install clearml-agent==1.4.1 in the docker and now it works.
Btw: clearml-agent uses pip install clearml-agent -U to install clearml-agent in the docker. However, instead of using the newest clearml-agent it should use the version that the host machine is using to run clearml-agent in my opinion.
One last question then I have everything solved: Is it possible to pass clearml the files to analyze manually? For example my setup consists of a run_this.py and this_should_be_run_A.py and this_should_be_run_B.py . I can then programmatically choose which file to import with importlib. Is there a way to tell clearml programmatically to analyze the files, so it can built up the requirements correctly?
I colleague fixed my server and I can confirm, that the fix works!
However, to use conda as package manager I need a docker image that provides conda.