Reputation
Badges 1
611 × Eureka!And the files that I see on github are the default configuration of the server, even if I do not have these files in my installation, right?
Thank you, perfect! I did not try yet, but wil do now.
I have my development machine where I develop for multiple projects. I want to configure clearml differently based on the project. Similar to .vscode , .git or .idea at the project level.
I am currently on the move, but it was something like upstream server not found in /etc/nginx/nginx.conf and if I remember correctly line 88
When I go into the GUI there are no artifacts displayed.
In the beginning my config file was not empty 😕
Don't know whether I do something wrong. Locally it works, but when executed via queue I get:
` File "run_task.py", line 14, in <module>
main()
File "run_task.py", line 9, in main
printme = importlib.import_module("some_package.file_to_import").printme
File "/home/tim/.clearml/venvs-builds/3.7/lib/python3.7/importlib/init.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1006, in _gcd...
I just realized that I forgot again that I am using importlib and this is probably why everythings weird. I tried to reproduce the error was a smaller project and was not able to get the error again. Sorry for having wasted your time! 😐
So the environment variables are not set by the clearml-agent, but by clearml itself
Hi Jake, thank you very much for the suggestion. I will try that!
I see, I just checked the logs and it showsurllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f246f0d6c18>: Failed to establish a new connection: [Errno 111] Connection refused [2022-04-29 08:45:55,018] [9] [WARNING] [elasticsearch] POST [status:N/A request:0.000s]Unfortunetely, there are no logs in /usr/share/elasticsearch/logs to see what elastic was up to
I don't think so. It is related to issue with the clearml-server I posted in the other thread. Essentially the clearml-server hangs, then I restart it with docker-compose down && docker-compose up -d and the experiments sometimes show as running, but on the clearml-agents I see that actually nothing is running or they show as aborted.
I know that usually clearml-agents do not abort on server restart and just continue.
One thing I want to add: Maybe you disabling deletion of artifacts if file-server deletion fails. Doesn't make sense that we cannot track existing files if something goes wrong.
I am pretty sure there is a flag in the clearml.conf where you can specify which python binary to use.
In the first run the package only existed because it is preinstalled in the docker image. Afaik, in the second run it is also preinstalled, but pip will first try to resolve it and then see whether it already exists. But I am not to sure about this.
The debug samples? or the artifacts/models?
Both.
Yes, change the Task's output destination in the UI (or programmatically)
This has no effect. I am not able to change the files_sever, e.g. I can not change from None to None
If my files_server is None , it will always look there no matter what I set as output destination.
Oh, I did not see the answer. Thank you very much. I was just wondering whether sync/async could lead to higher runtimes when doing a lot of remote logging compared to local logging.
WebApp: 1.2.0-153 • Server: 1.2.0-153 • API: 2.16
I see the problem, it should be 1.4 for the latest, right?
Let me check what I did wrong while upgrading.
Btw: It is weird that the fileservers are directly exposed, so no authentication through the webserver is needed. Is this something that is different in the paid version or why is it like that in the open-source version?
I just tried to envrionment setup steps that clearml-agent is doing locally, but with my environment.yml instead of the one that clearml generates.
With clearml==1.4.1 it works, but with the current version it aborts. Here is a log with latest clearml