Reputation
Badges 1
606 × Eureka!Don't know whether I do something wrong. Locally it works, but when executed via queue I get:
` File "run_task.py", line 14, in <module>
main()
File "run_task.py", line 9, in main
printme = importlib.import_module("some_package.file_to_import").printme
File "/home/tim/.clearml/venvs-builds/3.7/lib/python3.7/importlib/init.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1006, in _gcd...
Yes, I did not change this part of the config.
AgitatedDove14 fyi I think this is the issue I have: https://stackoverflow.com/a/65526944/3038183
AnxiousSeal95 Thanks a lot. Seems to be working fine for me. I see the clearml-agent version that pip installs in the docker is now fixed to the host version 🙂 PyTorch Nightly is also installed correctly now!
Related to this: How does the local cache/agent cache work? Are the sdk.storage.cache
parameters for the agent? When are datasets deleted from cache? When are datasets deleted if I run local execution?
WebApp: 1.2.0-153 • Server: 1.2.0-153 • API: 2.16
I see the problem, it should be 1.4 for the latest, right?
Let me check what I did wrong while upgrading.
No no, I was just wondering how much effort it is to create something like ClearML. And your answer gives me a rough estimate 🙂
Quick question: Where again does clearml place the venv? I wanna take a look into it after the task has failed
channels:
- defaults
- conda-forge
- pytorch
dependencies:
- cudatoolkit==11.1.1
- pytorch==1.8.0
Gives CPU version
Ah, now I see. This sounds like a good solution.
Locally it works fine.
To answer my own question: In the WebUI where one inputs the credentials, use https
for the host instead of the auto-added http
It is weird though. The task is submitted by the original user and then run on the agent. The task however is still registered by the original user, since it is created by the original user.
Makes more sense to just inherit the user from the task than from the agent?
@<1523701994743664640:profile|AppetizingMouse58> Thank you very much. I forgot the volume mapping.
So can I just add the config to the async_delete container and mirror the directory structure from github?
volumes:
- /opt/clearml/config:/opt/clearml/config
- /opt/clearml/logs:/var/log/clearml
The package is just subdir by the way. So it should not be in installed packages anyways, right?
Perfect, just what I always wanted. Looking forward to the MinIo version. Thank you:)
==> 2021-03-11 13:54:59 <==
# cmd: /home/tim/miniconda3/condabin/conda create --yes --mkdir --prefix /home/tim/.clearml/venvs-builds/3.8 python=3.8
# conda version: 4.9.2
+defaults/linux-64::_libgcc_mutex-0.1-main
+defaults/linux-64::ca-certificates-2021.1.19-h06a4308_1
+defaults/linux-64::certifi-2020.12.5-py38h06a4308_0
+defaults/linux-64::ld_impl_linux-64-2.33.1-h53a641e_7
+defaults/linux-64::libedit-3.1.20191231-h14c3975_1
+defaults/linux-64::libffi-3.3-he6710b0_2
+defaults/linux-64...
Thanks, I will look into it. For me the weird thing is that saving works and only deletion fails somehow.
Yes, that works fine. Just the http vs https was the problem. The UI will automatically change s3://<minio-address>:<port>
to
http://<minio-address>:<port>
in http://myclearmlserver.org/settings/webapp-configuration . However what is needed for me is https://<minio-address>:<port>
Based on https://github.com/lanpa/tensorboardX/blob/34d1616c035faaa0f3f7c9d19cb8bb4425f19939/tensorboardX/summary.py#L355 I would guess that it is already encoded before added to the tensorboard summary.
And the files that I see on github are the default configuration of the server, even if I do not have these files in my installation, right?
You can add and remove clearml-agents to/from the clearml-server anytime.
Maybe if you have time you can take a look at the log I posted in the beginning. I think I have the same extra_index_url
and the nightly flag activated 😕
Thank you very much. I tested it on a different machine now and it works like intended. So there must be something misconfigured with this one machine.