Reputation
Badges 1
606 × Eureka!I see. Thanks for explaining!
Thu Mar 11 17:52:45 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.56 Driver Version: 460.56 CUDA Version: 11.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | ...
Type "help", "copyright", "credits" or "license" for more information.
>>> from clearml_agent.helper.gpu.gpustat import get_driver_cuda_version
>>> get_driver_cuda_version()
'110'
I am still not getting why it is a problem to just update the requirements at any time... 😕
Okay. It works now. I don't know what went wrong before. Probably a user error 😅
Yea, is there a guarantee that the clearml-agent will not crash because it did not clean the cache in time?
At least when you use docker containers the agent will reuse the existing python environment.
Hi SuccessfulKoala55
I meant that in the WebUI deletion should only be allowed for artifacts for which deletion actually works.
For example I now have a lot of lingering artifacts that exist on the fileservers, but not on the clearml-api-server (I think).
Another example: I delete a task via WebUI. ClearML-server tries to delete the task and the artifacts belonging to the task. However, it will show that the task has been successfully deleted but some artifacts have not. Now there is no way...
Just multiple users who do not share their repositories. So sharing with the agent is also not possible.
Any idea why deletion of artifacts on my second fileserver does not work?
fileserver_datasets: networks: - backend - frontend command: - fileserver container_name: clearml-fileserver-datasets image: allegroai/clearml:latest restart: unless-stopped volumes: - /opt/clearml/logs:/var/log/clearml - /opt/clearml/data/fileserver-datasets:/mnt/fileserver - /opt/clearml/config:/opt/clearml/config ports: - "8082:8081"
ClearML successfu...
Setting the api.files_server:
s3://myhost:9000/clearml in clearml.conf
works!
` apiserver:
command:
- apiserver
container_name: clearml-apiserver
image: allegroai/clearml:latest
restart: unless-stopped
volumes:
- /opt/clearml/logs:/var/log/clearml
- /opt/clearml/config:/opt/clearml/config
- /opt/clearml/data/fileserver:/mnt/fileserver
depends_on:
- redis
- mongo
- elasticsearch
- fileserver
- fileserver_datasets
environment:
CLEARML_ELASTIC_SERVICE_HOST: elasticsearch
CLEARML_...
Yes, I am also talking about agents on different machines. I had two agents on the server machine, which also seem to have been killed. The ones on different machines kept working until 1 or 2 minutes after the clearml-server restarted.
Ah, thanks a lot. So for example the CleanUp Service ( https://github.com/allegroai/clearml/blob/master/examples/services/cleanup/cleanup_service.py ) should have no troubles deleting the artifacts.
Ah, okay, that's weird 🙂 Thank you for answering!
[2021-05-07 10:52:00,282] [9] [WARNING] [elasticsearch] POST
` [status:N/A request:60.058s]
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 445, in _make_request
six.raise_from(e, None)
File "<string>", line 3, in raise_from
File "/usr/local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 440, in _make_request
httplib_response = conn.getresponse()
File "/usr/lib64/python3.6/http/client.py", lin...
Is this really working for you guys? I have no clue what's wrong. Seems so unlikely that my code works with artifacts, datasets, but not logging...
AgitatedDove14 Yes, you understood correctly. But Task.create
is used by Task.init
something like this, right?
` def init(project_name, task_name):
if not Task.exists_already(project_name, task_name):
task = Task.create(...)
else:
task = load_existing_task()
return task `
@<1576381444509405184:profile|ManiacalLizard2> Maybe you are using the enterprise version with the vault? I suppose the enterprise version is running differently, but I dont have experience with it.
For the open-source version, each clearml-agent is using it's own clearml.conf