Reputation
Badges 1
49 × Eureka!I see. It's showing since experiment started.
We are running workers as bare metal and clearml-server on Kubernetes. I was trying to find, what are those min and max value for above metrics.
What do you mean by how much is reserved ? Are you running with an agent?
Yeah exactly. Scalar tab have those but I need to add track in the alert if GPU utilization/gpu memory not in use and experiment in progress then alert. Can I get gpu usage over time frame via API also?
SuccessfulKoala55 Thanks. Last one. How do use this task_urls ?
same error for tasks.get_all() endpoint
How do I know what are possible options for status? Same for other parameters.
I don't see those in documentation.
https://clear.ml/docs/latest/docs/references/api/tasks#post-tasksget_all
oh! yeah. That worked out. Thanks a lot.
I found system_tags and all the metrics including CPU but can't find any field mentions GPU scalar reported or GPU utilization.
Thanks for the reply. If gpu_0_mem_usage is % of GPU memory in use, what is gpu_0_utilization ?
Is gpu_0_utilization also in % then?
It would be great to have possible fields in the given parameters mentioned here: https://clear.ml/docs/latest/docs/references/api/tasks#post-tasksget_all
Any clue how do I figure out those?
I need to use this image in kubernetes
Phew. Make sense. I am testing it by updating FROM in dockerfile.
Fingers crossed.
Still failing but I know the root cause though.
RUN apk update && apk upgrade --no-cache RUN /usr/local/bin/python3 -m pip install --upgrade pip RUN /usr/local/bin/python3 -m pip install -r requirements.txt
AgitatedDove14 I am upgrade upgrading pip before this. 😕
😕 I will using docker_image python:3.9-alpine
I am running a basic python script. Need to use clearml to use it's API
How can it be even this kind of issue with Python when one endpoint is giving response and other not.
I see. I am getting error in html output.
<noscript>Please enable JavaScript to continue using this application.</noscript>
Exactly. I am trying to create alert for tasks that have GPU/CPU allocated but not utilizing it from x period of time.
So, if task is there, GPU will be allocated to it. I will need to check if task is using it or just idle.
Eg. To query tasks that are both Running --> You mean status=["in_progress"] ?? How do I figure out other possible parameter I can use with status parameter?
Another,
Filter only tasks that start say 10 min ago . Is there any parameter for it also ?
Thanks for the reply.
So, all requests should be post even when I'm trying to collect data?
For https://clear.ml/docs/latest/docs/references/api/workers#post-workersget_activity_report
when I am trying this
`
import requests, json
from requests.auth import HTTPBasicAuth
request_body = {
"from_date": "1665867715",
"interval": "3600",
"to_date": "1666213315"
}
url = web_server+"/workers.get_activity_report"
response = requests.get(url, auth=HTTPBasicAuth(access_key, secret_key)...
I found a lot of questions from past chat in this group including by you related to k8 glue with clearml.
Do you mean it recently become part of enterprise version?
AgitatedDove14
` from clearml.backend_api.session.client import APIClient
Create an instance of APIClient
client = APIClient()
users = client.users.get_all() `
I get
Traceback (most recent call last): File "get_all_users.py", line 13, in <module> users = client.users.get_all() AttributeError: 'APIClient' object has no attribute 'users'
Although,user = Task._get_default_session().send_request("users", "get_all", json={"id": [user_id]})
did the work.
Thanks SuccessfulKoala55
Also, https://clear.ml/docs/latest/docs/references/api/tasks#post-tasksget_all gives me userID
<tasks.Task: {
"id": "xxx",
"name": "Interactive Session",
"user": "6cfef1d32",
"company": "xxxx",
"type": "application",
"status": "in_progress",
.....
}
I can't find api endpoint to get user name from user_id like "user": "6cfef1d32" above.
` # which python
/Users/anuj.tyagi/clearml_api/venv/bin/python
(venv) LMWPRW6F3:clearml_api root# pip freeze | grep clearml
clearml==1.7.2
Traceback (most recent call last):
File "get_all_task.py", line 8, in <module>
print (client.tasks.get_all())
File "/Users/anuj.tyagi/clearml_api/venv/lib/python3.8/site-packages/clearml/backend_api/session/client/client.py", line 422, in get
result=self.session.send(request_cls(*args, **kwargs)),
File "/Users/anuj.tyagi/clearml_api/venv/lib...