Reputation
Badges 1
49 × Eureka!We are running workers as bare metal and clearml-server on Kubernetes. I was trying to find, what are those min and max value for above metrics.
What do you mean by how much is reserved ? Are you running with an agent?
CostlyOstrich36 May I know what are the request params to get task urls, tasks.task_urls()
😕 I will using docker_image python:3.9-alpine
I am running a basic python script. Need to use clearml to use it's API
oh! yeah. That worked out. Thanks a lot.
How do I know what are possible options for status? Same for other parameters.
I don't see those in documentation.
https://clear.ml/docs/latest/docs/references/api/tasks#post-tasksget_all
My goal is to detect events when task does not uses allocated resources (e.g. GPU) for some period of time.
I am still trying to understand clearml api response.
Do you have any clue how can I get it from client.tasks.get_all(status=["in_progress"]) ?
If task has GPU allocated but not using it, would it be in in_progress status also? I want to collect those task.
I see task runtime info. I guess it's current utilization not allocation but not sure.
"runtime": {
"progress": "0",...
so I think it depends if python:3.9-alpine use x86
which I believe it might be probably
same error for tasks.get_all() endpoint
Yeah exactly. Scalar tab have those but I need to add track in the alert if GPU utilization/gpu memory not in use and experiment in progress then alert. Can I get gpu usage over time frame via API also?
I need to use this image in kubernetes
AgitatedDove14 I found it's the issue with pycryptodome 😕
Error started coming from here. Maybe specific version of it. Digging more.
` #13 101.0 note: This error originates from a subprocess, and is likely not a problem with pip.
#13 101.0 ERROR: Failed building wheel for pycryptodome
#13 101.0 Running setup.py clean for pycryptodome
#13 104.9 Building wheel for numpy (pyproject.toml): started
#13 158.5 Building wheel for numpy (pyproject.toml): finished with status 'er...
Still failing but I know the root cause though.
Thanks SuccessfulKoala55
Also, https://clear.ml/docs/latest/docs/references/api/tasks#post-tasksget_all gives me userID
<tasks.Task: {
"id": "xxx",
"name": "Interactive Session",
"user": "6cfef1d32",
"company": "xxxx",
"type": "application",
"status": "in_progress",
.....
}
I can't find api endpoint to get user name from user_id like "user": "6cfef1d32" above.
SuccessfulKoala55 Yeah, that's possible but then I don't get any firewall will block only one endpoint response. I tried both workers.get_all() and get_stats(), both worked.
Can you share the snippet you used for tasks.get_all() ?
` from clearml.backend_api.session.client import APIClient
from time import time
Create an instance of APIClient
client = APIClient()
tasks = client.tasks.get_all() `This is what I used.
Doc mentions required request Body parameter type. Do I need to add this...
I see it now.
` "5451af93e0bf68a4ab09f654b222ccae": {
"1b790a3da2e8d6cd939cf271694fe81b": {
"metric": ":monitor:gpu",
"variant": "gpu_0_utilization",
"value": 0.0,
"min_value": 0.0,
"max_value": 3.542
},
"409d4e6ad9b69b3224fceeac6e265ddc": {
"metric": ":monitor:gpu",
"variant": "gpu_0_mem_used_gb",
"value": 0.0,
...
` from clearml.backend_api.session.client import APIClient
Create an instance of APIClient
client = APIClient()
users = client.users.get_all() `
I get
Traceback (most recent call last): File "get_all_users.py", line 13, in <module> users = client.users.get_all() AttributeError: 'APIClient' object has no attribute 'users'
Although,user = Task._get_default_session().send_request("users", "get_all", json={"id": [user_id]})
did the work.
I see. I am getting error in html output.
<noscript>Please enable JavaScript to continue using this application.</noscript>
Nice. Does it read those automatically just like it does from ~/clearml.conf? I don't need to call these environment variables within the Python code?
This worked out.
`
from clearml.backend_api.session.client import APIClient
Create an instance of APIClient
client = APIClient()
project_list = client.workers.get_all()
print(project_list) `
https://clear.ml/docs/latest/docs/references/api/definitions#taskstask_urls
tasks.task_urls()
It doesn't mention request parameters in this.
` from clearml.backend_api.session.client import APIClient
from clearml import Task
Create an instance of APIClient
client = APIClient()
tasks = client.tasks.task_urls()
print (tasks) Traceback (most recent call last):
File "get_all_users.py", line 15, in <module>
tasks = client.tasks.task_urls()
AttributeError: 'Tasks' object has no attr...
Eg. To query tasks that are both Running --> You mean status=["in_progress"] ?? How do I figure out other possible parameter I can use with status parameter?
Another,
Filter only tasks that start say 10 min ago . Is there any parameter for it also ?
I found system_tags and all the metrics including CPU but can't find any field mentions GPU scalar reported or GPU utilization.
I see. Dev tools is useful here for finding api endpoints used for the data and
https://github.com/allegroai/clearml/blob/master/clearml/task.py#L987 what I was looking for.
Thanks
Phew. Make sense. I am testing it by updating FROM in dockerfile.
Fingers crossed.
Worked with Bullseye image. Thanks for the suggestion.
Thanks for the reply.
So, all requests should be post even when I'm trying to collect data?
For https://clear.ml/docs/latest/docs/references/api/workers#post-workersget_activity_report
when I am trying this
`
import requests, json
from requests.auth import HTTPBasicAuth
request_body = {
"from_date": "1665867715",
"interval": "3600",
"to_date": "1666213315"
}
url = web_server+"/workers.get_activity_report"
response = requests.get(url, auth=HTTPBasicAuth(access_key, secret_key)...