Hi NaughtyHorse47 🙂
I guess it really depends on what you want to do.
Specifically for you, if you want the duration of a task you can use task.get_by_id
and look in the response for that info.
Thank CostlyOstrich36 , maybe my question was a bit too broad. So imagine I have a set of tasks that has the same name, for instancetask_name = "Prepocess data"
Imagine this task run in every ML project I have. As an MLOps I would like to have some insights, like maybe know how long does it take for "Prepocess data" to run per project.
I can use the API, fetch them, use a for loop, get when the the task finished with ( Task.get_last_update
), but how do I get when it started? Are there more elegant way to do that?
Thanks a lot in advance :)
I think that data is all returned by tasks.get_by_id
or you can get all tasks from a certain project with a certain name and then dig into that data
I usually use this 2 syntax to get the lists of tasks:
option 1from clearml import Task custom_task_filter = {...} tasks_list = Task.get_tasks( task_filter=, task_name=name_custom_tasks )
option 2from clearml.backend_api.session.client import APIClient client = APIClient() tasks_list_via_api = client.tasks.get_all( ...)
In both case if I get the element from the list, I am not able to get when the task started. Where is info stored?
For example, in the response of tasks.get_by_id
you get the data in data.tasks.0.started
anddata.tasks.0.completed
I hope this helps 🙂
In both case if I get the element from the list, I am not able to get when the task started. Where is info stored?
If you are using client.tasks.get_all( ...)
should be under started
field
Specifically you can probably also do:queried_tasks = Task.query_tasks(additional_return_fields=['started']) print(queried_tasks[0]['id'], queried_tasks[0]['started'],)
Thanks a lot AgitatedDove14 !!!
I have a question, is there a way I can filter tasks based on the started time?
I guess I can not do it direclty from the task_filter
(via Task.get_all
or via , Post task.get_all()
) ... so I can simply use your suggestion to get that!
Like I am saying from your code that the clearml.task.Task.started()
does not return me the datetime. could I get the starting time info from some hidden attributes?
def started(self, ignore_errors=True, force=False): # type: (bool, bool) -> () """ The signal that this Task started. """ return self.send(tasks.StartedRequest(self.id, force=force), ignore_errors=ignore_errors)
FreshKangaroo33 you can:from time import time Task.query_tasks(..., task_filter=dict(started=['<{}'.format(datetime.utcfromtimestamp(time())), ]))
I think this should work
Is it something on the most recent version of clearml
? I am using clearml==1.0.5
and it seems not to work ... so I guess this could be one reason to start about thinking upgrading ....
so I guess this could be one reason to start about thinking upgrading ....
Wait you mean the clearml-server ? (there is no reason not to upgrade the python package)
nothing against clearml, but it is more a general practice I tend to have, where stable sometimes is better than newer!
I remember we had 1/2 issues when we upgrade at first place (when you release the 1.0.0 , and all the ..1, ..2, ..3 ) ... anyway I will see how to do that!
you can also just create a venv and run the tests there (with the latest python package) ?
yeah I will do that!! anyway as usual, thanks a lot Martin !! maybe it would be nice in future release to add the duration attribute in the return of the API, as it shows in UI 🙂