Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hello People

Hello people 🙂

I wanted to use the client API of clearml to do some analysis on the metadata of my experiments . For instance I would like to get the duration of some tasks I filter. What do you think is the most "clearml-ic" way of doing that?

  
  
Posted 2 years ago
Votes Newest

Answers 16


you can also just create a venv and run the tests there (with the latest python package) ?

  
  
Posted 2 years ago

so I guess this could be one reason to start about thinking upgrading ....

Wait you mean the clearml-server ? (there is no reason not to upgrade the python package)

  
  
Posted 2 years ago

FreshKangaroo33 you can:
from time import time Task.query_tasks(..., task_filter=dict(started=['<{}'.format(datetime.utcfromtimestamp(time())), ]))I think this should work

  
  
Posted 2 years ago

Hmm not sure, try the latest anyhow 🙂

  
  
Posted 2 years ago

Thanks a lot! I will give it a try!

  
  
Posted 2 years ago

Thank CostlyOstrich36 , maybe my question was a bit too broad. So imagine I have a set of tasks that has the same name, for instance
task_name = "Prepocess data"

Imagine this task run in every ML project I have. As an MLOps I would like to have some insights, like maybe know how long does it take for "Prepocess data" to run per project.

I can use the API, fetch them, use a for loop, get when the the task finished with ( Task.get_last_update ), but how do I get when it started? Are there more elegant way to do that?

Thanks a lot in advance :)

  
  
Posted 2 years ago

Hi NaughtyHorse47 🙂
I guess it really depends on what you want to do.
Specifically for you, if you want the duration of a task you can use task.get_by_id and look in the response for that info.

  
  
Posted 2 years ago

Is it something on the most recent version of clearml ? I am using clearml==1.0.5 and it seems not to work ... so I guess this could be one reason to start about thinking upgrading ....

  
  
Posted 2 years ago

nothing against clearml, but it is more a general practice I tend to have, where stable sometimes is better than newer!

I remember we had 1/2 issues when we upgrade at first place (when you release the 1.0.0 , and all the ..1, ..2, ..3 ) ... anyway I will see how to do that!

  
  
Posted 2 years ago

I think that data is all returned by tasks.get_by_id or you can get all tasks from a certain project with a certain name and then dig into that data

  
  
Posted 2 years ago

I usually use this 2 syntax to get the lists of tasks:

option 1
from clearml import Task custom_task_filter = {...} tasks_list = Task.get_tasks( task_filter=, task_name=name_custom_tasks )
option 2
from clearml.backend_api.session.client import APIClient client = APIClient() tasks_list_via_api = client.tasks.get_all( ...)
In both case if I get the element from the list, I am not able to get when the task started. Where is info stored?

  
  
Posted 2 years ago

Like I am saying from your code that the clearml.task.Task.started() does not return me the datetime. could I get the starting time info from some hidden attributes?

def started(self, ignore_errors=True, force=False): # type: (bool, bool) -> () """ The signal that this Task started. """ return self.send(tasks.StartedRequest(self.id, force=force), ignore_errors=ignore_errors)

  
  
Posted 2 years ago

For example, in the response of tasks.get_by_id you get the data in data.tasks.0.started
and
data.tasks.0.completed
I hope this helps 🙂

  
  
Posted 2 years ago

In both case if I get the element from the list, I am not able to get when the task started. Where is info stored?

If you are using client.tasks.get_all( ...) should be under started field
Specifically you can probably also do:
queried_tasks = Task.query_tasks(additional_return_fields=['started']) print(queried_tasks[0]['id'], queried_tasks[0]['started'],)

  
  
Posted 2 years ago

Thanks a lot AgitatedDove14 !!!

I have a question, is there a way I can filter tasks based on the started time?

I guess I can not do it direclty from the task_filter (via Task.get_all or via , Post task.get_all() ) ... so I can simply use your suggestion to get that!

  
  
Posted 2 years ago

yeah I will do that!! anyway as usual, thanks a lot Martin !! maybe it would be nice in future release to add the duration attribute in the return of the API, as it shows in UI 🙂

  
  
Posted 2 years ago
913 Views
16 Answers
2 years ago
one year ago
Tags