Hi StickyMonkey98
I'm (again) having trouble with the lack of documentation regarding Task.get_tasks(task_filter={STUFF}).
Yes we really have to add documentation there... Let me add that to the todo list
How do I filter tasks by time started? It seems there's a "started" property, and the web ui uses "started" as a key-word in the url query, but task_filter results in an error when I try that...Is there some other filter keyword for filtering by start-time??
last 10 started Taskstask_filter={'order_by': ['-started'], 'page_size': 10, 'page': 0,}
How do I more generally build filters, without bothering the precious people of this community, considering I can't find proper documentation? I'm kinda hoping for a pointer the docs that I just missed, but other answers might be helpful as well of course...
This is a tough one ... The documentation lacks these internal queries 😞
Basically the intuition is looking at the actual RestAPI capabilities
The task_filter
argument will pass the keys directly into the get_all request (hence the original cryptic reference to the RestAPI object)
https://clear.ml/docs/latest/docs/references/api/endpoints#post-tasksget_all
BTW: StickyMonkey98 if you feel like writing a few examples I think it will be easy to push into the docs, so that at least we improve iteratively...
Looks good, and of course that's the kind of thing I tried very early on. But I'm still getting the same "unsupported keyword arguments" error, despite the allow-extra-fields thing...
At the moment I'm querying by paging through the tasks as you recommended, and then filtering with standard python list-comprehension filters...Which is less than ideal.
At least let's do that better:
Use Task._query_tasks:Task._query_tasks(order_by=['-started'], page_size=10, page=0, only_fields=['id', 'started'])
You will get "lighter" objects returned, then you can filter them with code (but the request will be a lots faster)
SuccessfulKoala55 any suggestion on improving that ?
Well, for instance, it would be nice to mimic the url-query style of -
started:2021-09-03T07:00:00%2B2021-09-10T05:32:00
At the moment I'm querying by paging through the tasks as you recommended, and then filtering with standard python list-comprehension filters...Which is less than ideal.
Try something like:task_filter={'order_by': ['-started'], 'started': ['2021-08-01T00:00:00', '2021-09-01T00:00:00'], '_allow_extra_fields_': True}
What I'm trying to do is to filter is between two datetimes...Is that possible?
could you expand ?
Ok. Thanks.
Until I get to that, is there an example somewhere regarding using the "all" option?
My (temporary) work-around is using multiple filter queries, but I don't think I'm getting the filters to work as "and" instead of "or"...
So something like this -
tag_filtered_training_tasks = Task.get_tasks(task_filter={'tags': [tag1, tag2, tag3], 'type': ['training']})
should result in very very few tasks, but that's not what seems to be happening.
(I've tried different guesses regarding the usage of "all", but only managed to run into a variety of errors...)
I think this _allow_extra_fields_
was just recently introduced
What version of ClearML SDK are you using?
Huh. Should have guessed 😉
Thanks! Hopefully I'll update to the latest version and get rid of the many lines of work-around code...
Making it into a nicer interface is on our TODO list 🙂
Thanks!
What I'm trying to do is to filter is between two datetimes...Is that possible?
If I get something sensible going I'll share for documentation of course.
Yeah, this support is still not very well documented (and we're working on replacing it with a better interface), but if you want AND/ALL
relation, you should do:tag_filtered_training_tasks = Task.get_tasks(task_filter={'tags': ["__$all", tag1, tag2, tag3], 'type': ['training']})