Some examples of the mess it creates (also posted in the main channel):
A single project now has multiple subprojects The subprojects have the .datasets
hidden subproject (with really frustrating project names) The subprojects are empty To access the original project, I have to go twice into the same project because of these hidden projects Because of these hidden subprojects, I cannot delete a project that has 0 experiments
SmugDolphin23 we've been working with this for 2 weeks now, and it creates a lot of junk in our UI. Is there anyway to have better control over this?
Ah right, I missed that in the codebase. It just adds the .dataset
convention to the dataset task.
Can you please provide a minimal example that may make this happen?
UnevenDolphin73 The task shouldn't disappear when using use_current_task=False
. There might be something else that makes it disappear.
It also happens when use_current_task=False
though. So the current best approach would be to not combine the task and the dataset?
UnevenDolphin73 Yes it makes sense. At the moment, this is not possible. When using use_current_task=True
the task gets attached to the dataset and moved under dataset_project/.datasets/dataset_name
. Maybe we could make the task not disappear from its original project in the near future.
I'm not too worried about the dataset appearing (or not) in the Datasets
tab. I would like it (the original task ) to to not disappear from the original project I assigned it to
I don't think the version makes the task disappear. You should still see the task in the Datasets
section. Maybe there is something you do with that task/dataset that makes it disappear (even tho it shouldn't)?
Yes, that one shows up. I forgot to mention we also set the version explicitly, but that just creates a duplicate dataset under Datasets
and anyway our main Task
is now hidden from the original project.
So project project
exists, but it is empty.
Can you see your task if you run this minimal example UnevenDolphin73 ?
` from clearml import Task, Dataset
task = Task.init(task_name="name_unique", project_name="project")
d = Dataset.create(dataset_name=task.name, dataset_project=task.get_project_name(), use_current_task=True)
d.upload()
d.finalize() `
And this is of course strictly with the update to 1.6.3 (or newer) that should support API 2.20
No task, no dataset, just an empty container with no reference to the task it's attached.
It seems to me that it should not move the task if use_current_task=True
?
Yes and no SmugDolphin23
The project is listed, but there is no content and it hides my main task that it is attached to.
UnevenDolphin73 can't you find your task/dataset under the Datasets
tab?
That is, we have something like:
` task = Task.init(...)
ds = Dataset.create(dataset_name=task.name, dataset_project=task.get_project_name(), use_current_task=True)
upload files
dataset.upload(show_progress=True)
dataset.finalize()
do stuff with task and dataset
task.close() `But because the dataset is linked to the task, the task is then moved and effectively becomes invisible 😕
Any thoughts AgitatedDove14 SuccessfulKoala55 ?