Just to make sure we're on the same page, you're referring the machine statistics or ALL scalars don't show up?
I see. Can you provide a simple stand alone code snippet that reproduces this behaviour for you?
Hi @<1775332375794814976:profile|WhimsicalChimpanzee6> , the webUI uses the API under the hood. You can trigger and a pipeline via the webUI and see what happens in developer tools (F12)
Hi @<1798162812862730240:profile|PreciousCentipede43> , can you add logs from the apiserver pod?
You can restore these tasks by copying or moving them from task__trash into task collection. But the events for these tasks cannot be restored. About the user who deleted them unfortunately ClearML does not record this info in Mongo and without logging to ES there is no place to retrieve it (I can suggest using Kibana to monitor ES). You can try to inspect the mongo collection url_to_delete. It contains all the links from the deleted tasks that should be removed from the fileserver. If you se...
Hi LackadaisicalHedgehong78 . It seems that someone/something sent a command to delete a bunch of tasks. Do you have backups?
Hi @<1675675722045198336:profile|AmusedButterfly47> , what is your code doing? Do you have a snippet that reproduces this?
Hi @<1689446563463565312:profile|SmallTurkey79> , I'd suggest opening a github feature request 🙂
Hi @<1543766544847212544:profile|SorePelican79> , I don't think you can track the data inside the dataset. Maybe @<1523701087100473344:profile|SuccessfulKoala55> , might have an idea
Hi @<1543766544847212544:profile|SorePelican79> , ClearML can certainly do that. For this you have the Datasets feature.
None
This will allow you to version and track your data super easily 🙂
Hi StraightParrot3 ,
I'm not sure if thumbnails are supported inside tables. AgitatedDove14 , what do you think?
I guess that's a good point but really applicable if your training is CPU intensive. If your training is GPU intensive I guess most of the load goes on the GPU so running over VM (EC2 instances for example) shouldn't have much of a difference but this is worthy of testing.
I found this article talking about performance
https://blog.equinix.com/blog/2022/01/04/3-reasons-why-you-should-consider-running-containers-on-bare-metal/
But it doesn't really say what the difference in performance is...
Hi @<1564060257435521024:profile|PerfectShrimp1> , currently not supported. Maybe open a GitHub feature request for better filterings
Hi @<1669152726245707776:profile|ManiacalParrot65> , you can find the full documentation here - None
The functionality is basically the same as the GCP/AWS ones but since it is only in the Scale/Enterprise I don't think there is any documentation externally
Hi @<1714451218161471488:profile|ClumsyChimpanzee54> , the Azure autoscaler is available only in the Scale/Enterprise plan. It functions the same as the GCP/AWS autoscalers. Basically scaling from 0 to as many as configured and then spinning them down automatically once the workload is over spin all the machines down like you described
ProudElephant77 , can you please add a code snippet of what you did?
It isn't a bug, you have to add the previews manually through reporting. For example:
ds = Dataset.create(...) ds.add_files(...) ds.get_logger().report_media(...)
How about when you view it in the datasets view? Also what version of clearml
package do you have?
At 1 call per second for 12 hours you'll get to numbers close to that. I think you could try increasing the flush threshold - None
Might need to refresh page after opening dev tools 🙂
Hi @<1835851148938973184:profile|BattySwan0> , are you hosting your own server or are you using app.clear.ml
?
Hi @<1523704157695905792:profile|VivaciousBadger56> , you can configure Task.init(..., output_uri=True)
and this will save the models to the clearml file server
Hi @<1594863230964994048:profile|DangerousBee35> , I don't think there is such a mechanism currently. What would the expected/optimal behaviour would be in your use case?
Yes. But the services queue doesn't need a GPU, just a simple CPU machine to handle the controllers which don't take much resources (unless you did something crazy inside the controller like heavy computation)
MuddySquid7 , I couldn't reproduce case 4.
In all cases it didn't detect sklearn.
Did you put anything inside _init_.py
?
Can you please zip up the folder from scenario 4. and post it here?
Is it possible to do something so that the change of the server address is supported and the pictures are pulled up on the new server from the new server?
Do the links point to a bucket or the fileserver?
Also, I would suggest trying pipelines from decorators, I think it would be much smoother for you
Hi @<1724960468822396928:profile|CumbersomeSealion22> , can you provide a log of such a run?