Are the cloned tasks running? Can you add logs from the HPO and one of the child tasks?
Hi @<1768447000723853312:profile|RipeSeaanemone60> , can you please provide the full log? Is it the pipeline controller that is getting stuck or some step?
Hi GiganticMole91 ,
I see that the storage settings are also available through environment variables, but I'm worried that the environment variables have already been parsed at that time.
I'm not sure I understand. Can you elaborate? How do you run in remotely? Do you raise an instance each time or are your instances persistent?
Hi @<1594863230964994048:profile|DangerousBee35> , do you have some stand-alone code snippet that reproduces this behaviour?
Why does the figure change so drastically? And how can I solve it?
What are you referring yo specifically? The data plots seem to be identical.
Sidenote: there seems to be a bug in the plot viewer, as the axis are a bit chaotic..
Do you mean the x/y intersection?
The problem was that the plot I created myself
How was the plot created? Can you give me a small snippet to try and play around with?
TartSeagull57 , what framework are you on? What version of ClearML are you using?
AppetizingMouse58 , might have some input here 🙂
@<1597762318140182528:profile|EnchantingPenguin77> , are you sure you added the correct log? I don't see any errors related to cuda
Hi @<1597762318140182528:profile|EnchantingPenguin77> , I don't see any errors related to CUDA in the log
Hi @<1819543688414498816:profile|ScatteredOctopus61> , what are your and your colleague's user IDs?
FreshKangaroo33 , I'm sorry for the delay. It looks like it will require a feature request. Maybe open a github issue to track it? 🙂
Looks like you're trying to do fetch something and then do something to it, that's why there's a TypeError regarding a NoneType. Do you know where in the code the traceback occurs?
Hi @<1639799308809146368:profile|TritePigeon86> , what is the use case for passing multiple callbacks? Why not have it in the same function simply?
I see. I don't think it's supported but I think it would be a great idea for a feature. Maybe Open a Github feature request?
Hi @<1670964687132430336:profile|SpicyFrog56> , can you please add the full log?
Hi @<1649221402894536704:profile|AdventurousBee56> , I'm not sure I understand. Can you add the full log and explain step by step what's happening?
I'll try soon 🙂
Hi @<1655744373268156416:profile|StickyShrimp60> , I think it would be good to open a GitHub issue if there isn't one 🙂
Hi @<1655744373268156416:profile|StickyShrimp60> , is it possible you're using different ClearML SDK versions?
Hi @<1523701295830011904:profile|CluelessFlamingo93> , I would suggest leaving your details here:
None
Hi @<1547028131527790592:profile|PleasantOtter67> , nothing out of the box. You can however quite easily extract all that information and inject it into a csv programmatically.
I think the bigger question is how would you break it down? Each experiment has several nested properties.
Hi @<1590514584836378624:profile|AmiableSeaturtle81> , you need to add the port to the credentials when you input them in the webUI
Or you're thinking only of the current view as it is?
Hi ImmenseMole41 , so your issue is specifically when trying to download compressed csv files? You mentioned that the values are correct when downloading via the StorageManager. Do you get corrupted values somewhere?
Also, how are you saving these csv files?
Hi FreshParrot56 , I'm not sure there is a way to stop it. However you do need to archive and then delete it.
Hi ShallowGoldfish8 ,
You can get specific chunks/files using the part argument:
https://clear.ml/docs/latest/docs/references/sdk/dataset#get_local_copy
Hi @<1546665634195050496:profile|SolidGoose91> , I think this capability exists when running pipelines. The pipeline controller will detect spot instances that failed and will retry running them.
Are you using the PRO or the open source auto scaler?