Hi @<1539417873305309184:profile|DangerousMole43> , in that case I think you can simply save the file path as a configuration in the first step and then in the next step you can simply access this file path from the previous step. Makes sense?
Also try adding the following to the bash init script
python -m pip install -U clearml-agent
and add the log of the machine please
JitteryCoyote63 I'll take a look, thanks a lot for the heads up! 🙂
Hi @<1859043976472956928:profile|UpsetWhale84> , output models are registered in the model repository.
It really depends on the use case regarding artifacts vs datasets. If the outputs are relevant only in the context of the pipeline then use artifacts, otherwise use datasets
assuming that they are in the same setup as user/secret keys then I guess they would work until they expire 🙂
If you're running on GCP I think using the autoscaler is a far easier and also cost efficient solution. The autoscaler can spin up and down instances on GCP according to your needs.
It looks as it is running, did the status of the experiment change?
WackyRabbit7 , isn't this what you need?
https://clear.ml/docs/latest/docs/references/sdk/automation_controller_pipelinecontroller#get_running_nodes
Hi @<1540142651142049792:profile|BurlyHorse22> , it looks like an error in your code that is bringing the traceback. What is happening during the traceback?
Hi @<1675675722045198336:profile|AmusedButterfly47> , what is your code doing? Do you have a snippet that reproduces this?
GorgeousMole24 , another note, in the comparison screen itself, you can add other experiments to the comparison view.
Hi WhoppingMole85 , you can actually do that with the logger.
Something along the lines of:Dataset.get_logger().report_table(title="Data Sample", series="First Ten Rows", table_plot=data1[:10])
Does this help?
Hi SuperiorPanda77 , I'm not sure I understand, can you elaborate on the specific use case?
CharmingStarfish14 , maybe SuccessfulKoala55 can assist
VexedCat68 , what errors are you getting? What exactly is not working, the webserver or apiserver? Are you trying to access the server from the machine you set it up on or remotely?
Hi @<1523701092473376768:profile|SuperiorPanda77> , I think a PR would always be appreciated. I don't see any issues with using task.reset()
You already have it : )
None
Hi @<1837300695921856512:profile|NastyBear13> , can you provide logs from the machine itself? Are you certain its the same VM? Can you also provide logs from the tasks themselves? \
Hi @<1797800424254738432:profile|FlatHippopotamus76> , you can set your image using Task.set_base_docker, I think this is what you're looking for - None
ScrawnyLion96 , Hey 🙂
Can you provide a screenshot?
Hi @<1566596968673710080:profile|QuaintRobin7> , do you have a self contained snippet that will reproduce this?
Hi BattyDove56 , it looks like your elasticsearch container is restarting. Is this the issue still? Can you check the container logs to see why it's restarting? I think this is what might be causing the issue with ClearML server not raising properly
RoughTiger69 , do you have a rough estimate on the size that breaks it?
For example--packages "tqdm>=2.1" "scikit-learn"
I think that's about that 🙂
Why do you manually use set_repo ?
And what was the result from 19:15 yesterday? The 401 error? Please note that's a different set of credentials
FierceHamster54 , please try re-launching the the autoscaler, the issue seems to be resolved now
you deleted the model from the same directory as you ran from the code but you didn't delete it from the cache folder?