I'm not sure it's possible to hide via the Task object. But I think such configurations should be saved as env variables or in clearml.conf
Can you please be more specific on the use case?
Which config file? The one sitting locally on your computer? You would still need to transmit that data to the application that is spinning the instances up and down. Maybe a CLI? But that would be adding more complexity on top of it. What do you think?
Hi NervousRabbit2 , what version of ClearML server are you running? Also what clearml version are you using?
@<1541954607595393024:profile|BattyCrocodile47> , that is indeed the suggested method - although make sure that the server is down while doing this
Are you getting some errors? Did you run an agent?
Hi @<1836213542399774720:profile|ConvincingDragonfly85> , from my understanding as long as you run Task.init() at the start, running any cell afterwards should update the backend
Hi @<1618418423996354560:profile|JealousMole49> , I'm afraid there is no such capability at the moment. Basically metrics mean any metadata that was saved (scalars, logs, plots etc). You can delete some log/metric heavy experiments/tasks/datasets to free up some space. Makes sense?
Hi @<1570220844972511232:profile|ObnoxiousBluewhale25> , you can use the output_uri parameter in Task.init to set a predetermined output destination for models and artifacts
Please do. You can download the entire log from the UI 🙂
Then I think this is something you need to implement in your script to mark the task as failed if the spot goes down
Hi UpsetTurkey67 ,
Is this what you're looking for?
https://clear.ml/docs/latest/docs/references/sdk/trigger#add_model_trigger
Hi DrabOwl94 , how did you create/save/finalize the dataset?
Hi EnviousPanda91 , what version of ClearML are you using? Are you running on a self hosted server?
Hi @<1603198163143888896:profile|LonelyKangaroo55> , how are you currently reporting, do you have a code snippet? Are you using the community server or a self hosted one?
Hi @<1845635622748819456:profile|PetiteBat98> , metrics/scalars/console logs are not stored on the files server. They are all stored in Elastic/Mongo. Files server is not required to use. default_output_uri will point all artifacts to your Azure blob
Hi SoreHorse95 ,
Does ClearML not automatically log all outputs?
Regarding logging maybe try the following setting in ~/clearml.conf sdk.network.metrics.file_upload_threads: 16
AlertCrow40 , by the way. ClearML already has an integrated tool to work on a jupyter notebook.
In a couple of lines it will open a jupyter notebook for you to work with. Further reading here: https://clear.ml/docs/latest/docs/apps/clearml_session/
🙂
Hmmm Regarding your issue you can use the following env vars to define your endpoint
https://clear.ml/docs/latest/docs/configs/env_vars/#server-connection
What is your usecase? Do you want to change the endpoint mid run?
Hi PricklyRaven28 , can you try with the latest clearml version? 1.7.1
ScaryLeopard77 , Hi! Is there a specific reason to the aversion from pipelines? What is the use case?
"continue with this already created pipeline and add the currently run task to it"
I'm not sure I understand, can you please elaborate? (I'm pretty sure it's a pipelines feature)
I'm not sure. Maybe @<1523703436166565888:profile|DeterminedCrab71> might have some input
Looks like you're having some connectivity to the files server
2024-11-14 07:05:30,888 - clearml.storage - INFO - Uploading: 5.00MB / 12.82MB @ 35.88MBs from /tmp/state.vykhyxpt.json
2024-11-14 07:05:31,111 - clearml.storage - INFO - Uploading: 10.00MB / 12.82MB @ 22.36MBs from /tmp/state.vykhyxpt.json
1731567938707 labserver error WARNING:urllib3.connectionpool:Retrying (Retry(total=2, connect=2, read=5, redirect=5, status=None)) after connection broken by 'NewConnectionError('<urllib...
Setting the upload destination correctly and doing the same steps again
Hi @<1643785593177509888:profile|FrustratingSeagull27> , do you have some sample code that recreates this behavior?
Hi AttractiveShrimp45 , can you please elaborate on what you mean by KPIs artifact?
Can you elaborate on how you did that?
Hi @<1556812486840160256:profile|SuccessfulRaven86> , just to make things easier, can you comment out these 3 lines in the config file? This will cause the sdk to have default behavior. Afterwards try with store_code_diff_from_remote: false
What do you see in the uncommitted changes section of the experiment?
In the UI check under the execution tab in the experiment view then scroll to the bottom - You will have a field called "OUTPUT" what is in there? Select an experiment that is giving you trouble?