Reputation
Badges 1
25 × Eureka!Hi RipeGoose2
when I'm using the set_credentials approach does it mean the trains.conf is redundant? if
Yes this means there is no need for trains.conf , all the important stuff (i.e. server + credentials you provide from code).
BTW: When you execute the same code (i.e. code with set_credentials call) the agent's coniguration will override what you have there, so you will be able to run the Task later either on prem/cloud without needing to change the code itself 🙂
Yes RipeGoose2 you are totally correct 🙂 if you want the models to be auto uploaded in the offline session you have to pass output_uri (or default_output_uri).
When a remote task runs
Dataset.get()
it is not using the correct URL
BoredHedgehog47 it will get the link the data was Registered with, when creating the Dataset.
This has Nothing to do with the local configuration, it can point to any arbitrary file location on the internet.
It was created there, because at the time of the dataset creation someone (manually or via the config) set a specific host as the file location, and to that host the files were uploaded (again ...
You mean to design the entire pipeline from YAML?
(this assumes your Tasks know how to process links to artifacts)
Is this what you are after?
(BTW: any reason for working with YAML files instead of coding it?)
Does clearml resolve the CUDA Version from driver or conda?
Actually it starts with the default CUDA based on the host driver, but when it installs the conda env it takes it from the "installed packages" (i.e. the one you used to execute the code in the first place)
Regrading link, I could not find the exact version bu this is close enough I guess:
None
PricklyJellyfish35
Do you mean the original OmegaConf, before the overrides ? or the configuration files used to create the OmegaConf ?
os.environ['CLEARML_PROC_MASTER_ID'] = ''
Nice catch! (I'm assuming you also called Task.init somewhere before, otherwise I do not think this was necessary)
I think i solved it by deleting the project and running the base_task one time before the hyper parameter optimzation
So isit working now? everything is there ?
I was hoping that there's a universal flag somewhere. Asking this because I want all the Models and Artifacts to be stored in one place and the users shouldn't have to edit their configuration files.
You mean like make sure all models/artifacts are always uploaded?
I guess last followup question, is there a way to cap costs?
Scale tier ? (I know it is not per usage, but it is probably more than 15$ per user 🙂 )
Can you fix locally, just to verify ?
ColorfulBeetle67 you might need to configure use_credentials_chain
see here:
https://github.com/allegroai/clearml/blob/a9774c3842ea526d222044092172980ae505e24f/docs/clearml.conf#L85
Regrading the Token, I did not find any reference to "AWS_SESSION_TOKEN" in the clearml code, my guess it is used internally by boto?!
I have a client that runs clearml-session and i saw from the agent's logs that the installation of vscode fails.
That makes sense, it downloads the vscode in runtime, do you have an alternative location? or maybe it is easier to built a container with the vscode pre installed ?
Basically it hooks into any torch.save function (monkey patching in realtime)
Is it not possible to say just look at my requirements.txt file and the imports in the script?
I think there is a GitHub Issue for this feature
(basically the issue is, requirements.txt are very often not updated, and have no real version lock, so replicating a working env is always safer)
That is odd ...
Could you open a GitHub issue?
Is this on any upload, how do I reproduce it ?
We are using k8s glue to spawn the job. ...
I think this is actual network latency, nothing to do with the jobs, could it be the server is very far away?
What happens when you manually start a Task from your machine ?
Is the latency fixed? Is it just when starting a new Task?
I believe a process is still running in the background. Is it expected? (v0.17.4)
Yes it is expected.
Basically it reports that the resource monitoring did not detect any "iterations"/"steps" reporting, so instead of reporting resources based on iterations it reports based on time. Make sense ?
PompousBeetle71 Check the beginning of the log, it should print the configuration, including the access key (excluding the secret) see if it makes sense...
PompousBeetle71 I think that was you saw as tags in previous version was actually systems tags, now we also have users tags (i.e. .tags). If you still want to access the system tags can you try:InputModel('aabbcc')._get_base_model().data.system_tags
yes
argument saying always create from code
can be helpful
@<1523701523954012160:profile|ShallowCormorant89> any chance you can open a github issue on that, just so we do not forget ?
if we can edit the configuration objects of a pipeline, that can be beneficial too. which we're unable to do from UI
Actually you already can, after you clone the pipeline, you can press on details then go to configuration Tab, and edit the pipeline object. The format is HOCON (...
LOL I see a meme waiting for GrumpyPenguin23 😉
Just curious, if
is a value I can set, where is it used?
It is used when Creating a dataset from inside the cluster (i.e. when launching using the clearml k8s glue),
it will have No effect on what users have on their local machines
i.e. they can always point to a diff server.
That said, when users create their initial clearml.conf and copy paste the info from the web UI, this value (or it might be another one, I'll double check later) will set the initial configuration the c...
Are you saying that in the UI you do not see "confusion matrix" at all, only on the GS bucket ?
FrothyShark37 what was different in your script ?
EnviousPanda91 please feel free to PR if it works 🙂
https://github.com/allegroai/clearml/blob/86586fbf35d6bdfbf96b6ee3e0068eac3e6c0979/clearml/binding/frameworks/catboost_bind.py#L114
Nice, that seems to be the issue. Any chance you can open a GitHub issue, so we do not loose track of it ?
GreasyPenguin14 I think the default is reporting on failed tasks only? could that be?