Reputation
Badges 1
25 × Eureka!SoggyBeetle95 is this secret a per Task secret, or is it for the agent itself (I.e. for all Tasks the agent will spin)?
Ohh, if this is the case then it kind of makes sense to store on the Task itself. Which means the Task object will have to store it, and then the UI will display it :(
I think the actual solution is a vault , per user, which would allow users to keep their credentials on the sever, the agent to pass those to the Task when it spins it, based on the user. Unfortunately the vault feature is only available on the paid/enterprise version ( with RBAC etc.).
Does that make sense?
SoggyBeetle95 you can configure the credentials in the clearml.conf
running on the agent machines:
https://github.com/allegroai/clearml-agent/blob/a5a797ec5e5e3e90b115213c0411a516cab60e83/docs/clearml.conf#L320
(I'm assuming these are storage credentials)
If you need general purpose env variables, you can ad them here:
https://github.com/allegroai/clearml-agent/blob/a5a797ec5e5e3e90b115213c0411a516cab60e83/docs/clearml.conf#L149
with ["-e", "MY_VAR=MY_VALUE"]
SoggyBeetle95 maybe it makes sense to configure the agent with an access-all credentials? Wdyt
report_text does not, this is very weird
Okay this seems to be the issue.
Just making sure the Task status is "running" and task.get_logger().report_text("something")
does not report a thing ?
Do you see it on your screen?
Can you test without the "Task.debug_simulate_remote_task / init" ?
but perhaps it is worth adding to the docs page a hint to avoid using the CLEARML_TASK_ID env variable, perhaps I am not the only one to ever try it
Good idea, any thoughts on where ? I cannot find a trivial place to put these things
Wait, how did you end up withclearml_task_id = os.environ['CLEARML_TASK_ID']
printing "01b77a220869442d80af42efce82c617" ?
This means you are running by an agent?!
I guess we should have obfuscated the name better 😄
LOL, great minds and so on 🙂
Manually I was installing the
leap
package through
python -m pip install .
when building the docker container.
NaughtyFish36 what happnes if you add to your "installed packages" /opt/keras-hannd
? This should translate to "pip install /opt/keras-hannd" which seems like exactly what you want, no ?
So could it be that pip install --no-deps .
is the missing issue ?
what happens if you add to the installed packages "/opt/keras-hannd" ?
Hi @<1542316991337992192:profile|AverageMoth57>
is this a follow up of this thread? None
@<1542316991337992192:profile|AverageMoth57> it sounds like you should use SSH authentication for the agent, just setforce_git_ssh_protocol: true
None
And make sure you have the SSH kets on the agent's machine
ssh: Could not resolve hostname
: Name or service not known
@<1542316991337992192:profile|AverageMoth57> so is this the main issue? this seems unrelated to the Gerrit thing, just missing configuration of the .ssh on the agent machine, is that correct?
I don't know whether you have access to the backend,
Creepy , no I do not 🙂
I can't make anything appear in the console part of the ui
clearml_task.logger.report_text("some text")
should work
Hi UpsetTurkey67
The status that you see on the graph is fetched from the pipeline itself (for example cached), I think that what happened is that the pipeline Logic has yet to update itself on the status of the running component. If the pipeline is indeed running, it should update the status shortly (actually you can set the polling frequency for that). If for some reason the pipeline Task died than indeed this is an odd state (that we should probably fix in the UI)
no, I set the env variable CLEARML_TASK_ID myself
Do not, this is the issue 🙂
this is used internally and messing up the internal state, basically this is one of the signals for the SDK to know there is an agent taking care of things (for example logging the entire console output)
Use any other variable, for example MY_CLEARML_TASK_ID
now it stopped working locally as well
At least this is consistent 🙂
How so ? Is the "main" Task still running ?
Hi @<1523703472304689152:profile|UpsetTurkey67>
You mean https://github.com/Lightning-AI/torchmetrics
?
Where are those stored?
Where are they stored? I could not find a backend they work with, what am I missing?
I'm already at 300MB of usage with just 15 tasks
Wow, what do you have there? I would try to download the console logs and see what the size you are getting, this is the only thing that makes sense, wdyt?
BTW: to get the detailed size for scalars, maximize the plot (otherwise you are getting "subsampled" data)
UI for some anomalous file,
Notice the metrics are not files/artifacts, just scalars/plots/console
Hi SkinnyPanda43
Let's say that I install the shared libs with pip in editable mode on my development evironment, how does the clearml-agent will handle those libraries if I submit a job
So installing packages from local folders with "-e" is in general ill-advised.
But using a full git path should work out of the box. for example if you install pip install
https://github.com/user/repo/repo.git then the agent will be able to install it on the remote machine. The main challenge...
Hmm how do you launch the autoscaler, code?
Any chance there is an env variable you set to get 1.5.0rc0? Because this is the version that is being used
Hi SkinnyPanda43
This issue was fixed with clearml-agent 1.5.1, can you verify?
Based on what I see when the ec2 instance starts it installs the latest, could it be this instance is still running?