... indicate the job needs to be run remotely? I’m imagining something like
clearml-task
and you need to specify the queue to push your Task into.
See here: https://clear.ml/docs/latest/docs/apps/clearml_task
Hi @<1673501397007470592:profile|RelievedDuck3>
how can I configure my alerts to be notified when the distribution of my metrics (variables) changes on my heatmaps?
This can be done inside grafana, here is a simple example:
None
Specifically you need to create a new metric that is the distance of current distribution (i.e. heatmap) from the previous window), then on the distance metric, ...
(apologies I just got to it now)
First of all, kudos on the video, this is so nice!!!
And thanks to you I think I found it:
None
we have to call serialize Before the execute_remotely
(the reason why sometimes it works is that it syncs in the background, so sometimes it's just fast enough and you get the config object)
Let me check if we can push an RC with a ...
but then an error message in the web-app pops up
Fetch parents failed
and the Scheduler task disappears
And the Task is still running? What's he clearml python version and webui version ?
No worries, you open the issue on pypa/pip and I will do my best to push forward 🙂
We also have to be realistic I have a PR that is waiting for almost a year now (that said it is a major one and needed to wait until a few more features were merged), basically what I'm saying best case scenario is a month to get a PR merged
Thanks @<1657918706052763648:profile|SillyRobin38> this is still in the internal git repo (we usually do not develop directly on github)
I want to get familiar with it and, if possible, contribute to the project.
This is a good place to start: None
we are still debating weather to sue it directly or as part of Triton ( None ) , would love to get your feedback
Thanks for checking @<1545216070686609408:profile|EnthusiasticCow4> stable release will be out soon
Hi @<1529633468214939648:profile|CostlyElephant1>
Is it possible to get user ID of the current user
On the Task.data
object itself there should be a filed named " user
" that's the user ID of the owner (creator) of the Task.
You can filter based on this id with
Tasks.get_tasks(..., task_filter={'user': ["user-id-here"]})
wdyt?
Hi @<1571308003204796416:profile|HollowPeacock58>
parameters = task.connect(config, name='config_params')
It seems that your DotDict does not support the python copy
operator?
i.e.
from copy import copy
copy(DotDict())
fails ?
Hi @<1566596960691949568:profile|UpsetWalrus59>
Could it be the two experiments have the exact name ?
(I sounds like a bug in the UI, but I'm trying to make sure, and also understand how to reproduce)
What's your clearml-server version ?
Why can I only call
import_model
Actually creates a new Model object in the system
InputModel(id) will "load" a model based on the model id
Make sense ?
Wait @<1715900788393381888:profile|BitingSpider17> are you passing it on a single Task? these values are read by the daemon (i.e. running on the host) which means it is not getting them from the Task context (which leads to zero effect on the mount points)
Notice that in new versions of the clearml-agent the SDK mount point was changed to: sdk_cache: "/clearml_agent_cache"
exactly to solve for the non-root containers:
[None](https://github.com/allegroai/clearml-agent/blob/6b31883e4579...
, it's just a custom module.
Is this your own module ? Is this a local folder we import from ?
Hi @<1538330703932952576:profile|ThickSeaurchin47>
Specifically I’m getting the error “could not access credentials”
Put your minio credentials here:
None
Hi SmugDog62
My guess is that there's an issue with the git repo detector.
Seems like you are correct
Can are you getting on the execution tab?
Is the repo correct?
Do you see the notebook in the uncommited changes ?
HighOtter69
Could you test with the latest RC? I think this fixed it:
https://github.com/allegroai/clearml/issues/306
Questions
I want to trigger a retrain task when F1
That means that in inference you are reporting the F1 score, correct?
As part of the retraining I have to train all the models and then have to choose best one and deploy it
Are you using passing output_uri to Task.init? are you storing the model as artifact?
You can tag your model/task with "best" tag (and untag the previous one). Then in production , look for the "best" task and get its model
Thoughts?
GreasyPenguin14 the demo-server is soon to be deprecated, so we are slow on upgrades there. But you can already see it in the SaaS free tier.
https://app.community.clear.ml/
Is it being used to ssh to the instance?
It is used for the SSH client so it "knows" the SSH server (does that make sense) ?
I think it is on the JWT token the session gets from the server
a bit of a hack but should work 🙂
session = task.session # or Task._get_default_session()
my_user_id = session.get_decoded_token(session.token)['identity']['user']
with tensorboard logging, it works fine when running from my machine, but not when running remotely in an agent.
This is odd, could you send the full Task log?
Hi @<1668065560107159552:profile|VivaciousPenguin20>
I think you are looking at the wrong experiment, this is a 3 year old experiment ? this does not seem to be your currently executed experiment, right?
BTW: @<1673501397007470592:profile|RelievedDuck3> we just released 1.3.1 with better debugging, it prints full exception stack on failure to the clearml Serving Session Task.
I suggest you pull the latest image re run the docker compose and check what you have on the serving session Task in the UI
same: Not Found (#404)
May I suggest to DM it to me (so it is not public)
Nice!!!
Are you aware of a limitation of "/events.get_task_events" preventing from fetching some of the images stored on the server
Are you saying you see them in the UI, but cannot access them via the API ?
(this would be strange as the UI is firing the same API requests to the back end)
btw: you can also do cron
for that:
None
@reboot sleep 60 && clearml-agent daemon ...
Hmm that is odd. Let me take a look and ask the guys. Thank you for quickly testing the RC! I'm hoping a new RC with a fix will be there tomorrow, if we can quickly replicate
the parameter datatypes are not being changed when loading them up.
These are the auto logged parameters , inside YOLO, correct?
Just to make sure, you can actually see the value None
in the UI, is that correct? (if everything works as expected, you should see empty string there)
JitteryCoyote63 yes this is very odd, seems like a pypi flop ?!
On the website they do say there is 0.5.0 ... I do not get it
https://pypi.org/project/pytorch3d/#history