Reputation
Badges 1
25 × Eureka!Okay that seems to explain it. Now the question is why it installed it in the wrong place.
Can you reproduce this behavior outside of lightning? or in a toy example (because I could not)
Hi PerplexedCow66
I'm assuming an extension for this:
https://github.com/allegroai/clearml-serving/issues/32
Basically JWT can be used as a general access/block all endpoints, which is most efficnely used if handled by k8s loadbalancer (nginx/envoy),
but if you want a per-endpoint check (or maybe do something based on the JWT values)
See adding JWT to FastAPI here:
https://fastapi.tiangolo.com/tutorial/security/oauth2-jwt/?h=jwt#oauth2-with-password-and-hashing-bearer-with-jwt-tokens
T...
It does, tested π but you should as well
Hi SmarmySeaurchin8
Could you open a bug on GitHub, so this is not lost? Let's assume 'a' is tracked, how would one change 'a' in the UI?
Hmm and you are getting empty list for thi one:
server_info['url'] = f"http://{server_info['hostname']}:{server_info['port']}/"
I see them run reliably (no killed), are they running in service mode?
How do you deploy agents, with the clearml k8s glue ?
I'm guessing this is done through code-server?
correct
I'm currently rolling a JupyterHub instance (multiuser, with codeserver inside) on the same machine as clearml-server. Thatβs where tasks are executed etc. so, all browser dev env.
Yeah, the idea with clearml-session each user can self serve themselves the container that works best for them. With a jupyterhub they start to step on each other's toes very quickly ...
Hi DilapidatedDucks58 ,
Just making sure all 8 works have different worker ids? (you can see 8 in the workers page in the UI)
Also, are they running this docker or venv mode?
Hi ObnoxiousStork61
Is it possible to report ie. validation scalars but shifted by 1/2 iteration?
No π these are integers
What's the reason for the shift?
I'm also curious π
, but it seems like I can only trigger a task using a Task scheduler but not a pipeline.
@<1523701132025663488:profile|SlimyElephant79> Maybe we should better state it, but Pipeline is "just" another type of Task. so triggering a Task with the Pipeline ID is essentially triggering the pipeline (do notice you need to select the "services" queue to be used so that the pipeline runs on the correct resource). Make sense ?
Yes it should
here is fastai example, just in case π
https://github.com/allegroai/clearml/blob/master/examples/frameworks/fastai/fastai_with_tensorboard_example.py
BTW: I tested the code you previously attached, and it showed the plot in the "Plots" section
(Tested with latest trains from GitHub)
btw: what's the OS and python version?
AbruptWorm50 can you send full image (X axis is missing from the graph)
Hello guys, i have 4 workers (2 in default and 2 in service queue on same machine)
Hi @<1526734437587357696:profile|ShaggySquirrel23>
I think what happens is one agent is deleting it's cfg file when it is done, but at least in theory each one should have it's own cfg
One last request can you try with the agent's latest RC version 1.5.3rc2 ?
Notice the pipeline step/Task at execution is not aware of the pipeline context
Hi WickedGoat98
Will I need to wrap their execution in python by system calls?
That would probably be the easiest solution π
Then you can plug it into your pipeline as a preprocessing Task:
You can check this example:
https://github.com/allegroai/trains/tree/master/examples/pipeline
Thanks TroubledHedgehog16 for the context.
sdk.development.worker.report_period_sec
Yes please update to the latest version 1.8.0 for full support (to be released today, I think)
https://github.com/allegroai/clearml/blob/f6238b8a0fb662540bca9095cc0c22bd7af483c1/docs/clearml.conf#L196
https://github.com/allegroai/clearml/blob/f6238b8a0fb662540bca9095cc0c22bd7af483c1/docs/clearml.conf#L199
we have have been running agents on 3 on-premise systems.
Do notice that by default an...
and I run agent from local user and I would expect that settings to have effect -v /home/localuser/.ssh:/home/testuser/.ssh
It does not map it directly, it creates a temp copy in the host /tmp folder of the entire ".ssh" folder, than maps this folder inside the container:
https://github.com/allegroai/clearml-agent/blob/a5a797ec5e5e3e90b115213c0411a516cab60e83/clearml_agent/commands/worker.py#L3422
Notice that the "docker_internal_mounts" section is nested inside the "agent" section ...
Hi ZealousSeal58
What's the clearml version you are using ?
If there was a "debug mode" for viewing the stack trace before the crash that would've been most helpful...
import traceback traceback.print_stack()
function and just seem to be getting an "isadirectory" error?
Can you post here what you are getting ? which clearml version are you using ?!
also tried manually adding
leap==0.4.1
in the task UI which didn't work.
That has to work, if it did not, can you send the log for the failed Task (or the Task that did not install it)?
The environment in the logs does show that leap is being installed potentially from a cache?
- leap @ file:///opt/keras-hannd...
Hi @<1523702652678967296:profile|DeliciousKoala34>
What's the clearml-server version you are working with?
Can you check with the latest RC?
pip3 install clearml==1.9.2rc2
Hi @<1523702786867335168:profile|AdventurousButterfly15>
Make sure you pass output_uri=true in Task.init
It will automatically upload your model to the file server. You can also configure it in the clearml.conf, look for defualt_output_uri
Hi MortifiedDove27
Looks like there is a limit of 100 images per experiment,
The limit is 100 unique combination of title/series per image.
This means that changing the title or the series name will add 100 more images (notice the 100 limit is for previous iterations)
Hmmm could you attach the entire log?
Remove any info that you feel is too sensitive :)