could one also limit the number of CPU cores available?
If you are running in docker mode you can add:--cpus=<value>see ref here: https://docs.docker.com/config/containers/resource_constraints/
Just add it to extra_docker_arguments :
https://github.com/allegroai/clearml-agent/blob/2cb452b1c21191f17635bcb6222fa8bfd82afe29/docs/clearml.conf#L142
That said, the arguments are passed Inside the code executed (i.e. monkey patched into the frameworks). This allows it to log and change All the arguments, including the default ones , and allow you to edit them.
Does that make sense ?
Finally managed; you keep saying "all projects" but you meant the "All Experiments" project instead. That's a good startย
ย Thanks!
Yes, my apologies you are correct: "all experiments"
The file itslef is csv.gz compressed, it's actually sending from the file-server back that messes things
(you can test with output_uri=/tmp/folder )
VirtuousFish83
could that be that "inplace-abn" while installing the package needs torch ?
not sure if this is considered a bug or not! but Iโd happily make an issue on github if needed.
I think we should, at least for the sake of transparency and visibility ๐
thanks again for all your help.
My pleasure ๐
I was not able to reproduce with the example code ๐
https://github.com/allegroai/clearml/blob/master/examples/pipeline/pipeline_from_decorator.py
Let's start small. Do you have grafana enabled in your docker compose and can you login to your grafana web ui?
Notice grafana needs to access the prometheus container directly so easiest way is to have everything in the same docker compose
After you call task.set_initial_iteration(0) what do you get with task.get_initial_iteration() , is it 0 ?
And can you see your promethues in your grafana?
Yes, you are too quick for the resource monitoring ๐
AstonishingRabbit13 so is it working now ?
What happens when you call:
from clearml.backend_interface.task.repo import ScriptInfo
print(ScriptInfo._ScriptInfo__legacy_jupyter_notebook_server_json_parsing(None))
BTW: you still can get race/starvation cases... But at least no crash
ComfortableShark77 are you saying you need "transformers" in the serving container?CLEARML_EXTRA_PYTHON_PACKAGES: "transformers==x.y"https://github.com/allegroai/clearml-serving/blob/6005e238cac6f7fa7406d7276a5662791ccc6c55/docker/docker-compose.yml#L97
Try to add here:
None
server_info['url'] = f"http://{server_info['hostname']}:{server_info['port']}/"
The dokcer itself does not have the host configured.
Hi @<1545216070686609408:profile|EnthusiasticCow4>
is there a way to get the date from the InputModel?
You should be able to with model._get_model_data()
But I think we should have it all exposed, wdyt?
This is strange, let me see if we can get around it, because I'm sure it worked ๐
At the top there should be the URL of the notebook (I think)
That is correct.
Obviously once it is in the system, you can just clone/edit/enqueue it.
Running it once is a mean to populate the trains-server.
Make sense ?
@<1556812486840160256:profile|SuccessfulRaven86> is the issue with flask reproducible ? if so could you open a github issue, so we do not forget to look into it?
what if the preexisting venv is just the system python? my base image is python:3.10.10 and i just pip install all requirements in that image. Does that not avoid venv still?
it will basically create a new venv inside the container forking the existing preinistalled stuff (i.e. the new venv already has everything the python system has preinstalled)
then it will call "pip install" on all the "installed packages of the Task.
Which should just check everything is there and install nothing...
UpsetTurkey67 are you saying there is a sym link in the original repository, and when it copies it, it breaks the symlink ?
Hi ElegantCoyote26
what's the clearml version you are using?
Thanks MinuteGiraffe30 , fix will be pushed later today
t seems there is some async behavior going on. After ending a run, this prompt just hangs for a long time:
2021-04-18 22:55:06,467 - clearml.Task - INFO - Waiting to finish uploads
And there's no sign of updates on the dashboard
Hmm that could point to an issue uploading the last images (which are larger than regular scalars) could you try flushing and waiting ?
i.e.task.flush() sleep(45)
Can you let me know if i can override the docker image using template.yaml?
No, you cannot.
But you can pass OS environment "CLEARML_DOCKER_IMAGE" to set a diff default one