BTW: what happens if you pass the same s3://bucket to Task.init output_uri ? I assume you are getting the same access issue ?
(basically python abusing types/casting where the value can be both str/bool on the same argparser aergument)
TrickyRaccoon92 actually Click is on the to do list as well ...
Hi WorriedParrot51 , what do you mean by "call get_parameters_as_dict() from agent" ?
Do you mean like change the trains-agent to run the task differently?
Or inside your code while the trains agent runs it?
From the code itself (regardless off how you run it) you can always call, and get the current states parameters (i.e. from backend if running with trains-agent, or copied from the code, if running manually)task.get_parameters_as_dict()
The second problem that I am running into now, is that one of the dependencies in the package is actually hosted in a private repo.
Add your private repo to the extra index section in the clearml.conf:
None
Yey! okay let me make sure we add this feature to the Task.init arguments so one can control it from code 🙂
I though the dataset was only linked to the fileserver and not to the specific url used to upload it. (
ShinyRabbit94 yep exactly! the idea is that you can actually do the storage on any solution (S3/GS etc.) the file server is just the default one 🙂
Yes, in tandem with the experiments (because they constantly log to the server).
That said, with 0.16 we added offline mode, so you can run in offline mode, then import the experiment into the system.
Could you manually configure the ~/trains.conf ?
(Just copy paste the section from the UI)
then try to run:trains-agent list
Ohh! I see now
@<1526371965655322624:profile|NuttyCamel41> the "backend: "pytorch" is not really supported because it does not use the optimized Triron engine (which is the reason to run Triron server)
In order to use pytorch you need to convert it to torchscript and then deploy, see example here:
None
[None](https://github.com/allegroai/clearml-serving/blob/7ba356efc97a6ae2159283d198d981b3c1ab85e6/examples/pytor...
Hi @<1715900760333488128:profile|ScaryShrimp33>
hi everyone! I’m trying to save my model’s weights to storage. And I can’t do it.
See example here: None
or
task.update_output_model(model_path="/path/to/model.pt")
C will be submitted to a different queue and I don’t care as much
Is there a way to define “task affinity” in this way?
Hi RoughTiger69 ,
when you say Task affinity, you mean, I want C to be executed next to A/B ? Affinity as a concept doesn't really exist, it can be abstracted to a queue, where you have agents pulling from multiple queues. Then C can be pushed to one the the queues (in theory you might be able to programmtically control the Queue of C), wdyt?
OutrageousSheep60 before I can answer, maybe you can explain why "zipping" them does not fit your workfow ?
I see TightElk12
You can always setup the OS environments : CLEARML_API_HOST CLEARML_WEB_HOST CLEARML_FILES_HOST with the correct configuration Or you can simply set CLEARML_NO_DEFAULT_SERVER=1 which will prevent any usage of the default demo serverwdyt?
Hi ConvolutedBee40
If we deploy a task to
clearml-server
, will it automatically scale?
The way it works is with agents and agent glue, basically using k8s as a resource allocator and the clearml agent as orchestrator, did that answer the question ?
If I edit directly the OmegaConf in the UI than the port changes correctly
This will only work if you change the Hydra/allow_omegaconf_edit to True in the UI. Did you?
I meant even just a link to a blank comparison and one can then add the experiments from that view
Just making sure you are aware, once you are in comparison you can always add Tasks (any Task):
Notice you can press on the "Add experiments", then select Any experiment (including all projects! as filters)
Notice you need to remove all filters (right side red x on the filter Icon)
Hi JumpyDragonfly13
I don't know why I'm gettingÂ
172.17.0.2
I think it (the remote jupyter Task) fails to get the correct IP address of the server.
You can manually correct it by going to the DevOps project, look for the runnig Task there, then under Configuration/Properties change external_address to the actual IP 10.19.20.15
Once that is done, re-run the clearml-session , it will suggest to connect to the running session, it should work....
BTW:
I'd like...
@<1527459125401751552:profile|CloudyArcticwolf80> what are you seeing in the Args section ?
what exactly is not working ?
What's the trains-server version?
PompousParrot44 with pleasure. If during your search for a solution you come across something that solves it, and might integrate to the agent, do not hesitate to suggest it :)
Hi RipeGoose2
I just test the hydra example, seems to work when you add the offline right after the import:
` from clearml import Task
Task.set_offline(True) `
Make sure you have the S3 credentials in your agent's clearml.conf :
https://github.com/allegroai/clearml-agent/blob/822984301889327ae1a703ffdc56470ad006a951/docs/clearml.conf#L210
LovelyHamster1 from the top, we have two steps:
We run the code "manually" (i.e. without the agent) this step create the experiment (Task) and automatically feels in the "installed packages" (which are in the same format as regular requirements.txt) An agent is running a cloned copy of the experiment (Task). The agents creates a new venv on the agent's machine, then the agent is using the "Installed packages" section as a replacement to regular "requirements.txt" and installs everything fro...
Hi CluelessElephant89
When you edit the args (General section) in the UI, you are editing the args for "remote execution"
(i.e. when executed by the agent, the args dict will get the values from the UI , as oppsed to "manual execution" where there UI gets the values from code)
In order to simulate the "remote execution" inside your development environment
Try:
` from clearml import Task
simulate remote execution of a specific Task instance
Task.debug_simulate_remote_task(task_id='R...
Hi LovelyHamster1 ,
you mean totally ignore the "installed packages" section, and only use the requirements.txt ?
Does it work if I launch the clearml-agent on a docker and pip doesn't know the packages to install
Not sure I follow... the "detect_with_pip_freeze" flag (when set) will tell clearml (at runtime) to create the "installed packages" directly from pip freeze (instead of analyzing the code)
SmallBluewhale13 in your code what are you getting when you print the version:from clearml import __version__ print(__version__)
another though, see what happens if you remove the .save and .close and stay with the .show, maybe the close call somehow interfere's with it ?
Otherwise, if you can test one of the shaps examples and see maybe they fail in your setup that's another avenue to take for reproducing the issue