Reputation
Badges 1
25 × Eureka!CooperativeSealion8 let me know if you managed to solve the issue, also feel free to send the entire trains-server log. I'm assuming one of the dockers failed to boot...
Hi @<1590514584836378624:profile|AmiableSeaturtle81>
I think you should use add_external_files
, instead of add_files
(which is for local files)
None
You cannot change the 8008 port, it has to be 8008 externally (i.e. from the client side).
You can however do subdomain, but only these will work:api.mydomain.com app.mydomain.com files.mydomain.com
Hmmm why don't you use "series" ?
(Notice that with iterations, there is a limit to the number of images stored per title/series , which is configurable in trains.conf, in order to avoid debug sample explosion)
We should probably make sure it is properly stated in the documentation...
CloudyHamster42
RC probably in a few days, but notice that it will just remove the warnings, I still can't reproduce the double axis issue.
It will be helpful if you could send a small script to reproduce the problem.
Maybe this example code can help ? https://github.com/allegroai/trains/blob/master/examples/manual_reporting.py
Where are you seeing this message?
CourageousLizard33 if the two series are on the same graph, just click on the series in the legend, you can enable/disable it, and the scale will adjust automatically.
Regarding grouping, this is a feature that can be turned off, the idea is that we split the tag to title/series... So if you have the same prefix you get to group the TF scalars on the same graph, otherwise they will be on a diff title graph. That said you can make force it to have a series per graph like in TB. Makes sense?
Hi SmugTurtle78
Unfortunately there is no actual filtering for these logs, because they are so important for debugging and visibility. I have to ask, what's the use case to remove some of them ?
Hi @<1534706830800850944:profile|ZealousCoyote89>
We'd like to have pipeline A trigger pipeline B
Basically a Pipeline is a Task (of a specific Type), so you can have pipeline A function clone/enqueue the pipelineB Task, and wait until it is done. wdyt?
Hi ThankfulOwl72 checkout TrainsJob
object. It should essentially do what you need:
https://github.com/allegroai/trains/blob/master/trains/automation/job.py#L14
Hi @<1643060801088524288:profile|HarebrainedOstrich43>
try this RC let me know if it works 🙂
pip install clearml==1.13.3rc1
Hi SubstantialElk6
We will be running some GUI applications so is it possible to forward the GUI to the clearml-session?
If you can directly access the machine running the agent, yes you could. If not reverse proxy is in the working 😉
We have a rather locked down environment so I would need a clear view of the network view and the ports associated.
Basically all connections are outgoing only, with the exception of the clearml-server (listening on ports 8008 8080 8081)
Hi @<1704304350400090112:profile|UpsetOctopus60>
https://clear.ml/docs/latest/docs/deploying_clearml/clearml_server_kubernetes_helm
Just use the helm charts. It's the easiest
try these values:
os.environ.update({
'CLEARML_VCS_COMMIT_ID': '<commit_id>',
'CLEARML_VCS_BRANCH': 'origin/master',
'CLEARML_VCS_DIFF': '',
'CLEARML_VCS_STATUS': '',
'CLEARML_VCS_ROOT': '.',
'CLEARML_VCS_REPO_URL': '
',
})
task = Task.init(...)
Hi DeliciousBluewhale87
Yes that should have worked, can you verify the task status ?
Print(Task.get_task(...).get_status())
SourOx12
Run this example:
https://github.com/allegroai/clearml/blob/master/examples/reporting/scalar_reporting.py
Once, then change line #26 to:task = Task.init(project_name="examples", task_name="scalar reporting", continue_last_task=True)
and run again,
Follow-up question: how does clearML "inject" the argparse arguments before the task is initialized?
it patches the actual parse_args
call, to make sure it works you just need to make sure it was imported before the actual call takes place
I had to do another workaround since when
torch.distributed.run
called it's
ArgumentParser
, it was getting the arguments from my script (and from my task) instead of the ones I passed it
Are you saying...
Since I can't use the
torchrun
comand (from my tests, clearml won't use it on the clearm-agent), I went with the
@<1556450111259676672:profile|PlainSeaurchin97> did you check this example?
None
Hmm yeah I can see why...
Now that I think about it, at least in theory the second process that torch creates, should inherit from the main one, and as such Task.init is basically "ignored"
Now I wonder why your first version of the code did not work?
Could it be that we patched the argparser on the subprocess and that we should not have?
Hi @<1726410010763726848:profile|DistinctToad76>
Why not just report scalars, the x-axis you can use as "iterations" if this is a running in real time to collect the prompts.
If this is a summary then just report a scatter plot (you can also specify the names of the axis and the series)
None
Hi @<1523706266315132928:profile|DefiantHippopotamus88>
The idea is that clearml-server acts as a control plane and can sit on a different machine, obviously you can run both on the same machine for testing. Specifically it looks like the clearml-sering is not configured correctly as the error points to issue with initial handshake/login between the triton containers and the clearml-server. How did you configure the clearml-serving docker compose?
Hi @<1523704207914307584:profile|ObedientToad56>
hat would be the right way to extend this with let's say a custom engine that is currently not supported ?
as you said 'custom' 🙂
None
This is actually a custom
engine, (see (3) in the readme, and the preprocessing.py
implementing it) I think we should actually add a specific example to custom
so this is more visible. Any thoughts on what would...
@<1734020162731905024:profile|RattyBluewhale45> could you attach the full Task log? Also what do you have under "installed packages" in the original manual execution that works for you?
We’d be using https in production
Nice 🙂
@<1687653458951278592:profile|StrangeStork48> , I was reading this thread trying to understand what exactly is the security concern/fear here, and I'm not sure I fully understand. Any chance you can elaborate ?
Assuming it was hashed, the seed would be stored on the same server, so knowing both would allow me the same access, no?
In the side bar you get the title of the graphs, then when you click on them you can see the diff series on the graphs themselves