Reputation
Badges 1
40 × Eureka!I didn't get to test it on the cloud yet and trying to make final adjustments
if I don't have internet connection on the other machine, can I just copy the artifacts and transfer them to my local machine?
Hi AgitatedDove14 , regarding the slider feature, do you know when would it be released?
AgitatedDove14 the option you mentioned just before sounds much better for me, I must admit I find the name of the method confusing. I came across it before but thought its only relevant for credentials
hi AgitatedDove14 , when I'm using the set_credentials approach does it mean the trains.conf is redundant? if the file doesn't exists on the machine, will it be an issue? if not, so what defaults should I assume for the rest of the values?
AgitatedDove14 a single experiment, that is being paused and resumed.
inconsistrncy in yhe reporting: when resuming the 10th epoch for example and doing an extra epoch clearml iteration count is wrong for debug images and monitored metrics.. somehow not for the scalar reporting
Thanks AgitatedDove14 , well if a machine doesn't set the default_output_uri, the default behavior for model checkpoints for example is to just register without uploading. So in the case that the default_output_uri is not defined the offline task folder will not have the artifacts for uploading (not included in the zip file created by offline package).. or am I missing something?
thanks SuccessfulKoala55 , the question arose after trying to follow the instructions you attached. it seems that installing a docker on windows 10 Home is somewhat problematic
AgitatedDove14 it is happening on an offline network, would be tricky to set it up we will try. so far the errors we observed were either:
Calling upload callback when starting upload: maximum recursion depth exceeded
Or
something like pending for upload (might be because we archived a run while it was uploading)
AgitatedDove14 , I want multiple machines to access the synced state of the optimizer. which is part of the internals of the optimizer... and then report the results back to the optimizer such that the study object of the optimizer keeps track of the results and the next sample will be aware of all previous studies
so it sounds like there is no known issue related to this
Hi AgitatedDove14 the thing I had in mind is having access to trains logger exclusive features like the https://allegro.ai/docs/logger.html#trains.logger.Logger.report_plotly and .report_table for example.. It can be done by explicitly getting the trains default logger, but I was wondered if there is some kind of combined interface to capture properties of both in one object especially because I came across the deprecated TrainsLogger
and latest pre release hydra
Hi AgitatedDove14 , path to the config file for trains manual execution
edit: tweaked it a little bit for my use-case:is_demo_server = '
http://demoapi.trains.allegro.ai ' in Session.get_api_server_host()
is_server_available = requests.get(Session.get_api_server_host() + "/debug.ping").status_code == 200
yes that's what I meant.. this is good, thanks
I'm doing this instead
I think the latter. the specific use-case I'm talking about is running experiments on one machine, and using a local server on another machine to read the "logs" \ artifacts
by WebApp you mean the public online one? I might be confusing stuff
by communication that the artifacts are streamed from the machine running the experiments to the local server?
can it be done "offline" after the experiments run view them in my local server?
yes, I have limited access to the machine that is running the experiment. I can't setup a server there. but I want to collect the results and view them later
I refer to all the info that accessible through the webApp
much appreciated, thanks!