Reputation
Badges 1
11 × Eureka!Sorry again for those walls of text. Just thought that detailed explanation of how model naming for remote models works with Ignite handlers could be helpful to somebody in the future (because I spent quite some time trying to figure out why what was working perfectly fine locally started to overwrite one another when I added output_uri
)
And the last question on top of that (sorry!), regarding the concept of OUTPUT MODELS and MODEL NAMES. For this example, I only used one saver to save off 2 last checkpoints. When model is being uploaded for the first time the MODEL NAME
in the UI is full and correct (as you can see in the first screenshot), but when it is being overwritten in the following epochs it only shows name of the experiment in the MODEL NAME
and therefore all the info which was stored in the filename (like e...
Just to demonstrate the workaround I described will attach an example from the UI on how it looks at the moment. Here I used 2 savers, with n_saved=2
, and filename_prefix=str(date.today()) + "_val_neg_img_loss"
and filename_prefix=str(date.today())
, therefore there are 4 output models in total. If I wouldn't add "_val_neg_img_loss"
to one prefix there would be only 2 models, even though (as you can see in the screenshot) in the model name the _val_neg_img_loss
was used al...
Apparently, the main cause was a big git diff file (about 10Mb because of the Jupyter Notebook), which somehow prevented any other upload to the server (not sure why, probably there is a timeout on any upload operation). I would much appreciate if somebody know or can guess why git diff was causing such an issue
@<1523701087100473344:profile|SuccessfulKoala55> very sorry to bother you via the mention! Just wanted to ask if you (or probably somebody else from the team) have any ideas what could be the reason of missing scalars
here?
Thank you for a suggestion! These are the logs I've collected after running this sample None And just for the context, I can see console logs under the console
tab, output models as well as a git diff uploaded to artifacts
, but no scalars
@<1523701205467926528:profile|AgitatedDove14> Am I right that the command from the link above is missing None in the end? And if so, the 9200 port should be opened first, right?
Tried what I described above and got
{"error":"Incorrect HTTP method for uri [/] and method [POST], allowed: [GET, DELETE, PUT, HEAD]"
@<1523701205467926528:profile|AgitatedDove14> I guess, the main issue is the lost of model name file especially in case when the model is being saved based on the metric value. As in the screenshots above, in the UI Model Name
is being just an experiment name after the first epoch, and not the name of the actual model file (which is different from the stored file name on the server, got it). So to understand from what epoch these weights were saved off you would need manually go to model...
Yes, I specified both "NEW_ADDRESS" and "OLD_ADRESS", but what I am talking about is that command is missing a <url>
. If you check the command there is --header
, --request
and --data
arguments but no <url>
, so it returns curl: (2) no URL specified