
Reputation
Badges 1
19 × Eureka!I don’t thinks it’s possible : the checkpoints are output models, and such can be accessed from the task, but from the model class I can get only the url of the last version (as I expected from using update weights, as it does sound like it replaces the old weights with the new ones). Also, even if I could get the urls of the old checkpoints, how can I delete them from the file server? StorageManager doesn’t seem to have any method to delete remote files
Yeah, but it doesn’t make sense to the file name of the file to be uploaded to have any sort of influence over the process, since the models have specifically a name to identify them. Moreover, if those files remain present on the server, but inaccessibile through the api, it’s a bug in any case, I’d say
Using the webserver ? In the pc where I’m accessing the server from I don’t even have a clearml.conf file in my home, nor I would expect it to be used anyway … Is there any other instance of such configuration file in the server that I’m not aware of?
I set this in the webserver section of the docker-compose, but it didn’t helped (the double quotes are because a problem in the parsing of the arguments , it has been already reported on GitHub where a read about this fix):
environment:
WEBSERVER__fileBaseUrl: '"http://192.168.1.83:8081"'
WEBSERVER__useFilesProxy: 'true'
The container is still running and doesn’t show any log entry when I start the trigger scheduler remotely
The url there has localhost as host, instead of the ip
The log of the “clearml-agent-services” only contained this:
I noticed th e problem in the preview section of the dataset files: they cannot be shown because they point to localhost, but, if I click on “open image” and then replace localhost with the server ip everything works as expected
. The version of the sdk is 1.17.1
Both the web app and server versions are 2.0.0-613, api is instead 2.31
Yes, in fact, if I take the urls of the files that the webserver provides me and I replace the localhost part with that ip I can clearly view the underlying data from my browser
Exactly, I guess that is exactly the problem probably caused by start_remotely() terminating the process accidentally. The server is 2.0.0-613
I see as this can be a problem at least when someone wants to migrate the machine on a new domain/ip. In any case, a warning in the documentation would be useful, as the default deploy shown is for localhost, but would brake as soon as someone tries to access those data from the local network instead
I only used the env variables I mentioned (I also checked inside the docker-compose.yaml and noticed that only CLEARML_HOST_IP hasn’t a default value, so I tried to set only that env variable, but the result didn’t change). I haven’t any other configuration other than apiserver.conf in /opt/clearml/config with the users. I definitely haven’t seen any configuration.json file for now. Ps: the docker-compose.yaml is just the one inside the repo, without any change to it.
Do I maybe need to do something more that I wasn’t aware to start the trigger schedule on the service queue? Or is it better at this point to just manually run a script on the machine and do “start()” instead of “start_remotely()”?
I just checked and indeed the clearml.conf file that of the user that I used to upload the dataset has indeed set the various servers host as localhost (since for this testing that user was on the same machine as the server). Is this an expected behavior ? I would have thought that such a config only influences the the connection between the cli/ask of the user and the server, but once the data is uploaded it’s duty of the server to provide the right url for whoever is accessing the data