Reputation
Badges 1
40 × Eureka!Yes, i was expecting that it was already working like that. So far, i modified the code to set DOCKER_ROOT_CONF_FILE to what i want !!!
I saw this part of the configuration file but i don't know exactly which key is used as mount point for configuration file ?
I think i have my answer, this is hard coded in agent base_cmd += ( (['--name', name] if name else []) + ['-v', conf_file+':'+DOCKER_ROOT_CONF_FILE] + (['-v', host_ssh_cache+':'+mount_ssh] if host_ssh_cache else []) + ...
But after all these modifications, i succeeded in using the clearml-agent. 👍 Great job ! Thank to the clearml team.
The mount point for the clearml.conf, i.e. '-v', '/tmp/.clearml_agent.qy2xyt21.cfg:/root/clearml.conf'
becauce the docker i use is run as user
and doesn't have acces to \root.
. May be this is defined on cleaml-server side ? I use my own server installed on another linux box using docker-compose.
i use a proxy and the port is 80, i need to write it ?
Yes, so far i came back to the old adress 🙂
Hello. I think an Issue should at least be opened. Modifications in my code need to be generalized before creating a pull request.
Yes, i even got a "upload finished" message et the whole process goes to end.
for clearml package i use the 1.4.1.
only a "upload failed" and no data in my S3 bucket
As far i know, a server get a SIG_PIPE event on a socket when a client died too soon or is closed by user, but i don't know who get the broken pipe ? Is it the clearml file server ( who masters the upload, i guess) ? Is it due to my minio server ? Who is the client that died before upload is finished ?
But i don't want to create a new dataset, my dataset exists and has already been downloaded by a previous task.
The addresses seems strange, is this the hostname?
I use the nip services to have subdomains: clearml.domain api.domain and file.domain that points to the same host.
What i don't understant is the list of artifacts that were not deleted
i tried so far but it was not so easy, because there is a python executable "update_from_env" that empties the configuration.json file. So i create a file in /mnt/external_files/configs and my configuration.json was read.
DeterminedCrab71 You right, if i understand correctly HTTP.FILE_BASE_URL is undefined, then file to delete is describe as "misc" instead of "fc" then i guess system is unable to launch the delete of the file
Files are stored on the same box where the docker is running. And there is a mounting point between file server docker and the host itself.
No particular information in console( no error), no network error too.
I don't undestand why after specifying /root/clearml.conf
, a copy is required to /root/default_clearml.conf
. I modified in the code this copy by one that takes a user mounting point and copies it to home directory ~/clearml.conf
I tried to modify all docker_internal_mounts point but the mount point for clearm.conf file still remains the same. May be it is defined on server side ?
My files (fs) are deleted but i have the same issue as reported by SuperiorPanda77 , with some undefined value that is said not to be deleted. I guess that as my command deleteFileServerSources
works but exit with some strange return value, other commands in the row addFaieldDeletedFiles
and deleteProjectFromRoot
are not executed (file src/app/webapp-common/shared/entity-page/entity-delete/base-delete-dialog.effects.ts
)
It seems that i should define this variable by the use of an environment variable in ConfigurationService.globalEnvironment.
I was unable to define FILE_BASE_URL inside the docker container. I modify the HTTP constant in app.constants.ts with hard code values, compile the webapp again (npm) and replace it in my docker container and now it works....
ClearML results page:
http://clearml.10.68.0.250.nip.io/projects/300ec77013504f51a7f295226c3f7e40/experiments/5418cf58b64f425a9a17fbd4af6cfee8/output/logTraceback (most recent call last):
File "/app/.clearml/venvs-builds/3.8/code/__init__.py", line 287, in <module>
[train_data, test_data, train_loader, test_loader, nb_class] = import_data(root_database, train_path, test_path,
File "/app/.clearml/venvs-builds/3.8/code/__init__.py", line 153, in import_data
...
As my clearml server is run using docker, i have no idea where http://clearml.10.68.0.250.nip.io/projects/300ec77013504f51a7f295226c3f7e40/experiments/5418cf58b64f425a9a17fbd4af6cfee8/output/log is exactly stored.
My configuration.json
is { "fileBaseUrl": " http://file.10.68.0.250.nip.io "}, but HTTP.FILE_BASE_URL still remains undefined
. Something is missing ?
sudo docker logs clearml-fileserver
This gives no info at all. May be i should increase the log level to debug. The only message i got is about "werkzeug" the default server module of flash that shouldn't be use as production deployement (by the way, why not use gunicorn as entrypoint in docker-compose ?)
My artifacts are now deleted but the directories where the artifacts are stored are not deleted.
In docker-compose, image was latest allegroai/clearml:latest when i pull docker images. When i launch it, after installation i have in WebApp following informations : "WebApp: 1.3.1-169 • Server: 1.3.1-169 • API: 2.17"