Reputation
Badges 1
92 × Eureka!yes it looks like this.. I just wanted to understand if it is should be so slow.. or I did something wrong
SuccessfulKoala55 it still stuck on the same line .. does it should be like this?
Hi SuccessfulKoala55 ,
I down the server:
` [ec2-user@ip-172-31-26-41 ~]$ sudo docker-compose -f /opt/clearml/docker-compose.yml down
WARNING: The CLEARML_HOST_IP variable is not set. Defaulting to a blank string.
WARNING: The CLEARML_AGENT_GIT_USER variable is not set. Defaulting to a blank string.
WARNING: The CLEARML_AGENT_GIT_PASS variable is not set. Defaulting to a blank string.
Stopping clearml-webserver ... done
Stopping clearml-agent-services ... done
Stopping clearml-apiserver...
I update to the new version 0.16.1 few weeks away and it works using the elastic_upgrade.py
Hi DeterminedCrab71 .
Thanks 🙂
Thanks AgitatedDove14 ,
I need to check with my boss that it is OK to share more code, will let you know..
But I will give 0.16 a try when it will release.
🙏
AgitatedDove14 Maybe I need to change something here: apiserver.conf
for increasing workers number?
Hi SuccessfulKoala55 , yes for now I will like to start moving what inside the /opt/trains/data/fileserver..
because as I understand the logs and graphs are saved in elastic so I think it will not be easy to move them as well right?
Regarding of moving the fileserver to S3, what is the best way to move the old data to S3 ?
I think if I will move all the /opt/trains/data/fileserver to s3,
the trains-server will not know that right?
If I will mount the S3 bucket to the trains-server and link the mount to /opt/trains/data/fileserver does it will work?
I just need it to ran the docker and run the command inside it no?
Thanks I am basing my docker on https://github.com/facebookresearch/detectron2/blob/master/docker/Dockerfile
Thanks I will upgrade my instance type and the add more workers. where I need to configure it?
I didn't try trains-agent yet, does it support using AWS batch?
Thanks I will upgrade the server for now and will let you know
Thanks!! you are the best..
I will give it a try when the runs will finish
Ohh I understood, so can you give me a short explanation on how to change the meta data?
Does it still work if I will keep trains.conf like this, and mount the S3 also?
Hi AgitatedDove14 ,
Sorry for the late response It was late at my country 🙂 .
This what I am gettingappuser@219886f802f0:~$ sudo su root root@219886f802f0:/home/appuser# whoami root
Thanks, I will make sure that all the python packages install as root..
And will let you know if it works
I did it just because FAIR did it in detectron2 Dockerfile
It is now stacking after:
` 2021-03-09 14:54:07
task 609a976a889748d6a6e4baf360ef93b4 pulled from 8e47f5b0694e426e814f0855186f560e by worker ov-01:gpu1
2021-03-09 14:54:08
running Task 609a976a889748d6a6e4baf360ef93b4 inside default docker image: MyDockerImage:v0
2021-03-09 14:54:08
Executing: ['docker', 'run', '-t', '--gpus', '"device=1"', '-e', 'CLEARML_WORKER_ID=ov-01:gpu1', '-e', 'CLEARML_DOCKER_IMAGE=MyDockerImage:v0', '-v', '/tmp/.clearml_agent.jvxowhq4.cfg:/root/clearml.conf', '-v', '/...
So I ask my boss and DevOps and they say for now we can use the root
user inside the docker image...