Reputation
Badges 1
38 × Eureka!or have I got this wrong, and it's the clearml-agent that needs to read/write to S3?
is there any documentation for connecting to an S3 bucket?
Thanks. Where is the configuration stored on the server? Currently we have deployed an EC2 instance using the marketplace AMI - if this demo is successful we would be looking at splitting the environment into the different AWS services - logging to S3, use of secrets manager, elasticsearch, redis, mongoDB etc
I've looked through the documentation, but didn't initially spot anything that would help with doing this (granted I may have overlooked something)
Some ideas, not all directly related to this:
- Passwords should be encrypted before being stored
- A mechanism on the server application to add/remove users, avoiding having to SSH on to the server would be nice
- Some level of permissions in the application would be nice - Admin/Owner/Viewer restrictions which would dictate would users can do and give finer control
yep, in most of them:
/opt/clearml/config
apiserver.conf
clearml.conf
/opt/clearml/data/elastic_7
/nodes
/opt/clearml/data/fileserver
<empty>
/opt/clearml/data/mongo/configdb
<empty>
/opt/clearml/data/mongo/db
collection/index files, /diagnostic.data, /journal etc
/opt/clearml/data/redis
dump.rdb
/opt/clearml/logs
apiserver.log.x, filserver.log (0 bytes)
no, that's what i'm trying to do
Morning, we got to 100% used which is what triggered this investigation. When we initially looked at overlay2 it was using 8GB, so now seems to be acceptable.
thanks @<1523715084633772032:profile|AlertBlackbird30> this is really informative. Nothing seems to be particularly out of the ordinary though
3.7G /var/lib/
3.7G /var/lib/docker
3.0G /var/lib/docker/overlay2
followed by a whole load of files that are a few hundred KBs in size, nothing huge though
incidentally we turn off the server every evening as it's not used overnight, we've not faced issues with it starting up in the morning or noticed any data loss
so am I right in thinking it's just the mount points that are missing?based on the output of df
above
yep still referring to the S3 credentials, somewhat familiar with boto and IAM roles
Hi @<1523701205467926528:profile|AgitatedDove14>
Yes the clearml-server AMI - we want to be able to back it up and encrypt it on our account
Host key verification failed.
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.
- What are the credentials that are referred to in the logs? Where do I get these?
- How do I ensure this container is run automatically if it keeps restarting using the docker-compose file?
- If I run
sudo docker run -it allegroai/clearml-agent-services:latest
manually, do I need to put this in a crontab? My knowledge of running docker containers is a bit limited. - Does this service need to run if I already have a clearml-agent running on a separate instance?
Is there a way you can allow our account to make a copy of the AMI and store it privately?
thanks, i'll try that out
have 2 listeners setup. LB 80 > instance 8080 and LB 443 > instance 8080
ok i'll try that out thanks
Thank you very much 🙂 I don't think our Data team ever use this container so I will stop it for now and comment it from the compose file
not yet, going to try and fix it today.
if I do a df
I see this, which is concerning:
Filesystem Size Used Avail Use% Mounted on
devtmpfs 3.9G 0 3.9G 0% /dev
tmpfs 3.9G 0 3.9G 0% /dev/shm
tmpfs 3.9G 928K 3.9G 1% /run
tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup
/dev/nvme0n1p1 20G 7.9G 13G 40% /
tmpfs 790M 0 790M 0% /run/user/1000
so it looks like the mount points are not created. When do these g...
our setup currently consists of an EC2 instance for clearml-server and one for clearml-agent. We're not using a load balancer at the moment.
Hi @<1523701205467926528:profile|AgitatedDove14> I tried this out, but I keep getting connection timeouts in the browser getting to the ELB. The instance is showing as inservice and passing the healthcheck. Is there any other configuration I need to do in the clearml.conf to make this work?
After making the change yesterday to the docker-compose file, the server is completely unusable - this is all I see for the /dashboard screen