Reputation
Badges 1
38 × Eureka!is there any documentation for connecting to an S3 bucket?
Thanks. Although it's AWS related, the context was with an error we see within clearml "ValueError: Insufficient permissions for None "
yep still referring to the S3 credentials, somewhat familiar with boto and IAM roles
not yet, going to try and fix it today.
if I do a df I see this, which is concerning:
Filesystem Size Used Avail Use% Mounted on
devtmpfs 3.9G 0 3.9G 0% /dev
tmpfs 3.9G 0 3.9G 0% /dev/shm
tmpfs 3.9G 928K 3.9G 1% /run
tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup
/dev/nvme0n1p1 20G 7.9G 13G 40% /
tmpfs 790M 0 790M 0% /run/user/1000
so it looks like the mount points are not created. When do these g...
no, that's what i'm trying to do
Hi @<1523701205467926528:profile|AgitatedDove14> I tried this out, but I keep getting connection timeouts in the browser getting to the ELB. The instance is showing as inservice and passing the healthcheck. Is there any other configuration I need to do in the clearml.conf to make this work?
have 2 listeners setup. LB 80 > instance 8080 and LB 443 > instance 8080
Host key verification failed.
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.
thanks @<1523715084633772032:profile|AlertBlackbird30> this is really informative. Nothing seems to be particularly out of the ordinary though
3.7G /var/lib/
3.7G /var/lib/docker
3.0G /var/lib/docker/overlay2
followed by a whole load of files that are a few hundred KBs in size, nothing huge though
Yep i've done all that, it didn't seem to work until I set the deploy key to write
Just by chance I set the SSH deploy keys to write access and now we're able to clone the repo. Why would the SSH key need write access to the repo to be able to clone?
- What are the credentials that are referred to in the logs? Where do I get these?
- How do I ensure this container is run automatically if it keeps restarting using the docker-compose file?
- If I run
sudo docker run -it allegroai/clearml-agent-services:latestmanually, do I need to put this in a crontab? My knowledge of running docker containers is a bit limited. - Does this service need to run if I already have a clearml-agent running on a separate instance?
I added this to each of the containers
logging:
options:
max-file: 5
max-size: 10m
Thanks. Where is the configuration stored on the server? Currently we have deployed an EC2 instance using the marketplace AMI - if this demo is successful we would be looking at splitting the environment into the different AWS services - logging to S3, use of secrets manager, elasticsearch, redis, mongoDB etc
I've looked through the documentation, but didn't initially spot anything that would help with doing this (granted I may have overlooked something)
that should be the case, we have default_output_uri: set to an s3 bucket
thanks, i'll try that out
strange, I used one of the publicly available AMIs for ClearML (we did not upgrade from the Trains AMI as started fresh)
Thank you very much 🙂 I don't think our Data team ever use this container so I will stop it for now and comment it from the compose file
no, they are still rebooting. i've looked in /opt/clearml/logs/apiserver.log no errors
ok i'll try that out thanks
or have I got this wrong, and it's the clearml-agent that needs to read/write to S3?
Some ideas, not all directly related to this:
- Passwords should be encrypted before being stored
- A mechanism on the server application to add/remove users, avoiding having to SSH on to the server would be nice
- Some level of permissions in the application would be nice - Admin/Owner/Viewer restrictions which would dictate would users can do and give finer control
it looks like clearml-apiserver and clearml-fileserver are continually restarting
so am I right in thinking it's just the mount points that are missing?based on the output of df above