Reputation
Badges 1
533 × Eureka!SuccessfulKoala55 AppetizingMouse58
[ec2-user@ip-10-0-0-95 ~]$ df -h Filesystem Size Used Avail Use% Mounted on devtmpfs 3.9G 0 3.9G 0% /dev tmpfs 3.9G 0 3.9G 0% /dev/shm tmpfs 3.9G 880K 3.9G 1% /run tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup /dev/nvme0n1p1 8.0G 6.5G 1.5G 82% / tmpfs 790M 0 790M 0% /run/user/1000
Can you lend a few a words about how the not-pip freeze mechanism of detecting packages work?
Okay Jake, so that basically means I don't have to touch any server configuration regarding the file-server on the trains server. It will simply get ignored and all I/O initiated by clients with the right configuration will cover for that?
I want to collect the dataframes from teh red tasks, and display them in the pipeline task
I just tried setting the conf in the section Martin said, it works perfectly
To be clearer - how to I refrain from using the built in file-server altogether - and use MINIO for any storage need?
But does it disable the agent? or will the tasks still wait for the agent to dequeue?
Could be, my message is that in general, the ability to attach a named scalar (without iteration/series dimension) to an experiment is valuable and basic when looking to track a metric over different experiments
my current version of the images used:
I'd go for
` from trains.utilities.pyhocon import ConfigFactory
config = ConfigFactory.parse_file(CONF_FILE_PATH) `
I'm quite confused... The package is not missing, it is in my environment and executing tasks normally ( python my_script.py.... ) works
Any news on this? This is kind of creepy, it's something so basic that I can't trust my prediction pipeline because sometimes it fails randomly with no reason
Does that mean that teh AWS autoscaler in trains, manages EC2 auto scaling directly without using the AWS built in EC2 auto scaler?
TimelyPenguin76 , this can safely be set to s3:// right?
` alabaster==0.7.12
appdirs==1.4.4
apturl==0.5.2
attrs==21.2.0
Babel==2.9.1
bcrypt==3.1.7
blinker==1.4
Brlapi==0.7.0
cachetools==4.0.0
certifi==2019.11.28
chardet==3.0.4
chrome-gnome-shell==0.0.0
clearml==1.0.5
click==8.0.1
cloud-sptheme==1.10.1.post20200504175005
cloudpickle==1.6.0
colorama==0.4.3
command-not-found==0.3
cryptography==2.8
cupshelpers==1.0
cycler==0.10.0
Cython==0.29.24
dbus-python==1.2.16
decorator==4.4.2
defer==1.0.6
distlib==0.3.1
distro==1.4.0
distro-info===0.23ubuntu1
doc...
AgitatedDove14 clearml version on the Cleanup Service is 0.17.0
TimelyPenguin76 this fixed it, using the detect_with_pip_freeze as true solves the issue
So the scale will also appear?
Oh I get it, that also makes sense with the docs directing this at inference jobs and avoiding GPU - because of the 1-N thing
I'm really confused, I'm not sure what is wrong and what is the relationship between the templates the agent and all of those thing
For the meantime, I'm giving up on the pipeline thing and I'll write a bash script to orchestrate the execution, because I need to deliver and I'm not feeling this is going anywhere
On an end note I'd love for this to work as expected, I'm not sure what you need from me. A fully reproducible example will be hard because obviously this is proprietary code. What ...
and in the UI configuration I didn't understand where does permission management came into play
You should try trains-agent daemon --gpus device=0,1 --queue dual_gpu --docker --foreground and if it doesn't work try quoting trains-agent daemon --gpus '"device=0,1"' --queue dual_gpu --docker --foreground
which permissions should it have? I would like to avoid full EC2 access if possible, and only choose the necessary permissions
This is what I meant should be documented - the permissions...
