Reputation
Badges 1
533 × Eureka!Oh I get it, that also makes sense with the docs directing this at inference jobs and avoiding GPU - because of the 1-N thing
the worst part of debugging this is waiting for the docker to install tensorflow each time over and over again π
AgitatedDove14 this is stillnot fixed for me, even though I upgraded to server 1.1... Does the client require an update as well? Should I open an issue about this?
so putting the docs aside, what permissions should I give to the IAM associated with trains' autoscale ?
I think you are talking about separate problems - the "WARNING DIFF IS TOO LARGE" is only a UI issue, that you can't see hte diff in the UI - correct me if I'm wrong with this
Maria seems to be saying that the execution FAILS when she has uncomitted changes, which is not the expected behavior - am I right maria?
(I'm working with maria)
essentially, what maria says is when she has a script with uncomitted changes, when executing remotely, the script that actually runs on the remote machine is without the uncomitted changes
e.g.:
Her git status
is clean, she makes some changes to script.py
and executes it remotely. What gets executed remotely is the original script.py
and not the modified version she has locally
SuccessfulKoala55 here it is
I assume that at some points in the execution, the client (where the task is running) is sending JSONs to the mongo service, and that is what we see in the web UI.
Since we are talking about a case where there is no internet available, maybe these could be dumped into files/stdout and let the user manually insert them.
The manual insertion UX could be something like a CLI copy-paste or and endpoint for files - but since your UX is so good ( π ) I'm sure you'll figure this part out better
alabaster==0.7.12 appdirs==1.4.4 apturl==0.5.2 attrs==21.2.0 Babel==2.9.1 bcrypt==3.1.7 blinker==1.4 Brlapi==0.7.0 cachetools==4.0.0 certifi==2019.11.28 chardet==3.0.4 chrome-gnome-shell==0.0.0 clearml==1.0.5 click==8.0.1 cloud-sptheme==1.10.1.post20200504175005 cloudpickle==1.6.0 colorama==0.4.3 command-not-found==0.3
Another thing I noticed now it happens on my personal computer, when I execute the same pipeline from the exact same commit with exact same data on another host it works without these problems
Cool, now I understand the auto detection better
yeah I guessed so
what should I paste here to diagnose it?
AgitatedDove14 permanent. I want to start with a CLI interface that allows me add users to the trains server
So just to be clear - the file server has nothing to do with the storage?
I tried what you said in the previous response, setting sdk.aws.s3.key
and sdk.aws.s3.secret
to the ones in my MINIO. Yet when I try to download an object, i get the following
` >>> result = manager.get_local_copy(remote_url="s3://*******:9000/test-bucket/test.txt")
2020-10-15 13:24:45,023 - trains.storage - ERROR - Could not download s3://*****:9000/test-bucket/test.txt , err: SSL validation failed for https://*****:9000/test-bucket/test.txt [SSL: WRONG_VERSION_NU...
Okay Jake, so that basically means I don't have to touch any server configuration regarding the file-server
on the trains server. It will simply get ignored and all I/O initiated by clients with the right configuration will cover for that?