Reputation
Badges 1
25 × Eureka!Okay, I'll make sure we always qoute " , since it seems to work either way.
We will release an RC soon, with this fix.
Sounds good?
Hi OutrageousGrasshopper93
Are you working with venv or docker mode?
Also notice that is you need all gpus you can pass --gpus all
Cloud Access section is in theΒ
Profile
Β page.
Any storage credentials (S3 for example) are only stored on the client side (never the trains-server), this is the reason we need to configure them in the trains.conf. When the browser needs to access those URL's (downloading an artifact) it also needs the secret/key, it automatically display a popup requesting them, and will store them in this section. Notice they are stored on the browser session (as a cookie).
BTW:
Error response from daemon: cannot set both Count and DeviceIDs on device request.
Googling it points to a docker issue (which makes sense considering):
https://github.com/NVIDIA/nvidia-docker/issues/1026
What is the host OS?
Hi OutrageousGrasshopper93
When the Task is executed on a worker, the presence of spaces breaks the URLs and from the UI I cannot access to the resources on the bucket
You are saying the URLs generated in a remote execution are "broken" and on local execution are working, even though it is the same project/task name ?
Maybe we should rename it?! it actually creates a Task but will not auto connect it...
Hmm seems like everything is working, can you check in the UI if you see the serving session ID in the DevOps project? maybe there are two, and you configured one an dthe docker-compose is running another ?
Hmm, Notice that it does store sym links to parent data versions (to save on multiple copies of the same file). If you call get_mutable_local_copy() you will get a standalone copy
I guess I just have to make sure that total memory usage of all parallel processes are not higher than my gpu's memory.
Yep, unfortunately I'm not aware of any way to do that automatically π
BeefyCow3 see this https://allegroai-trains.slack.com/archives/CTK20V944/p1593077204051100 :)
yey working π
Hi IrritableGiraffe81
I have a package called
feast[redis]
in my requirements.txt file.
This means feast is installing additional packages, once the agent is done installing everything, it basically calls pipe freeze and stores back All the packages including versions
Now the question is, how come redis is not installed.
Notice that the Task already has the autodetected packages (it basically ignores requirem,ents.txt as it is often not full missing or just wrong)
...
Hi @<1523703472304689152:profile|UpsetTurkey67>
You mean https://github.com/Lightning-AI/torchmetrics
?
Where are those stored?
In theory task.tags.remove(tag) might also work, but I'm not sure of it will automatically be updated on the backend
okay, let me know if it works
Thatβs the question i want to raise too,
No file size limit
Let me try to run it myself
Hi SoreHorse95
I am exploring hiding our clearml server behind
Do you mean add additional reverse proxy to authenticate clearml-server from outside ?
I'll make sure they get back to you
Really stoked to start using it and introduce a more sane ML ops workflow at my workplace lol.
Totally with you π
... would that be aΒ
Model Registry Store
Β plugin?
YES please β€
So we actually just introduced "Applications" into the clearml free tier, https://app.community.clear.ml/applications
Allowing you to take any Task in the system and make it an "application" (a python script running on one of the service agents), with the ability to configu...
it is just local copy so you can rerun and reconfigure
DM me the entire log, I would assume this is something with the configuration
Different question. How can I pass PYTHONPATH env variable to a task, run by agent (so python can find classes inside m subdirectories)?
Hi HelpfulHare30
By default the working directory will be added to the python path, this means if I have under execution:Working Dir: "." Script: "src/script.py"The root git repo will be added to the python path.
BTW: next RC you could add a flag to the agent to always add the git repo
can the ClearML File server be configured to any kind of storage ? Example hdfs or even a database etc..
DeliciousBluewhale87 long story short, no π the file server, will just store/retrieve/delete files from a local/mounted folder
Is there any ways , we can scale this file server when our data volume explodes. Maybe it wouldnt be an issue in the K8s environment anyways. Or can it also be configured such that all data is stored in the hdfs (which helps with scalablity).I would su...
You might need to play around a bit, it might be that StorageHelper.get(' gs://bucket ') and then helper.list('folder/*')
Let me know what worked π
Hmm so is the problem having the gituser inside the code? or the k8s_glue print ?