Reputation
Badges 1
981 × Eureka!Ok thanks! And for this?
Would it be possible to support such use case? (have the clearml-agent setting-up a different python version when a task needs it?)
Interesting! Something like that would be cool yes! I just realized that custom plugins in Mattermost are written in Go, could be a good hackday for me ๐ to learn go
Should I try to disable dynamic mapping before doing the reindex operation?
Adding back clearml logging with matplotlib.use('agg') , uses more ram but not that suspicious
So I created a symlink in /opt/train/data -> /data
Or even better: would it be possible to have a support for HTML files as artifacts?
AgitatedDove14 WOW, thanks a lot! I will dig into that ๐
AgitatedDove14 Yes exactly, I tried the fix suggested in the github issue urllib3>=1.25.4 and the ImportError disappeared ๐
and with this setup I can use GPU without any problem, meaning that the wheel does contain the cuda runtime
And I am wondering if only the main process (rank=0) should attach the ClearMLLogger or if all the processes within the node should do that
Iโve set dynamic: โstrictโ in the template of the logs index and I was able to keep the same mapping after doing the reindex
Hi AgitatedDove14 , thatโs super exciting news! ๐คฉ ๐
Regarding the two outstanding points:
In my case, Iโd maintain a client python package that takes care of the pre/post processing of each request, so that I only send the raw data to the inference service and I post process the raw output of the model returned by the inference service. But I understand why it might be desirable for the users to have these steps happening on the server. What is challenging in this context? Defining how t...
I am already trying with latest of pip ๐
Ok, now I would like to copy from one machine to another via scp, so I copied the whole /opt/trains/data folder, but I got the following errors:
Could be, but not sure -> from 0.16.2 to 0.16.3
now I can do nvcc --version and I getCuda compilation tools, release 10.1, V10.1.243
Thanks! I will investigate further, I am thinking that the AWS instance might have been stuck for an unknown reason (becoming unhealthy)
ubuntu18.04 is actually 64Mo, I can live with that ๐
Thanks a lot, I will play with that!
No worries! I asked more to be informed, I don't have a real use-case behind. This means that you guys internally catch the argparser object somehow right? Because you could also simply use sys argv to find the parameters, right?
Hi AgitatedDove14 , How should we proceed to fix this bug? Should I open an issue in github? Should I try to make a minimal reproducible example? Itโs blocking me atm
(by console you mean in the dashboard right? or the terminal?)
Note: I can verify that post_packages is well picked up by the trains-agent, since in the experiment log I see:agent.package_manager.type = pip agent.package_manager.pip_version = \=\=20.2.3 agent.package_manager.system_site_packages = true agent.package_manager.force_upgrade = false agent.package_manager.post_packages.0 = PyJWT\=\=1.7.1
PS: in the new env, Iโv set num_replicas: 0, so Iโm only talking about primary shardsโฆ
Yea, the config is not appearing in the webUI anymore with this method ๐
No I agree, itโs probably not worth it