Reputation
Badges 1
282 × Eureka!Hi CostlyOstrich36 , thanks. I will check with the Enterprise team then.
We are using k8s glue to spawn the job. Would you be able to advise in detail of steps on what goes on when the above code executes?
Hi, the latest k8sglue-example.py was last commited about 4 months ago. Are you refering to that version?
Hi SuccessfulKoala55 , is there a channel here that posts version updates?
Oh, this meant i have been using the latest agent which is v1.0.0. The problems were still there.
Hi AgitatedDove14 , i changed everything to cuda 10.1 and tried again with the same rrror. the section as follows. I made sure torch==1.6.0+cu101 and torchvision==0.8.2+cu101 are in the pypi repo. But the same error still came up.
` # Python 3.6.9 (default, Oct 8 2020, 12:12:24) [GCC 8.4.0]
boto3 == 1.14.56
clearml == 0.17.4
numpy == 1.19.1
torch == 1.6.0
torchvision == 0.7.0
Detailed import analysis
**************************
IMPORT PACKAGE boto3
clearml.storage: 0
IMPORT PACKAG...
AlertBlackbird30 , Actually the log says 10.2.docker_cmd = nvidia/cuda:10.2-devel-ubuntu18.04 -e GIT_SSL_NO_VERIFY=true
I can't seem to find the fix to this. Ended up using an image that comes with torch installed.
Hi AgitatedDove14 , what version i should change it to? I'm currently on v0.17.2rc3.
AgitatedDove14 , would you elaborate on this resolution process?
Sorry AgitatedDove14 i missed your reply. So this means that in the community version, when i have an experiment using clearml and it uses clearml datasets SDK, the dataset id that was used will not be reflected on the clearml experiment in any way, thus making it impossible to establish Data Lineage/Provenance. (E.g. Link data used to experiment). This feature is however available in the Enterprise Version as HyperDatasets. Am i correct?
Code example.
` from clearml import Task, Logger
tas...
Thanks this would be a good alternative before the enterprise version comes in. How is this different from argparser btw?
I meant the dataset id.
Hi, it make sense to automate this part just like how you automate the rest of the MLOps flow, especially when you already support Data Versioning/Lineage, Data Provenance (How it works with the experiment and as a model source) should be in too. Although i agree technically it's probably not possible to tell if the users actually used the indicated datasets after they do a datasets.get_copy()
.
Sorry AgitatedDove14 can you bump me to that thread?
So the context I'm asking is I realise I'll need to catalogue all the dataset ids created by ppl separately on a spreadsheet. And for each experiment, I'll need to go into the code commit to see which id is being used. But on the other hand, I thought I've seen advertised use cases where the experiment can be directly linked to the dataset id being used. The brain's a bit rusty to recall how it was done.
AgitatedDove14 , i'm Jax, not Manoj! lol. 😅 😅
Okay this part I missed, why would you need to add additional "catalog" when you have the UI?
Yeah this is the part i am trying to reconcile. I don't see any UI for datasets, Or is this a feature of hyperdatasets and i just mixed them up.
Try set docker_force_pull: true
under agent section of your agent's clearml.conf.
What's the diff between template-yaml and --overrides-yaml? I used the latter to ensure the gpu is passed in.
We are deploying ClearML Server via the docker-compose.
For ClearML-Agent. We have the choice of Docker or K8S preferred (Using the Glue).
For K8S, we can't get the glue to work ( https://clearml.slack.com/archives/CTK20V944/p1614525898114200?thread_ts=1613923591.002100&cid=CTK20V944 ) so we can't make an assessment of whether it actually works for us.
Thanks 👍 . Should i create an issue on Github?
Unfortunately it's not. The problem previously encountered with the docker method surfaced again. In this case, the BASE DOCKER IMAGE
nvidia/cuda:10.1-runtime-ubuntu18.04 --env GIT_SSL_NO_VERIFY=true
is not taking effect with the k8s glue.
Thanks, its attached.
I also noted that the status on the ClearML is always in 'pending', unlike others which says 'Running'. Is this a side effect of using k8s glue?
ok thanks.
yeah, someone should call them out.