SubstantialElk6

117 Questions, 310 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

282 × Eureka!

Answers 310

0 Hi, V1 Of Agent Seems To Have Removed Agent.Package_Manager.Force_Repo_Requirements_Txt. Is This Still Available In Other Forms?

yup. in this case it wasn't root. Removing that USER and -u in pip solves the problem. However, in our production images, we are required to remove root access.
` FROM nvidia/cuda:10.1-cudnn7-devel

ENV DEBIAN_FRONTEND noninteractive
RUN apt-get update && apt-get install -y
python3-opencv ca-certificates python3-dev git wget sudo ninja-build
RUN ln -sv /usr/bin/python3 /usr/bin/python

create a non-root user

ARG USER_ID=1000
RUN useradd -m --no-log-init --system --uid ${USER_ID} a...

3 years ago

0 Hi, I Am Working On Creating Retraining Pipelines In Production. The Way I'M Doing This Is To Install Clearml-Server On My Production. Then I Recreate The Ingestion, Preprocessing And Training/Opt Tasks Into A Clearml-Pipeline. Thereafter, I Would Call

Hi CostlyOstrich36 , thanks. I will check with the Enterprise team then.

2 years ago

0 We'Re Working On Clearml Serving Right Now And Are Very Interested In What You All Are Searching For In A Serving Engine, So We Can Make The Best Serving Engine We Can

Do you mean by this that you want to be able to seamlessly deploy models that were tracked using ClearML experiment manager with ClearML serving?

Ideally that's best. Imagine that i used Spacy (Among other frameworks) and i just need to add the one or two lines of clearml codes in my python scripts and i get to track the experiments. Then when it comes to deployment, i don't have to worry about Spacy having a model format that Triton doesn't recognise.

Do you want clearml serving ...

2 years ago

0 Hi, Just To Check. Does The K8S Glue Install Torch By Default? I'M Getting

Hi AgitatedDove14 , that's what i am trying to figure out as well. The task has nothing to do with torch, and the requirements.txt doesn't have any torch packages as well.

3 years ago

0 Hi, Is There A Command I Can Use To Generate A Report That Can

Can i dig into the mongodb or ES to pull these data?

3 years ago

0 Hi, I Started My Agent Using. Clearml-Agent Daemon --Gpus 0 --Queue Gpu --Docker --Foreground, With The Following Parameters In Clearml.Conf.

That didn't work as well...

3 years ago

0 Hi, My Devsecops Team Has Raised Some Issues Of Us Deploying Clearml For Use. In Particular, They Are Not Happy With Docker.Sock Configuration As It Would Potentially Expose The Entire Cluster To Unauthorised View. Can We Do Without It?

Unfortunately it's not. The problem previously encountered with the docker method surfaced again. In this case, the BASE DOCKER IMAGE
nvidia/cuda:10.1-runtime-ubuntu18.04 --env GIT_SSL_NO_VERIFY=true is not taking effect with the k8s glue.

3 years ago

0 Hi, We Are Having An Interesting Issue Here. We Serve Many Users And Each User Has Their Own Credentials In Accessing The Private Git Repo. We Can'T Seem To Find A Way For The End User To Pass In Their Git Credentials When They Run Their Codes In Both Age

Hi AgitatedDove14 , i was refering to
task.set_base_docker("nvcr.io/nvidia/tensorflow:19.11-tf2-py3 --env TRAINS_AGENT_GIT_USER=git_username_here --env TRAINS_AGENT_GIT_PASS=git_password_here")The above will give error
skipping docker argument TRAINS_AGENT_GIT_USER=git_username_here (only -e --env supported) TRAINS_AGENT_GIT_PASS=git_username_here (only -e --env supported)

3 years ago

0 Hi, How Can I Make A Stage In A Clearml Pipeline Non-Blocking? The Scenario Is That Stages Downstream Needed Runtime Info From The First Stage, However The First Stage Needs To Continue Running To Act As A Monitor For The Other Downstream Stages.

The first stage is a rank0 pytorch script. The downstream stages are rankN scripts, they are waiting for the IP address of the first stage. But the first stage doesn’t return, it simply waits for the rankN scripts to connect to it. But in this case, the rankN scripts doesn’t start. So its probably necessary to have just a single stage.

If i were to start a single rank0, and subsequent rankN tasks, it would be rather messy on ClearML Dashboard. Best to have either a single clearml application...

one year ago

0 Hi, Is There A Way I Can Supply Credentials To Clearml-Data (Cli And Python) Without Going Thru The Clearml.Conf?

Thanks TimelyPenguin76 , is there an env var for the S3 connection as well?

3 years ago

0 Hi, Several Changes Occurred Recently And I Would Like To Know If There'S A Way To Verbose Catch All The Printout That Happening Within A K8S Glue Spawned Pod. We Have An Issue Where All Of Our New Remote_Execution Tasks Are Stuck In The 'Pending' Stage.

I did notice that in the tmp folder, .clearml_agent.xxxxx.cfg does not exists.

3 years ago

Ok. That brings me back to the spawned pod. At this point, clearml-agent and its config would be a controbuting factor. Is the absence of /tmp/.clearml_agent.xxxxxx.cfg an issue?

3 years ago

0 Hi I'M Using Clearml Datasets. How Do I Tell From The Clearml Ui Which Datasets Version Am I Using?

Sorry AgitatedDove14 i missed your reply. So this means that in the community version, when i have an experiment using clearml and it uses clearml datasets SDK, the dataset id that was used will not be reflected on the clearml experiment in any way, thus making it impossible to establish Data Lineage/Provenance. (E.g. Link data used to experiment). This feature is however available in the Enterprise Version as HyperDatasets. Am i correct?
Code example.
` from clearml import Task, Logger
tas...

3 years ago

Alright i will try that.

3 years ago

This would be solved if --env GIT_SSL_NO_VERIFY=true is passed to the k8s pod that's spawned to run the job. Currently its not.

3 years ago

0 We'Re Working On Clearml Serving Right Now And Are Very Interested In What You All Are Searching For In A Serving Engine, So We Can Make The Best Serving Engine We Can

I think a related question is, ClearML replies heavily on Triton (Good thing) but Triton only support a few frameworks out of the box. So this 'engine' need to make sure its can work with Triton and use all its wonderful features such as request batching, GPU reuse...etc.

2 years ago

0 Hi, I Have Been Getting The Following For A While. Is There A More Detailed Log I Can Look Into? This Happens On Both Https And Http.

Hi.

We tried as advised above and it still didn't work.
Host: http://ecs.ai:443
output_uri = S3://ecs.ai:443/bucketname

This time round the client gave this error.
Botocore.exceptions.connectiinclosederror: connection was closed before we received a valid response from endpoint URL: ' http://ecs.ai/bucketname/.clearml.test '.

It's quite apparent that whatever clearml passed to boto3 ends up as a http call instead of https, which is wrong.

3 years ago

0 Hi, I Cant Access The Clearml Ui, Is There Something Wrong With Your Servers?

Having same issues. Looks like Google DNS can't resolve the DNS at all.
` %nslookup app.clear.ml - 8.8.8.8
Server: 8.8.8.8
Address: 8.8.8.8#53

** server can't find app.clear.ml: SERVFAIL `

2 years ago

0 Hi, I Have Been Getting The Following For A While. Is There A More Detailed Log I Can Look Into? This Happens On Both Https And Http.

Ok thanks. that explains alot. We have been doing this wrongly the whole time, thinking that the clearml.conf on the client side would be acknowledged by the remote agent execution. In reality, only the API section is utilised.

3 years ago

0 Hi, I Was Using The K8S Glue And It Worked Fine On One Project But Didn'T Work On Another. At The Point Just Before A Git Clone Was Executed, I Get The Error

Hi, so you meant i need to installl virtualenv in my base image?

3 years ago

AgitatedDove14 , will these be fixed?
Passing env via the code Passing env via template yaml

3 years ago

0 Hi, Is There Any Code Examples Of How Dataops Is Being Established?

Yeah that'll cover the first two points, but I don't see how it'll end up as a dataset catalogue as advertised.

3 years ago

0 Hi, I Would Like To Check What Would Be The Recommended Hardware Specs For The Server Host Clearml Server. I Had One Configured With 32 Cpu Cores, 64Gb Ram And I Noticed That If We Have A Surge In Remote Task Creation, The Following Delays Occurs.

The server is running only the ClearML components. Could you advise on the ELB part, how should we optimise it?

3 years ago

0 Hi, We Are Using Gitlab And It Is A Security Requirement To Use Ssh Keys To Access The Repos For Each Individual. We Are Also Using K8S Glue. Is There Any Provisions To Do This Seamlessly?

Thanks SuccessfulKoala55 . I can try my hand on a patch. But the pod spinning is handled by the k8s glue, which has no link to the client side. How should the client pass the key over to k8s glue during runtime via clearml server?

3 years ago

0 Can I Ask How Often Does The Hosted Clearml Reset? I'M In A Hackathon And Thought Of Using It.

ah ok, so if i see Jax's workspace on https://app.community.clear.ml/dashboard , then i'm on the right track? How regular does this reset then?

3 years ago

0 Hi, Clearml Console Leaks Credentials Passed In As Env Vars. The Issue Remains With Clearml Version==1.1.1.135 - 1.1.1 - 2.1.4 (As Listed On The Profile Page) I Am Using K8S Glue And The Clearml.Conf Has The Following In The Agent Section.

I also see this on my logs, noting that the config is read in but its still printing the supposedly hidden keys on the logs and UI.
agent.hide_docker_command_env_vars.enabled = true agent.hide_docker_command_env_vars.extra_keys.0='TRAINS_AGENT_GIT_USER' ..... docker_cmd=harbor.ai/public/detectron2:v3 --env TRAINS_AGENT_GIT_USER=gituser

3 years ago

0 Hi I Saw This On The Clearml-Agent Docs But Other Than The Docker Image, I'M Not Sure How To Integrate This With Clearml Py And Clearml-Server. Please Advise.

Ok, that seems clearer, thanks.

3 years ago

0 Prev, I Worked With Clearml (1 Year Back) And Back Then, We Config Seldon Core For The Deployment And Clearml For The Training.. Now There Is Clearml-Serving, Does It And Can It Fulfill A Similar Objective ?

Hi, by deployment strategies I meant by canary, blue-green...etc..etc. I figured this should be done by clearml-serving and maybe seldon as well.

2 years ago

0 Hi, The `

ah thanks. Hopefully the old ones get flushed out by Google soon.

3 years ago

0 Hi

Thanks. Which brings me to the question. How does ClearML deal with all the CVEs? What is your process for response?

3 years ago

Show more results