SubstantialElk6

117 Questions, 310 Answers

Active since 10 January 2023

Last activity 2 years ago

Reputation

Badges 1

282 × Eureka!

Answers 310

0 Hi, I Noted That If I Run My Codes On My Laptop With Remote_Execute Off A Python3.8 Venv, And When The Remote Task Starts Executing But The Image Is Installed With A Different Version Of Python, Say Python3.8, We Would Encounter Errors With Venv. At This

They don't have the same version. I do seem to notice that if the client is using version 3.8, during remote execution will try to use that same version despite the docker image not installed with that version.

4 years ago

0 Hi I Saw This On The Clearml-Agent Docs But Other Than The Docker Image, I'M Not Sure How To Integrate This With Clearml Py And Clearml-Server. Please Advise.

This is probably the whole script.

kubectl get nodes
pip install clearml-agent
python k8s_glue_example.py

4 years ago

0 Hi, I Notice A New Behavuour With Clearml-Agent=1.1.0. When It Is Installing The Packages I Nrequirements.Txt, It Failed With.

can you please verify that you have all the required packages installed locally ?

Its not installed on the image that runs the experiment. But its reflected in the requirements.txt.

what is the setting of

agent.package_manager.system_site_packages

True.

4 years ago

0 Hi, We Are Having An Interesting Issue Here. We Serve Many Users And Each User Has Their Own Credentials In Accessing The Private Git Repo. We Can'T Seem To Find A Way For The End User To Pass In Their Git Credentials When They Run Their Codes In Both Age

The apply.yaml template is not working (E.g. the arguments env is not passed to the container), this is why i tried the code approaach instead.

4 years ago

0 Sorry Folks Too Many Questions - If I Have A Project (And I Set The Output Uri In It While Creating, To A S3 Folder) How Can I Ensure That A Experiment (Task) That I Run On My Local Outputs The Model To The Uri?

Hi, i was reading this thread and wondered which version of clearml-server and clearml-agent has this taken effect with?

4 years ago

0 Hi, Can I Default The Clearml Fileserver To A S3 Path?

In the ClearML config that's being run by the ClearML container?

4 years ago

0 Hi, I Shifted My Clearml Setup To An On-Premise Disconnected Env, Which Has A Pip Repo Setup. I Noted This Warning,

I can't seem to find the fix to this. Ended up using an image that comes with torch installed.

4 years ago

0 Hi, We Are Having Issues With Clearml-Session For Vscode. Apparently It'S Hardcoded To Download From

I would say yes, otherwise the vscode feature is only available on internet connected premises due to the hard coded URL to download vscode.

4 years ago

0 Hi Community, I’Ve Just Posted My First Blog Post About Mlops. I Am Open To Any Suggestions.

Here's my two cents worth.
I thought its really nice to start off the topic highlighting 'pipelines', its unfortunately one of the most missed component when ppl start off with ML work. Your article mentioned about drfits and how MLOps process covered it. I thought there are 2 more components that was important and deserves some mention.Retraining pipelines. ML engineers tend not to give much thought to how they want to transit a training pipeline in development to a automated retraining pipe...

3 years ago

0 Hi, Is There Any Code Examples Of How Dataops Is Being Established?

Yeah that'll cover the first two points, but I don't see how it'll end up as a dataset catalogue as advertised.

4 years ago

0 Hi, I Started My Agent Using. Clearml-Agent Daemon --Gpus 0 --Queue Gpu --Docker --Foreground, With The Following Parameters In Clearml.Conf.

To note, the latest codes have been pushed to the Gitlab repo.

4 years ago

0 Hi I Saw This On The Clearml-Agent Docs But Other Than The Docker Image, I'M Not Sure How To Integrate This With Clearml Py And Clearml-Server. Please Advise.

The doc also mentioned preconfigured services with selectors in the form of
"ai.allegro.agent.serial=pod-<number>" and a targetPort of 10022. Would you have any examples of how to do this?

4 years ago

0 Hi, Is There A Command I Can Use To Generate A Report That Can

No issues. I know its hard to track open threads with Slack. I wish there's a plugin for this too. 🙂

4 years ago

0 Hi, Is There A Command I Can Use To Generate A Report That Can

Any idea where i can find the relevant API calls for this?

4 years ago

0 Hi, V1 Of Agent Seems To Have Removed Agent.Package_Manager.Force_Repo_Requirements_Txt. Is This Still Available In Other Forms?

yup. in this case it wasn't root. Removing that USER and -u in pip solves the problem. However, in our production images, we are required to remove root access.
` FROM nvidia/cuda:10.1-cudnn7-devel

ENV DEBIAN_FRONTEND noninteractive
RUN apt-get update && apt-get install -y
python3-opencv ca-certificates python3-dev git wget sudo ninja-build
RUN ln -sv /usr/bin/python3 /usr/bin/python

create a non-root user

ARG USER_ID=1000
RUN useradd -m --no-log-init --system --uid ${USER_ID} a...

4 years ago

0 I Am Facing This Error While Trying To Run My Code

I'm having the same problem. You using latest clearmagent? Is your docker image a root user by default?

3 years ago

0 Hi, How Can I Pass A Env Variable To The Docker That'S Running The Agent When I Run This? I'M Havving Issues With The Agent'S Git Clone Where It Requires Sslverification To Be Disabled. Clearml-Agent Daemon --Gpus 0 --Queue Gpu --Docker --Foreground

After some churning, this is the answer. Change it in the clearml-agent init generated clearml.conf.

` default_docker: {
# default docker image to use when running in docker mode
image: "nvidia/cuda:10.1-runtime-ubuntu18.04"

    # optional arguments to pass to docker image
    # arguments: ["--ipc=host", ]
    arguments: ["--env GIT_SSL_NO_VERIFY=true",]
  } `

4 years ago

0 Hi, I Would Like To Check What Would Be The Recommended Hardware Specs For The Server Host Clearml Server. I Had One Configured With 32 Cpu Cores, 64Gb Ram And I Noticed That If We Have A Surge In Remote Task Creation, The Following Delays Occurs.

We are running on a 1gbps backend.

4 years ago

0 Hi, We Recently Upgraded Clearml To 1.1.1-135 . 1.1.1 . 2.14. The Task Init Is

This is strange then, is it possible for clearml logs to register successfully saving into a S3 storage when actually it isn't? For example, i've seen in past experiences with certain S3 client that saved onto a local folder called 's3:/' instead of putting it on S3 storage itself.

4 years ago

0 Hi, I'M Having Problems With The Installed Packages When Creating An Experiment. The Installed Packages Used To Be A List With The Versions Of All The Installed Packages In The Venv. However, Now I Get The Following:

Previously we had similar issues when we switched images used in agent. Might want to check on that.

4 years ago

0 Hi, I Have Been Getting The Following For A While. Is There A More Detailed Log I Can Look Into? This Happens On Both Https And Http.

Setting the credentials on agent machine means the users cannot use their own credentials since an k8s glue agent serves multiple users.

Referencing your suggestion, we can configure output_uri on task.set_base_docker() but how should we do this for the credentials?

4 years ago

0 Hi, I Would Like To Pass In Some Pip Arguments That Clearml-Agent Would Include When Setting Up The Venv On The Containers. How Should I Specify This? The Argument In Question Are --Trusted-Host And --Find-Links . I Need Them As I'Ve Installed A Pypi Repo

Hi, i changed it, but it still point to https://files.pythonhosted.org/packages .

4 years ago

0 Hi, We Have Recurring Disk Space Issues On Our Clearml Server (Drop Of Many Gb In A Few Days). After Some Analysis, We Noted

ok thanks. this would mean that increasing the disk space for my ClearML is the only option as we are not at liberty to delete.

3 years ago

0 Hi, My Devsecops Team Has Raised Some Issues Of Us Deploying Clearml For Use. In Particular, They Are Not Happy With Docker.Sock Configuration As It Would Potentially Expose The Entire Cluster To Unauthorised View. Can We Do Without It?

Hi, clearml-agent==0.17.2rc3 did work. I'm on a 1.19 k8s cluster, and has this error when a task is pulled. Is the glue not compatible with 1.19?
` Pulling task 3a90802d1dfa4ec09fbccba0beffbaa8 launching on kubernetes cluster
Pushing task 3a90802d1dfa4ec09fbccba0beffbaa8 into temporary pending queue
Kubernetes scheduling task id=3a90802d1dfa4ec09fbccba0beffbaa8
kubectl output:
Flag --replicas has been deprecated, has no effect and will be removed in the future.
Flag --generator has been depre...

4 years ago

0 Hi, Just To Check. Does The K8S Glue Install Torch By Default? I'M Getting

Hi AgitatedDove14 , that's what i am trying to figure out as well. The task has nothing to do with torch, and the requirements.txt doesn't have any torch packages as well.

4 years ago

0 Hi, I Shifted My Clearml Setup To An On-Premise Disconnected Env, Which Has A Pip Repo Setup. I Noted This Warning,

AlertBlackbird30 , Actually the log says 10.2.
docker_cmd = nvidia/cuda:10.2-devel-ubuntu18.04 -e GIT_SSL_NO_VERIFY=true

4 years ago

0 Hi I'M Using Clearml Datasets. How Do I Tell From The Clearml Ui Which Datasets Version Am I Using?

I meant the dataset id.

4 years ago

0 The Party Continues On Reddit. I'M Sure There Will Be Questions From Non-Users, I Hope You Could All Join In And Answer!

Congrats on v1.0. 🎉

4 years ago

0 Hi, V1 Of Agent Seems To Have Removed Agent.Package_Manager.Force_Repo_Requirements_Txt. Is This Still Available In Other Forms?

so the clearml-agent daemon needs higher privilege?

4 years ago

0 Hi, V1 Of Agent Seems To Have Removed Agent.Package_Manager.Force_Repo_Requirements_Txt. Is This Still Available In Other Forms?

I managed to find out why. The docker image I'm using is not set as root user thus the error. But I'm wondering why this is the case as docker best practices does indicate we should use a non root on production images.

4 years ago

Show more results