Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
SubstantialElk6
Moderator
117 Questions, 310 Answers
  Active since 10 January 2023
  Last activity one year ago

Reputation

0

Badges 1

282 × Eureka!
0 Votes
1 Answers
1K Views
0 Votes 1 Answers 1K Views
[Distributed Training] Hi, i have a ClearML setup with K8SGlue that spins up pods of 4 GPUs when picking tasks off the clearml queue. We would now want to pr...
one year ago
0 Votes
1 Answers
1K Views
0 Votes 1 Answers 1K Views
Hi, would like to check. So an agent pulled a docker image and install the pip dependencies on it. What if I have OS library dependencies as well? (Apt insta...
3 years ago
0 Votes
3 Answers
1K Views
0 Votes 3 Answers 1K Views
3 years ago
0 Votes
5 Answers
1K Views
0 Votes 5 Answers 1K Views
Hi, i would like to ask around if anyone has following languages working with ClearML? It can be direct from ClearML SDK or via any indirect method. Julia R ...
3 years ago
0 Votes
6 Answers
1K Views
0 Votes 6 Answers 1K Views
Hi, we are planning to move on to openshift. Can I ask if k8s-glue supports openshift?
3 years ago
0 Votes
30 Answers
1K Views
0 Votes 30 Answers 1K Views
3 years ago
0 Votes
0 Answers
611 Views
0 Votes 0 Answers 611 Views
Hi, is there a way to export ClearML experiments into a file package and import them on another ClearML instance?
10 months ago
0 Votes
11 Answers
1K Views
0 Votes 11 Answers 1K Views
Hi, i shifted my clearml setup to an on-premise disconnected env, which has a pip repo setup. I noted this warning, Trying pip install: /root/.clearml/venvs-...
3 years ago
0 Votes
8 Answers
1K Views
0 Votes 8 Answers 1K Views
Hi, I've a few questions on clearml-session. We will be running some GUI applications so is it possible to forward the GUI to the clearml-session? We have a ...
3 years ago
0 Votes
0 Answers
892 Views
0 Votes 0 Answers 892 Views
Hi, we are encountering an increasing number of cases where it takes quite a while before actual training (GPU utilisation) can be done. After observing, thi...
one year ago
0 Votes
1 Answers
914 Views
0 Votes 1 Answers 914 Views
Hi. For the experiment scalar tab, there's a gpu resource graph. The gpu mem used is in percentage, is it possible to display as absolute GB instead? Reason ...
one year ago
0 Votes
6 Answers
1K Views
0 Votes 6 Answers 1K Views
Hi, i notice a new behavuour with clearml-agent=1.1.0. When it is installing the packages i nrequirements.txt, it failed with. clearml_agent: ERROR: HTTPSCOn...
3 years ago
0 Votes
2 Answers
1K Views
0 Votes 2 Answers 1K Views
2 years ago
0 Votes
9 Answers
1K Views
0 Votes 9 Answers 1K Views
Hi, we are having issues with clearml-session for vscode. Apparently it's hardcoded to download from https://github.com/microsoft/vscode-python/releases but ...
3 years ago
0 Votes
8 Answers
1K Views
0 Votes 8 Answers 1K Views
Hi we have had some crashes on ClearML server and it was caused by ClearML uploading the models into ClearML server (by default). Is it possible to have an o...
2 years ago
0 Votes
29 Answers
1K Views
0 Votes 29 Answers 1K Views
Hi, I started my agent using. clearml-agent daemon --gpus 0 --queue gpu --docker --foreground, with the following parameters in clearml.conf. default_docker:...
3 years ago
0 Votes
14 Answers
1K Views
0 Votes 14 Answers 1K Views
So i bumped onto this comparison shared by dagshub. It kinda placed ClearML is a rather bad position compared to everything else in the industry. https://dag...
3 years ago
0 Votes
22 Answers
1K Views
0 Votes 22 Answers 1K Views
3 years ago
0 Votes
9 Answers
1K Views
0 Votes 9 Answers 1K Views
Hi, just to check. Does the k8s glue install torch by default? I'm getting Warning: could not resolve python wheel replacement for torch==1.8.0 even though i...
3 years ago
0 Votes
8 Answers
1K Views
0 Votes 8 Answers 1K Views
Hi, if i've ClearML agents installed on several servers, each with a single GPU. How can I train a gpt2 model that would require multiple GPUs?
one year ago
0 Votes
4 Answers
1K Views
0 Votes 4 Answers 1K Views
Hi recently upgraded all the clearml, clearml-server, clearml-agent. Now running k8s glue with clearml-agent=1.0.1rc1. python3 k8s_glue_example.py --queue 1b...
3 years ago
0 Votes
8 Answers
983 Views
0 Votes 8 Answers 983 Views
I just getting this in my agent run task. Would appreciate if someone can advise where i externalrequirement is pointing at. RequirementsManager handler rais...
3 years ago
0 Votes
4 Answers
1K Views
0 Votes 4 Answers 1K Views
Hi, I'm running clearml agents via K8s glue. I noticed that the agent is not pulling latest images even though docker_force_pull is set to true. A kubectl de...
3 years ago
0 Votes
1 Answers
1K Views
0 Votes 1 Answers 1K Views
Thought i would share this. Something to think about over the new year. 🙂 https://www.thoughtworks.com/content/dam/thoughtworks/documents/whitepaper/tw_whit...
2 years ago
0 Votes
6 Answers
1K Views
0 Votes 6 Answers 1K Views
3 years ago
0 Votes
1 Answers
1K Views
0 Votes 1 Answers 1K Views
Hi, in your latest changelog. There's a new function. Task.launch_multi_node() for distributed experiment execution In the context of using with K8S glue, wi...
one year ago
0 Votes
0 Answers
1K Views
0 Votes 0 Answers 1K Views
Hi, we are working on a mini project to 'integrate' ClearML Datasets with CKAN. Wondering if the community could share some ideas.
2 years ago
Show more results questions
0 Hi We Have Had Some Crashes On Clearml Server And It Was Caused By Clearml Uploading The Models Into Clearml Server (By Default). Is It Possible To Have An Overriding Config So Clients Can Never Upload To Clearml Server Itself As Default?

Hi SuccessfulKoala55 , would they need the fileserver to route to minio then? E.g.

This will ensure that any actions by clearml-data and models are saved into the S3 object store.
api {
files_server: s3://ecs.ai:80/clearml-data/default
}

aws {
s3 {
credentials {
host: http://ecs.ai:80
## Insert the iam credentials provided by your SAs here.
}
}
}

But if user forgot to do above, they will be saved on ClearML server. If I switch off f...

2 years ago
0 Hi, Several Changes Occurred Recently And I Would Like To Know If There'S A Way To Verbose Catch All The Printout That Happening Within A K8S Glue Spawned Pod. We Have An Issue Where All Of Our New Remote_Execution Tasks Are Stuck In The 'Pending' Stage.

Hi, i dont't think clearml agent actually ran at that point in time. All i can see in the pod is
apt install of libpthread-stubs, libx11, libxau and libxcb1 packages. pip install of clearml-agentAfter the above are successful, the pod just hang there.

3 years ago
0 Can I Ask How Often Does The Hosted Clearml Reset? I'M In A Hackathon And Thought Of Using It.

Hi, is this currently not working? http://app.community.clear.ml ? I noticed that cleaml UI will cache on the browser and if the backend is not running, its not clear to user that something is wrong (except for broken pages).

3 years ago
0 Hi, Can I Ask How I Can Make Clearml-Datasets In Comparison With Pytorch Datasets/Dataloader? In Particular, Pytorch Dataloaders Would Be Able To Batch Pull And Then Preprocess Data Using Multi-Cpus, Feed It Into The Training Loop And Achieve As High Util

Thanks CostlyOstrich36 , how do i know how is the parts indexed in the first place? Or rather, how is chunk and parts defined? Say in the context of images, videos, text documents...etc.

2 years ago
0 Hi, We Have Recurring Disk Space Issues On Our Clearml Server (Drop Of Many Gb In A Few Days). After Some Analysis, We Noted

Thanks SuccessfulKoala55 , how might I do this clean up? Does this increase with more use of ClearML? And to add, we save all artifacts onto a remote S3 server.

2 years ago
0 Hi, I Shifted My Clearml Setup To An On-Premise Disconnected Env, Which Has A Pip Repo Setup. I Noted This Warning,

Hi AgitatedDove14 , what version i should change it to? I'm currently on v0.17.2rc3.

3 years ago
3 years ago
0 Hi, Trying To Understand Clearml-Session. I Have An Agent Running On A Machine Monitoring A Queue Then I Ran Clearml-Session --Queue Myqueu --Docker Torch-Image. The Clearml Session Ended Up Tunneling Into The Physical Machine That My Agent Is Running

Hi, I was expecting to see the container rather then the actual physical machine. For example, in the file panel on the left of the jupyter panel, I see the file contents of the physical machine. I was expecting this to be the container.

3 years ago
0 Hi, V1 Of Agent Seems To Have Removed Agent.Package_Manager.Force_Repo_Requirements_Txt. Is This Still Available In Other Forms?

I managed to find out why. The docker image I'm using is not set as root user thus the error. But I'm wondering why this is the case as docker best practices does indicate we should use a non root on production images.

3 years ago
0 Hi, V1 Of Agent Seems To Have Removed Agent.Package_Manager.Force_Repo_Requirements_Txt. Is This Still Available In Other Forms?

Its actually in your documentation. Its removed since 0.17 apparently.
https://allegro.ai/clearml/docs/docs/release_notes/ver_0_17.html#clearml-agent-0-17-2

And this is my logs, it tried to install something and encountered permission denied. It wouldn't if it obeyed the force_repo_requirements_txt.

1620664917916 Kahs-MacBook-Pro.local info ClearML Task: created new task id=024a421c0e174650a1c7ff64af756c26 ClearML results page: `
1620664920359 Kahs-MacBook-Pro.local info ClearML Mon...

3 years ago
0 Hi, V1 Of Agent Seems To Have Removed Agent.Package_Manager.Force_Repo_Requirements_Txt. Is This Still Available In Other Forms?

yup. in this case it wasn't root. Removing that USER and -u in pip solves the problem. However, in our production images, we are required to remove root access.
` FROM nvidia/cuda:10.1-cudnn7-devel

ENV DEBIAN_FRONTEND noninteractive
RUN apt-get update && apt-get install -y
python3-opencv ca-certificates python3-dev git wget sudo ninja-build
RUN ln -sv /usr/bin/python3 /usr/bin/python

create a non-root user

ARG USER_ID=1000
RUN useradd -m --no-log-init --system --uid ${USER_ID} a...

3 years ago
0 Hi, We Are Having An Interesting Issue Here. We Serve Many Users And Each User Has Their Own Credentials In Accessing The Private Git Repo. We Can'T Seem To Find A Way For The End User To Pass In Their Git Credentials When They Run Their Codes In Both Age

Hi AgitatedDove14 . I'm trying out passing env via the code instead.
task.set_base_docker("nvcr.io/nvidia/tensorflow:19.11-tf2-py3 --env TRAINS_AGENT_GIT_USER=git_username_here --env TRAINS_AGENT_GIT_PASS=git_password_here")So the strange thing is when my k8sglue pulls a task, this happens.
` Pulling task xxxxxxxxxx launching on kubernetes cluster
Pushing task xxxxxxxxxx into temporary pending queue
Kubernetes scheduling task id=xxxxxxxxxxxx
skipping docker argument TRAINS_AGENT_GIT_USE...

3 years ago
3 years ago
0 Hi, We Are Using Gitlab And It Is A Security Requirement To Use Ssh Keys To Access The Repos For Each Individual. We Are Also Using K8S Glue. Is There Any Provisions To Do This Seamlessly?

Hi, scenario as follows.

client.py runs task.execute_remotely(queue='myqueue', exit_process=True) The API section of clearml.conf at client side is read in. client side calls clearml server and insert task into queue. K8S glue retrieves task from queue. Spawn a K8S pod. K8S pod performs git clone Error. ssh keys not found.
Each individual has their own key in the gitlab profile and gitlab is configured to only work via ssh.
We can't place the key in the image as this is as good as ...

3 years ago
Show more results compactanswers