Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
AgitatedDove14
Moderator
49 Questions, 8122 Answers
  Active since 10 January 2023
  Last activity one year ago

Reputation

0

Badges 1

25 × Eureka!
0 Hello Everyone, I Am Having An Issue With File Paths When Running Clearml Agent. Somehow, Directories Are Being Created Inside The

Hi @<1634001100262608896:profile|LazyAlligator31>

Is this because the code repo is being recreated in this directory?

Yes this is correct πŸ™‚
Basically the entire code base + venv is installed there, to make sure it does not intyerfere with the "system" preinstalled environment
(it also allows for caching on the host machine πŸ™‚ )

one year ago
0 Hi, I Have A Small Question Regarding K8S Clearml-Serving Behavior. I Have In My Cluster One Gpu Of 16Gb Ram, And Another One Of 24 Gb Ram. I Have A Llm Model Fitting The 24Gb But Not The 16Gb Gpu. When I Call The Endpoint, How Will I Know To Which Gpu I

Hi @<1556812486840160256:profile|SuccessfulRaven86>
Every clearml-serving session (you can have multiple different "sessions") is assumed to be homogeneous, this would mean it will serve the same models on as many nodes as possible supporting multiple models per pod.
In your example I think the easiest is to create two serving sessions one with a node selector for the 24GB node and another for the 16GB node, wdyt?

one year ago
0 Hi, Can I Use Clearml As A Tool For Deploying Models In A Private Network? Did Not Manage To Understnd From The Docs.

does the clearml server is a worker i can serve on models?

The serving is done by one of the clearml-agents.
Basically you spin an agent, then this agent is spinning the model serving engine container (fully managed).
(1) install run run clearml-agent (2) run clearml-session CLI to configure and spin the serving engine

4 years ago
0 Is There Documentation On The Clearml Slurm Enterprise Integration?

Hi @<1523711619815706624:profile|StrangePelican34>
Hmm, I think this is missing from the docs, let me ping the guys about that πŸ™

one year ago
0 Hi, I'M Following The Instructions For

OutrageousSheep60

I found the task in the UI -

and in the

UNCOMMITTED CHANGES

execution section there is

No changes logged

This is the issue.

and then run the

session

via docker

clearml-session --docker nvidia/cuda:10.1-cudnn7-runtime-ubuntu18.04 \ --packages "clearml" "tensorflow>=2.2" "keras" \ --queue MY_QUEUE \ --verboseAre you running the "cleamrl-session" from your machine? (i.e. not from inside a docker) ?...

3 years ago
0 Hi, I'M Following The Instructions For

clearml_agent: ERROR: Can not run task without repository or literalscript in script.diff

This is odd ...

OutrageousSheep60 when you launch clearml-session it tells you the session ID (which is also a Task ID), can you look for it in the UI and check there is something in the repo/uncommitted-changes section ?

3 years ago
0 Hi, I'M Following The Instructions For

OK - the issue was the firewall rules that we had.

Nice!

But now there is an issue with the

Setting up connection to remote session

OutrageousSheep60 this is just a warning, basically saying we are using the default signed SSH server key (has nothing to do with the random password, just the identifying key being used for the remote ssh session)
Bottom line, I think you have everything working πŸ™‚

3 years ago
0 Hi All I Am Would Like To Somehow Prevent Clearml Caching From Caching A Task That Hasn'T Uploaded Artifacts (Using Cache_Executed_Step In

But I am considreing just failing the task.

This will of course work, just raise exception in the Task itself, and protect the call from the pipeline logic function with try/except

regrading the second option, try to nullify the hash on the Component Task:

# running the Task component here
# if we do not want someone to use us
Task.current_task()._set_runtime_properties({"pipeline_job_hash": None})
one year ago
0 Help Please, After Creating My Data Drift Monitoring Dashboard Using Clearml Serving And Grafana, How Can I Configure My Alerts To Be Notified When The Distribution Of My Metrics (Variables) Changes On My Heatmaps?

try to break it into parts and understand what produces the error
for example:
increase(test12_model_custom:Glucose_bucket[1m])
increase(test12_model_custom:Glucose_sum[1m])
increase(test12_model_custom:Glucose_bucket[1m])/increase(test12_model_custom:Glucose_sum[1m])
and so on

one year ago
0 Hey Guys, I'M Trying To Run An Experiment Using Trains-Agent. I Have A Custom Docker Image With Nightly Versions Of Pytorch And Our Own Library Installed From A Private Repo. I Was Assuming That These Packages Will Be Automatically Available To Trains Dur

Hi DilapidatedDucks58 ,
Are you running in docker or venv mode?
Do the works share a folder on the host machine?
It might be syncing issue (not directly related to the trains-agent but to the facts you have 4 processes trying to simultaneously access the same resource)

BTW: the next trains-agent RC will have a flag (default off) for torch-nightly repository support πŸ™‚

5 years ago
0 Hey, Thanks For The Great Logging Tool

CloudyHamster42
RC probably in a few days, but notice that it will just remove the warnings, I still can't reproduce the double axis issue.

It will be helpful if you could send a small script to reproduce the problem.

Maybe this example code can help ? https://github.com/allegroai/trains/blob/master/examples/manual_reporting.py

5 years ago
0 <image>

Let me know if it solved it, if it did I'll make sure we push the RC

4 years ago
0 Hello Everyone, I'M Curious To Know If It'S Possible To Prevent Uploading A Duplicate Endpoint. For Instance, If An Endpoint Has Already Been Uploaded Using The

Thanks for answering, Yes, this is exactly what I wanted

Hmm should be possible, how slow is the update that we want to save the time ?

one year ago
0 Hi Folks

Thanks @<1550289509273309184:profile|CooperativeBeetle24> !
Is this an error with the CLI not working with a certain version of numpy ?
Any chance you can PR the fix ?

None

2 years ago
0 Does K8S Glue Support Running Service Agent? Slightly Confused Here

you mean to spin a pod with the agent inside it (daemon in services mode).
Or connect the services queue to the k8s cluster (i.e. define the pod template that uses cpu with not a lot of ram)?

3 years ago
0 Hey Trains Riders, This Must Be Something Simple I Am Missing, But Still I Couldn'T Realize What The Problem Is. I Am Trying To Run Trains-Agent On My Experiments. Setup Of The Server And The Agent Is Fine, But I Am Struggling To Run Real Experiments (Not

Hi ColossalDeer61 ,

Xxx is the module where my main experiment script resides.

So I think there are two options,
Assuming you have a similar folder structure-main_folder
--package_folder
--script_folder
---script.py
Then if you set the "working directory" in the execution section to "." and the entry point to "script_folder/script.py", then your code could do:
from package_folder import ABC
2. After cloning the original experiment, you can edit the "installed packages", and ad...

5 years ago
0 Does K8S Glue Support Running Service Agent? Slightly Confused Here

I want to use services queue for running services, and I want to do it on k8s

So yes, as a standalone pod with the agent in venv mode (as opposed to docker mode)
Does that make sense to you?

3 years ago
0 Does K8S Glue Support Running Service Agent? Slightly Confused Here

I guess it won’t due to the nature of services?

Correct, k8s glue works differently, that said I would actually use the helm to spin a pod woth the agent in services mode and venv mode.

3 years ago
0 Adding

Makes sense to add it to docker run by default if GPUs are mentioned in agent.

I think this is an arch thing, --privileged is not needed on ubuntu flavor, that said you can always have it if you add it here:
https://github.com/allegroai/clearml-agent/blob/178af0dee84e22becb9eec8f81f343b9f2022630/docs/clearml.conf#L149

clearml-agent daemon --gpus 0 --queue default --docker
But docker still sees all GPUs.

Yes --gpus should be enough, are you sure regrading the --privileged flag ?

3 years ago
0 Can We Report A Pandas Table With Styling To Be Retained In The Webui? It Would Be Nice To Report E.G.

SmugLizard25 are you saying that with the latest version it does not work?

2 years ago
0 I Saw Some Talk Of Clearml + Kedro On Reddit. Is That A Good Approach?

one can containerise the whole pipeline and run it pretty much anywhere.

Does that mean the entire pipeline will be running on the instance spinning the container ?
From here: this is what I understand:
https://kedro.readthedocs.io/en/stable/10_deployment/06_kubeflow.html

My thinking was I can use one command and run all steps locally while still registering all "nodes/functions/inputs/outputs etc" with clearml such that I could also then later go into the interface and clone an...

4 years ago
Show more results compactanswers