AgitatedDove14

49 Questions, 8122 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

25 × Eureka!

Answers 8122

0 Playing Around With Hpo For First Time. I Am Giving This As Hyperparameter:

It completed after the max_job limit (10)

Yep this is optuna "testing the water"

3 years ago

0 Does Artifact Track Per File Base? What If Only Some File Is Updated, Does It Knows Only Uploading The New Files? Also, Wonder What Is The Best Way To Setup Storage For Teams To Share? (Not Prefer Using Cloud As Network Cost Can Be Significant Since We Do

Hi EnviousStarfish54
Artifacts are stored per experiment, that means that storage wise every experiment uploading an artifact (even if it is the same file content as previous execution) will create a new file on the central storage (default being the trains-server)
As for the preferred way to share data / artifacts. Where do you have your trains server ? Is it local ? Cloud? Where do you access it from home? VPN?

5 years ago

0 I Have A Questions About Queue Priorities With Clearml-Agent. I Have Two Queues,

but it is not optimal if one of the agents is only able to handle tasks of a single queue (e.g. if the second agent can only work on tasks of type B).

How so?

4 years ago

0 How, If At All, Should We Cite Clearml In A Research Paper? Would You Like Us To? How About A Footnote/Acknowledgement?

Thanks SmallDeer34 !

Would you like us to? How about a footnote/acknowledgement?

How about a reference / footnote ?
@misc{clearml, title = {ClearML - Your entire MLOps stack in one open-source tool}, year = {2019}, note = {Software available from }, url={ }, author = {allegro.ai}, }

3 years ago

0 How Do I Restart Trains-Agents? How Do I Stop Them?

WackyRabbit7 I do 'pkill -f trains' but it's the same... If you need to debug and test run with --foreground and just hit ctrl-c to end the process (it will never switch to background...). Helps?

5 years ago

0 Hey, I Was Wondering How Can I Do Hparams Tuning With Trains? Couldn'T Find Anything On The Documentation

ShaggyHare67 notice that the services queue is designed to run CPU based tasks like monitoring etc.
For the actual training you need to run your trains-agent on a GPU machine.
Did you run the trains-agent init ? it will walk you through the configuration (git user/pass) included.
If you want to manually add them, you can see an example of the configuration file in the link below.
You can find it on ~\trains.conf
https://github.com/allegroai/trains-agent/blob/master/docs/tr...

4 years ago

0 What Could Be The Reason For Fail Status Of A Task That Seems To Have Completed Correctly? No Information In The Log Whatsoever

BTW:
Task.add_requirements('tensorflow', '2.2') will make sure you get the specified version 🙂

4 years ago

0 Hi Folks, Is There A Way To Force Clear-Ml Agent With --Docker To

My bad you have to pass it to the container itself:
https://github.com/allegroai/clearml-agent/blob/a5a797ec5e5e3e90b115213c0411a516cab60e83/docs/clearml.conf#L149
extra_docker_arguments: ["-e", "CLEARML_AGENT_SKIP_PYTHON_ENV_INSTALL=1"]

3 years ago

0 Can I Run A Random Task From A Queue? Like This

os.environ['CLEARML_PROC_MASTER_ID'] = ''

Nice catch! (I'm assuming you also called Task.init somewhere before, otherwise I do not think this was necessary)

I think i solved it by deleting the project and running the base_task one time before the hyper parameter optimzation

So isit working now? everything is there ?

3 years ago

0 When I Tried To Create A Clearml Serving Inference Endpoint For Yolov8, I Received The Following Error:

This line 🙂
None
Notice Triton (and so is clearml-serving) needs the pytorch model to be converted into torchscript, so that the triton backend can load it

2 years ago

0 Thread Re: Pipelines And How They'Re Meant To Be Used / How Long They Take To Orchestrate.

pipe.start_locally() will run the DAG compute part on the same machine, where pipe.start() will start it on a remote worker (if it is not already running on a remote worker)
basically "pipe.start()" executed via an agent, will start the compute (no overhead)
does that help?

one year ago

0 Hi, I'M Trying To Make Use Of New Capabilities Of Dag Creation In Clearml. Seems That Api Has Changed Pretty Much Since A Few Versions Back. There Seems To Be No Need In

👍
Okay But we should definitely output an error on that

3 years ago

0 <image>

Can I send you a wheel to test ?

4 years ago

0 <image>

YEY

4 years ago

0 Hey - I'M Trying To Compare Voxel Versus Clear Ml In Image Data Exploration.

I'm hoping i can find an end to end solution that also includes experiment management

Well of course biased here, but ClearML with the hyperdatasets is probably the most complete one.
Specifically with model performance analysis I would add voxel open-source to dissect specific results. but the combination of the abstraction and query capabilities of hyperdatasets, orchestration and experiment management are really unmatched for.
(and again of course I'm biased, but really there is n...

2 years ago

0 Has Anyone Used Dynaconf With Clearml? Trying To Decide Whether To Migrate To Hydra Or Stick With Dynaconf. Would Love To Take Advantage Of Automatic Logging Of The Hyperparameters

Hi @<1532532498972545024:profile|LittleReindeer37>
Does Hydra support notebooks ? If it does, can you point to an exapmle?

2 years ago

0 Hi All, I'M Trying To Deploy Trains On Rancher (Nice Kubernetes Cluster Orchestration Project) Where I'M Quite New To Rancher And Kubernetes. I Have Been Able To Install Trains Using Helm

Hi WickedGoat98
Regardless on the ingress configuration (which seems like you have the hang of), the API instance itself needs to be configured with persistent volume (the web / file server do not need direct access to the API server).
Can you get the API to run properly ?

Regrading the trains-agent once you have the API/Web/File server configured, you can configure it like the trains-agent-services is configured inside the docker-compose (e.g. set the environment variable with the c...

4 years ago

0 Hi! I'Ve Been Trying Out The

https://github.com/allegroai/clearml/blob/master/clearml/automation/trigger.py
Example coming soon, with docs :)

4 years ago

0 Hi, I'D Very Much Like A Feature To Support Clearml Agent Stopping Specified By The Exact Bringup Command: When Trying For Example To Bring Down A Gpu Agent Now, Clearml Brings It Down By The Order In Which The Agents Were Put Up, But Clearml Could Potent

ImmensePenguin78 it might be... Let me check, worst case sync after the weekend 🙂
(pypi does contain 1.2.0rc4 and we are finalizing tests so that we can release a stable 1.2.0)

3 years ago

0 Hello, I use ```'-e', 'CLEARML_AGENT_AGENT_DOCKER_INTERNAL_MOUNTS__SDK_CACHE=/home/username/clearml_agent_cache',``` yet my mount looks like this ```'-v', '/home/clearml/.clearml/cache:/clearml_agent_cache',``` I want clearml to use `/home/username/clearm

Hi @<1715900788393381888:profile|BitingSpider17>
Notice that you need __ (double underscore) for converting "." in the clearml.conf file,
this means agent.docker_internal_mounts.sdk_cache will be CLEARML_AGENT__AGENT__DOCKER_INTERNAL_MOUNTS__SDK_CACHE
None

one year ago

0 Hi Everyone! I'M Trying To Upload Roc Figure From Matplotlib To Clearml. Unfortunately Clearml Adds Invalid Legend Item To The Plot As You Can See On The Attached Image. Is There Any Way To Hide This Junk?

Hi SpicyOtter88
plt.plot([0, 1], [0, 1], 'r--', label='')ti cannot have a legend without a label, so it gives it "anonymous" label, I think it should just get "unlabeled 0" wdyt?

2 years ago

0 Hi. Inside A Notebook When I Cerate A New Clearml Task And Then Run Sklearn Gridsearchcv , Clearml Uploads A Lot Of Model. Is There A Way To Force Clearml Not To Upload These Models? Related Question Is What Are These Models Anyway? Their Name Only Contai

so all models are part of the same experiment and has the experiment name in their name.

Oh that explains it, (1) you can use the model filename to control the model name in clearml (2) you can disable the autologging and manually upload the model, then you can control the model name
wdyt?

2 years ago

0 Hi When We Try And Sign Up A User With Github. The Invitation Link Never Works. Given They Have Already Signed Up With Their Github

Yes, could you send the full log? screen grab ?

one year ago

0 I Saw That Clearml Overrides The Random Number Generator Is It Possible To Control This Behaviour?

Hi ConvolutedSealion94
Yes 🙂
Task.set_random_seed(my_seed=123) # disable setting random number generators by passing None task = Task.init(...)

2 years ago

0 We Are Facing Performance Issues Of Our Self-Hosted Clearml Server Looking At The Cpu Utilization \ Memory \ Networking We Couldn'T Identify A Bottleneck We Are At The Moment Using ~100 Workers For Some Hpo, And The Main Performance Issues We Observe Are

Hi DepressedChimpanzee34
I think main issue here is slow response time from the API server, I "think" you can increase the number of API server processes, but considering the 16GB, I'm not sure you have the headroom.
At peak usage, how much free RAM so you have on the machine ?

3 years ago

0 Hi, I Encountered An Issue That Might Affect Others As Well: When Using "

Hi IrritableJellyfish76
https://clear.ml/docs/latest/docs/references/sdk/task#taskget_tasks

task_name

(

str

) – The full name or partial name of the Tasks to match within the specified

project_name

(or all projects if

project_name

is

None

). This method supports regular expressions for name matching. (Optional)

You are right, this is a bit confusing, I will make sure that we add in the docstring an examp...

3 years ago

0 Hi Folks, One Question: I Have A Script That Looks Like:

AttributeError: 'PosixPath' object has no attribute 'loc'
SarcasticSquirrel56 I'm assuming the artifacts is pandas and you forgot to either import before or add as requirement for the Task 🙂
This is causing the artifact .get() method to revert to returning the local path to the artifact, instead of actually de-serializing
(We should print a warning though, I'll make sure we do 🙂 )

EDIT: basically clearml failed to realize you also need pandas because it was never imported ...

3 years ago

My question was about the automatically uploaded models. Those that were uploaded by clearml client.

So there is a way to add a callback would that work?
https://github.com/allegroai/clearml/blob/cf7361e134554f4effd939ca67e8ecb2345bebff/clearml/binding/frameworks/init.py#L137
def callback(_, model_info): model_info.name = "my new name" return model_info

2 years ago

0 Hi Everyone, I Have A Few Questions To Understand The Clearml-Serving A Little Better (And How Much Resources To Allocate For The Serving Pods): Are The Models I Defined To Be Served E.G. Via The Cli Downloaded To The Serving Pod? So That They Are Physica

Hi @<1649221394904387584:profile|RattySparrow90>

: Are the models I defined to be served e.g. via the CLI downloaded to the serving pod

Yes this is done automatically and online (i.e. when you update the using CLI/API) , based on the models/endpoints you set

So that they are physically lying there as a file I can see in the filesystem?

They are, and cached there

Or is it more the case that the pod gets the model when needed/when an API call for this model is incoming?

I...

one year ago

0 Hey All, Quick Question About Pipeline Execution Queues. I Set The

This workflow however is the only way I have found to easily fix my previous ‘Module not found’ errors

Hmm okay make sense,
Did you try to set these ?
or even hack the sys.path with something like
import sys, os sys.path.insert(0, os.path.abspath(os.path.dirname(__file__)+"/../")

2 years ago

Show more results