AgitatedDove14

49 Questions, 8126 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

25 × Eureka!

Answers 8126

0 Hi, I'M Trying To Clone And Queue Experiments For Running Them On My Workers. I Am Able To Successfully Clone And Queue The Task, But Seems Like The Task Does Not Pass The Correct Parameters To My Python Script On The Worker. We Use Hydra For Configuring

The package detection is done when running the code on your laptop, and this is when it first logs the packages and versions. Following it, what do you have on your laptop? OS/Conda/Python

3 years ago

0 Hi Guys, Suppose I Have The Following Script:

GiganticTurtle0 is it in the same repository ?
If it is it should have detected the fact that it needs to analyze the entire repository (not just the standalone script, and then discover tensorflow)

4 years ago

0 I Seem To Be Missing Something ... I'Ve Only Got One Task Running To Train A Segmentation Model On My Local Machine, And In A Few Days It'S Hit Over 1.15M Api Calls. It Looks Like It'S Sending Every Single Console Output ... Are There Settings To Control

is number of calls performed, not what those calls were.

oh, yes this is just a measure of how many API calls are sent.
It does not really matter which ones

2 years ago

0 Hi There, Are There Any Plans To Add Better Documentation/Examples To

hi ElegantCoyote26

but I can't see any documentation or examples about the updates done in version 1.0.0

So actually the docs are only for 1.0... https://clear.ml/docs/latest/docs/clearml_serving/clearml_serving

Hi there, are there any plans to add better documentation/example

Yes, this is work in progress, the first Item on the list is custom model serving example (kind of like this one https://github.com/allegroai/clearml-serving/tree/main/examples/pipeline )

about...

3 years ago

0 Hi! Trying To Run The Following Very Basic Code. The First Few Parts Works As They Should:

BTW:

======> WARNING! Git diff to large to store (1327kb), skipping uncommitted changes <======

This means all your git changes are stored as an artifact, which is consistent with the "wait for upload" message.

4 years ago

0 Hi All, I Am Starting To Use Clearml-Agent. Run It With

👍

4 years ago

0 Is The App/Ui/Backend Customizable? Any Tutorials For That?

Hi CleanWhale17 let me see if I can address them all

Email Alert for finished Job(I'm not sure if it's already there).

Slack integration will be public by the end of the weekend 🙂
It is fully customization / extendable, I'll be happy to help .

DVC

Full dataset tracking is supported using the artifacts and the ability to integrate to any central storage (shared folders/ S3 / GS / Azure etc.)
From my experience, it is easier to work with artifacts from Data-Processing Tasks...

5 years ago

0 I Have A Set Up An Agent, On A Gpu Machine, And Spun Up The Daemon In Docker Moder, And Specifically Specified A Gpu That It Will Work With. The Image Is Okay And I Verified That By Running

LOL

5 years ago

0 I Use

Hi SteadyFox10
I promised to mention here once we start working on ignite integration, you can check it here:
https://github.com/jkhenning/ignite/tree/trains-integration
Feel free to provide insights / requests 🙂

As for the model upload. The default behavior is
torch.save() calls will only be logged , nothing more. But, if you pass to the Task.init output_uri field, then all your models will be uploaded automatically. For example:
` task = Task.init('examples', 'model upload test', o...

5 years ago

0 Hi! I Have A Clearml Offline Mode Question, In The Docs It Says That When Importing An Offline Session "Full Task Execution Includes Repository Details, Installed Packages, Artifacts, Logs, Metric And Debug Samples." I Am Trying To Figure Out How To Get T

Hi RipeGoose2
I think it "should" take of uploading the artifacts as well (they are included in the zip file created by the offline package)
Notice that the "default_output_uri" on the remote machine is meaningless as it stored them locally anyhow. It will only have an effect on the machine that actually imports the offline session.
Make sense ?

4 years ago

0 Hi, I’M Trying To Create A Dataset On Clearml Server From My Aws S3 Bucket Via:

suppose I have an S3 bucket where my data is stored and I wish to transfer it to ClearML file server.

Then you first have to download the entire bucket locally, then register the local copy.
Basically:

StorageManager.download_folder("

", "/target/folder")
# now register the local "/target/folder" with Dataset.add_files

2 years ago

0 Hi All, I Am Having Trouble Using The

Notice both needs to be str
btw, if you need the entire folder just use StorageManager.upload_folder

4 years ago

0 Hi, A Question About Dataset Storage Suppose I Create A Dataset Like This

Hi MelancholyElk85
So the way datasets now work, is they are actually an entity (folder) inside a project , all under TFW hidden .datasets sub project
This is so all data and tasks are both on the same project , but at the same time will not intersect with subprojects by the same name. Does that make sense?

2 years ago

0 Hi All. I Am Struggling With Integrating Plots Into My Task. Without The Plotting Code, The Task Never Completes The Execution And Seems To Hang. Also, The Plots Are Not Visible In The Plots Tab. I Am Running A For Loop For Different Models And Attemptin

Seems like it is working (including seaborn)

4 years ago

0 A Question About Ssh Keys Mount To A Clearnl-Agent Running In Docker Mode. I Noticed That Only When The Task Is Created And Enqueued (Using Python Script), The Local .Ssh Folder Will Be Bind With The Container, But If I Later Reset (Or Clone) And Enqueue

but when I run the same task again it does not map the keys.. (edited)

SparklingElephant70 what do you mean by "map the keys" ?

3 years ago

0 I'M Running Hyperparameter Tuning With Oputnaotimization. When Using Optuna It Is Possible To Save Studies As You Go And Pick Them Up Again In Case Of Crashes Etc. Is There Anyway Of Accessing The Optuna.Study Class So When We Run The Optunaoptimization W

yes, that makes sense to me.
What is your specific use case, meaning when/how do you stop / launch the hpo?
Would it make sense to continue from a previous execution and just provide the Task ID? Wdyt?

3 years ago

0 Hi There, I Used

I remember there were some issues with it ...

I hope not 😞 Anyhow the only thing that does matter is the auto_connect arguments (meaning if you want to disable some, you should pass them when calling Task.init)

3 years ago

0 Is There Any Specific Version Of Numpy You Recommend To Use With Clearml Python Library? I Am Building An Python Alpine Docker Image With Clearml==1.7.2 But It Breaks When Building Image From Dockerfile.

I'm assuming you are building for x86

3 years ago

0 Yesterday I Executed An Experiment In Our Hosted Clearml Cluster. After The Experiment Was Finished, We Got An Aws Guard Duty Notification About Suspicious Outbound Traffic From The Ec2 That Executed The Job. It Looks Like The Tag Being Used Is Hardcoded

It looks like the tag being used is hardcoded to 1.24-18. Was this issue identified and fixed in later versions?

BoredHedgehog47 what do you mean by "hardcoded 1.24-18" ? tag to what I think I lost context here

3 years ago

0 [Injecting Secrets Into A Clearml Agent / Accessing

The remaining problem is that this way, they are visible in the ClearML web UI which is potentially unsafe / bad practice, see screenshot below.

Ohhh that makes sense now, thank you 🙂
Assuming this is a one time credntials for every agent, you can add these arguments in the "extra_docker_arguments" in clearml.conf
Then make sure they are also listed in: hide_docker_command_env_vars which should cover the console log as well
https://github.com/allegroai/clearml-agent/blob/26e6...

3 years ago

0 Hi, How Can I Change The Project.Default_Output_Destination? I Tried Setting It To None But It Is Not Updated

Hi JitteryCoyote63

I change the project.default_output_destination? I tried setting it to None but it is not updated

How did yo try to change it? and where do you see the effect ?

3 years ago

0 Btw: There Seems To Be No Support For Videos In Tensorboard/Experiment View (E.G.

Well if we the "video" from TB is not in mp4/gif format than someone will have to encode it.
I was just pointing that for the encoding part we might need additional package

4 years ago

0 Hey,

WickedElephant66 is this issue the same as this one?
https://clearml.slack.com/archives/CTK20V944/p1656537337804619?thread_ts=1656446563.854059&cid=CTK20V944

3 years ago

0 Hey, I Saw Previously That Grafana/Prometheus Were Not Supported As Part Of The Clearml-Serving Helm Chart. I Guess This Is Outdated Right? I See These Charts As Part Of Your Serving Chart. So They Should Be Accessible If The K8S Cluster Is Accessible Fro

Depending on your security restrictions, but generally yes.

2 years ago

0 Well, We Accidentally Leaked Some Super Powerful Credentials Today. Is There A Way To

Maybe hard code them into the code or pass them as env variable after the connect?
None
None

2 years ago

0 Is There Some Built-In Way In Clearml To Trigger Further Action On Task Fail (Or Pipeline Fail)?

I suppose one way to perform this is with a

that kicks

Yes, that was my thinking.

It seems more efficient to support a triggered response to task fail.

Not sure I follow this one, I mean the pipeline logic itself monitors the execution. If I'm not mistaken, try/except will catch a step that files, and a global will catch the entire pipeline. Am I missing something ?

3 years ago

0 Why Does Clearml Still Waste Time On Requirement Analysis When I Provide Them? Any Tips For How I Can Reduce Clearml Overhead ... (The Time Before Work Actually Starts)?

yup! that's what I was wondering if you'd help me find a way to change the timings of. Is there an option I can override to make the retry more aggressive?

you mean wait for less?
None
add to your clearml.conf:

api.http.retries.backoff_factor = 0.1

one year ago

0 Hello, A Question About Pipelines. I Have A Repository With One Pipeline Using Decorators, Defined In

apologies @<1798887585121046528:profile|WobblyFrog79> somehow I missed your reply,

My workflow is based around executing code that lives in the same repository, so it’s cumbersome having to specify repository information all over the place, and changing commit hash as I add new code.

It automatically infers the repo if the original as long as the pipeline code itself is inside the repo, by that I mean the pipeline logic, when you run it the first time (think development etc), if it s...

9 months ago

0 Hi, I Am Using Pipelinedecorator To Create Tasks. Is There A Way To Force It To Use The Entire Git Repo It Is Created From On The Pythonpath? Vs. Just The Decorated Function And Perhaps The Helper_Function=[Some_Function]?

they are just neighboring modules to the function I am importing.

So I think that is you specify the repo,, on the remote machine you will end with the code of the component sitting at the root folder of the repo, from there I assume you can import the rest, the root git path should be part of your PYTHONPATH automatically.
wdyt?

3 years ago

0 Question Regarding Tensorboard (If There Is An Answer Here Already Please Send Me A Link). I Have A Few Graphs With The Same X Axis But Different Y Axis That Are Presented On Different Graphs In Tensorboard And For Some Reason Trains Joins Them On The Sam

BTW: CloudyHamster42 I think this issue was discussed on GitHub, and the final "verdict" was we should have an option to split/combine graphs on the UI side (i.e. similar to the "smoothing" or wall-time axis etc.)

5 years ago

Show more results