PanickyMoth78

34 Questions, 167 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

166 × Eureka!

Questions 34
Answers 167

0 Votes

20 Answers

2K Views

0 Votes 20 Answers 2K Views

Task Struck At

task struck at task.flush(wait_for_uploads=True) : I've been running a model training task - a variation on this clearml dataset example: https://github.com/...

tensorboard

2 years ago

0 Votes

2 Answers

2K Views

0 Votes 2 Answers 2K Views

Hi. I'M Using

Hi. I'm using @PipelineDecorator.component to define a task from a function (to run in a pipeline) I'd like to get the task object within this function so th...

clearml

3 years ago

0 Votes

27 Answers

2K Views

0 Votes 27 Answers 2K Views

Hi. I'M Running This Little Pipeline:

Hi. I'm running this little pipeline: from clearml.automation.controller import PipelineDecorator from clearml import TaskTypes @PipelineDecorator.component(...

clearml

3 years ago

0 Votes

9 Answers

2K Views

0 Votes 9 Answers 2K Views

Hi. Help

Hi. Help 🥺 I have a clearml.Datase which I can't get

clearml

3 years ago

0 Votes

7 Answers

2K Views

0 Votes 7 Answers 2K Views

I Have 5 Unarchived Pipeline Runs That Were Defined With This Decorator:

I have 5 unarchived pipeline runs that were defined with this decorator: @PipelineDecorator.pipeline( name="fastai_image_classification_pipeline", project="l...

clearml

3 years ago

0 Votes

1 Answers

2K Views

0 Votes 1 Answers 2K Views

Hi. I Spent Some Time This Week Trying To Optimise File Transfer Time In And Out Of Processes That Use Google'S Gcs (In Vertex Ai Pipelines). It Seems That In The Case Where I Have A Lot Of Very Small Files, It Made More Sense To Tar.Gz Them And Send A Bi

Hi. I spent some time this week trying to optimise file transfer time in and out of processes that use google's gcs (in vertex ai pipelines). It seems that i...

clearml

3 years ago

0 Votes

22 Answers

2K Views

0 Votes 22 Answers 2K Views

I Started Two Pipelines (Using Aws Autoscaler In App.Clear.Ml ). The Pipelines Ran Concurrently, Using The Same Pipeline Code. Both Failed In The Same Component Half-Way Though The Pipeline Run With:

I started two pipelines (using AWS autoscaler in app.clear.ml ). The pipelines ran concurrently, using the same pipeline code. Both failed in the same compon...

aws mlops

3 years ago

0 Votes

4 Answers

1K Views

0 Votes 4 Answers 1K Views

Hi. I'M Using Clearml Agent 1.16.1 My Code Is Running A Multi-Process Pool With "Spawn" (See

Hi. I'm using clearml agent 1.16.1 My code is running a multi-process pool with "spawn" (see here for why) from multiprocessing import get_context ... with g...

mlops

one year ago

0 Votes

1 Answers

2K Views

0 Votes 1 Answers 2K Views

Suppose I Use A Pipeline Decorator To Define A Pipeline:

suppose I use a pipeline decorator to define a pipeline: @PipelineDecorator.pipeline(name='my-pipeline', project='my-project', version='0.2') def my_pipeline...

clearml

3 years ago

0 Votes

3 Answers

2K Views

0 Votes 3 Answers 2K Views

Hi. First Time User Here

Hi. First time user here 👋 I have experienced a problem following the getting started documentation. I opened an account on https://app.clear.ml/ I then fol...

clearml

3 years ago

0 Votes

2 Answers

2K Views

0 Votes 2 Answers 2K Views

Hi. I'Ve Noticed That My Clearml.Conf Has Both:

Hi. I've noticed that my clearml.conf has both: agent.git_user="" agent.git_pass=""and agent { ... git_user: "" git_pass: "" ... }What's the difference? Shou...

mlops

3 years ago

0 Votes

7 Answers

2K Views

0 Votes 7 Answers 2K Views

Hi. I Have A

Hi. I have a problem accessing repo code in pipeline components running in an AWS autoscaler (first attempts at doing this) My local clearml.conf file has ag...

mlops

3 years ago

0 Votes

6 Answers

2K Views

0 Votes 6 Answers 2K Views

Is There Some Built-In Way In Clearml To Trigger Further Action On Task Fail (Or Pipeline Fail)?

Is there some built-in way in clearml to trigger further action on task fail (or pipeline fail)?

clearml

3 years ago

0 Votes

2 Answers

2K Views

0 Votes 2 Answers 2K Views

I Am Using The Aws Autoscaler And I Wish To Set My Files Server To Be Gs. I Tried To Do So By Having This In The Additional Clearml Configuration Window:

I am using the AWS autoscaler and I wish to set my files server to be gs. I tried to do so by having this in the ADDITIONAL CLEARML CONFIGURATION window: api...

mlops

3 years ago

0 Votes

22 Answers

2K Views

0 Votes 22 Answers 2K Views

Hi. I'M Encountering A Problem With

Hi. I'm encountering a problem with model.name At least, for models that where auto-magically uploaded. I see it in my own code but you can see it if you run...

clearml

2 years ago

0 Votes

9 Answers

2K Views

0 Votes 9 Answers 2K Views

Hi. I Have A Question About Pipelines And Their Generated Dependency Graphs. I Took The Code Of The Clearml Pipeline From Decorator Example:

Hi. I have a question about pipelines and their generated dependency graphs. I took the code of the clearml pipeline from decorator example: https://github.c...

clearml

3 years ago

0 Votes

14 Answers

2K Views

0 Votes 14 Answers 2K Views

Bug?

Bug? dataset name is ignored if use_current_task=True

clearml

2 years ago

0 Votes

2 Answers

2K Views

0 Votes 2 Answers 2K Views

Hi. Suppose I Want To Report On What My Task Has Done By Having It Generate A Markdown (.Md) File With Links To Some "Local" Figure Files. Looking At The Reporting Documentation, The Closest Thing I Found Is The

Hi. Suppose I want to report on what my task has done by having it generate a markdown (.md) file with links to some "local" figure files. looking at the rep...

clearml

2 years ago

0 Votes

3 Answers

2K Views

0 Votes 3 Answers 2K Views

Hi. Looking Into Clearml Support For Datasets, I'D Like To Understand How To Work With Large Datasets And Cases Where Not All The Data Is Downloaded At Once. (E.G. 1. Each Training Epoch Is Performed On A (Preferably Random) Sample Of The Data That Is Dow

Hi. Looking into clearml support for datasets, I'd like to understand how to work with large datasets and cases where not all the data is downloaded at once....

clearml

3 years ago

0 Votes

13 Answers

2K Views

0 Votes 13 Answers 2K Views

Another Question On The Topic Of How A Remote Execution Of A Pipeline Kills The Calling Process (Previously Discussed

Another question on the topic of how a remote execution of a pipeline kills the calling process (previously discussed https://clearml.slack.com/archives/CTK2...

clearml

3 years ago

0 Votes

3 Answers

2K Views

0 Votes 3 Answers 2K Views

Hi. Shoulf This Command Succeed In The Presence Of Project

Hi. Shoulf this command succeed in the presence of project lavi-testing and absence of dataset tmp_datset within it? from clearml import Dataset tmp_dataset ...

clearml

3 years ago

0 Votes

16 Answers

3K Views

0 Votes 16 Answers 3K Views

Hi. Question About Dataset Upload Errors: When Uploading A

Hi. Question about Dataset upload errors: When uploading a clearml.Dataset created with output_uri=" gs://lavi_test/datasets after adding 20 files of size 50...

gcp

2 years ago

0 Votes

14 Answers

2K Views

0 Votes 14 Answers 2K Views

Hi. I Have A Job That Processes Images And Creates ~5 Gb Of Processed Image Files (Lots Of Small Ones). At The End - It Creates A

Hi. I have a job that processes images and creates ~5 GB of processed image files (lots of small ones). At the end - it creates a clearml.Dataset and perform...

clearml

2 years ago

0 Votes

25 Answers

2K Views

0 Votes 25 Answers 2K Views

Autoscaler Parallelization Issue: I Have An Aws Autoscaler Set Up With A Resource That Has A Max Of 3 Instances Assigned To The

Autoscaler parallelization issue: I have an AWS Autoscaler set up with a resource that has a max of 3 instances assigned to the default queue I've given it a...

clearml

3 years ago

0 Votes

8 Answers

2K Views

0 Votes 8 Answers 2K Views

Hi (Again... Sorry For Asking So Many Questions) Question About Using Google Cloud Storage In A Clearml Agent Running In Aws Ec2 Instance. My

Hi (again... sorry for asking so many questions) Question about using google cloud storage in a clearml agent running in AWS ec2 instance. my clearml.conf ha...

mlops

3 years ago

0 Votes

2 Answers

2K Views

0 Votes 2 Answers 2K Views

I Have A Training Task That Auto-Magically Saves A Model For Me To Gcs

I have a training task that auto-magically saves a model for me to GCS task = Task.init( project_name=project_name, task_name=f"Image classification training...

clearml

2 years ago

0 Votes

8 Answers

3K Views

0 Votes 8 Answers 3K Views

Hi There I'M Trying Out Clearml. I Saw Mention That Clearml Can Capture Tensorboard Output So I Tried It With This Little Script (Image Below). The Events File Is Filled, The Clearml Task Is Created, And Marked Complete However There Is Nothing In The Sc

Hi there I'm trying out clearml. I saw mention that clearml can capture tensorboard output so I tried it with this little script (image below). The events fi...

tensorboard

3 years ago

0 Votes

14 Answers

2K Views

0 Votes 14 Answers 2K Views

Hi There. I'M Trying To Switch Pipeline Code From A Local Run Using

Hi there. I'm trying to switch pipeline code from a local run using PipelineDecorator.run_locally()to a slightly-less-local run using PipelineDecorator.set_d...

clearml

3 years ago

0 Votes

30 Answers

2K Views

0 Votes 30 Answers 2K Views

Hi. I'D Like To Try The Gcp Autoscaler.

Hi. I'd like to try the GCP autoscaler. What permissions does the service account that I provide to clearml need? (and what GCP API should I enable in the GC...

clearml

3 years ago

0 Votes

8 Answers

2K Views

0 Votes 8 Answers 2K Views

Hello Community. I'D Like To Try The Aws Autoscaler (I Actually Prefer To Try The Gcp One But I Think It'S Broken Or, At Least, I'Ve Failed To Make It Work So Far) I Can'T Find Documentation On What Permissions Would Be Required From An Aws Sub-Account

Hello community. I'd like to try the AWS autoscaler (I actually prefer to try the GCP one but I think it's broken or, at least, I've failed to make it work s...

mlops

3 years ago

Show more results

0 Hello! When Trying To Use Clearml Datasets With Google Cloud Storage With The Authorized User Credentials It Will Fail And Say Some Fields Are Missing From The Json. This Isn'T An Issue If The User Is Using A Service Account Json Key, Is A Service Account

nice !

2 years ago

0 Hi. I'M Using Clearml Agent 1.16.1 My Code Is Running A Multi-Process Pool With "Spawn" (See

Oh, cool. So would this then report the activities of the spawned processes to the same task as that of the spawning process?

one year ago

0 Is There Some Built-In Way In Clearml To Trigger Further Action On Task Fail (Or Pipeline Fail)?

Yes.
Some mechanism that would allow for followup code execution. Ideally in a way that would not be susceptible to the same things that may cause a task to fail.

3 years ago

0 Hi. Help

It seems to be doing ok on the app side:
I didn't realise Datasets had tasks associated with them but there is one and it seems to be doing ok.
I've attached it's log file which only mentions skipping one file (a warning)

3 years ago

0 Hi. I Have A Job That Processes Images And Creates ~5 Gb Of Processed Image Files (Lots Of Small Ones). At The End - It Creates A

would setting the max_workers to 1 be a (slower) workaround?

2 years ago

0 Hi. I Have A Question About Pipelines And Their Generated Dependency Graphs. I Took The Code Of The Clearml Pipeline From Decorator Example:

Sure. It is a minor change from the code in the clearml examples for pipelines.
I just repeat the last two pipeline steps from that code in a loop (x3)
https://github.com/allegroai/clearml/blob/master/examples/pipeline/pipeline_from_decorator.py

3 years ago

0 Hi. I'M Just Starting Out Here Trying To Evaluate Clearml Ease Of Use. I'D Like To Understand Whether Clearml (Paid Service) Can Receive Access To A Gcp Project And Use Gke To Spin Clusters Up And Workers Or Would That Be On The Customer To Manage.

or, barring that, something similar on AWS?

3 years ago

0 Hi. First Time User Here

In case anyone else is interested. We found two alternative solutions:
Repeating the first steps but from within a Docker container ( docker run -it --rm python:3.9 bash ) worked for me.alternatively
The example tasks (or at least those I've tried) that appear in the clear ml examples within a new workspace have clearml==0.17.5 (an old clearml version) listed in "INSTALLED PACKAGES". Updating the clearml package within the task to 1.5.0 let me run the clear-ml agent daemon lo...

3 years ago

0 Hi (Again... Sorry For Asking So Many Questions) Question About Using Google Cloud Storage In A Clearml Agent Running In Aws Ec2 Instance. My

I now get this error:
2022-07-18 21:51:29,168 - clearml.storage - ERROR - Failed creating storage object Reason: [Errno 2] No such file or directory: '~/gs.cred'
to be clear, I replaced <this is your GCP storage credentials file> with the contents of that file, escaping every " with a \" and removing newlines.

3 years ago

0 Hi. I'M Encountering A Problem With

To be specific there is "model name" which is not unique , and there is model-key which is unique to the Task

not sure why the two fields don't simply match. I guess that there may be situations where file name (without the full path) may be used several times.

2 years ago

0 Hi. I'M Encountering A Problem With

Ooh nice.
I wasn't aware task.models["output"] also acts like a dict.
I can get the one I care about in my code with something like task.models["output"]["best_model"]
however can you see the inconsistency between the key and the name there:

2 years ago

0 I Am Using The Aws Autoscaler And I Wish To Set My Files Server To Be Gs. I Tried To Do So By Having This In The Additional Clearml Configuration Window:

I tried the first option and it worked 🙂 🙏

3 years ago

0 Hi. Help

I had several pipeline components getting it and uploading files to is concurrently.
Can Datsets handle that?

3 years ago

0 I Started Two Pipelines (Using Aws Autoscaler In App.Clear.Ml ). The Pipelines Ran Concurrently, Using The Same Pipeline Code. Both Failed In The Same Component Half-Way Though The Pipeline Run With:

Hey Alon,
See
https://clearml.slack.com/archives/CTK20V944/p1658892624753219
I was able to isolate this as a bug in clearml 1.6.3rc1 that can be reproduced outside of a task / app simply be doing get_local_copy() on a dataset with parents.

3 years ago

0 Hi. I'M Encountering A Problem With

another weird thing:
Before my training task is done:
print(task.models['output'].keys())outputs
odict_keys(['Output Model #0', 'Output Model #1', 'Output Model #2'])
after task.close()
I can do:
task = Task.get_task(task_id) for i in range(100): print(task.models["output"].keys())which prints
odict_keys(['Output Model #0', 'Output Model #1', 'Output Model #2'])in the first iteration
and prints the file names in the latter iterations:
` od...

2 years ago

0 Hi. I'D Like To Try The Gcp Autoscaler.

I'll do a clean relaunch of everything (scaler and pipeline)

3 years ago

0 Hi. Help

essentially, several running processes were performing:
model_evals_dataset = Dataset.get( dataset_project=dataset_project, dataset_name=f"model_evals", ) model_evals_dataset.add_files(run_eval_path) model_evals_dataset.upload()

3 years ago

0 Hi. Help

silly me. I deleted my gs credentials file :man-facepalming:

3 years ago

0 Hi. I'M Encountering A Problem With

I imagine that one workaround is to
Disable automatic model uploads Perform manual model upload (with the correct name).Can you point me to how to do these?

2 years ago

0 I Have 5 Unarchived Pipeline Runs That Were Defined With This Decorator:

In fact, all my projects seems empty of tasks.

3 years ago

0 Hi. I'D Like To Try The Gcp Autoscaler.

Trying to switch to a resources using gpu-enabled VMs failed with that same error above.
Looking at spawned VMs, they were spawned by the autoscaler without gpu even though I checked that my settings ( n1-standard-1 and nvidia-tesla-t4 and https://console.cloud.google.com/compute/imagesDetail/projects/ml-images/global/images/c0-deeplearning-common-cu113-v20220701-debian-10?project=ml-tooling-test-external image for the VM) can be used to make vm instances and my gcp autoscaler...

3 years ago

0 Hi. I'M Encountering A Problem With

BTW:

If I try to find the right model in the

task.models["output"]

(this time there is just one but in my code there may be several) it appears with the

(see other attached screenshot).

What would make sense here ? (I have to be honest I'm not sure).

If the model was saved with a file name (is that the trigger for auto-upload?), I think it makes sense for the model name to match the file name (not the task name), especially when there may be ...

2 years ago

Unfortunately, waiting a while did not make this go away 🙂

3 years ago

0 Hi. I Have A

AgitatedDove14
Adding adding repo and repo_branch to the pipeline.component decorator worked (and I can move on to my next issue 🙂 ).
I'm still unclear on why cloning the repo in use happens automatically for the pipeline task and not for component tasks.

3 years ago

0 Hi. Question About Dataset Upload Errors: When Uploading A

I have google-cloud-storage==2.6.0 installed

2 years ago

(I'm going to stop the autoscaler, terminate all the instances and clone the autoscaler and retry it all from the beginning)

3 years ago

0 Is There Some Built-In Way In Clearml To Trigger Further Action On Task Fail (Or Pipeline Fail)?

I suppose one way to perform this is with a https://clear.ml/docs/latest/docs/references/sdk/scheduler that kicks off a health check task (check exit state of executed tasks). It seems more efficient to support a triggered response to task fail.

3 years ago

0 Hi I'M Looking Into How Clearml Supports Datasets And Dataset Versioning And I'M A Bit Confused. Is Dataset Versioning Not Supported At All In The Non-Enterprise Or Is Versioning Available By A Different Mechanism? I See That

console output shows uploads of 500 files on every new dataset. The lineage is as expected, each additional upload is the same size as the previous ones (~50mb) and Dataset.get on the last dataset's ID retreives all the files from the separate parts to one local folder.
Checking the remote storage location (gs://) shows artifact zip files, each with 500 files

3 years ago

Another issue, may, or may not be related.
Running another pipeline (to see if I can reproduce the issue with simple code), it looks like the autoscaler has spun down all the instances for the default queue while a component was still running.
Both the pipline view and the "All experiment" view shows the component as running.
The component's console show that last command was a docker run command

3 years ago

0 Hi. First Time User Here

thanks, I'll DM

3 years ago

Show more results