PanickyMoth78

33 Questions, 165 Answers

Active since 10 January 2023

Last activity one month ago

Reputation

Badges 1

164 × Eureka!

Questions 33
Answers 165

0 Votes

3 Answers

635 Views

0 Votes 3 Answers 635 Views

Hi. Shoulf This Command Succeed In The Presence Of Project

Hi. Shoulf this command succeed in the presence of project lavi-testing and absence of dataset tmp_datset within it? from clearml import Dataset tmp_dataset ...

clearml

one year ago

0 Votes

2 Answers

614 Views

0 Votes 2 Answers 614 Views

Hi. I'M Using

Hi. I'm using @PipelineDecorator.component to define a task from a function (to run in a pipeline) I'd like to get the task object within this function so th...

clearml

one year ago

0 Votes

1 Answers

595 Views

0 Votes 1 Answers 595 Views

Suppose I Use A Pipeline Decorator To Define A Pipeline:

suppose I use a pipeline decorator to define a pipeline: @PipelineDecorator.pipeline(name='my-pipeline', project='my-project', version='0.2') def my_pipeline...

clearml

one year ago

Show more results

0 Hi. Shoulf This Command Succeed In The Presence Of Project

That would be a better message however, I must have misunderstood the meaning of auto_create=True
I thought that flag made the get function into a "get-or-create"

one year ago

0 I Started Two Pipelines (Using Aws Autoscaler In App.Clear.Ml ). The Pipelines Ran Concurrently, Using The Same Pipeline Code. Both Failed In The Same Component Half-Way Though The Pipeline Run With:

now trying with added lines as Alon suggested:
` @PipelineDecorator.component(
return_values=["run_model_path", "run_info"],
cache=True,
task_type=TaskTypes.training,
repo="git@github.com:shpigi/clearml_evaluation.git",
repo_branch="main",
packages="./requirements.txt",
)
def train_image_classifier_component(
clearml_dataset,
backbone_name,
image_resize: int,
batch_size: int,
run_model_uri,
run_tb_uri,
local_data_path,
num_epochs: int,
)...

one year ago

the same occures when I run a single training component instead of two

one year ago

I get the same error with those added lines

one year ago

correct

one year ago

0 Another Question On The Topic Of How A Remote Execution Of A Pipeline Kills The Calling Process (Previously Discussed

I've also not figured out how to modify the examples above to wait for one pipline to end before the next begins

one year ago

TimelyPenguin76 , Could the problem be related to an error in the log of the previous step (which completed successfully)?
` 2022-07-26 04:25:56,923 - clearml.Task - INFO - Waiting to finish uploads
2022-07-26 04:26:01,447 - clearml.storage - ERROR - Failed uploading: HTTPSConnectionPool(host='storage.googleapis.com', port=443): Max retries exceeded with url: /upload/storage/v1/b/clearml-evaluation/o?uploadType=multipart (Caused by SSLError(SSLError(1, '[SSL: DECRYPTION_FAILED_OR_BAD_RECORD_M...

one year ago

Another issue, may, or may not be related.
Running another pipeline (to see if I can reproduce the issue with simple code), it looks like the autoscaler has spun down all the instances for the default queue while a component was still running.
Both the pipline view and the "All experiment" view shows the component as running.
The component's console show that last command was a docker run command

one year ago

Unfortunately, waiting a while did not make this go away 🙂

one year ago

0 Another Question On The Topic Of How A Remote Execution Of A Pipeline Kills The Calling Process (Previously Discussed

yes
here is the true "my_pipeline" declaration:
` @PipelineDecorator.pipeline(
name="fastai_image_classification_pipeline",
project="lavi-testing",
target_project="lavi-testing",
version="0.2",
multi_instance_support="",
add_pipeline_tags=True,
abort_on_failure=True,
)
def fastai_image_classification_pipeline(
run_tags: List[str],
i_dataset: int,
backbone_names: List[str],
image_resizes: List[int],
batch_sizes: List[int],
num_train_epochs: i...

one year ago

0 Another Question On The Topic Of How A Remote Execution Of A Pipeline Kills The Calling Process (Previously Discussed

on the same topic. What if (I were able to iterate and) I wanted the pipelines calls to be blocking so that the next pipeline executes only after the previous one completes?

one year ago

0 Another Question On The Topic Of How A Remote Execution Of A Pipeline Kills The Calling Process (Previously Discussed

oops, should it have been multi_instance_support=True ?

one year ago

0 Autoscaler Parallelization Issue: I Have An Aws Autoscaler Set Up With A Resource That Has A Max Of 3 Instances Assigned To The

(fixed small typo in code above just now)

one year ago

0 Hi (Again... Sorry For Asking So Many Questions) Question About Using Google Cloud Storage In A Clearml Agent Running In Aws Ec2 Instance. My

My local environment has clearml version 1.6.3rc0
and agents in aws were started with the AWS Autoscaler which has no explicit place for google credentials.

I see a place for Additional ClearML Configuration in the AWS autoscaler UI which I suspect may help but I don't see how I can pass a secrets file along with my agent.

one year ago

0 Hi (Again... Sorry For Asking So Many Questions) Question About Using Google Cloud Storage In A Clearml Agent Running In Aws Ec2 Instance. My

For anyone following, you can "inject" a credentials json file for a google cloud service account so at to get access to your google cloud storage from agents on aws ec2 instances that are managed by the AWS autoscaler by providing the following in the ADDITIONAL CLEARML CONFIGURATION when starting the autoscaler:
` sdk.google.storage.credentials_json: "/root/gs.cred"
sdk.google.storage.project: "<my-gcp-project-id>"
files {
gsc {
contents: """<copy-paste the contents of yo...

one year ago

0 Hi. I Have A

in order for the autoscaler to access your git , in the wizard you have to provide the git user/token

git_pass has the token
Perhaps I should have mentined that I start the AWS autoscaler with the https://app.clear.ml/applications/aws-autoscaler/ .

Hmm, how does the decorator of the component looks like ? meaning did you specify a repo/branch/commit there

Neither my pipeline decorator not my component specify any repos:

` # pipeline
@PipelineDecorator.pipeline(
name=...

one year ago

0 Hi. I'M Running This Little Pipeline:

one year ago

0 Hi There I'M Trying Out Clearml. I Saw Mention That Clearml Can Capture Tensorboard Output So I Tried It With This Little Script (Image Below). The Events File Is Filled, The Clearml Task Is Created, And Marked Complete However There Is Nothing In The Sc

Would you expect this fastai callback to work?
(Uses SummaryWriter):
https://github.com/fastai/fastai/blob/d7f4863f1ee3c0fa9f2d9feeb6a05f0625a53696/fastai/callback/tensorboard.py
It seems to have failed as well (but I'd need to check more carefully)

one year ago

0 Hi. I'D Like To Try The Gcp Autoscaler.

I believe n1-standard-8 would work for that. I initially just tried going with the autoscaler defaults which has gpu on but that n1-standard-1 specified as the machine

one year ago

0 Hi. I'D Like To Try The Gcp Autoscaler.

Trying to switch to a resources using gpu-enabled VMs failed with that same error above.
Looking at spawned VMs, they were spawned by the autoscaler without gpu even though I checked that my settings ( n1-standard-1 and nvidia-tesla-t4 and https://console.cloud.google.com/compute/imagesDetail/projects/ml-images/global/images/c0-deeplearning-common-cu113-v20220701-debian-10?project=ml-tooling-test-external image for the VM) can be used to make vm instances and my gcp autoscaler...

one year ago

0 Hi There. I'M Trying To Switch Pipeline Code From A Local Run Using

I'm on clearml 1.6.2
The jupyter notebook service and two clear-ml agents ( version1.3.0, one in queue "default" and one in queue "services" and with --cpu-only flag) ) are all running inside a docker container

one year ago

0 Hi There. I'M Trying To Switch Pipeline Code From A Local Run Using

If I run from terminal, I see:
ValueError: Task object can only be updated if created or in_progress [status=stopped fields=['configuration']]

one year ago

0 Hi There. I'M Trying To Switch Pipeline Code From A Local Run Using

Hi Martin. See that ValueError https://clearml.slack.com/archives/CTK20V944/p1657583310364049?thread_ts=1657582739.354619&cid=CTK20V944 Perhaps something else is going on?

one year ago

0 Hi There. I'M Trying To Switch Pipeline Code From A Local Run Using

actually, re-running pipeline_from_decorator.py a second time (and a third time) from the command line seem to have executed without the that ValueError so maybe that issue was some fluke.
Nevertheless, those runs exit prior to line
print('process completed')
and I would definitely prefer the command executing_pipeline to not kill the process that called it.
For example, maybe, having started the pipeline I'd like my code to also report having started the pipeline to som...

one year ago

0 Hi There. I'M Trying To Switch Pipeline Code From A Local Run Using

What I think would be preferable is that the pipeline be deployed and that the python process that deployed it were allowed to continue on to whatever I had planned for it to do next (i.e. not exit)

one year ago

0 Hi There. I'M Trying To Switch Pipeline Code From A Local Run Using

first, thanks for having these discussions. I appreciate this kind of support is an effort 🙏
Yes. i perfectly understand that once a pipeline job (or a task) is sent off in this manner, it executes separately (and, most likely in a different machine) from the process that instantiated it.
I still feel strongly that such a command should not be thought of as a fire and exit operation. I can think of several scenarios where continued execution of the instantiating process is desired:
I ...

one year ago

0 Hi There. I'M Trying To Switch Pipeline Code From A Local Run Using

Hmm interesting, so like a callback?!

like https://github.com/allegroai/clearml/blob/bca9a6de3095f411ae5b766d00967535a13e8401/examples/pipeline/pipeline_from_tasks.py#L54-L55 pipe-step level callbacks? I guess that mechanism could serve. Where do these callbacks run? In the instantiating process? If so, that would work (since the callback function can be any code I wish, right?)

I might want to dispatch other jobs from within the same process.

This is actually something t...

one year ago

0 Hi There. I'M Trying To Switch Pipeline Code From A Local Run Using

Thanks ! 🎉
I'll give it a try.
I think that clearml should be able to do parameter sweeps using pipelines in a manner that makes use of parallelisation.
If that's not happening with the new RC, I wonder how I would do a parameter sweep within the pipelines framework.

For example - how would this task-based example be done with pipelines?
https://github.com/allegroai/clearml/blob/master/examples/automation/manual_random_param_search_example.py

I'm thinking of a case where you want t...

one year ago

0 Hi There. I'M Trying To Switch Pipeline Code From A Local Run Using

Thanks for the fix and the mock HPO example code !
Pipeline behaviour with the fix is looking good.
I see the point about changes to data inside the controller possibly causing dependencies for step 3 (or, at least, making it harder for the interpreter to know).

one year ago

0 Hi. I'M Running This Little Pipeline:

one year ago

Show more results