PanickyMoth78

34 Questions, 167 Answers

Active since 10 January 2023

Last activity 5 months ago

Reputation

Badges 1

166 × Eureka!

Answers 167

0 I Have 5 Unarchived Pipeline Runs That Were Defined With This Decorator:

Hi John. sort of. It seems that archiving pipelines does not also archive the tasks that they contain so /projects/lavi-testing/.pipelines/fastai_image_classification_pipeline is a very long list..

2 years ago

0 Hi. I'D Like To Try The Gcp Autoscaler.

did you mean that I was running in CPU mode? I'll tried both but I'll try cpu mode with that base docker image

2 years ago

0 Hi. Question About Dataset Upload Errors: When Uploading A

Thanks AgitatedDove14
setting max_workers to 1 prevents the error (but, I assume, it may come the cost of slower sequential uploads).

My main concern now is that this may happen within a pipeline leading to unreliable data handling.

If Dataset.upload() does not crash or return a success value that I can check and if Dataste.get_local_copy() also does not complain as it retrieves partial data - how will I ever know that I lost part of my dataset?

2 years ago

0 Hi. I'D Like To Try The Gcp Autoscaler.

Trying to switch to a resources using gpu-enabled VMs failed with that same error above.
Looking at spawned VMs, they were spawned by the autoscaler without gpu even though I checked that my settings ( n1-standard-1 and nvidia-tesla-t4 and https://console.cloud.google.com/compute/imagesDetail/projects/ml-images/global/images/c0-deeplearning-common-cu113-v20220701-debian-10?project=ml-tooling-test-external image for the VM) can be used to make vm instances and my gcp autoscaler...

2 years ago

0 Hi. I'D Like To Try The Gcp Autoscaler.

Hi TimelyPenguin76
Thanks for working on this. The clearml gcp autoscaler is a major feature for us to have. I can't really evaluate clearml without some means of instantiating multiple agents on GCP machines and I'd really prefer not to have to set up a k8 cluster with agents and manage scaling it myself.

I tried the settings above with two resources, one for default queue and one for the services queue (making sure I use that image you suggested above for both).
The autoscaler started up...

2 years ago

0 Hi. Question About Dataset Upload Errors: When Uploading A

I can't find version 1.8.1rc1 but I believe I see a relevant change in code of Dataset.upload in 1.8.1rc0

2 years ago

0 Bug?

I have a task where I create a dataset but I also create a set of matplotlib figures, some numeric statistics and a pandas table that describe the data which I wish to have associated with the dataset and vieawable from the clearml web page for the dataset.

2 years ago

0 Hi. I'M Using

For component
task=Task.current_task()Will get me the task object. (right?)
This does not work for pipeline. Is pipeline a task?
Edit: The same works for pipeline

2 years ago

0 Hi. I'D Like To Try The Gcp Autoscaler.

Here are screen shots of a VM I started with a gpu and one stared by the autoscaler with the setting above but whose GPU is missing (both in the zame gcp zone, us-central1-f ) . I may have misconfigured something or perhaps the autoscaler is failing to specify the GPU requirement correctly. :shrug:

2 years ago

0 Another Question On The Topic Of How A Remote Execution Of A Pipeline Kills The Calling Process (Previously Discussed

oops, should it have been multi_instance_support=True ?

2 years ago

0 Hi. I'D Like To Try The Gcp Autoscaler.

I'll try a more carefully checked run a bit later but I know it's getting a bit late in your time zone

2 years ago

0 Hi. Question About Dataset Upload Errors: When Uploading A

this is the printout I get:

2 years ago

0 Hi There. I'M Trying To Switch Pipeline Code From A Local Run Using

actually, re-running pipeline_from_decorator.py a second time (and a third time) from the command line seem to have executed without the that ValueError so maybe that issue was some fluke.
Nevertheless, those runs exit prior to line
print('process completed')
and I would definitely prefer the command executing_pipeline to not kill the process that called it.
For example, maybe, having started the pipeline I'd like my code to also report having started the pipeline to som...

2 years ago

0 Another Question On The Topic Of How A Remote Execution Of A Pipeline Kills The Calling Process (Previously Discussed

You can have

parents

as one of the

@PipelineDecorator.component

args. The step will be executed only after all the

parents

are executed and completed

Is there an example of using parents some place? Im not sure what to pass and also, how to pass a component from one pipeline that was just kicked off to execute remotely (which I'd like to block on) to a component of the next pipeline's run

2 years ago

0 Hi. I Have A Few Questions About The Snippet Attached

Thanks,

Just to be clear, you are saying the "random" results are consistent over runs ?

yes !
By re-runs I mean re-running this script (not cloning the pipeline)

2 years ago

0 Hi There. I'M Trying To Switch Pipeline Code From A Local Run Using

I'm on clearml 1.6.2
The jupyter notebook service and two clear-ml agents ( version1.3.0, one in queue "default" and one in queue "services" and with --cpu-only flag) ) are all running inside a docker container

2 years ago

0 Bug?

I was doing it with the task that I had been using. Mostly for logging arguments that control what the dataset will contain.

2 years ago

0 Hi. Question About Dataset Upload Errors: When Uploading A

This seems relevant:
https://stackoverflow.com/questions/61001454/why-does-upload-from-file-google-cloud-storage-function-throws-timeout-error

2 years ago

0 Another Question On The Topic Of How A Remote Execution Of A Pipeline Kills The Calling Process (Previously Discussed

on the same topic. What if (I were able to iterate and) I wanted the pipelines calls to be blocking so that the next pipeline executes only after the previous one completes?

2 years ago

0 Bug?

Yeah. I was only using the task for the process of creating the dataset.

My code does start out with a step that checks for the existence of the dataset, returning it if it exists (search by project name/dataset name/version) rather than recreating it.
I noticed the name mismatch when that check kept failing me...

I think that init-ing the encompassing task with the relevant dataset name still allows me to search for the dataset by dataset_name=task_name / project_name (shared by both datas...

2 years ago

0 Hi There. I'M Trying To Switch Pipeline Code From A Local Run Using

What I think would be preferable is that the pipeline be deployed and that the python process that deployed it were allowed to continue on to whatever I had planned for it to do next (i.e. not exit)

2 years ago

0 Hi. I'D Like To Try The Gcp Autoscaler.

I believe n1-standard-8 would work for that. I initially just tried going with the autoscaler defaults which has gpu on but that n1-standard-1 specified as the machine

2 years ago

0 Hi. Question About Dataset Upload Errors: When Uploading A

I have google-cloud-storage==2.6.0 installed

2 years ago

0 Another Question On The Topic Of How A Remote Execution Of A Pipeline Kills The Calling Process (Previously Discussed

yes
here is the true "my_pipeline" declaration:
` @PipelineDecorator.pipeline(
name="fastai_image_classification_pipeline",
project="lavi-testing",
target_project="lavi-testing",
version="0.2",
multi_instance_support="",
add_pipeline_tags=True,
abort_on_failure=True,
)
def fastai_image_classification_pipeline(
run_tags: List[str],
i_dataset: int,
backbone_names: List[str],
image_resizes: List[int],
batch_sizes: List[int],
num_train_epochs: i...

2 years ago

0 Bug?

I don't mind assigning to the task the same name that I'd assign to the dataset. I just think that the create function should expect dataset_name to be None in the case of use_current_task=True (or allow the dataset name to differ from the task name)

2 years ago

0 Hi (Again... Sorry For Asking So Many Questions) Question About Using Google Cloud Storage In A Clearml Agent Running In Aws Ec2 Instance. My

I now get this error:
2022-07-18 21:51:29,168 - clearml.storage - ERROR - Failed creating storage object Reason: [Errno 2] No such file or directory: '~/gs.cred'
to be clear, I replaced <this is your GCP storage credentials file> with the contents of that file, escaping every " with a \" and removing newlines.

2 years ago

0 I Have 5 Unarchived Pipeline Runs That Were Defined With This Decorator:

In fact, all my projects seems empty of tasks.

2 years ago

0 Hi. Question About Dataset Upload Errors: When Uploading A

🙏

2 years ago

0 Hi. I Have A Few Questions About The Snippet Attached

multi_instance_support=True lets me run the pipeline again 👍
The second run prints out the same (non) "random" numbers as the first run

2 years ago

0 Hi. I'D Like To Try The Gcp Autoscaler.

switching the base image seems to have failed with the following error :
2022-07-13 14:31:12 Unable to find image 'nvidia/cuda:10.2-runtime-ubuntu18.04' locallyattached is a pipeline task log file

2 years ago

Show more results