Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
PanickyMoth78
Moderator
34 Questions, 167 Answers
  Active since 10 January 2023
  Last activity one year ago

Reputation

0

Badges 1

166 × Eureka!
0 Hi There. I'M Trying To Switch Pipeline Code From A Local Run Using

Thanks ! 🎉
I'll give it a try.
I think that clearml should be able to do parameter sweeps using pipelines in a manner that makes use of parallelisation.
If that's not happening with the new RC, I wonder how I would do a parameter sweep within the pipelines framework.

For example - how would this task-based example be done with pipelines?
https://github.com/allegroai/clearml/blob/master/examples/automation/manual_random_param_search_example.py

I'm thinking of a case where you want t...

3 years ago
0 Task Struck At

there may have been some interaction between the training task and a preceding dataset creation task :shrug:

2 years ago
0 Task Struck At

no retry mesages
CLEARML_FILES_HOST is gs
CLEARML_API_HOST is a self hosted clearml server (in google compute engine).

Note that earlier in the process the code uploads a dataset just fine

2 years ago
0 Hi. Shoulf This Command Succeed In The Presence Of Project

That would be a better message however, I must have misunderstood the meaning of auto_create=True
I thought that flag made the get function into a "get-or-create"

3 years ago
0 Hi. I'M Encountering A Problem With

To be specific there is "model name" which is not unique , and there is model-key which is unique to the Task

not sure why the two fields don't simply match. I guess that there may be situations where file name (without the full path) may be used several times.

2 years ago
0 Hi. I'M Running This Little Pipeline:

I found that instead of returning some_returned_url (which triggers zipping and saving of the filed under that url), I can wrap it in a dict: {"the url": some_returned_url} which then lets me pass back the url to the pipeline and only that dict gets uploaded (e.g. {'run_datasets_path': Path('/data/my_datasets_path/run_id_1')} ) I can divert all files that I do want uploaded and tracked by clearml to gs:// by adding at start of task-fuction: ` Logger.current_logger().se...

3 years ago
0 Hi. I'M Running This Little Pipeline:

Is there a way to set the default upload destination for all tasks in my ~/clearml.conf

.. yes by setting files_server: gs://clearml-evaluation/

3 years ago
0 I Started Two Pipelines (Using Aws Autoscaler In App.Clear.Ml ). The Pipelines Ran Concurrently, Using The Same Pipeline Code. Both Failed In The Same Component Half-Way Though The Pipeline Run With:

Restarting the autoscaler, instances and a running single pipeline - I still get the same error.
clearml.utilities.locks.exceptions.LockException: [Errno 11] Resource temporarily unavailable

3 years ago
0 I Started Two Pipelines (Using Aws Autoscaler In App.Clear.Ml ). The Pipelines Ran Concurrently, Using The Same Pipeline Code. Both Failed In The Same Component Half-Way Though The Pipeline Run With:

now trying with added lines as Alon suggested:
` @PipelineDecorator.component(
return_values=["run_model_path", "run_info"],
cache=True,
task_type=TaskTypes.training,
repo="git@github.com:shpigi/clearml_evaluation.git",
repo_branch="main",
packages="./requirements.txt",
)
def train_image_classifier_component(
clearml_dataset,
backbone_name,
image_resize: int,
batch_size: int,
run_model_uri,
run_tb_uri,
local_data_path,
num_epochs: int,
)...

3 years ago
0 Hi. I'D Like To Try The Gcp Autoscaler.

so..
I restarted the autoscaler with this configuration object:
` [{"resource_name": "cpu_default", "machine_type": "n1-standard-1", "cpu_only": true, "gpu_type": null, "gpu_count": 1, "preemptible": false, "num_instances": 5, "queue_name": "default", "source_image": "projects/ubuntu-os-cloud/global/images/ubuntu-1804-bionic-v20220131", "disk_size_gb": 100}, {"resource_name": "cpu_services", "machine_type": "n1-standard-1", "cpu_only": true, "gpu_type": null, "gpu_count": 1, "preemptible": fa...

3 years ago
0 Hi There. I'M Trying To Switch Pipeline Code From A Local Run Using

Thanks for the fix and the mock HPO example code !
Pipeline behaviour with the fix is looking good.
I see the point about changes to data inside the controller possibly causing dependencies for step 3 (or, at least, making it harder for the interpreter to know).

3 years ago
0 Autoscaler Parallelization Issue: I Have An Aws Autoscaler Set Up With A Resource That Has A Max Of 3 Instances Assigned To The

Note that the same models files were previously also generated by a non-paralelized version of the same pipeline without the out-of-space error but a storage manager was downloading zip files in that version as well (maybe these files were downloaded and removed as the object reference counts went to 0?)

3 years ago
0 Hi There I'M Trying Out Clearml. I Saw Mention That Clearml Can Capture Tensorboard Output So I Tried It With This Little Script (Image Below). The Events File Is Filled, The Clearml Task Is Created, And Marked Complete However There Is Nothing In The Sc

here is the code in text if you feel like giving it a try:
import tensorboard_logger as tb_logger from clearml import Task task = Task.init(project_name="great project", task_name="test_tb_logging") task_tb_logger = tb_logger.Logger(logdir='./tb/run1', flush_secs=2) for i in range(10): task_tb_logger.log_value("some_metric", 42, i) task.close()

3 years ago
0 Hi. I'D Like To Try The Gcp Autoscaler.

that's strange because, opening the currently running autoscaler config I see this:

3 years ago
0 Hi (Again... Sorry For Asking So Many Questions) Question About Using Google Cloud Storage In A Clearml Agent Running In Aws Ec2 Instance. My

I now get this error:
2022-07-18 21:51:29,168 - clearml.storage - ERROR - Failed creating storage object Reason: [Errno 2] No such file or directory: '~/gs.cred'
to be clear, I replaced <this is your GCP storage credentials file> with the contents of that file, escaping every " with a \" and removing newlines.

3 years ago
0 Hi. Question About Dataset Upload Errors: When Uploading A

I can't find version 1.8.1rc1 but I believe I see a relevant change in code of Dataset.upload in 1.8.1rc0

3 years ago
0 Is There Some Built-In Way In Clearml To Trigger Further Action On Task Fail (Or Pipeline Fail)?

I suppose one way to perform this is with a https://clear.ml/docs/latest/docs/references/sdk/scheduler that kicks off a health check task (check exit state of executed tasks). It seems more efficient to support a triggered response to task fail.

3 years ago
0 Hi. I Have A Question About Pipelines And Their Generated Dependency Graphs. I Took The Code Of The Clearml Pipeline From Decorator Example:

I think this should be a valid use of pipelines. for example - at some step I choose to sweep across several values of some parameter and the rest of the steps are duplicated for each value of that parameter.
The additional edges in the graph suggest that these steps somehow contain dependencies that I do not wish them to have.

3 years ago
0 Is There Some Built-In Way In Clearml To Trigger Further Action On Task Fail (Or Pipeline Fail)?

There may be cases where failure occurs before my code starts to run (and, perhaps, after it completes)

3 years ago
0 Hi. I'D Like To Try The Gcp Autoscaler.

Hi TimelyPenguin76
Thanks for working on this. The clearml gcp autoscaler is a major feature for us to have. I can't really evaluate clearml without some means of instantiating multiple agents on GCP machines and I'd really prefer not to have to set up a k8 cluster with agents and manage scaling it myself.

I tried the settings above with two resources, one for default queue and one for the services queue (making sure I use that image you suggested above for both).
The autoscaler started up...

3 years ago
0 Hi There. I'M Trying To Switch Pipeline Code From A Local Run Using

What I think would be preferable is that the pipeline be deployed and that the python process that deployed it were allowed to continue on to whatever I had planned for it to do next (i.e. not exit)

3 years ago
0 Hi. I'D Like To Try The Gcp Autoscaler.

I'll do a clean relaunch of everything (scaler and pipeline)

3 years ago
0 Hi. I'D Like To Try The Gcp Autoscaler.

I can try switching to gpu-enabled machines just to see if that path can be made to work but the services queue shouldn't need gpu so I hope we figure out running the pipeline task on cpu nodes

3 years ago
Show more results compactanswers