Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
PanickyMoth78
Moderator
34 Questions, 167 Answers
  Active since 10 January 2023
  Last activity one year ago

Reputation

0

Badges 1

166 × Eureka!
0 Hi. I Have A Job That Processes Images And Creates ~5 Gb Of Processed Image Files (Lots Of Small Ones). At The End - It Creates A

I tried playing with those parameters on my laptop to no great effect.

Here is code you can use to reproduce the issue:

` import os
from pathlib import Path
from tqdm import tqdm
from clearml import Dataset, Task

def dataset_upload_test(project_id:str, bucket_name:str
):
def _random_file(fpath, sizekb):
fileSizeInBytes = 1024 * sizekb
with open(fpath, "wb") as fout:
fout.write(os.urandom(fileSizeInBytes))

def random_dataset(dataset_path, num_files, file...
2 years ago
3 years ago
0 Hi. I'D Like To Try The Gcp Autoscaler.

I'll try a more carefully checked run a bit later but I know it's getting a bit late in your time zone

3 years ago
3 years ago
0 I Started Two Pipelines (Using Aws Autoscaler In App.Clear.Ml ). The Pipelines Ran Concurrently, Using The Same Pipeline Code. Both Failed In The Same Component Half-Way Though The Pipeline Run With:

here is the log from the failing component:
File "/root/.clearml/venvs-builds/3.8/lib/python3.8/site-packages/clearml/utilities/locks/portalocker.py", line 140, in lock fcntl.flock(file_.fileno(), flags) BlockingIOError: [Errno 11] Resource temporarily unavailable

3 years ago
0 Hi. I Have A Job That Processes Images And Creates ~5 Gb Of Processed Image Files (Lots Of Small Ones). At The End - It Creates A

That job was using clearml 1.8.3 so I take it that setting max_workers to 1 would not make a difference?
Looking at the docs:
https://clear.ml/docs/latest/docs/references/sdk/dataset/#upload
they say that max_workers = number of cores but looking at the log it does seem like it's doing one chunk every 5 minutes (long time for 500mb upload for a node running in gcp...)

2 years ago
0 Hi There. I'M Trying To Switch Pipeline Code From A Local Run Using

If I run from terminal, I see:
ValueError: Task object can only be updated if created or in_progress [status=stopped fields=['configuration']]

3 years ago
0 Hi. I'M Encountering A Problem With

anyhow - looks like the keys are simple enough to use (so I can just ignore the model names)

2 years ago
0 Autoscaler Parallelization Issue: I Have An Aws Autoscaler Set Up With A Resource That Has A Max Of 3 Instances Assigned To The

Thanks 🙂
I wonder if it'll also include the fix that went into in the RC I was using there ( 1.6.3rc0 )

3 years ago
0 Hi (Again... Sorry For Asking So Many Questions) Question About Using Google Cloud Storage In A Clearml Agent Running In Aws Ec2 Instance. My

For anyone following, you can "inject" a credentials json file for a google cloud service account so at to get access to your google cloud storage from agents on aws ec2 instances that are managed by the AWS autoscaler by providing the following in the ADDITIONAL CLEARML CONFIGURATION when starting the autoscaler:
` sdk.google.storage.credentials_json: "/root/gs.cred"
sdk.google.storage.project: "<my-gcp-project-id>"
files {
gsc {
contents: """<copy-paste the contents of yo...

3 years ago
0 Hi There. I'M Trying To Switch Pipeline Code From A Local Run Using

first, thanks for having these discussions. I appreciate this kind of support is an effort 🙏
Yes. i perfectly understand that once a pipeline job (or a task) is sent off in this manner, it executes separately (and, most likely in a different machine) from the process that instantiated it.
I still feel strongly that such a command should not be thought of as a fire and exit operation. I can think of several scenarios where continued execution of the instantiating process is desired:
I ...

3 years ago
0 Hi. I Have A Job That Processes Images And Creates ~5 Gb Of Processed Image Files (Lots Of Small Ones). At The End - It Creates A

Q: is there an equivalent env var for sdk.google.storage.pool_connections/pool_maxsize ? My jobs are running remotely and not within a clearml agent at the moment so they get clearml config through env vars.

2 years ago
2 years ago
0 Hi. I'D Like To Try The Gcp Autoscaler.

Is there any chance the experiment itself has a docker image specified?

It does not as far as I know. The decorators do not have docker fields specified

3 years ago
0 Hi I'M Looking Into How Clearml Supports Datasets And Dataset Versioning And I'M A Bit Confused. Is Dataset Versioning Not Supported At All In The Non-Enterprise Or Is Versioning Available By A Different Mechanism? I See That

console output shows uploads of 500 files on every new dataset. The lineage is as expected, each additional upload is the same size as the previous ones (~50mb) and Dataset.get on the last dataset's ID retreives all the files from the separate parts to one local folder.
Checking the remote storage location (gs://) shows artifact zip files, each with 500 files

3 years ago
0 I Have 5 Unarchived Pipeline Runs That Were Defined With This Decorator:

I can find the tasks in the "all experiments" project but there are over 500 tasks there (I guess in includes the archived tasks as well) so that's not much help.

3 years ago
0 Hi. I'M Running This Little Pipeline:

Hi again.
Thanks for the previous replies and links but I haven't been able to find the answer to my question: How do I prevent the content of a uri returned by a task from being saved by clearml at all.

I'm using this simplified snippet (that avoids fastai and large data)
` from clearml.automation.controller import PipelineDecorator
from clearml import TaskTypes

@PipelineDecorator.component(
return_values=["run_datasets_path"], cache=False, task_type=TaskTypes.data_processing
)
def ma...

3 years ago
0 Hi. Help

I had several pipeline components getting it and uploading files to is concurrently.
Can Datsets handle that?

3 years ago
0 Hi. I'M Encountering A Problem With

another weird thing:
Before my training task is done:
print(task.models['output'].keys())outputs
odict_keys(['Output Model #0', 'Output Model #1', 'Output Model #2'])
after task.close()
I can do:
task = Task.get_task(task_id) for i in range(100): print(task.models["output"].keys())which prints
odict_keys(['Output Model #0', 'Output Model #1', 'Output Model #2'])in the first iteration
and prints the file names in the latter iterations:
` od...

2 years ago
0 Hi. I Have A Question About Pipelines And Their Generated Dependency Graphs. I Took The Code Of The Clearml Pipeline From Decorator Example:

I imagine that these phantom dependencies will prevent parallelization. Is there a workaround?

3 years ago
Show more results compactanswers