Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
JumpyClams73
Moderator
10 Questions, 57 Answers
  Active since 10 January 2023
  Last activity one year ago

Reputation

0

Badges 1

57 × Eureka!
0 Votes
10 Answers
948 Views
0 Votes 10 Answers 948 Views
Hi, I've just started to evaluate ClearML for internal use at my org and am wondering if there's anyway to import data from old experiments into the dashboar...
2 years ago
0 Votes
3 Answers
1K Views
0 Votes 3 Answers 1K Views
Hi, I'm looking for documentation on GCP autoscalers. When I search on the docs site, it shows me the AWS autoscaler but not the GCP one. Can someone point m...
2 years ago
0 Votes
2 Answers
907 Views
0 Votes 2 Answers 907 Views
Hi, I'm looking at https://clear.ml/docs/latest/docs/webapp/webapp_exp_tuning/#base-docker-image where it says To add, change, or delete a base Docker image:...
2 years ago
0 Votes
1 Answers
969 Views
0 Votes 1 Answers 969 Views
2 years ago
0 Votes
8 Answers
959 Views
0 Votes 8 Answers 959 Views
2 years ago
0 Votes
2 Answers
1K Views
0 Votes 2 Answers 1K Views
Hi, For the CML SaaS Pro tier - are the first 3 users still free and I'll only be charged for any additional users?
2 years ago
0 Votes
18 Answers
979 Views
0 Votes 18 Answers 979 Views
Hi, I'm trying to run the following API call # Imports ... client = APIClient() resp = client.events.get_scalar_metrics_and_variants("MY_TASK_ID")but it erro...
2 years ago
0 Votes
29 Answers
945 Views
0 Votes 29 Answers 945 Views
Hi, I'm using ClearML's hosted free SaaS offering. I'm running model training in PyTorch on a server and pushing metrics to CML. I've noticed that anytime my...
2 years ago
0 Votes
30 Answers
1K Views
0 Votes 30 Answers 1K Views
2 years ago
0 Votes
6 Answers
974 Views
0 Votes 6 Answers 974 Views
2 years ago
0 Hi, I'M Trying To Run The Following Api Call

Also tagged you SuccessfulKoala55
Thanks for the quick support!

2 years ago
0 Hi, I'M Trying To Run The Following Api Call

I think there's some confusion here. I'm not running the server. My metrics are getting logged to the CML cloud.

2 years ago
0 Hi, I'M Using Clearml'S Hosted Free Saas Offering. I'M Running Model Training In Pytorch On A Server And Pushing Metrics To Cml. I'Ve Noticed That Anytime My Training Job Fails Due To Gpu Oom Issues, Cml Marks The Job As

No, we currently don't handle it gracefully. It just crashes. But we do use hydra which does sort of arrests that exception first. I'm wondering if it's Hydra causing this issue. I'll look into it later today

2 years ago
0 Hi, I'M Using Clearml'S Hosted Free Saas Offering. I'M Running Model Training In Pytorch On A Server And Pushing Metrics To Cml. I'Ve Noticed That Anytime My Training Job Fails Due To Gpu Oom Issues, Cml Marks The Job As

I didn't check with the toy task, I thought the error codes might be an issue here so was just looking for the difference. I'll check for that too.
But for my hydra task, it's always marked completed, never failed

2 years ago
2 years ago
0 Hi, I'M Using Clearml'S Hosted Free Saas Offering. I'M Running Model Training In Pytorch On A Server And Pushing Metrics To Cml. I'Ve Noticed That Anytime My Training Job Fails Due To Gpu Oom Issues, Cml Marks The Job As

clearml's callback is never called

yeah I suspect that's what might be happening which is why I was inquiring as to how and where exactly in the CML code that happens. Once I know, I can then place breakpoints in the critical regions and debug to see what's going in.

2 years ago
0 Hi, I'M Trying To Clone And Queue Experiments For Running Them On My Workers. I Am Able To Successfully Clone And Queue The Task, But Seems Like The Task Does Not Pass The Correct Parameters To My Python Script On The Worker. We Use Hydra For Configuring

I thought the agent created a new conda env and installed all packages, recorded during initial task run, from scratch (except for caching with venv). Is that not the case?

2 years ago
0 Hello, When I Clone And Enqueue A Task Using The Web-Console, Is There Anyway To Add A Pre-Execution Hook To That Cloned Task? More Specifically, My Code Uses A Bunch Of Resources Off The Local Disk Which Are Setup Independently Of The Code Itself. When I

The Agent pulls the Task, and then reproduces it, and now it will execute the extra_docker_shell_script that was put in the configuration file.Does this imply the former? Env is fully setup, then script is run, then experiment is started by calling the executable?

2 years ago
0 Hi, I'M Trying To Clone And Queue Experiments For Running Them On My Workers. I Am Able To Successfully Clone And Queue The Task, But Seems Like The Task Does Not Pass The Correct Parameters To My Python Script On The Worker. We Use Hydra For Configuring

yes, it seems like the command line args are recorded now but the connect call with my parameter dictionary now fails with exception:
` Error executing job with overrides: ['model_name=all-test', ...]
Traceback (most recent call last):
File "/home/binoydalal/miniconda3/envs/DS974/lib/python3.9/site-packages/clearml/binding/hydra_bind.py", line 146, in _patched_task_function
return task_function(a_config, *a_args, **a_kwargs)
....
File "/home/binoydalal/miniconda3/envs/DS974/li...

2 years ago
0 Hi, I'M Trying To Clone And Queue Experiments For Running Them On My Workers. I Am Able To Successfully Clone And Queue The Task, But Seems Like The Task Does Not Pass The Correct Parameters To My Python Script On The Worker. We Use Hydra For Configuring

Thanks for getting back Martin. The hydra example fails when i try to queue it to my local with
Starting Task Execution: Traceback (most recent call last): File "hydra_example.py", line 10, in <module> @hydra.main(config_path="config_files", config_name="config") AttributeError: module 'hydra' has no attribute 'main'

2 years ago
0 Hello, When I Clone And Enqueue A Task Using The Web-Console, Is There Anyway To Add A Pre-Execution Hook To That Cloned Task? More Specifically, My Code Uses A Bunch Of Resources Off The Local Disk Which Are Setup Independently Of The Code Itself. When I

I'm looking at the docs on docker mode and running the script. Is this script run after the venv and code dir are setup, or immediately after the container starts but before the environment for running the experiment is setup?

2 years ago
0 Hi, I'M Trying To Run The Following Api Call

the CML free SaaS offering. It'll probably hit https://app.clear.ml/api if I'm not wrong

2 years ago
0 Hi, I'M Trying To Clone And Queue Experiments For Running Them On My Workers. I Am Able To Successfully Clone And Queue The Task, But Seems Like The Task Does Not Pass The Correct Parameters To My Python Script On The Worker. We Use Hydra For Configuring

Could it be the script itself is using vanilla sys.argv and not Argparser ? (edited)

Thanks for bringing this up. Our code uses fire to parse command line args and then sort of hands off to hydra, so yes it does use sys.argv initially. Is this a possible issue?

2 years ago
0 Hi, I'M Using Clearml'S Hosted Free Saas Offering. I'M Running Model Training In Pytorch On A Server And Pushing Metrics To Cml. I'Ve Noticed That Anytime My Training Job Fails Due To Gpu Oom Issues, Cml Marks The Job As

AgitatedDove14 finally had a chance to properly look into it and I think I know what's going on
When running any task with hydra, hydra wraps the called method in its own https://github.com/facebookresearch/hydra/blob/a559aa4bf6807d5e3a82e065987825fa322351e2/hydra/_internal/utils.py#L211 . When the task throws any exception, it triggers the except block of this method which handles the exception.
CML marks a task as failed only if the whatever exception the task generated was not ha...

2 years ago
0 Hi, I'M Trying To Run The Following Api Call

Sorry if I sounded curt. Didn't mean to. To clarify, I've created my account using Google SSO on http://app.clear.ml , and am currently on the Free tier. I am pushing all my data onto CML's servers. This error happens when I try to query those servers for the metrics and variants for a particular task of mine.

2 years ago
0 Hi, I'M Using Clearml'S Hosted Free Saas Offering. I'M Running Model Training In Pytorch On A Server And Pushing Metrics To Cml. I'Ve Noticed That Anytime My Training Job Fails Due To Gpu Oom Issues, Cml Marks The Job As

Sorry for the delay CostlyOstrich36 here's the relevant lines from the console:
` ...
File "/home/binoyloaner/miniconda3/envs/DS974/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/binoyloaner/miniconda3/envs/DS974/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 103, in forward
return F.linear(input, self.weight, self.bias)
File "/home/binoyloaner/miniconda3/envs/DS974/lib/python3....

2 years ago
0 Hi, I'Ve Just Started To Evaluate Clearml For Internal Use At My Org And Am Wondering If There'S Anyway To Import Data From Old Experiments Into The Dashboard. Anyone Have Any Thoughts On This?

We have run experiments in the past (before I put ClearML into my code) which has logged scalars, plots etc. to local tensorboard. Is there any way to import this data to ClearML cloud for tracking, visualization and comparison?

2 years ago
Show more results compactanswers