Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
SmallDeer34
Moderator
21 Questions, 155 Answers
  Active since 10 January 2023
  Last activity 8 months ago

Reputation

0

Badges 1

132 × Eureka!
0 Votes
3 Answers
972 Views
0 Votes 3 Answers 972 Views
Question: has anyone done anything with Ray or RLLib, and ClearML? Would ClearML be able to integrate with those out of the box? https://medium.com/distribut...
3 years ago
0 Votes
3 Answers
1K Views
0 Votes 3 Answers 1K Views
So, I'm trying to do a several-step process, but it needs to run on a GPU queue in ClearML. How would I do that? Specifically, here's what I'm trying to do, ...
3 years ago
0 Votes
9 Answers
1K Views
0 Votes 9 Answers 1K Views
3 years ago
0 Votes
10 Answers
1K Views
0 Votes 10 Answers 1K Views
So, I did a slew of pretrainings, then finetuned those pretrained models. Is there a way to go backwards from the finetuning Task ID to the pretraining Task ...
3 years ago
0 Votes
21 Answers
1K Views
0 Votes 21 Answers 1K Views
2 years ago
0 Votes
7 Answers
976 Views
0 Votes 7 Answers 976 Views
Question about https://allegro.ai/clearml/docs/rst/references/clearml_python_ref/task_module/task_task.html#clearml.task.Task.upload_artifact : Let's say I g...
3 years ago
0 Votes
30 Answers
985 Views
0 Votes 30 Answers 985 Views
Hello! Getting credential errors when attempting to pip install transformers from git repo, on a GPU Queue. fatal: unable to write credential store: Device o...
3 years ago
0 Votes
6 Answers
1K Views
0 Votes 6 Answers 1K Views
Currently trying to figure out how to extend clearML's automagical reporting to JoeyNMT. https://github.com/joeynmt/joeynmt/blob/master/joey_demo.ipynb is a ...
3 years ago
0 Votes
13 Answers
989 Views
0 Votes 13 Answers 989 Views
How, if at all, should we cite ClearML in a research paper? Would you like us to? How about a footnote/acknowledgement?
3 years ago
0 Votes
0 Answers
1K Views
0 Votes 0 Answers 1K Views
Here's the original Colab notebook. It can import torch without error: https://colab.research.google.com/github/huggingface/blog/blob/master/notebooks/01_how...
3 years ago
0 Votes
18 Answers
1K Views
0 Votes 18 Answers 1K Views
Second: is there a way to take internally tracked training runs and publish them publicly, e.g. for a research paper? "Appendix A: training runs can be found...
2 years ago
0 Votes
18 Answers
1K Views
0 Votes 18 Answers 1K Views
Is there any way to: within the UI, select and compare the scalars for more than 10 experiments? I'd like to do something like: select these 10 run in such a...
3 years ago
0 Votes
0 Answers
1K Views
0 Votes 0 Answers 1K Views
So, this is something I've noticed, this line always seems to crash my Colab Notebooks: Task.current_task().completed()
3 years ago
0 Votes
28 Answers
1K Views
0 Votes 28 Answers 1K Views
3 years ago
0 Votes
8 Answers
1K Views
0 Votes 8 Answers 1K Views
So I'm in a Colab notebook, and after running my Trainer(), how do I upload my test metrics to ClearML? ClearML caught these metrics and uploaded them: train...
3 years ago
0 Votes
3 Answers
1K Views
0 Votes 3 Answers 1K Views
OK, we've got a GPU Queue setup on one of our local machines. I managed to run a script on it, which was intended to download a clearML dataset stored in s3....
3 years ago
0 Votes
13 Answers
1K Views
0 Votes 13 Answers 1K Views
Hello, there's a particular metric (perplexity) I'd like to track, but clearML didn't seem to catch it. Specifically, this "Evaluation" section of run_mlm.py...
3 years ago
0 Votes
7 Answers
1K Views
0 Votes 7 Answers 1K Views
Is there any way to get just one dataset folder of a Dataset? e.g. only "train" or only "dev"?
3 years ago
0 Votes
30 Answers
1K Views
0 Votes 30 Answers 1K Views
3 years ago
0 Votes
4 Answers
1K Views
0 Votes 4 Answers 1K Views
OK, next question, I've got some training args that I'd like to manually upload and have them show up in the attached place, under Configuration. It is a Hug...
3 years ago
0 Votes
10 Answers
1K Views
0 Votes 10 Answers 1K Views
Hello! I'm just starting out with ClearML, and I seem to be having some sort of conflict between clearml and torch , at least in Colab In this guide ( https:...
3 years ago
0 My Nth Question For The Day

AgitatedDove14 I'm making some progress on this. I've currently got the situation that my training run saved all of these files, and Task.get_task(param['TaskA']).models['output''][-1] gets me just one of them, training_args.bin . Then -2 gets me another, rng_state.pth

If I just get Task.get_task(param['TaskA']).models['output'] , I end up getting a huge list of, like, ` [<clearml.model.Model object at 0x7fec2841c880>, <clearml.model.Model object at 0x7fec2841...

3 years ago
0 My Nth Question For The Day

Interesting, I wasn't aware of the possibilities you outline there at the end, where you, like, programmatically pull all the results down for all the tasks. Neat!

A more complex version of this which I'm trying to figure out:

I trained a model using TaskA. I need to now pull that model down from the saved artifacts of TaskA and fine-tune it in TaskB That finetuning in TaskB spits out a metric.
Is there a way to do this all elegantly? Currently my process is to manually download the model...

3 years ago
0 Two Questions Today. First, Is There Some Way To Calculate The Number Of Gpu-Hours Used For A Project? Could I Select All Experiments And Count Up The Number Of Gpu-Hours/Gpu-Weeks? I Realize I Could Do This Manually By Looking At The Gpu Utilization Grap

Here's the hours/days version, corrected now lol:
gpu_hours = {} gpu_days = {} for gpu_type, gpu_time_seconds in gpu_seconds.items(): gpu_time_hours = gpu_time_seconds/3600 gpu_hours[gpu_type] = gpu_time_hours gpu_days[gpu_type] = gpu_time_hours/24

2 years ago
0 Hello, There'S A Particular Metric (Perplexity) I'D Like To Track, But Clearml Didn'T Seem To Catch It. Specifically, This "Evaluation" Section Of Run_Mlm.Py In The Transformers Repo:

Yeah that should work. Basically in --train_file it needs the path to train.txt, --validation_file needs the path to validation.txt, etc. I just put them all in the same folder for convenience

3 years ago
0 So, Here'S A Question. Does Clearml Automatically Save Everything Necessary To Continue Training A Pytorch Language Model? Specifically, I'Ve Been Looking At The Checkpoint Folders Created When I'M Training A Huggingface Robertaformaskedlm. I Checked What

OK, neat! Any advice on how to edit the training loop to do that? Because the code I'm using doesn't offer easy access to the training loop, see here: https://github.com/huggingface/transformers/blob/040283170cd559b59b8eb37fe9fe8e99ff7edcbc/examples/pytorch/language-modeling/run_mlm.py#L469

trainer.train() just does the training loop automagically, and saves a checkpoint once in a while. When it saves a checkpoint, clearML uploads all the other files. How can I hook into... whatever ...

3 years ago
0 So I'M In A Colab Notebook, And After Running My Trainer(), How Do I Upload My Test Metrics To Clearml? Clearml Caught These Metrics And Uploaded Them:

AgitatedDove14 yes, I called init and tensorboard is installed. It successfully uploaded the metrics from trainer.train(), just not from the next cell where we do trainer.predict

3 years ago
0 Hello, I'M Not Getting Training Metrics Tracked By Clearml When I Execute The A Training Script Remotely, But I Get Them If I Run Locally. Is It Because I Have A Task.Init() In The File? What Happens When You Remotely Run A Script Which Has An Init() In I

Tried it. Updated the script (attached) to add it to the main function instead. Then ran it locally. Then aborted the job. Then "reset" the job on clearML web interface and ran it remotely on a GPU queue. as you can see in the log (attached) there is loss happening, but it's not showing up in the scalars (attached picture):

edit: where I ran it after resetting

3 years ago
0 So, I'M Trying To Do A Several-Step Process, But It Needs To Run On A Gpu Queue In Clearml. How Would I Do That? Specifically, Here'S What I'M Trying To Do, Is It Possible?

Gave it a try, it seems our GPU Queue doesn't have the S3 creds set up correctly. Making a separate thread about that

3 years ago
0 Hello, I'M Not Getting Training Metrics Tracked By Clearml When I Execute The A Training Script Remotely, But I Get Them If I Run Locally. Is It Because I Have A Task.Init() In The File? What Happens When You Remotely Run A Script Which Has An Init() In I

Before I enqueued the job, I manually edited Installed Packages thus:
boto3 datasets clearml tokenizers torchand added
pip install git+to the setup script.

And the docker image is
nvidia/cuda:11.2.2-cudnn8-runtime-ubuntu18.04

I did all that because I've been having this other issue: https://clearml.slack.com/archives/CTK20V944/p1624892113376500

3 years ago
3 years ago
0 Hello, I'M Not Getting Training Metrics Tracked By Clearml When I Execute The A Training Script Remotely, But I Get Them If I Run Locally. Is It Because I Have A Task.Init() In The File? What Happens When You Remotely Run A Script Which Has An Init() In I

SuccessfulKoala55 I think I just realized I had a misunderstanding. I don't think we are running a local server version of ClearML, no. We have a workstation running a queue/agents, but ClearML itself is via http://app.pro.clear.ml , I don't think we have ClearML running locally. We were tracking experiments before we setup the queue and the workers and all that.

IrritableOwl63 can you confirm - we didn't setup our own server to, like, handle experiment tracking and such?

3 years ago
0 Hello, I'M Not Getting Training Metrics Tracked By Clearml When I Execute The A Training Script Remotely, But I Get Them If I Run Locally. Is It Because I Have A Task.Init() In The File? What Happens When You Remotely Run A Script Which Has An Init() In I

Long story, but in the other thread I couldn't install the particular version of transformers unless I removed it from "Installed Packages" and added it to setup script instead. So I took to just throwing in that list of packages.

3 years ago
Show more results compactanswers