SmallDeer34

21 Questions, 155 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

132 × Eureka!

Questions 21
Answers 155

0 Votes

9 Answers

2K Views

0 Votes 9 Answers 2K Views

Is There Any Way To, Like, Load-Balance Automatically? Like, On The User End Can I Just Specify An Amount Of Gb I Think I Will Need, And It Goes And Picks A Queue For Me Based On That? Like, Let'S Say I Want "A 15Gb Gpu Or Better" And There'S 4 Queues, Tw

Is there any way to, like, load-balance automatically? Like, on the user end can I just specify an amount of GB I think I will need, and it goes and picks a ...

clearml

4 years ago

0 Votes

10 Answers

2K Views

0 Votes 10 Answers 2K Views

So, I Did A Slew Of Pretrainings, Then Finetuned Those Pretrained Models. Is There A Way To Go Backwards From The Finetuning Task Id To The Pretraining Task Id? What I Tried Was:

So, I did a slew of pretrainings, then finetuned those pretrained models. Is there a way to go backwards from the finetuning Task ID to the pretraining Task ...

clearml

4 years ago

0 Votes

13 Answers

2K Views

0 Votes 13 Answers 2K Views

Hello, There'S A Particular Metric (Perplexity) I'D Like To Track, But Clearml Didn'T Seem To Catch It. Specifically, This "Evaluation" Section Of Run_Mlm.Py In The Transformers Repo:

Hello, there's a particular metric (perplexity) I'd like to track, but clearML didn't seem to catch it. Specifically, this "Evaluation" section of run_mlm.py...

clearml

4 years ago

0 Votes

7 Answers

2K Views

0 Votes 7 Answers 2K Views

Is There Any Way To Get Just One Dataset Folder Of A Dataset? E.G. Only "Train" Or Only "Dev"?

Is there any way to get just one dataset folder of a Dataset? e.g. only "train" or only "dev"?

clearml

4 years ago

0 Votes

21 Answers

2K Views

0 Votes 21 Answers 2K Views

Two Questions Today. First, Is There Some Way To Calculate The Number Of Gpu-Hours Used For A Project? Could I Select All Experiments And Count Up The Number Of Gpu-Hours/Gpu-Weeks? I Realize I Could Do This Manually By Looking At The Gpu Utilization Grap

Two questions today. First, is there some way to calculate the number of GPU-hours used for a project? Could I select all experiments and count up the number...

clearml

3 years ago

0 Votes

10 Answers

2K Views

0 Votes 10 Answers 2K Views

Hello! I'M Just Starting Out With Clearml, And I Seem To Be Having Some Sort Of Conflict Between

Hello! I'm just starting out with ClearML, and I seem to be having some sort of conflict between clearml and torch , at least in Colab In this guide ( https:...

clearml

4 years ago

0 Votes

30 Answers

2K Views

0 Votes 30 Answers 2K Views

Hello, I'M Not Getting Training Metrics Tracked By Clearml When I Execute The A Training Script Remotely, But I Get Them If I Run Locally. Is It Because I Have A Task.Init() In The File? What Happens When You Remotely Run A Script Which Has An Init() In I

Hello, I'm not getting training metrics tracked by ClearML when I execute the a training script remotely, but I get them if I run locally. Is it because I ha...

clearml

4 years ago

0 Votes

8 Answers

2K Views

0 Votes 8 Answers 2K Views

So I'M In A Colab Notebook, And After Running My Trainer(), How Do I Upload My Test Metrics To Clearml? Clearml Caught These Metrics And Uploaded Them:

So I'm in a Colab notebook, and after running my Trainer(), how do I upload my test metrics to ClearML? ClearML caught these metrics and uploaded them: train...

clearml

4 years ago

0 Votes

30 Answers

2K Views

0 Votes 30 Answers 2K Views

Hello! Getting Credential Errors When Attempting To Pip Install Transformers From Git Repo, On A Gpu Queue.

Hello! Getting credential errors when attempting to pip install transformers from git repo, on a GPU Queue. fatal: unable to write credential store: Device o...

clearml

4 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

Here'S The Original Colab Notebook. It Can

Here's the original Colab notebook. It can import torch without error: https://colab.research.google.com/github/huggingface/blog/blob/master/notebooks/01_how...

clearml

4 years ago

0 Votes

3 Answers

2K Views

0 Votes 3 Answers 2K Views

So, I'M Trying To Do A Several-Step Process, But It Needs To Run On A Gpu Queue In Clearml. How Would I Do That? Specifically, Here'S What I'M Trying To Do, Is It Possible?

So, I'm trying to do a several-step process, but it needs to run on a GPU queue in ClearML. How would I do that? Specifically, here's what I'm trying to do, ...

clearml

4 years ago

0 Votes

4 Answers

2K Views

0 Votes 4 Answers 2K Views

Ok, Next Question, I'Ve Got Some Training Args That I'D Like To Manually Upload And Have Them Show Up In The Attached Place, Under Configuration. It Is A Huggingface Trainingarguments Object, Which Has A To_Dict() And To_Json Function

OK, next question, I've got some training args that I'd like to manually upload and have them show up in the attached place, under Configuration. It is a Hug...

clearml

4 years ago

0 Votes

7 Answers

2K Views

0 Votes 7 Answers 2K Views

Question About

Question about https://allegro.ai/clearml/docs/rst/references/clearml_python_ref/task_module/task_task.html#clearml.task.Task.upload_artifact : Let's say I g...

clearml

4 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

So, This Is Something I'Ve Noticed, This Line Always Seems To Crash My Colab Notebooks:

So, this is something I've noticed, this line always seems to crash my Colab Notebooks: Task.current_task().completed()

clearml

4 years ago

0 Votes

28 Answers

2K Views

0 Votes 28 Answers 2K Views

So, Here'S A Question. Does Clearml Automatically Save Everything Necessary To Continue Training A Pytorch Language Model? Specifically, I'Ve Been Looking At The Checkpoint Folders Created When I'M Training A Huggingface Robertaformaskedlm. I Checked What

So, here's a question. Does clearml automatically save everything necessary to continue training a pytorch language model? Specifically, I've been looking at...

pytorch

4 years ago

0 Votes

18 Answers

2K Views

0 Votes 18 Answers 2K Views

Is There Any Way To: Within The Ui, Select And Compare The Scalars For More Than 10 Experiments? I'D Like To Do Something Like:

Is there any way to: within the UI, select and compare the scalars for more than 10 experiments? I'd like to do something like: select these 10 run in such a...

clearml

4 years ago

0 Votes

6 Answers

2K Views

0 Votes 6 Answers 2K Views

Currently Trying To Figure Out How To Extend Clearml'S Automagical Reporting To Joeynmt.

Currently trying to figure out how to extend clearML's automagical reporting to JoeyNMT. https://github.com/joeynmt/joeynmt/blob/master/joey_demo.ipynb is a ...

tensorboard

4 years ago

0 Votes

18 Answers

2K Views

0 Votes 18 Answers 2K Views

Second: Is There A Way To Take Internally Tracked Training Runs And Publish Them Publicly, E.G. For A Research Paper? "Appendix A: Training Runs Can Be Found Here, Feel Free To Explore Them And Look At The Loss Curves"? For Example

Second: is there a way to take internally tracked training runs and publish them publicly, e.g. for a research paper? "Appendix A: training runs can be found...

clearml

3 years ago

0 Votes

3 Answers

2K Views

0 Votes 3 Answers 2K Views

Ok, We'Ve Got A Gpu Queue Setup On One Of Our Local Machines. I Managed To Run A Script On It, Which Was Intended To Download A Clearml Dataset Stored In S3. But I"M Getting This Error:

OK, we've got a GPU Queue setup on one of our local machines. I managed to run a script on it, which was intended to download a clearML dataset stored in s3....

clearml

4 years ago

0 Votes

13 Answers

2K Views

0 Votes 13 Answers 2K Views

How, If At All, Should We Cite Clearml In A Research Paper? Would You Like Us To? How About A Footnote/Acknowledgement?

How, if at all, should we cite ClearML in a research paper? Would you like us to? How about a footnote/acknowledgement?

clearml

4 years ago

0 Votes

3 Answers

2K Views

0 Votes 3 Answers 2K Views

Question: Has Anyone Done Anything With Ray Or Rllib, And Clearml? Would Clearml Be Able To Integrate With Those Out Of The Box?

Question: has anyone done anything with Ray or RLLib, and ClearML? Would ClearML be able to integrate with those out of the box? https://medium.com/distribut...

clearml

4 years ago

0 Hello, I'M Not Getting Training Metrics Tracked By Clearml When I Execute The A Training Script Remotely, But I Get Them If I Run Locally. Is It Because I Have A Task.Init() In The File? What Happens When You Remotely Run A Script Which Has An Init() In I

SuccessfulKoala55 I think I just realized I had a misunderstanding. I don't think we are running a local server version of ClearML, no. We have a workstation running a queue/agents, but ClearML itself is via http://app.pro.clear.ml , I don't think we have ClearML running locally. We were tracking experiments before we setup the queue and the workers and all that.

IrritableOwl63 can you confirm - we didn't setup our own server to, like, handle experiment tracking and such?

4 years ago

0 Hi

https://github.com/allegroai/clearml/releases if anyone is curious!

4 years ago

0 Hi. I'M Using Clearml For Logging My Experiments. Can I Compare Experiments By Plotting Graphs? For Example, Every Experiment Logs The Time Per Training Iteration And The Accuracy Per Epoch. I Want To Create A Graph With "Average Time Per Iteration" As X-

This discussion might be relevant, it shows how to query a Task for metrics in code: https://clearml.slack.com/archives/CTK20V944/p1626992991375500?thread_ts=1626981377.374400&cid=CTK20V944

4 years ago

0 So, I Did A Slew Of Pretrainings, Then Finetuned Those Pretrained Models. Is There A Way To Go Backwards From The Finetuning Task Id To The Pretraining Task Id? What I Tried Was:

` {'input': ['Input Model #0'], 'output': [<clearml.model.Model object at 0x7f6d7d6a2750>,
...omitted some here
<clearml.model.Model object at 0x7f6d7d4b1350>]}
Input Model #0

AttributeError Traceback (most recent call last)

<ipython-input-83-65009a52f91b> in <module>()
22
23
---> 24 pretraining_task_id = input_model.task
25 print(f"pretraining_task_id {pretraini...

4 years ago

0 This Will Close It

It's not a big deal because it happens after I'm done with everything, I can just reset the Colab runtime and start over

4 years ago

0 Is There Any Way To: Within The Ui, Select And Compare The Scalars For More Than 10 Experiments? I'D Like To Do Something Like:

As an alternate solution, if I could group runs and get stats across the group, that would be cool

4 years ago

0 Second: Is There A Way To Take Internally Tracked Training Runs And Publish Them Publicly, E.G. For A Research Paper? "Appendix A: Training Runs Can Be Found Here, Feel Free To Explore Them And Look At The Loss Curves"? For Example

AgitatedDove14 I should have probably expanded my last message a bit more to say "Right, natanM, right now it's on http://app.pro.clear.ml , not http://app.clear.ml , can you advise, given that it is on .pro?"

3 years ago

Sure, I can give that a try!

4 years ago

0 Ok, Next Question, I'Ve Got Some Training Args That I'D Like To Manually Upload And Have Them Show Up In The Attached Place, Under Configuration. It Is A Huggingface Trainingarguments Object, Which Has A To_Dict() And To_Json Function

So for example:
` {'output_dir': 'shiba_ner_trainer', 'overwrite_output_dir': False, 'do_train': True, 'do_eval': True, 'do_predict': True, 'evaluation_strategy': 'epoch', 'prediction_loss_only': False, 'per_device_train_batch_size': 16, 'per_device_eval_batch_size': 16, 'per_gpu_train_batch_size': None, 'per_gpu_eval_batch_size': None, 'gradient_accumulation_steps': 1, 'eval_accumulation_steps': None, 'learning_rate': 0.0004, 'weight_decay': 0.0, 'adam_beta1': 0.9, 'adam_beta2': 0.999, 'adam...

4 years ago

0 Is There Any Way To: Within The Ui, Select And Compare The Scalars For More Than 10 Experiments? I'D Like To Do Something Like:

Or at least not conveniently

4 years ago

Anyhow, it seems that moving it to main() didn't help. Any ideas?

4 years ago

0 So, I Did A Slew Of Pretrainings, Then Finetuned Those Pretrained Models. Is There A Way To Go Backwards From The Finetuning Task Id To The Pretraining Task Id? What I Tried Was:

I did it the hard way

4 years ago

Ah... so there actually is a way to share it then, so long as people are signed up? How would one do this? Do I just share a link to the experiment, like https://app.pro.clear.ml/projects/b4a1875539cb4d9798529439801402ee/experiments/6f4cb4718c7c4a25b3a041c63f6ff2b4/output/execution?columns=selected&columns=type&columns=last_iteration&columns=hyperparams.Args.num_train_epochs&columns=name&columns=status&columns=users&columns=started&columns=last_update&columns=tags&columns=parent.name&colum...

3 years ago

I see a "publish" button on here, but would that make it visible on the wider internet?

3 years ago

0 Hello! I'M Just Starting Out With Clearml, And I Seem To Be Having Some Sort Of Conflict Between

OK, so with the RC, the issue has gone away. I can now import torch without issue.

4 years ago

0 Is There Any Way To Get Just One Dataset Folder Of A Dataset? E.G. Only "Train" Or Only "Dev"?

I suppose I could upload 200 different "datasets", rather than one dataset with 200 folders in it, but then clearml-data search would have 200 entries in it? It seemed like a good idea to put them all in one at the time

4 years ago

As in, I edit Installed Packages, delete everything there, and put that particular list of packages.

4 years ago

0 My Nth Question For The Day

AgitatedDove14 I'm making some progress on this. I've currently got the situation that my training run saved all of these files, and Task.get_task(param['TaskA']).models['output''][-1] gets me just one of them, training_args.bin . Then -2 gets me another, rng_state.pth

If I just get Task.get_task(param['TaskA']).models['output'] , I end up getting a huge list of, like, ` [<clearml.model.Model object at 0x7fec2841c880>, <clearml.model.Model object at 0x7fec2841...

4 years ago

0 So, Here'S A Question. Does Clearml Automatically Save Everything Necessary To Continue Training A Pytorch Language Model? Specifically, I'Ve Been Looking At The Checkpoint Folders Created When I'M Training A Huggingface Robertaformaskedlm. I Checked What

Alas, no luck. Uploaded the same things, did not upload trainer_state.json

4 years ago

Yup, not hoping to open the server to the world. As for "rerun it", I don't think I want to rerun the experiments, I want to show the results of the original training runs.

Is there any way to export the results from the internal server?

3 years ago

Yeah, we don't even get to line 480, all the training loop is within line 469, I think.

4 years ago

0 I Must Compliment The Python Level And Documentation Level In The Source Code Is Superb, I Love Reading It

Ah, makes sense! Have you considered adding a "this is the old website! Click here to get to the new one!" banner, kinda like on docs for python2 functions? https://docs.python.org/2.7/library/string.html

4 years ago

0 So, I Did A Slew Of Pretrainings, Then Finetuned Those Pretrained Models. Is There A Way To Go Backwards From The Finetuning Task Id To The Pretraining Task Id? What I Tried Was:

Martin I found a different solution (hardcoding the parent tasks by hand), but I'm curious to hear what you discover!

4 years ago

Is it based on the ClearML serversion?

3 years ago

0 So, I Did A Slew Of Pretrainings, Then Finetuned Those Pretrained Models. Is There A Way To Go Backwards From The Finetuning Task Id To The Pretraining Task Id? What I Tried Was:

So for example, I'm able to view in the UI that my finetuning task 7725f5bed94848039c68f2a3a573ded6 has an input model, and I can find the creating experiment for that. But how would I do this in code?

4 years ago

0 Is There Any Way To, Like, Load-Balance Automatically? Like, On The User End Can I Just Specify An Amount Of Gb I Think I Will Need, And It Goes And Picks A Queue For Me Based On That? Like, Let'S Say I Want "A 15Gb Gpu Or Better" And There'S 4 Queues, Tw

That answers it, I think!

4 years ago

0 Is There Any Way To: Within The Ui, Select And Compare The Scalars For More Than 10 Experiments? I'D Like To Do Something Like:

Well they do all have different names

4 years ago

I'm not sure I follow. Can you elaborate what you mean? Pseudo stack?

4 years ago

OK, neat! Any advice on how to edit the training loop to do that? Because the code I'm using doesn't offer easy access to the training loop, see here: https://github.com/huggingface/transformers/blob/040283170cd559b59b8eb37fe9fe8e99ff7edcbc/examples/pytorch/language-modeling/run_mlm.py#L469

trainer.train() just does the training loop automagically, and saves a checkpoint once in a while. When it saves a checkpoint, clearML uploads all the other files. How can I hook into... whatever ...

4 years ago

So in theory we could hook into one of those functions and add a line to have ClearML upload that particular json we want

4 years ago

Show more results