DizzyButterfly4

Moderator

3 Questions, 12 Answers

Active since 01 June 2023

Last activity one month ago

Reputation

Badges 1

12 × Eureka!

Questions 3
Answers 12

0 Votes

4 Answers

301 Views

0 Votes 4 Answers 301 Views

Hi, It Appears That What Clearml Calls Histograms Are Actually Bar Graphs. Does Clearml Have Real Histograms?

Hi, it appears that what clearml calls histograms are actually bar graphs. Does clearml have real histograms? None

clearml

4 months ago

0 Votes

11 Answers

154 Views

0 Votes 11 Answers 154 Views

Hi, I See Some Strange Behavior Where The Training Fails When Running On Clearml (Loss = Nan) Compared To Running W/O Clearml. This Is Entirely Reproduceable. Has Anyone Seen This?

Hi, i see some strange behavior where the training fails when running on clearml (loss = nan) compared to running w/o clearml. This is entirely reproduceable...

clearml

one month ago

0 Votes

7 Answers

1K Views

0 Votes 7 Answers 1K Views

Hi All, I'M New To Clearml, And Trying To Understand The Set_Metadata Method Of The Dataset Class. The Documentation Is Almost Non-Existent Does Anyone Have An Example Script? How Would I Add Data Using E.G. Pandas. I Assume One Column Would Relate To The

Hi All, I'm new to ClearML, and trying to understand the set_metadata method of the Dataset class. The documentation is almost non-existent Does anyone have ...

clearml

one year ago

0 Hi All, I'M New To Clearml, And Trying To Understand The Set_Metadata Method Of The Dataset Class. The Documentation Is Almost Non-Existent Does Anyone Have An Example Script? How Would I Add Data Using E.G. Pandas. I Assume One Column Would Relate To The

John, regarding documentation - what is mainly missing is how he dataframe / numpy / dict is structures. Yes, an example would be helpful

one year ago

0 Hi, I See Some Strange Behavior Where The Training Fails When Running On Clearml (Loss = Nan) Compared To Running W/O Clearml. This Is Entirely Reproduceable. Has Anyone Seen This?

The only difference between the two runs is that in one run project_name is and empty string (in which case all is OK), and in the other case project_name has a value

one month ago

0 Hi, I See Some Strange Behavior Where The Training Fails When Running On Clearml (Loss = Nan) Compared To Running W/O Clearml. This Is Entirely Reproduceable. Has Anyone Seen This?

The project is many 1000s of lines long. It fails in the model.fit TF command. The only thing different from other versions which work is the loss function - which I share below. The relevant class is BoundaryWithCategoricalDiceLoss which is called with boundary_loss_type = "GRAD" . When I use the loss with boundary_loss_type = "MSE" all works fine. This class is a subclass of CategoricalDiceLoss which is a sub-class of keras.losses.Loss

`from typing import Dict, Itera...

one month ago

0 Hi, I See Some Strange Behavior Where The Training Fails When Running On Clearml (Loss = Nan) Compared To Running W/O Clearml. This Is Entirely Reproduceable. Has Anyone Seen This?

This is what I get using clearml
This is what I get when running on Clearml. Notice the nan in the loss
Epoch 1/150
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1739804333.538008 890492 service.cc:145] XLA service 0x7f19b80029d0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
`I0000 00:00:1739804333.538068 890492 service.cc:153] StreamExecutor device (0): NVIDIA GeForce RTX 2080 ...

one month ago

0 Hi, I See Some Strange Behavior Where The Training Fails When Running On Clearml (Loss = Nan) Compared To Running W/O Clearml. This Is Entirely Reproduceable. Has Anyone Seen This?

This is what I get when running w/o clearml
Epoch 1/150
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1739806371.262488 897794 service.cc:145] XLA service 0x7fc058066d20 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
I0000 00:00:1739806371.262578 897794 service.cc:153] StreamExecutor device (0): NVIDIA GeForce RTX 2080 Ti, Compute Capability 7.5
`2025-02-17 15:32:51.772357: I tensor...

one month ago

Thanks John. But does the metadata relate to the entire dataset or individual elements in the dataset?
For example, lets say I have a dataset of images, and I would like to attach metadata to each image - e.g. a "type" field which could have values 1,2,3,4,5...
How would the dataframe be constructed? I assume one column would contain an ID identifying the image, and the other column would be "type". If this is the case, what would the ID be?

one year ago

0 Hi, I See Some Strange Behavior Where The Training Fails When Running On Clearml (Loss = Nan) Compared To Running W/O Clearml. This Is Entirely Reproduceable. Has Anyone Seen This?

The code that generates this is the fit method in TF
model.fit(train_dataset, validation_data=val_dataset, epochs=cfg.fit.epochs, callbacks=callbacks, verbose=2)

Clearml is activated in the usual way:
task = Task.init(project_name=project_name, task_name=name, output_uri=True, auto_connect_frameworks={'tensorflow': False}, **kwargs)

one month ago

0 Hi, I See Some Strange Behavior Where The Training Fails When Running On Clearml (Loss = Nan) Compared To Running W/O Clearml. This Is Entirely Reproduceable. Has Anyone Seen This?

This is what I get when running the exact same training session without clearml
Epoch 1/150
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1739806371.262488 897794 service.cc:145] XLA service 0x7fc058066d20 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
I0000 00:00:1739806371.262578 897794 service.cc:153] StreamExecutor device (0): NVIDIA GeForce RTX 2080 Ti, Compute Capability 7.5
2025-02-17 15:3...

one month ago

0 Hi, I See Some Strange Behavior Where The Training Fails When Running On Clearml (Loss = Nan) Compared To Running W/O Clearml. This Is Entirely Reproduceable. Has Anyone Seen This?

This is what I get when running on Clearml. Notice the nan in the loss
Epoch 1/150
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1739804333.538008 890492 service.cc:145] XLA service 0x7f19b80029d0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
I0000 00:00:1739804333.538068 890492 service.cc:153] StreamExecutor device (0): NVIDIA GeForce RTX 2080 Ti, Compute Capability 7.5
2025-02-17 14:58:54.0217...

one month ago

0 Hi, It Appears That What Clearml Calls Histograms Are Actually Bar Graphs. Does Clearml Have Real Histograms?

A bar graph takes a set of numbers and assigns a bar to each number where the height of the bar represents the number. This is what you see in the example in the link I attached.
A histogram takes a set of number and divides them into bins, and plots the number of samples in each bin.

4 months ago

0 Hi, It Appears That What Clearml Calls Histograms Are Actually Bar Graphs. Does Clearml Have Real Histograms?

I disagree that the difference is minute 🙂 They are fundamentally different plots. The term used by clearml is misleading.
I will use the manual plotting feature with matplotlib

4 months ago

OK so if I understand correctly, you can only add metadata to items in the enterprise version.
Just to confirm, the screen shot in the dataops pages here refers to the enterprise version?
None

one year ago