Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
AgitatedDove14
Moderator
48 Questions, 8049 Answers
  Active since 10 January 2023
  Last activity 6 months ago

Reputation

0

Badges 1

25 × Eureka!
0 Hi! I Need Help Debugging The Following Issue Please. I'M Training A Cnn And Plotting The Confusion Matrices For Train And Val In Each Epoch. When I Get To Epoch 101, The Ui Kind Of Breaks..It Starts Showing Me The Images For Epoch 1. When I Right Click O

So the TB issue was reported images were not logged.
We are now talking about the caching, which is actually a UI thing which clearml-server version are you using ?
And where are the images stored (the default files server or is it S3/GS etc.) ?

3 years ago
0 Hi! I Need Help Debugging The Following Issue Please. I'M Training A Cnn And Plotting The Confusion Matrices For Train And Val In Each Epoch. When I Get To Epoch 101, The Ui Kind Of Breaks..It Starts Showing Me The Images For Epoch 1. When I Right Click O

Okay I found it, this is due to the fact the newer versions are sending the events/images in a subprocess (it used to be a thread).
The creation of the object is done on he main process, updating file index (round robin manner), but the check itself, happens on the subprocess., which is not "aware" of the used indexes (i.e. it is always 0, hence when exceeding the history side, it skips it)

3 years ago
0 Hi! I Need Help Debugging The Following Issue Please. I'M Training A Cnn And Plotting The Confusion Matrices For Train And Val In Each Epoch. When I Get To Epoch 101, The Ui Kind Of Breaks..It Starts Showing Me The Images For Epoch 1. When I Right Click O

why doesn't this happen on my other experiments?

same 100+ reports ?
(My new theory is that calling Task.reload() will fix it, and it might be called internally for the other experiments, like when reporting models/artifacts)
Could that be the case ?

3 years ago
0 Hi, I Would Like To Bring Awareness

@<1523701066867150848:profile|JitteryCoyote63>
I just created a new venv and run

pip install "torch==1.11.0.*" --extra-index-url 

Then started python:

import torch
torch.cuda.is_available()

And I get True

what are you getting?

one year ago
0 Hi, I Would Like To Bring Awareness

Hi @<1523701066867150848:profile|JitteryCoyote63>
RC is out,

pip3 install clearml-agent==1.5.3rc3

Then in pytorch_resolve: "direct"
None

Let me know if it worked

one year ago
0 Hi, I Would Like To Bring Awareness

Hi @<1523701066867150848:profile|JitteryCoyote63>
Thank you for bringing it! can you verify with the latest clearml-agent 1.5.3rc2 ?

one year ago
0 Hi, I Would Like To Bring Awareness

Hi @<1523701066867150848:profile|JitteryCoyote63>

Could you please push the code for that version on github?

oh seems like it is not synced, thank you for noticing (it will be taken care immediately)
Regrading the issue:
Look at the attached images
None does not contain a specific wheel for cuda117 to x86, they use the pip defualt one
![image](https://clearml-web-assets.s3.amazonaws.com/scoold/images/TT9ATQXJ5-F05744CK09L/screenshot...

one year ago
0 Hi, I Would Like To Bring Awareness

No, I think the default version already supports cuda 117

one year ago
0 Hi, I Would Like To Bring Awareness

So I suppose clearml-agent is not responsible, because it finds a wheel for torch 1.11.0 with cu117.

The thing is, the agent used to do all the heavy parsing because pytorch never actually had a pip compatible artifactory
But now they do, so the agent basically passed the parsing to pip and just added the correct additional pytorch pip repo.
It seems we need to switch back... wdyt?

one year ago
0 Hi, I Would Like To Bring Awareness

The wheel you download from pip, for example this one torch-1.11.0-cp38-cp38-manylinux1_x86_64.whl
is actually both CPU and cuda 117

one year ago
0 Hi, I Would Like To Bring Awareness

if this is the case pytorch really messed things up, this means they removed packages
Let me check something

one year ago
0 Hi, I Would Like To Bring Awareness

I am not sure what switching back will solve, here the wheel should have been correct, it's just the architecture of the card that is incompatible

So I tested the "old" code that did the parsing and matching, and it did resolve to the correct wheel (i.e. found that there is no 117 only 115 and installed this one)
I think we should switch back, and have a configuration to control which mechanism the agent uses , wdyt?

one year ago
0 Hi Guys, I Am Having Some Trouble Running Some Training Scripts With The Agent Functionality:

When we enqueue the task using the web-ui we have the above error

ShallowGoldfish8 I think I understand the issue,
basically I think the issue is:
task.connect(model_params, 'model_params')Since this is a nested dict:
model_params = { "loss_function": "Logloss", "eval_metric": "AUC", "class_weights": {0: 1, 1: 60}, "learning_rate": 0.1 }The class_weights is stored as a String key, but catboost expects "int" key, hence it fails.
One op...

2 years ago
0 Hi All, I'M New With Clearml And I Have A Question. I Have A Modular Code, And When I'M Trying To Run It In A Remote Machine With The Agent, I Get An Error On The Line 'From X Import Y', Which Says That There Isn'T Such Module X. Any Help? Thanks.

Hi BroadMole64

'from X import Y', which says that there isn't such module X. any help? thanks.

can you see package X under the "Execution" tab "Installed Packages" section ?
(think of this section as requirements.txt section, in order for the agent to install the package on the remote machine it should have it listed there)

3 years ago
0 Hello, I Would Like To Use Spot Instances Together With The Aws Autoscaler To Train Models With Pytorch/Ignite And I Am Wondering How To Support Interruptions During The Training (In Case The Instance Is Terminated By Aws). Is There Anything Already Built

The problems comes from ClearML that thinks it starts from iteration 420, and then adds again the iteration number (421), so it starts logging from 420+421=841

JitteryCoyote63 Is this the issue ?

3 years ago
0 Hi, Currently We Can Add "Tags" On Experiments. When Filtering The Tags In The Dashboard, It Seems To Default To Filter As A "Or" Condition, Is It Possible To Search With "And" Condition, Such As Search With "Dataset_Version1 + Nn_Model"

Hi EnviousStarfish54
Verified with the frontend / backend guys.
Backend allows to search for "all" tags, and frontend will add a toggle button for the UI to select or/all for the selected Tags.
Should be part of the next release

3 years ago
0 Is The App/Ui/Backend Customizable? Any Tutorials For That?

CleanWhale17 per your request :)

An automated ML Pipeline 👍 Automated Data Source Integration 👍 Data Pooling and Web Interface for Manual Annotation of Images(Seg. / Classif) [Allegro Enterprise] or users integrate with open-source Storage of Annotation output files(versioned JSON) 👍 Online-Training  Support(for Dataset Shifts) [Not Sure what you mean] Data Pre-processessing (filter/augment) [Allegro Enterprise] or users integrate with open-source Data-set visualization(stats...

4 years ago
0 Is The App/Ui/Backend Customizable? Any Tutorials For That?

CleanWhale17 what is " Online-Training  Support(for Dataset Shifts" ?

4 years ago
0 Is The App/Ui/Backend Customizable? Any Tutorials For That?

As I'm a Full-stack developer at Core. I'd be looking to extend the TRAINS Frontend and Backend APIs to suit my need of On-Prem data storage integration and lots of other customization for Job Scheduler(CRON)/Dataset Augmentation/Custom Annot. tool etc.

That is awesome! Feel free to post a specific question here, and I'll try to direct to the right place 🙂

Can you guide me to one such tutorial that's teaching how to customize the backend/front end with an example?

You mean l...

4 years ago
0 Hi All, I'M Using Clearml 1.0.3 With Clearml-Server <1 (How Do I Get The Current Running Version?) In Pytorch-Lightning I Use Ddp And I See Multiple Tasks (As The Number Of Gpus) Being Created And Remaining In Draft Mode. Is It A Problem Running Clearml

Task.init should be called before pytorch distribution is called, then on each instance you need to call Task.current_task() to get the instance (and make sure the logs are tracked).

3 years ago
Show more results compactanswers