AgitatedDove14

48 Questions, 8049 Answers

Active since 10 January 2023

Last activity 6 months ago

Reputation

Badges 1

25 × Eureka!

Answers 8049

0 I'M Using

Let me check what we can do 😉

3 years ago

0 Hi, I Have A Future Roadmap Question On Clearml-Datasets. The Current Implementation Works Well For Small Datasets But Its Rather In Effective For Very Large Datasets. For Example, Let'S Say I Have 10 Million Images Just For The Training Dataset, And My T

Hi SubstantialElk6
quick update, once clearml 1.1 is out, we will push the clearml-data improvement, supporting chunks per version (i.e. packaging the changeset into multiple zip files, instead of a single one as the current version does).

regrading (1) storage limit server.

Ideally, we should be able to specify the batch size that we want to download, or even better, tie this in with the training by parallelising the data download, data preprocessing and batch trains.

With the nex...

3 years ago

0 If I Have A Task And A Dataset Is Being Created In A Task, How Can I Get A “Link” That This Dataset Is Created In This Task, Similar To How Model Has The Task Where It Came From

👍

3 years ago

0 If I Have A Task And A Dataset Is Being Created In A Task, How Can I Get A “Link” That This Dataset Is Created In This Task, Similar To How Model Has The Task Where It Came From

Also, the IDs as an entry in the Configuration will not be clickable in the web interface, right?

No, but on the other hand, it will be editable if you clone the Task.
Which brings me to a different scenario,
In the original one, the Main Task created the Dataset, i.e. Output Dataset (and stored it both ways).
I could think of a situation the Task is using the Dataset as input (say preprocessing or traing), then we might want to enable users to clone and change the Input dataset. wdyt?

3 years ago

0 Hi, We Saw 2 More Issues With Images Logging: 1. You Need To Use Tf.Summary.Image And Not Summary_Ops_V2.Image 2. Image Needs To Be In Range [0, 1] And Not [0, 255] (Matplotlib And Tensorboard Can Handle Either One) Is That Expected?

You need to use tf.summary.image and not summary_ops_v2.image

Fixed on main branch (see github issue), RC later today

Image needs to be in range [0, 1] and not [0, 255] (matplotlib and tensorboard can handle either one)

Is there a code to reproduce ?

3 years ago

So it seems to get the "hint" from the type:
This will work
tf.summary.image('toy255', (ex * 255).astype(np.uint8), step=step, max_outputs=10)wdyt, should it actually check min/max and manually cast it ?

3 years ago

StaleButterfly40 are you sure you are getting the correct image on your TB (toy255) ?

3 years ago

I get the same "white" image in both TB & ClearML 😞

3 years ago

Thanks StaleButterfly40 !

3 years ago

0 Hi There, Executing Remotely, The Script I'M Running The Clearml Task From, Has An Import Command For A Module Located In A Different Repository, But When Running The Script It Only Clones The Repo The Clearml Task Is On, How Can It Get The Other Repo Als

I mean, can you install it with something like ?
pip install git+Basically the agent will install main repository, and any git submodules. But it cannot install multiple repositories, as the directory structure might be too much.
wdyt?

3 years ago

Hi ConvolutedChicken69

but when running the script it only clones the repo the clearml task is on, how can it get the other repo also?

Do you have a wheel or a git you can install it from ?

3 years ago

0 Hi, Does Anyone Use Mlflow / Weight & Biases /

Ahha! Nice work!

4 years ago

👍

3 years ago

0 Hey - Is There A Known Issue With Tf.Summary.Image Not Being Logged When There'S A 'Description' Argument?

Hi BattyLizard6
Not that I'm aware of, which TF version are you using, and which clearml version?

3 years ago

0 Hey - Is There A Known Issue With Tf.Summary.Image Not Being Logged When There'S A 'Description' Argument?

is there a code that can reproduce it ?

3 years ago

0 Hi, We'Re Hosting Clearml On Our K8S Cluster, And I'M Running Into Problems With It... I'Ve Set It Up In A Subdomain Way - App/Files/Api.Clearml.Mydomain... But I Have Some Issues With The Ssl Certificate. When I Try Running

BTW: see if this works:
$ CLEARML_API_HOST_VERIFY_CERT=0 clearml-init

3 years ago

0 Hi Team, I'M Currently Trying To Install Clearml-Server On A Powerpc Server With Redhat7. The Issue Is That The Clearml-Server Pre-Built Images Doesn'T Run On The Powerpc, So The Docker Containers Need To Be Rebuild On The Powerpc Host. Is There Dockerfil

Hi Team, I'm currently trying to install ClearML-Server on a Powerpc server with RedHat7.

You are a brave man LividCrab90 !

s there dockerfiles for the ClearML-Server stack somewhere ?

The main issue is replacing the DB containers, do you have elastic/mongo/redis for powerpc ?

3 years ago

0 Ok, Next Question, I'Ve Got Some Training Args That I'D Like To Manually Upload And Have Them Show Up In The Attached Place, Under Configuration. It Is A Huggingface Trainingarguments Object, Which Has A To_Dict() And To_Json Function

Task.current_task().connect(training_args, name='hugggingface args')And you should be able to change them when launching remotely 😉
SmallDeer34 btw: "set_parameters_as_dict" will replace all the arguments (and is one way) ...

3 years ago

Then in theory (since the backend is python based) you just need to find a base docker image to build it on.

3 years ago

0 Is There Anywhere In The Web Ui Where One Can See The Clearml Server Version Running? I Keep Getting "Version 1.1.1 Is Now Available" Even Though I'M Pretty Sure I Took All The Steps To Update To The Latest Version

UnevenDolphin73 go to the profile page, I think at the bottom right corner you should see it
(Also ctrl-F5 to reload the web application, if you upgraded the server 🙂 )

3 years ago

😄

3 years ago

0 Is There A Way To

DisgustedDove53 , TrickySheep9
I'm all for it!
I can think of two options here, (1) use the k8s glue + apply template with ports mode see discussion https://clearml.slack.com/archives/CTK20V944/p1628091020175100
(2) create an interface (queue) to launch arbitrary job on the k8s cluster, with the full pod definition on the Task. This will allow the clearml-session to setup everything from the get go.
How would you interface with the k8s operator, and what exactly will it do?
(BTW: the reas...

3 years ago

0 Hi, I Was Some How Able To Get A Project Running Yesturday, However Now I Am Unable To Get It Running, I Keep Getting An Failed Getting Token Error

MiniatureCrocodile39 from the screen shot I imagine you are running inside a docker, this means that when you restart the docker, the configuration file is lost.
Could that be the case ?

3 years ago

0 Hi, I Was Some How Able To Get A Project Running Yesturday, However Now I Am Unable To Get It Running, I Keep Getting An Failed Getting Token Error

And your ~/clearml,conf ?

3 years ago

0 Hi, I Was Some How Able To Get A Project Running Yesturday, However Now I Am Unable To Get It Running, I Keep Getting An Failed Getting Token Error

is it there ?

3 years ago

0 Hello! How Can I Use "Report_Scatter2D" In Order To Report Timestamp In The X-Axis?

but I cannot compare between them

I think we noticed it, and this will be fixed in the next server update (again, some plotly.js issue there)

3 years ago

0 Hey There, Happy New Year To All Of You

Did you experiment any drop of performances using forkserver?

No, seems to be working properly for me.

If yes, did you test the variant suggested in the pytorch issue? If yes, did it solve the speed issue?

I haven't tested it, that said it seems like a generic optimization of the DataLoader

3 years ago

0 Hello Everyone

RobustSnake79 let's assume that the trace figure above is probably too much to get into the WebUI, which simple figures might still have value in your scenario ?

3 years ago

0 Hi! I Am Using The Modelcheckpoint Callback From Tensorflow To Save The Best Model. When The Experiment Finishes If I Go On The Server To Experiment > Artifacts > Output Model I Can See The Model And Subsequently By Clicking On It The Weights. How Can I

Three options:
In your code: Task.init(..., output_uri='s3://.../'2. Configure a default output_uri to be used by all tasks: https://github.com/allegroai/clearml/blob/64042f6c4fdaaf15b6c5f816f2fbf50f89c313e2/docs/clearml.conf#L156
3. In the UI after you clone a Task under Execution tab, "output" "destination"

In all cases output_uri can be:
/mnt/share/folder (if you have a shared folder between all machines. http://trains-server:8081/ gs://bucket azure://bucket/

3 years ago

I mean what is the actual link?
File:// is a path to a file.
If your machine cannot access that path you get an error.
For example:
file:///home/user/file.bin
translates to /home/user/file.bin
If you do not have the file /home/user/file.bin on your machine you get an error.
GrievingTurkey78 make sense ?
Note that by default trains / clearml will not upload your weights file anywhere , only if you set "output_uri" to a specific location it will do that .

3 years ago

Show more results