AgitatedDove14

49 Questions, 8124 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

25 × Eureka!

Questions 49
Answers 8124

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

Hi Guys/Gals, If You Want To Checkout The Latest Rc We Have 0.15.0Rc0 Out :

Hi Guys/Gals, If you want to checkout the latest RC we have 0.15.0rc0 out : pip install trains==0.15.0rc0 pip install trains-agent==0.15.0rc0Many of the impr...

clearml

5 years ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

There Is No V1.0 Release Without A Prompt V1.0.1 Following It, And We Are No Different

🙏 There is no v1.0 release without a prompt v1.0.1 following it, and we are no different 😊 pip install clearml==1.0.1

clearml

4 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

<!here> New video is out :slightly_smiling_face: Cloud Autoscalers are awesome <https://www.youtube.com/watch?v=j4XVMAaUt3E>

New video is out 🙂 Cloud Autoscalers are awesome https://www.youtube.com/watch?v=j4XVMAaUt3E

clearml

3 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

Well To Be Honest, We Kind Of Thought It'S Redundant. Basically Storing Artifacts In Experiments And Having Them Retrieved Quickly From The Code Itself Was Way More Convenient For Us Then To Manually Have To Do Clone/Pull Of The Data... Example: Create Da

Well to be honest, we kind of thought it's redundant. Basically storing artifacts in experiments and having them retrieved quickly from the code itself was w...

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

you set it :slightly_smiling_face:

you set it 🙂

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

Gals, Guys & :robot_face: If you want to get some inspiration on building DL Continuous Integration pipelines, I suggest this post (obviously built on top of Trains :smile_cat: ) <https://twitter.com/PyTorch/status/1272919483980500999>

Gals, Guys & :robot_face: If you want to get some inspiration on building DL Continuous Integration pipelines, I suggest this post (obviously built on top of...

clearml

5 years ago

0 Votes

6 Answers

1K Views

0 Votes 6 Answers 1K Views

Hi :robot_face: , humans We have the new documentation site up and running 🎉 None 🎊 This is still a work in progress, so we keep the previous version alive...

clearml

4 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

<!everyone> Trains v0.14.2 is out (<https://github.com/allegroai/trains/releases/tag/0.14.2|Change log>) Highlights: <https://github.com/allegroai/trains/blob/master/trains/storage/manager.py#L13|trains.storage.StorageManager> - with caching for any http

Trains v0.14.2 is out ( https://github.com/allegroai/trains/releases/tag/0.14.2 ) Highlights: https://github.com/allegroai/trains/blob/master/trains/storage/...

clearml

5 years ago

0 Votes

2 Answers

2K Views

0 Votes 2 Answers 2K Views

Hi ClearML v0.17.1 and ClearML-Agent v0.17.0 are now the official packages & repositories 🎉 🎊 👋 🛤️ This new name brings on many changes, mainly replace a...

clearml

4 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

Finally

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

Https://M.Facebook.Com/Story.Php?Story_Fbid=2484620658505570&Id=1620822758218702&Refid=52&__Tn__=-R

https://m.facebook.com/story.php?story_fbid=2484620658505570&id=1620822758218702&refid=52&tn=-R

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

New Rc For Trains-Agent Is Out

New RC for trains-agent is out pip install trains-agent==0.13.2rc1

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

<https://allegro.ai/docs>

https://allegro.ai/docs

clearml

5 years ago

0 Votes

1 Answers

2K Views

0 Votes 1 Answers 2K Views

Quick Note: V1.3.1 Caused Pipelinedecorator Tasks To By Default Disable The Automagic Frameworks Connection, This Bug Is Solved In The Latest Rc

Quick note: v1.3.1 caused PipelineDecorator Tasks to by default disable the automagic frameworks connection, this bug is solved in the latest RC pip install ...

clearml

3 years ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

Lstmeow Is Back! Bots/Gals/Guys Feel Free To

LSTMeow is back! Bots/Gals/Guys feel free to 👍 None

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

docs are up

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

New releases: ```pip install trains==0.13.3``` <https://github.com/allegroai/trains/releases/tag/0.13.3> ```pip install trains-agent==0.13.2``` <https://github.com/allegroai/trains-agent/releases/tag/0.13.2>

New releases: pip install trains==0.13.3https://github.com/allegroai/trains/releases/tag/0.13.3 pip install trains-agent==0.13.2https://github.com/allegroai/...

clearml

5 years ago

0 Votes

1 Answers

2K Views

0 Votes 1 Answers 2K Views

This Is Usually Due To Enterprise Level Issued Https Certificates Not Part Of The Local Installation (Basically Any Python Generated Ssl Request Will Fail)

This is usually due to enterprise level issued https certificates not part of the local installation (basically any python generated SSL request will fail)

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

YEY!!!! *Download as CSV* :exploding_head:

YEY!!!! Download as CSV 🤯

clearml

3 years ago

Show more results

0 Hey, I'M Running A Pipeline, And 1 Stage Passed - But The Next One Failed. I Fixed The Bug For The Second One - Is There Any Way To Retry The Pipeline From The Failure?

Hi CleanPigeon16
Yes there is, when you are cloning the pipeline in the UI, go to the Configuration/Pipeline/continue_pipeline and change it to True

4 years ago

0 Hey, I'M Running A Pipeline, And 1 Stage Passed - But The Next One Failed. I Fixed The Bug For The Second One - Is There Any Way To Retry The Pipeline From The Failure?

Thanks CleanPigeon16
Could you verify Task "d1d361d1059c4f0981200f59d7683773" exists (and not archived)?

4 years ago

0 Hey, I'M Running A Pipeline, And 1 Stage Passed - But The Next One Failed. I Fixed The Bug For The Second One - Is There Any Way To Retry The Pipeline From The Failure?

Is there an option to do this from a pipeline, from within the

add_step

method? Can you link a reference to cloning and editing a task programmatically?

Hmm, I think there is an open GitHub issue requesting a similar ability , let me check on the progress ...

nope, it works well for the pipeline when not I don't choose to continue_pipeline

Could you send the full log please?

4 years ago

0 Hey, I'M Running A Pipeline, And 1 Stage Passed - But The Next One Failed. I Fixed The Bug For The Second One - Is There Any Way To Retry The Pipeline From The Failure?

The pipeline stores the state of it's previous run, specifically the executed steps.
In our case the executed step was reset (I assume) so it cannot find the output model you are referring to, hence crashing
CleanPigeon16 make sense ?

4 years ago

0 Hi All! I'M Using Clearml With Hydra As Configuration Manager. I'M Trying To Rerun A Task By Overriding Some Of The Configurations From The Ui. I Tried To Change The Config_Name Args In The Args Section And Also The Omegaconf Configuration In Configuratio

If I edit directly the OmegaConf in the UI than the port changes correctly

This will only work if you change the Hydra/allow_omegaconf_edit to True in the UI. Did you?

4 years ago

0 Hi, Can We Upload Our Project Repository To Trains Server? If We Can, How Should We Do? I Know When We Write "Task.Init()", It Uploads Our Experiment Into Server, But It Also Run The Experiment. However, I Want To Upload All My Experiments In Draft Status

MysteriousBee56 when you execute your code once it will appear in the server (with all fields pre-populated based on your setup/git etc.) once it is there you can "clone" them and move them around.
Is this what you mean?
A bit of background, the idea behind Trains is that the environment definition (i.e,. git repo packages etc, code entry arguments etc.) is collected when executing the code. This avoids the tedious task of generating and maintaining YAML/Json configuration files.
What is exa...

5 years ago

0 Whats The Main Difference Between Creating A Task And Using Init?

SweetGiraffe8 Task.init will autolog everything (git/python packages/console etc), for your existing process.
Task.create purely creates a new Task in the system, and lets' you manually fill in all the details on that Task
Make sense ?

4 years ago

0 Hi! I Am Trying To Build And Run A Pipeline. I Pass My Dataset As Parameter Of Pipeline:

I pass my dataset as parameter of pipeline:

@<1523704757024198656:profile|MysteriousWalrus11> I think you were expecting the dataset_df dataframe to be automatically serialized and passed, is that correct ?
If you are using add_step, all arguments are simple types (i.e. str, int etc.)
If you want to pass complex types, your code should be able to upload it as an artifact and then you can pass the artifact url (or name) for the next step.

Another option is to use pipeline from dec...

2 years ago

0 Good Morning Folks, I Am Setting Up Clearml On A (Self-Hosted) K8S Cluster Using The

is how you would create different queues,

SarcasticSquirrel56 you can create them from the UI, when the server is already running
(if you are saying, how do I create them in the first installaiton, then yes you are correct, this is possible in the helm chart, I think 😞 )

3 years ago

0 Hi, And Thanks For The Great System. I'Ve Been Training Using

Great! btw: final v1.2.0 should be out after the weekend

3 years ago

0 Hi. I Have A

Hi PanickyMoth78

My local

clearml.conf

file has agent's

git_user

and

git_pass

defined as in my

in order for the autoscaler to access your git , in the wizard you have to provide the git user/token

The component agent's log has:

Executing task id [90de043e354b4b28a84d5cc0788fe63c]: repository = branch = version_num =Hmm, how does the decorator of the component looks like ? meaning did you specify a repo/branch/commi...

3 years ago

0 For Clearml Serving, If I Am Trying To Deploy 100 Models On A Gpu That Can Handle 5 Concurrently, But Each One Will Be Sporadically Used (Fine Tuned Models Trained For Different Customers), Can Clearml-Serving Automatically Load And Unload Models Based Up

Triton server does not support saving models off to normal RAM for faster loading/unloadingCorrect, the enterprise version also does not support RAM caching

Therefore, currently, we can deploy 100 models when only 5 can be concurrently loaded, but when they are unloaded/loaded (automatically by ClearML), it will take a few seconds because it is being read from the the SSD, depending on the size.

Correct, there is also deserializing CPU time (imaging unpickling 20GB file, this takes ...

one year ago

0 Hi, I Try To Write An Article On Medium About Clearml And Face Some A Problem With Plotly Figures. When Displaying The Figure Locally In A Browser Works Fine, But On The Cleaml Server (I Use The Free Tier Service) The Plot Is Empty And Has The Title 'Unkn

Okay, I was able to reproduce it (this is odd) let me check ...

4 years ago

0 I’M Using Catboost For Training, But Sadly It Does Not Have A Native Integration With Clearml (Xgboost And Lightgbm Do Have Integrations). But Catboost Writes Down Training Logs In Tensorboard Format (Into A

Hmm I think everything is generated inside the c++ library code, and python is just an external interface. That means there is no was to collect the metrics as they are created (i.e. inside the c++ code), which means the only was to collect them is to actively analyze/read the tfrecord created by catboost 😞
Is there a python code that does that (reads the tfrecords it creates) ?

4 years ago

0 Hey Has Anyone Managed To Capture Darts Logging With Clearml When Using The Temporal Fusion Transformers ? Even When Overriding Their Trainer With A Custom Pytorch Lightning Trainer It Seems That Clearml Cannot Retrieve The Iteration Log...

No I was was pointing out the lack of one

Sounds like a great idea, could you open a github issue (if not already opened) ? just so we do not forget

set the pytorch lightning trainer argument

log_every_n_steps

to

1

(default

50

) to prevent the ClearML iteration logger from timing-out

Hmm that should not have an effect on the training time, all logs are send in the background, that said checkpoints might slow it a bit (i.e.; i...

2 years ago

0 Hi! Is There Something Happening With The

https://github.com/allegroai/clearml/commit/737ca91d2a6f42b5f1ef19815fac93e183ba5398

4 years ago

0 I Have A Self-Hosted Clearm-Server And And Clearml-Agent Started With

Correct :)

4 years ago

0 Hi Guys! What Is The Best Way To Access Artifacts From Other Step Of The Pipeline? I Have Step One Returning Dataframe And Step Two Takes It As An Input But When First Step Is Cached I Only Get An Artifact Url. So How Should I Read It From Artifacts Stora

None

2 years ago

0 Hey, Trying To Use Trains-Agent To Run An Experiment On My Computer. When Trying To Execute A Job From The Queue On My Agent Im Getting An Error That Numpy Is Not Installed. How Do I Have The Trains-Agent Install My

Hi CloudyHamster42

how do i have the trains-agent install my

requirements.txt

file from my repo when creating the environment?

BTW if you clear all "the installed packages", then trains-agent will user requirements.txt and update back all the packages in the UI

4 years ago

0 Hi, Can You Help Me Pls, I Got: Environment Setup Completed Successfully Starting Task Execution: Traceback (Most Recent Call Last): File "Agro_Api.Py", Line 13, In From Help_Models.Consts Import Urls Importerror: No Module Named 'Help_Models'

Check the log usually ~/.trains/venv/...

5 years ago

0 Hi Everyone, I'M Running Into A Weird Error When Trying To Clone And Run And Task That Has Completed Successfully. I Have A Test Task That Loads A Dummy Dataset And Trains A Toy Model With Pytorch. When Running Remotely, I Use My Own Docker Image That Has

@<1533620191232004096:profile|NuttyLobster9> I think we found the issue, when you are passing a direct link to the python venv, the agent fails to detect the python version and since the python version is required for fetching the correct torch it fails to install it. This is why passing CLEARML_AGENT_PACKAGE_PYTORCH_RESOLVE=none because it skipped resolving the torch / cuda version (that requires parsing the python version)

one year ago

0 Hi All! I Have A Question About Pipelines. My Pipeline Consists Of Several Steps:

Makes total sense!
Interesting, you are defining the sub-component inside the function, I like that, this makes the code closer to how this is executed!

2 years ago

0 I’M Trying To Use

But these changes haven’t necessarily been merged into main. The correct behavior would be to use the forked repo.

So I would expect the agent to pull from your fork, is that correct? is that what you want to happen ?

4 years ago

0 Hi Folks, A Question Regarding The Clearml-Agent With K8S Glue. In The Agents We Mount An Nfs Volume So That Some Artifacts And Data Would Be Available For Training. I Have Seen That The K8S Glue Runs As Root (I Guess To Be Able To Spawn New Pods?), But

For example, for some of our models we create pdf reports, that we save in a folder in the NFS disk

Oh, why not as artifacts ? at least you will be able to access from the web UI, and avoid VFS credential hell 🙂

Regrading clearml datasets:
https://www.youtube.com/watch?v=S2pz9jn26uI

3 years ago

0 Hi Everyone, I Wanna Implement Some Sort Of A Tool That Checks Every N Minutes For A New Data In Db Or Maybe S3 Bucket And Whenever The New Data Appears It Runs Some Preprocessing And Then Starts Training. Is It A Good Idea To Use Clearml Agent Services F

Hi ContemplativeGoat37

it a good idea to use ClearML Agent Services for such things?

Yes! it is exactly the kind of thing it was designed to do 🙂

4 years ago

0 Hi! Is There Something Happening With The

This is what I just used:
` import os
from argparse import ArgumentParser

from tensorflow.keras import utils as np_utils
from tensorflow.keras.datasets import mnist
from tensorflow.keras.layers import Activation, Dense, Softmax
from tensorflow.keras.models import Sequential
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import ModelCheckpoint

from clearml import Task

parser = ArgumentParser()
parser.add_argument('--output-uri', type=str, required=False)
args =...

4 years ago

0 When I Do

So you could change it down the road if infra/hosting changes.

Internally this is doable and Enterprise edition supports it, at the end this is stored in DBs 🙂

Also in this case, I'm uploading the data to the public file server URL, but my k8 pod can't reach that for security reasons.

Yes, this is solvable as well (again sorry for pointing it, but only in the enterprise version), where you can specify per client or globally:
` path_substitution = [
# Replace regis...

2 years ago

0 If I Clone A Task, I Suppose All Artifacts Are Not Cloned With It, Even If They Are Registered, Right?

Very lacking wrt to how things interact with one another

If I'm reading it correctly, what you are saying is that some of the "big picture" / holistic approach on how different parts interact with one another is missing, is that correct?

I think ClearML would benefit itself a lot if it adopted a documentation structure similar to numpy ecosystem

Interesting thought, what exactly would you suggest we "borrow" in terms of approach?

3 years ago

0 Clearml-Session Fails Ssh Tunneling. It Does Not Use Key Auth, Instead Sets Up Some Weird Password And Then Fails To Auth:

Btw it seems the docker runs in

network=host

Yes, this is so if you have multiple agents running on the same machine they can find a new open port 🙂

I can telnet the port from my mac:

Okay this seems like it is working

2 years ago

0 Hi Everyone, Is There A Way To Increase The Cache Size Of Each Clearml Task? I'M Running An Experiment And Many Artifacts Are Downloaded. My Dataloader Fails To Load Some Of The File Since They Are Missing, Although They Were Downloaded. I Guess There Is

Are you suggesting the conf file did not set the default size? It sounds like a bug, can you verify?

3 years ago

Show more results