AgitatedDove14

49 Questions, 8124 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

25 × Eureka!

Answers 8124

0 Hey Guys, I Am Trying To Plan What I Need To Do In Order To Efficiently Use Clearml With Spot Instances 1) Detecting When Spot Instance Is Down And Experiment Is Aborted 2) Extracting S3 Address Of The Latest Checkpoint From Clearml Api 3) Starting New E

Very Cool!
BTW guys, are you using the task.models[] to continue from the last checkpoint? or is it task.artifacts[] ?

3 years ago

0 Hi Everyone, I Was Looking Into Clearml Integration With Nvidia For Transfer Learning. Does Clearml Have Plans To Integrate With The New Tao? Looks Like Nvidia Is Focusing Tao As A Low Code Transfer Learning Tool With Everything Done In Command Line, Whic

The latest TAO doesn't use python for fine tuning, rather it uses the CLI entirely

It's a good question, but I think the CLI actually just runs a python code (the CLI is their interface). Generally speaking I'm pretty sure it will not be complicated to convert the TLT integration to support TAO (Nvidia helps with that, and I think we had a similar proces with Nvidia Clara/MONAI)
BTW: how are you using Nvidia TAO ?

3 years ago

0 There Seems To Be An Error If A Project Name Has Spaces (At Least At The Top-Level Name). I Created A Project Called

UnevenDolphin73 are you positive, is this reproducible? What are you getting?

2 years ago

0 Hi, With Clearml-Agent 1.5.1, I Tried To Run An Experiment Within A Docker With Image Python3:8 And It Failed Executing The Task While Trying To Call Python3.9. I Am Not Sure Why It'S Using Python3.9, Since The Agent.Default_Python Is 3.8 And The Image Is

packages are updated, and I don't know which python version I get, + changing the python version of the OS is not really recommended

Wait I'm confused, this is inside a container, no?

and the python version running my code should not depend of the python version running the clearml-agent (especially for experiments running in containers)

Generally speaking you are correct, but some packages will not have the same version for all python versions

Specifically in this case I think...

2 years ago

0 Hello, I'Ve Been Using Clearml For A Month Now, And Must Say It'S A Really Good Product! I'M Mostly Working With Huggingface Transformers, I Integrated Clearml In My Solution:

HungryArcticwolf62 transformer model is at the end a pytorch/tf model, with pre/post processing.
the pytorch/tf model inference is done with Triton (probably the most efficient engine today), where clearml runs the pre/post on a different CPU machine (making sure we fully utilize all the HW. Does that answer the question?
Latest docs here:
https://github.com/allegroai/clearml-serving/tree/dev

expect a release after the weekend 😉

3 years ago

0 Hi. I Get Some Problem With Clearml Agent. I Start Training On My Local Device, Clone Run, And Start This Run In Docker On Cluster. But, Seems Like Clearml Agent Сaches Environment(Package Weels, Python Version, Etc). Can I Config Clearml Agent To Not Сac

I'm running agent inside docker.

So this means venv mode...

Unfortunately, right now I can not attach the logs, I will attach them a little later.

No worries, feel free to DM them if you feel this is to much to post them here

3 years ago

Hi StickyBlackbird93
Yes, this agent version is rather old ( clearml_agent v1.0.0 )
it had a bug where pytorch wheel aaarch broke the agent (by default the agent in docker mode, will use the latest stable version, but not in venv mode)
Basically upgrade to the latest clearml-agent version it should solve the issue:
pip3 install -U clearml-agemnt==1.2.3BTW for future debugging, this is the interesting part of the log (Notice it is looking for the correct pytorch based on the auto de...

3 years ago

0 Hey Is Clearml Mlflow Based ? Is It Exposed To

Hi @<1523702000586330112:profile|FierceHamster54>
Nope 🙂 nothing to worry about.
That said do notice the open-source file-server is not secure, this does not mean it will spill data on the server, but it does mean that you should probably put it behind a VPN or use S3/GCP/Azure if this is open to the public internet

2 years ago

0 Hi There, I'Ve Encountered A Problematic Behavior In Python. When Defining An Argument A Default Value Of

PompousBeetle71 If this is argparser and the type is defined, the trains-agent will pass the equivalent in the same type, with str that amounts to '' . make sense ?

5 years ago

0 Hi There, I'Ve Encountered A Problematic Behavior In Python. When Defining An Argument A Default Value Of

Hmm... That's what happens with the exception of None/'' if type is str... There is no way to differentiate in the UI.
This is why we opted for type=str will "cast" everything to str so you always get str, while not specifying a type will leave the variable as is... If you have an idea on how to support both, feel free to suggest 🙂

5 years ago

0 Hi There, I'Ve Encountered A Problematic Behavior In Python. When Defining An Argument A Default Value Of

i.e. run
pip install --upgrade trains

5 years ago

0 Hi There, I'Ve Encountered A Problematic Behavior In Python. When Defining An Argument A Default Value Of

I mean , the python package, not the trains-server version

5 years ago

0 Hi There, I'Ve Encountered A Problematic Behavior In Python. When Defining An Argument A Default Value Of

Hi PompousBeetle71
I remember it was an issue, but it was solved a while ago. Which Trains version are you using?

5 years ago

0 Hi There, I'Ve Encountered A Problematic Behavior In Python. When Defining An Argument A Default Value Of

And the trains version?

5 years ago

0 Hi There, I'Ve Encountered A Problematic Behavior In Python. When Defining An Argument A Default Value Of

PompousBeetle71 Could you check with 0.14.3 that just released?

5 years ago

0 Hi There, I'Ve Encountered A Problematic Behavior In Python. When Defining An Argument A Default Value Of

Hmm, I still wonder what is the "correct" answer for most people, is empty string in argparse redundant anyhow? will someone ever use it?

5 years ago

0 Hi There, I'Ve Encountered A Problematic Behavior In Python. When Defining An Argument A Default Value Of

PompousBeetle71 is this ArgParser argument or a connected dictionary ?

5 years ago

0 Is There A Way To Get The Most Updated

HandsomeCrow5 if you want to edit the Task object you can just use:
internal_task_representation = task.data internal_task_representation.execution.script = ... task._edit(execution=internal_task_representation.execution)This will make sure you do not need to worry about API version etc. the Task object will take care of it.
BTW: it seems a few more people wanted this ability, maybe we should edit a proper .edit method to Task. Thoughts ?

5 years ago

0 Also, For Selecting A Subset Of Experiments To Compare, It Looks Like Neptune Currently Has A More Advanced Solution (

You can already sort and filter experiments based on any hyper parameter or metric that the experiment reports, there is no need for any custom language query. Also all created filter/sorted table can be shared exactly as they are, so you can create leaderboards and share specific filters. You can also use the search bar in order to filter based on experiment name / comment. Tags will be added soon as well 🙂

Example of custom columns is here (the screen grab is a bit old, now there is als...

5 years ago

0 Hi There! I'M (Again) Having Trouble With The Lack Of Documentation Regarding Task.Get_Tasks(Task_Filter={Stuff}). The Documentation Refers To Getallrequest, For Which I Couldn'T Find The Docs, And Reading The Code Was Only Partially Helpful. So I'Ve Actu

Hi StickyMonkey98

I'm (again) having trouble with the lack of documentation regarding Task.get_tasks(task_filter={STUFF}).

Yes we really have to add documentation there... Let me add that to the todo list

How do I filter tasks by time started? It seems there's a "started" property, and the web ui uses "started" as a key-word in the url query, but task_filter results in an error when I try that...Is there some other filter keyword for filtering by start-time??

last 10 started ...

4 years ago

0 Brand New User Here. I’M Trying To Run An Optimization Task. The Tasks Resulting From The Optimization All Fail Because A Necessary Package Is Not Installed On Them. I Checked The Template Task And The List Of “Installed Packages” Indeed Does Not Have One

Hi BeefyHippopotamus73

. I checked the template task and the list of “Installed Packages” indeed does not have one of my required packages in the list.

Basically the "installed packages" is auto populated based on the directly imported packages n your code base.
Could it be you do not have import snowflake-connector-python and this is a derivative package (i.e. required from a different package)

BTW: when you clone your Task in the UI you can edit and add the missing packages,...

3 years ago

0 Hi, Is There A Way To Get The Quota Used By Each Task? My "Metrics" Quota Is Filling Up Very Quickly And I Would Like To Understand What'S Causing It.

Hi @<1570220858075516928:profile|SlipperySheep79>
I think this is more complicated than one would expect. But as a rule of thumb, console logs and metrics are the main ones. I hope it helps? Maybe sort by number of iterations in the experiment table ?

BTW: probable better to ask in channel

2 years ago

0 Hi! I Am Currently Using Hydra+Clearml And Wanted To Know If There Are Still Some Updates Coming. At The Moment, If I Change The Defaults Hydra Uses From The

The -m src.train is just the entry script for the execution all the rest is be taken care by the Configuration section (whatever you pass after it will be ignored if you are using Argparse as it is auto-connects with ClearML)
Make sense ?

4 years ago

0 Hello Again, How Can I Use The

Basically run the 'agentin virtual environment mode JumpyDragonfly13 try this one (notice no --docker flag) clearml-agent daemon --queue interactive --create-queue Then from the "laptop" try to get a remote session with: clearml-session `

4 years ago

0 Hello Again, How Can I Use The

Our remote machine is Windows 10

JumpyDragonfly13 seems like the Windows 10 + docker is the issue (that would explain the OCI error)
Is this relevant ?
https://github.com/microsoft/WSL/issues/5100

4 years ago

0 Hi All! Is There A Way For Trains To Recognize The Cli Arguments When Using

Just making sure I understand, basically same ArgParser support we already have, but for python-fire (which is the ability to automatically log the arguments, and then change them when executed by trains-agent), correct?
If this is the case, are you familiar with the implementation of python-fire ? What I'm looking for is where exactly the parsing happens, so we could patch it, and log/override values

4 years ago

0 I Have A Self-Hosted Clearm-Server And And Clearml-Agent Started With

Could you run your code not from the git repository.
I have a theory, you never actually added the entry point file to the git repo, so the agent never actually installed it, and it just did nothing (it should have reported an error, I'll look into it)
WDYT?

4 years ago

0 Another Issue Is The Agent Uses Python 2 For Some Reason Even Though Locally I’M Using Python 3 And The Agent Is Supposed To Use A Python 3 Venv.

If this doesn't help.
Go to your ~/clearml.conf file, at the bottom of the file you can add agent.python_binary and change it to to the location of python3.6 (you can run which python3.6 to get the full path):
agent.python_binary: /full/path/to/python3.6

4 years ago

0 Trying To Setup A Trains-Agent Worker On A Remote Machine; When I Run Trains-Init And Follow The Steps To Give It Credentials For Our Trains Server I Get This

Could you manually configure the ~/trains.conf ?
(Just copy paste the section from the UI)
then try to run:
trains-agent list

4 years ago

0 Hi Folks! I'M Using

Verified and fix, should be deployed soon in RC 🙂

4 years ago

Show more results