AgitatedDove14

49 Questions, 8124 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

25 × Eureka!

Answers 8124

0 Hi! I'M Running Launch_Multi_Mode With Pytorch-Lightning

Hi @<1578555761724755968:profile|GrievingKoala83>

Two tasks are created, but the training does not begin, both tasks are in perpetual running.

Can you print something after the task.launch_multi_node(args.nodes)) - I'm assuming the two Tasks are running and are blocked on the " Trainer " class

If specified

args.gpus=2

and args.nodes=2,

three

tasks are created.

This is really odd, can you add some prints with task id and rank after the ...

one year ago

0 How Can I Log My Configuration Like This? I Have A Dict Params = {'Data':{'Data_Key':123}, 'Model':{'Model_Key':123}}, But It Become Data/Datakey Instead Of An Foldable Config. In Addition, I Don'T Want To Name It As "General", Where Can I Change It?

NICE!

5 years ago

0 Clearml Task Execution Fails Trying To Pull Data From Gitlab. The Credentials Are Correct (Username + Access Token), But I Get This Error:

Any chance you can share the Log?
(feel free to DM it so it will not end up public)

2 years ago

0 Clearml Task Execution Fails Trying To Pull Data From Gitlab. The Credentials Are Correct (Username + Access Token), But I Get This Error:

your account has 2FA enabled and you must use a personal access token instead of a password.I'm assuming you have created the personal access token and used it, not the pass

2 years ago

0 How Can I Ensure Tasks In A Pipeline Have The Same Environment As The Pipeline Itself? It Seems A Bit Counter-Intuitive That The Pipeline (Executed Remotely) Captures The Local Environment, But The Tasks (Executed Remotely) Do Not Use That Same Environmen

This then looks for a module called foo , even though it’s just a namespaceI think this is the issue, are you using python package name spaces ?
(this is a PEP feature that is really rarely used, and I have seen break too many times)
Assuming you have from from foo.mod import what are you seeing in pip freeze ? I'd like to see if we can fix this, and better support namespaces

2 years ago

0 Hi All, I Was Wondering If It Is Possible To Set The Aws Autoscaler (And Other Aws Services Such As S3) To Assume The Permissions Of A Specific Iam Role. I Didn'T Find Any Reference To This In The Documentation

LovelyHamster1 what do you mean by "assume the permissions of a specific IAM Role" ?
In order to spin an ec2 instance (aws autoscaler) you have to have correct credentials, to pass those credentials you must create a key/secret pair to pass to the autoscaler. There is no direct support for IAM Role. Make sense ?

4 years ago

0 Clearml Task Execution Fails Trying To Pull Data From Gitlab. The Credentials Are Correct (Username + Access Token), But I Get This Error:

👍

2 years ago

I use Yaml config for data and model. each of them would be a nested yaml (could be more than 2 layers), so it won't be a flexible solution and I need to manually flatten the dictionary

Yes, you are correct, the recommended option would be to store it with task.connect_configuration it's goal is to store these types of configuration files/objects.
You can also store the yaml file itself directly just pass Path object instead of dict/string

5 years ago

0 Clearml Task Execution Fails Trying To Pull Data From Gitlab. The Credentials Are Correct (Username + Access Token), But I Get This Error:

Was I right to put the credentials in

clearml.conf

on the machine I am starting the agent on?

AdventurousButterfly15 Yes exactly!
you should be able to see that in the log of the Task (at the top of the log there will be the entire configuration), can you see the git user there?

2 years ago

Yes EnviousStarfish54 the comparison is line by line and compared only to the left experiment (like any multi comparison, you have to set the baseline, which is always the left column here, do notice you can reorder the columns and the comparison will be updated)

5 years ago

0 Hi All, I Have An Issue With The Way Hyper Parameters Are Logged Under Configuration, The Values That Are Stored Seem To Add Unnecessary Escape Characters To The Original Values.. Is It A Known Issue? Is There A Way To Change It? Thanks

BTW:
str('\.') Out[4]: '\\.' str(('\.', )) Out[5]: "('\\\\.',)"This is just python str casting

4 years ago

0 Greetings, Could You Please Clarify If It Is Possible To Reinstall All Packages Every Time? For Example, I Tried To Start The Agent With Docker Options And Got The Following Message:

Hi ResponsiveCamel97
Let me explain how it works, essentially it creates a new venv inside the docker, inheriting all the packages form the main system packages.
This allows it to use the installed packages if the version match, and upgrade/change if you need, all without the need to rebuild a new container. Make sense ?

4 years ago

0 Hi, I'M Trying To Run Task.Init Inside A Jupyter Notebook For The First Time (Used It A Lot Before In Normal Python Scripts), And I Get A Warning-

Nice! I'll see if we can have better error handling for it, or solve it altogether 🙂

4 years ago

0 Hi There :) Can Anybody Tell Me What The Best Practice Is For Performing A Normalization In The Preprocess.Py Script Used By Clearml-Serving? Currently I Use A Sklearn Minmaxscaler Which Is Loaded And Applied Before And After The Data Is Send To The Model

And as far as I can see there is no mechanism installed to load other objects than the model file inside the Preprocess class, right?

Well actually this is possible, let's assume you have another Model that is part of the preprocessing, then you could have:
something like that should work

def preprocess(...)
    if not getattr(self, "_preprocess_model):
        self._preprocess_model = joblib.load(Model(model_id).get_weights())

2 years ago

0 Hello! I Think I'Ve Found A Bug, But Couldn'T Fix It Completely To Make A Pull Request. I Want To Optimizer Hyperparameters With Trains.Automation But:

PungentLouse55 I'm checking something here, you might stumbled on a bug in parameter overriding. Updating here soon ...

4 years ago

0 Is It Possible To Set An Environment Variable For A Task?

if project_name is None and Task.current_task() is not None: project_name = Task.current_task().get_project_name()This should have fixed it, no?

4 years ago

0 Did Someone Here Already Try The

I can then programmatically choose which file to import with importlib. Is there a way to tell clearml programmatically to analyze the files, so it can built up the requirements correctly?

Sadly no 😞
It analyzes the running code, then if it decides it is not a self contained script it will analyze the entire repo ...

I just saw that

Task.create

takes

Task.create is Not Task.init. It is meant to allow you to create new Tasks (think Jobs) from ...

4 years ago

0 What Is The Recommended Way To Stop The Execution Of A Specific Agent? This Command Doesn'T Allow Me To Specify The Agent Ip I Want To Stop:

GiganticTurtle0 can you please add a github issue with feature request to clearml-agent? I think this is a great use case!

4 years ago

0 Hi Guys, Just Wondering If Anyone Encountered This Error When Using The Pipeline Controller Object. I Simply Added A Step With The Step-Name And Base_Task_Id As Flags.

and the clearml server version ?

4 years ago

0 Hi, Does Anyone Use Mlflow / Weight & Biases /

What do you have in the .netrc in the machine section

5 years ago

0 How Do People Solve This? If I Am Pip Installing A Custom Package From .Tar.Gz, How Can I Ensure That If I Run The Experiment (Initially Run From A Notebook) Via The Queueing It Can Be Properly Installed Steps - Notebook -> Get A Tar.Gz From S3 -> Pip I

TrickySheep9 is this a conda package or a wheel you are installing manually ?

4 years ago

0 For Remote Execution Where The Queue Has

Hi UnevenDolphin73
If you "remove" the lock file the agent will default to pip.
You can hack it with uncommitted changes section?

2 years ago

0 Hi, When Trying To Use A Remote Agent To Train A Model, The Initial Environment Setup On The Remote Machine Fails Because The List Of Requirements Located In /Tmp/Cached-Reqsaw90Argk.Txt Contains A Link To An Aarch64 Wheel:

Thanks for the details TroubledJellyfish71 !
So the agent should have resolved automatically this line:
torch == 1.11.0+cu113 into the correct torch version (based on the cuda version installed, or cpu version if no cuda is installed)
Can you send the Task log (console) as executed by the agent (and failed)?
(you can DM it to me, so it's not public)

3 years ago

0 Web Server Ui Bug? When Trying To Extend The Width Of A Column In The Experiments Table, If You Try To Extend It More Then The Width Of The Column To The Right, It Doesn'T Do Anything..

Hmm I just tested on the community version and it seems to work there, Let me check with frontend guys. Can you verify it works for you on https://app.community.clear.ml/ ?

3 years ago

0 If I Have A Dataset And I Process It And I Want The Processed Data As Another Dataset, Is Parent The Right Approach?

LOL AlertBlackbird30 had a PR and pulled it 🙂
Major release due next week after that we will put a a roadmap on the main GitHub page.
Anything specific you have in mind ?

4 years ago

0 Hello There, Is There A Parameter To Configure The Number Of Columns Rendered In The Preview Area Of The Csv Artifacts? (Some Of Them Are Truncated With “…”)

I love the new docs layout!

Thank you and thank docusaurus, they rock!

4 years ago

Hi DepressedChimpanzee34
How do I reproduce the issue ?
What are we expecting to get there ?
Is that a Colab issue or hyper-parameter encoding issue ?

4 years ago

0 Hi Everyone, Is There A Way To Increase The Cache Size Of Each Clearml Task? I'M Running An Experiment And Many Artifacts Are Downloaded. My Dataloader Fails To Load Some Of The File Since They Are Missing, Although They Were Downloaded. I Guess There Is

And when retrieve just this file? is it working ?
(Maybe for some reason the file is corrupted) ?

3 years ago

ScaryKoala63 nice!!!!!

3 years ago

Is this consistent on the same file? can you provide a code snippet to reproduce (or understand the flow) ?
Could it be two machines are accessing the same cache folder ?

3 years ago

Show more results