EnthusiasticCow4

Moderator

34 Questions, 129 Answers

Active since 11 March 2023

Last activity 2 months ago

Reputation

Badges 1

108 × Eureka!

Questions 34
Answers 129

0 Votes

6 Answers

998 Views

0 Votes 6 Answers 998 Views

I Have An Interesting Issue. It Seems Like When I Run Several Agents On The Same Server I Run Into An Issue Where Some Of The Agents Will Timeout When Connecting To Github. Each Task In The Pipeline Is Pulling From The Same Repo And It'S Working For Some

I have an interesting issue. It seems like when I run several agents on the same server I run into an issue where some of the agents will timeout when connec...

clearml

one year ago

0 Votes

5 Answers

1K Views

0 Votes 5 Answers 1K Views

Hello All, I'M Trying To Queue A Task In Python But I'D Like To Reuse The Prior Task Id. In The Webapp You Can

Hello all, I'm trying to queue a task in python but I'd like to reuse the prior task ID. In the webapp you can Reset and Enqueue a task and it'll reuse the t...

clearml

one year ago

0 Votes

19 Answers

218 Views

0 Votes 19 Answers 218 Views

Hi All

Hi all 😄 The hyperparameter tuner functionality has just stopped working. When I try and launch an instance of the tuner it gets stuck at "This instance has...

clearml

2 months ago

0 Votes

5 Answers

1K Views

0 Votes 5 Answers 1K Views

Is There Any Way To Exclude Archived Datasets From Dataset.List_Datasets()?

Is there any way to exclude archived datasets from Dataset.list_datasets()?

clearml

one year ago

Show more results questions

0 Hope Everyone'S Having A Nice Holiday Period. I'Ve Been Debating Between Cron And The Clearml Taskscheduler Cron Is The Solution I'M Currently Using But I Wanted To Understand The Advantages To Using The Taskscheduler. Right Now I'M Using The Classic Cro

Thanks so much @<1523701205467926528:profile|AgitatedDove14> !

one year ago

0 I'M Trying To Spin Up A Task On An Agent And Inside The Task I Have Two Packages That I'Ve Created Custom Versions Of And Specified A Git Repo For In The Requirements.Txt. Example With Hydra-Core And Omegaconf:

Actually this is not how it works, pip will install in any way it sees fit, and it is not consistent between versions (it has to do with dependency resolving)

Oh I see. What a pain. 🤣

You can configure the agent to first install specific packages, and only then others, just add the package names here:

That's an interesting solution. I'll keep that in mind as I work more with ClearML.

Thanks for your help Martin!

one year ago

@<1523701087100473344:profile|SuccessfulKoala55> You wouldn't happen to know what's going on here. :D

one year ago

0 Why Is One Of The Ghosts From Packman In The Corner Of My Workspace Folders?

Ah, that makes sense. What is supposed to be hidden changes depending on the section your in, which makes sense. Now there needs to a packman sprite easter egg hidden somewhere else.

one year ago

0 Hi All, Details: Both Projects Are Using Clearml V1.14.2Rc0 (But I'Ve Tested It With Other Versions). I'M Using The Web-App, So We'Re Not Hosting Our Own Cleaml-Server We Do Have A Server With Several Cleaml-Agents V1.7.0 I'M Running Into A Seemingly Co

Project 2:

2024-01-22 17:21:56
task 6518c3cd13394aa4abbc8f0dc34eb763 pulled from 8a69a982f5824762aeac7b000fbf2161 by worker bigbrother:10
2024-01-22 17:22:03
Current configuration (clearml_agent v1.7.0, location: /tmp/.clearml_agent.bojpliyx.cfg):
----------------------
agent.worker_id = bigbrother:10
agent.worker_name = bigbrother
agent.force_git_ssh_protocol = true
agent.python_binary = /home/natephysics/anaconda3/bin/python
agent.package_manager.type = pip
agent.package_manager.pip_v...

one year ago

0 In My Current Project I Generate The Data From An Sql Query. Is The Only Way To Register The Dataset With Clearml To Write The Files To Disk First Or Is There Another Method? This Leads Into The Second Issue I Have, Which Is What Happens When I Store The

The original file sizes are the same but the compressed sizes seem to be different.

one year ago

✨ It works ✨

Thanks @<1523701205467926528:profile|AgitatedDove14> 😁

one year ago

Stability is for wimps. I live on the edge of brining down production at any moment, like a real developer. But thanks for the update! 🙃

one year ago

0 I'M Running Into A Perplexing Issue. I Have Several Agents Running On A Workstation, I Also Am Directly Running Code From The Same Workstation. There Are Several Projects On The Workstation But One Of The Projects Is Struggling With Authentication To Git

1707128614082 bigbrother:gpu0 INFO task 59d23c5919b04fd6947c1e463fa8c78c pulled from 9890a035b8f84872ab18d7ff207c26c6 by worker bigbrother:gpu0

Current configuration (clearml_agent v1.7.0, location: /tmp/.clearml_agent.vo_oc47r.cfg):
----------------------
agent.worker_id = bigbrother:gpu0
agent.worker_name = bigbrother
agent.force_git_ssh_protocol = true
agent.python_binary = /home/natephysics/anaconda3/bin/python
agent.package_manager.type = pip
agent.package_manager.pip_version.0 = ...

one year ago

Results:

I first tried uncommenting enable_git_ask_pass: false but it didn't resolve the issue.

I then cleared the cache in the vcs-cache folder, and that did fix the issue. This is the second time the cache seemed to have been the root cause of the problem. At some point I did move from token-based auth to ssh keys. Would this require clearing the cache for any project that was cached prior to the auth change?

one year ago

In the debugger I can see that before starting the scheduler the test task is added:

ScheduleJob(name='Snitch-TaskScheduler', base_task_id='', base_function=<function main.<locals>.scheduler_function.<locals>.<lambda> at 0x7f05e1ab3600>, queue='services', target_project='DevOps', single_instance=False, task_parameters=None, task_overrides=None, clone_task=True, _executed_instances=None, execution_limit_hours=None, recurring=True, starting_time=datetime.datetime(2024, 1, 17, 10, 50, 28,...

one year ago

Alright, I deleted everything in the ClearML web-app waited a day tried again, it seems to be showing a configuration object in the configuration section of the scheduler task again. I honestly don't know what changed. Maybe some strange caching on the server side that got cleaned up.

@<1523701205467926528:profile|AgitatedDove14> Question: Does the schedule_function option in the TaskScheduler.add_task() method run at the time the task is scheduled to execute? So if I pass a functi...

one year ago

Strange, the code seems to work perfectly when I run it locally. To make it more confusing, the queue that I enqueue it to when I run it remotely is using agents from the same server that I'm running it locally from.

one year ago

0 Hi All

From the logs it looks like the HPO application finds a worker from the queue, attempts to serialize the config sent to the worker, and crashes because of the version conflict with Pyro4. But I don't think we control any of that. I might be misunderstanding something. 🙃

2 months ago

0 Hello All, I Want To Clarify Something. In The

@<1523701435869433856:profile|SmugDolphin23> Yeah, I just wanted to validate it was worth spending the time. Since there is already a parameter that takes callable (i.e. schedule_function ) it might make sense that we reuse the parameter. If it returns a str we validate that it's a task and if it does we can run the task as if we originally passed it as the task_id in .add_task() . This would only be a breaking change if the callable that was passed happened to return a task_id ...

one year ago

Looks good 😄

one year ago

0 Hello All, I Want To Clarify Something. In The

Should I post this in dev?

one year ago

0 Hi All

Yes, but only because you asked so nicely 😚

2 months ago

Yes, I'm experimenting with this. I actually wrote my own process to do this so I just had to adapt it as a callable to pass to the scheduler. However, I'm running into an issue and I don't think this is a user error this time. When I start the scheduler, it starts running, shows up in the web-app, but then an error message in the web-app pops up Fetch parents failed and the Scheduler task disappears from the web-app. I can't even see an error log because the task is gone.

I'm running th...

one year ago

Maybe the sleep between scheduler.mark_completed() and scheduler.delete() is too short? But I don't get why deleting the old scheduler task would break the new scheduler. I'm going to try testing by running the scheduler locally.

one year ago

0 Question About Clearml Models And Metadata. I'M Using The Web Interface For Clearml. I Create An Trained An Output_Model That Was Stored In Clearml From A Training Process And On The Web App I Manually Added Some Metadata To The Model. In A Separate Task

On the original model.

one year ago

I'm not sure why the logs were incomplete. I think part of the reason it wasn't pulling from the repo was that it was pulling from cache. I cleared the clearml cache for that project and reran it. This should be the full log.

one year ago

0 Discovered An Issue With Clearml-Session Where We Have The Agents Running Within A Tailscale Network. When The Clearml Session Is Local On The Same Physical Network, Connections Work Fine. But When We Are On The Virtual Network, They Dont Work Fine

Sure. I'm in Europe but we can also test things async.

one year ago

0 If I Ran A Hyperparemeter Sweep And I Wanted To Create A Graph Where The X-Axis Was One Of The Hyperparameters, Let'S Say The Momentum Term Of The Optimizer, And I Wanted To Plot That Vs The Min-Loss Over All Epochs, Is There A Good Way To Do This With Cl

I see, that could work.

one year ago

0 Hello Again, Is There Any Way To Get A List Of The Datasets From Clearml That Excludes Archived Datasets?

@<1539780284646428672:profile|PoisedElephant79> Are you sure you're not simply referring to the get operation? That seems to exclude archived datasets. But I don't see anything like that for the list_datasets operation.

one year ago

0 Hello All, I'M Trying To Queue A Task In Python But I'D Like To Reuse The Prior Task Id. In The Webapp You Can

This turns out to be a layer-8 error . task.execute_remotely does work but there was a bug in my code and I wasn't correctly setting the reuse_task flag when run. Sorry to bother the both of you with my mistake.

one year ago

0 Hello All, I'M Trying To Queue A Task In Python But I'D Like To Reuse The Prior Task Id. In The Webapp You Can

No error. Just a new task each time.

one year ago

0 Is There Any Way To Exclude Archived Datasets From Dataset.List_Datasets()?

Sounds good. 👍

one year ago

0 Is There Any Way To Exclude Archived Datasets From Dataset.List_Datasets()?

None ✅

one year ago

0 Hello Again, Is There Any Way To Get A List Of The Datasets From Clearml That Excludes Archived Datasets?

I had 2 datasets on archive and 0 unarchived. When I ran the following command:

Dataset.list_datasets(dataset_project=self.task.get_project_name(), only_completed=True)

It returned two entrees for the two datasets I had on archive.

one year ago

Show more results compactanswers