AgitatedDove14

48 Questions, 8049 Answers

Active since 10 January 2023

Last activity 6 months ago

Reputation

Badges 1

25 × Eureka!

Questions 48
Answers 8049

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Hello Everyone!

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

@YummyWhale40 you are saying the example code is not working when running with the demo server? Also I think I was able to view your experiment on the demo server, and do get the Scalars without any issues...

YummyWhale40 you are saying the example code is not working when running with the demo server? Also I think I was able to view your experiment on the demo se...

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Https://M.Facebook.Com/Story.Php?Story_Fbid=2484620658505570&Id=1620822758218702&Refid=52&__Tn__=-R

https://m.facebook.com/story.php?story_fbid=2484620658505570&id=1620822758218702&refid=52&tn=-R

clearml

4 years ago

0 Votes

1 Answers

921 Views

0 Votes 1 Answers 921 Views

Gals, Guys &

Gals, Guys & :robot_face: , if you want to checkout the Hyper-Parameters automation (Using Bayesian Optimization Hyper-Band) We have an example on the demo s...

clearml

4 years ago

0 Votes

2 Answers

1K Views

0 Votes 2 Answers 1K Views

Hi ClearML v0.17.1 and ClearML-Agent v0.17.0 are now the official packages & repositories 🎉 🎊 👋 🛤️ This new name brings on many changes, mainly replace a...

clearml

3 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Hi Guys/Gals, If You Want To Checkout The Latest Rc We Have 0.15.0Rc0 Out :

Hi Guys/Gals, If you want to checkout the latest RC we have 0.15.0rc0 out : pip install trains==0.15.0rc0 pip install trains-agent==0.15.0rc0Many of the impr...

clearml

4 years ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

This Is Usually Due To Enterprise Level Issued Https Certificates Not Part Of The Local Installation (Basically Any Python Generated Ssl Request Will Fail)

This is usually due to enterprise level issued https certificates not part of the local installation (basically any python generated SSL request will fail)

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

We Are At Aaai Ny, Come Look Us Up :)

We are at AAAI NY, come look us up :)

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

New Rc For Trains-Agent Is Out

New RC for trains-agent is out pip install trains-agent==0.13.2rc1

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Is You Server Using Https ?!

Is you server using https ?!

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

you set it :slightly_smiling_face:

you set it 🙂

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

docs are up

clearml

4 years ago

0 Votes

0 Answers

992 Views

0 Votes 0 Answers 992 Views

Hey <!here> Just a heads up, starting *Jan 25th*, the default <http://demoapp.demo.clear.ml/|ClearML demo server> will move to a *daily* reset cycle (replacing the current weekly cycle). Anybody needing more than 24h data retention is welcome to use our <

Hey Just a heads up, starting Jan 25th , the default http://demoapp.demo.clear.ml/ will move to a daily reset cycle (replacing the current weekly cycle). Any...

clearml

3 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Lol, I Wonder What The Adblock Rule Was ;)

Lol, I wonder what the adblock rule was ;)

clearml

4 years ago

0 Votes

3 Answers

976 Views

0 Votes 3 Answers 976 Views

This Will Close It

This will close it Task.current_task().close()I think we should rename completed() because it just marks the Task as completed on the backend but does not ac...

clearml

3 years ago

0 Votes

2 Answers

958 Views

0 Votes 2 Answers 958 Views

Hi ! trains 0.16.2 is finally out with the new pipelines interface! Check out the new example https://github.com/allegroai/trains/blob/master/examples/pipeli...

clearml

4 years ago

0 Votes

0 Answers

972 Views

0 Votes 0 Answers 972 Views

Gals, Guys & :robot_face: If you want to get some inspiration on building DL Continuous Integration pipelines, I suggest this post (obviously built on top of Trains :smile_cat: ) <https://twitter.com/PyTorch/status/1272919483980500999>

Gals, Guys & :robot_face: If you want to get some inspiration on building DL Continuous Integration pipelines, I suggest this post (obviously built on top of...

clearml

4 years ago

0 Votes

0 Answers

975 Views

0 Votes 0 Answers 975 Views

<!here> New video is out :slightly_smiling_face: Cloud Autoscalers are awesome <https://www.youtube.com/watch?v=j4XVMAaUt3E>

New video is out 🙂 Cloud Autoscalers are awesome https://www.youtube.com/watch?v=j4XVMAaUt3E

clearml

2 years ago

Show more results

0 Hey All. I Need Some Help Debugging Some Errors. I Keep Getting An Error About Failing To Clone The Repository On The Remote Instance. What Could Be The Reason Of This? Are There Any Common Errors Related To This? I Suspect Permissions, But Not Entirely

Hi @<1687643893996195840:profile|RoundCat60> , I just saw the message,

Just by chance I set the SSH deploy keys to write access and now we're able to clone the repo. Why would the SSH key need write access to the repo to be able to clone?

Let me explain, the default use case for the agent is to use user/pass (as configured in the clearml.conf file(
It will change any ssh links to https links and will add the credentials to clone the repository.
You can also provide SSH keys (basicall...

3 years ago

That is a bit odd, But SSH keys have to have a specific chmod flags for them to work (security issues)
What was the error ?

3 years ago

0 Is There Any Testing Suite That Ships With Clearml? If We'D Like To Make Some Unit Tests For Our Code?

Last but not least - can I cancel the offline zip creation if I'm not interested in it

you can override with OS environment, would that work?

Or well, because it's not geared for tests, I'm just encountering weird shit. Just calling

task.close()

takes a long time

It actually zips the entire offline folder so you can later upload it. Maybe we can disable that part?!

` # generate the script section
script = (
"fr...

one year ago

0 Hi. I Get Some Problem With Clearml Agent. I Start Training On My Local Device, Clone Run, And Start This Run In Docker On Cluster. But, Seems Like Clearml Agent Сaches Environment(Package Weels, Python Version, Etc). Can I Config Clearml Agent To Not Сac

Hi StickyBlackbird93
Yes, this agent version is rather old ( clearml_agent v1.0.0 )
it had a bug where pytorch wheel aaarch broke the agent (by default the agent in docker mode, will use the latest stable version, but not in venv mode)
Basically upgrade to the latest clearml-agent version it should solve the issue:
pip3 install -U clearml-agemnt==1.2.3BTW for future debugging, this is the interesting part of the log (Notice it is looking for the correct pytorch based on the auto de...

2 years ago

ClearML maintains a github action that sets up a dummy clearml-server,

You have one, it's the http://app.clear.ml (not a dummy one, but for this purpose it will work)
thoughts ?

2 years ago

0 I'M On The Machine With Clearml Server Hosted. Is There Any Way To See Datasets Uploaded To Clearml Data Without Downloading Them Using Clearml Data?

s there any way to see datasets uploaded to ClearML Data without downloading them using ClearML Data?

Hi VexedCat68
Currently when you create datasets with clearml-data it has to repackage your files, i.e. upload them. That said we have received numerous requests on "registering data", and we are looking into it.
Here is the main technical hurdles we are facing, and I would love to get your perspective:
If the data is not available locally, we cannot calculate the hash of the conten...

2 years ago

0 Another Question, I Have Written A Code That Includes A Task Scheduler That Calls A Function. That Function Watches A Folder And If There Are Sufficient Images, It Creates And Publishes The Dataset, After Which It Clears The Folder. Problem, For Some Rea

why are there indefinitely growing anonymous tasks, even after i've closed the main schedulers.

The anonymous Tasks are The Dataset you are creating (a Dataset version is also a Task of a certain type with artifacts, the idea is usually Datasets are created from code, hence the need to combine the two).
Make sense ?

2 years ago

0 Hi Clm Community, I Am Having An Issue With A Private Package Install When Using Clearml-Agent By Env (Not Docker). Things I Have Tried: Adding The Package Repository To "Extra_Index_Url" Adding The Conda Env To "Python_Binary" Listing The Libraries To I

FlatStarfish45

In the parent task, the libs appear installed.

What do you mean by "parent Task"? Is this the base task we are optimizing (i.e. the experiment / model we are optimizing) ?
Or is it the "Optimization Task" itself?

2 years ago

0 I'M Trying To Run A Task On An Agent. I'Ve Passed The Requirements File But It Isn'T Able To Install It. The Error Is In The Reply. Help Would Be Appreciated.

Hi VexedCat68
Could it be the python version is not the same? (this is the only reason not to find a specific python package version)

2 years ago

0 Hi! Is There A Simple Way To Visualize Tensors In Clearml? Something Like Tensorboard'S Tsne Or Pca...

FrustratingWalrus87 Unfortunately TB's TSNE is not automatically captured by ClearML (Scalars, histograms etc. are)
That said, matplotlib will be automatically captured do you can run your own PCA/tSNE and use matplotlib to visualize (ClearML will capture it).
The same applies for plotly.
What do you think?

3 years ago

0 Dear Community! I'M Trying Out A New Way To Make Clearml-Related Content. I'D Like Your Opinion On Whether This Is Something You Would Consider Watching (Provided Editing And Content Is A Little Bit Better

The Commodore 64 theme is hilarious

3 years ago

0 Hi, I Need Your Help Setting Up An Trains Agent Running In Docker. I Have An Python Script Calling Wget As System Command Which Runs Fine On My Dev Engine. When Cloning The Experiment And Scheduling It Into The Services Queue I Get An Error That The Call

Nice!!!

3 years ago

0 Hey Everyone! Is It Possible To Trigger A Pipeline Run Via Api? We Have A Repo That Builds An Image For Serving To Clearml Server But We'Ve Wrapped It Inside A Fastapi Application So It Can Be Called From Another Web Service.

Hi @<1692345677285167104:profile|ThoughtfulKitten41>

Is it possible to trigger a pipeline run via API?

Yes! a pipeline is at the end a Task, you can take the pipeline ID and clone and enqueue it

pipeline_task = Task.clone("pipeline_id_here") 
Task.enqueue(pipeline_task, queue_name="services")

You can also monitor the pipeline with the same Task inyerface.
wdyt?

5 months ago

Is there any way to make that increment from last run?

pipeline_task = Task.clone("pipeline_id_here", name="new execution run here") 
Task.enqueue(pipeline_task, queue_name="services")

wdyt?

5 months ago

0 I Get These Warnings Whenever I Run Pipelines And I Have No Idea What It Means Or Where It Comes From:

Hmm yeah I think that makes sense. Can you post here the arguments?
I'm assuming you have something like '1.23a' in the arguments?

5 months ago

0 Hi All, It Seems After Sync Command, Finalize Is Not Working: Please Let Me Know If I Am Missing Anything.

@<1720249421582569472:profile|NonchalantSeaanemone34>

dso = Dataset.create(
        dataset_project= project_name,
        dataset_name= dataset_name,
        parent_datasets=[parent_datasets_id],
)
dso = Dataset.get(
        dataset_project= project_name,
        dataset_name= dataset_name,
        only_completed=True,
        only_published=False,
        alias='latest',
)

why are you creating a dataset then getting a dataset on the same object?
it seems you are trying to upload...

2 months ago

0 Hey All, Quick Question About Pipeline Execution Queues. I Set The

This workflow however is the only way I have found to easily fix my previous ‘Module not found’ errors

Hmm okay make sense,
Did you try to set these ?
or even hack the sys.path with something like
import sys, os sys.path.insert(0, os.path.abspath(os.path.dirname(__file__)+"/../")

one year ago

0 Hi, I'M Trying To Clone And Queue Experiments For Running Them On My Workers. I Am Able To Successfully Clone And Queue The Task, But Seems Like The Task Does Not Pass The Correct Parameters To My Python Script On The Worker. We Use Hydra For Configuring

Oh no, you are absolutely correct, it is broken (I mean I have no idea why it lists Hydra, or how it got there). I will let the guys know and fix it.
Bottom line, after you clone it, please edit the installed packages and remove the "Hydra" line and replace with just "hydra-core" (no need for version).
The format is the same as "requirements.txt" and will effect the venv created by the agent

2 years ago

0 We Have A Environment Variables Definitions.Py File Which Every User Configures On Their Local Machine. This File Includes Local Paths As Well As Aws/Api Credentials. This Is An Issue When Spinning Up Clearml Tasks Since It Is Not Included In The Git Repo

that's the downside

2 years ago

My pleasure

2 years ago

0 Any Info On The Lifecycle Of Datasets Downloaded To $Home/.Clearml/Cache/Storage_Manager/Datasets Via Get_Local_Copy I Have A Task Running And I Was Watching The Above Path And Datasets Were Being Downloaded And Then They Are All Removed And For A Partic

And I think the default is 100 entries, so it should not get cleaned.

and then they are all removed and for a particular task it even happens before my task is done

Is this reproducible ? Who is cleaning it and when?

3 years ago

0 Hi, I Am Trying To Setup Multi-Node Training With Pytorch Distributeddataparallel. Ddp Requres A Launch Script With A Set Of Parameters To Be Run On Each Node. One Of These Parameters Is Master Node Address. I Am Currently Using The Following Scheme:

looks like service-writing-time for me!

Nice!

persist/restore state so that tasks are restartable?

You mean if you write preemption-ready training code ?

2 years ago

0 Hi, I Am Running A Pipeline (Which Does Preprocessing And Training) ? Once Training Ends, I Want To Automatically Publish The Task (Model). Reading The Docs, I Tried This Approach Below. I Wrote A

DeliciousBluewhale87

node.base_task_id

is the base task, which will always be in draft mode, Instead we should use the

node.executed

which references the current executed node.

YES, maybe we should add that into the example, so it is clearer ? WDYT?

3 years ago

0 Fyi: Conda Installation Of Pytorch Is Broken Again. My Old Tasks Which Worked Before Now Fail Since They Do Not Find Torch. However, I Can See In The Execution That Conda Had Errors. Most Probably It Happens Because Pytorch 1.8.1 Has Been Released, But I

Yea I know, I reported this

LOL, apologies these days it a miracle I still remember my login passwords 😉

3 years ago

0 Hi, We Are Having An Interesting Issue Here. We Serve Many Users And Each User Has Their Own Credentials In Accessing The Private Git Repo. We Can'T Seem To Find A Way For The End User To Pass In Their Git Credentials When They Run Their Codes In Both Age

Do we have it on the git issue ?

3 years ago

0 Hi, I Have An Agent That Is Running Two Experiments At The Same Time: One That Was Running For A Long Time (11H) And One That The Agent Picked Up Afterwards, While The First One Was Still Running. Context: I Have 3 Agents Up (Not In Docker Mode) And All O

So what you are saying is the workers randomly report on one another's experiments ?

4 years ago

0 Hi! I Have Local Minio Setup, Via Minio Browser I Can Upload 50-100 Mb Per Second As Its Local. But When I Try To Use Task.Upload_Artifact It Uploads 500 Kb Per Second. Does Anyone Have An Idea About This?

There is some overhead, but it should be negligible.

4 years ago

0 Hi Again, I Was Wondering What Would Be A Good Practice With Respect To Saving Different Datasets (While Preprocessing It In Several Steps/Stages). Mainly With The Use Of Remove_Files(). Is It Ok To Delete Raw Data After Preprocessing For Example? In That

Hi CostlyElephant1
What do you mean by "delete raw data"? Data is always fetched to cached folders and clearml takes care of cache cleanup
That said notice that get mutable copy is a target you specify, in this case you should definetly delete after usage. Wdyt ?

one year ago

0 I Cannot Get The Configuration From A Task: I Run

Is

mark_completed

used to complete a task from a different process and

close

from the same process - is that the idea?

Yes

However, when I tried them out,

mark_completed

terminated the process that called

mark_completed

.

Yes if you are changing the state of the Task externally or internally the SDK will kill the process. If you are calling task.close() from the process that created the Task it will gra...

one year ago

0 Clearml (Remote Execution) Sometimes Doesn'T "Pick-Up" Gpu. After I Rerun The Task It Picks It Up. Seems Random, Doesn'T Happen Too Often (Maybe Once In 30-40 Times) And I Cannot Seem To Detect Any Pattern. Did Anyone Else Notice This? Agents Are Vms On G

Failed to initialize NVML: Unknown Error

yeah this is a driver issue. I think you need to check the VM image if the drivers match the GPU on that machine

2 months ago

Show more results