AgitatedDove14

49 Questions, 8094 Answers

Active since 10 January 2023

Last activity 10 months ago

Reputation

Badges 1

25 × Eureka!

Questions 49
Answers 8094

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Apparently Everyone Can ...

apparently everyone can ...

clearml

5 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

@PunySquid88 I'm not very familiar with what they do, but it seems that although it has a backend server as an option, it will mostly target single users with what seems like an easy to use single app. From the Reddit thread it seems that it is still not

PunySquid88 I'm not very familiar with what they do, but it seems that although it has a backend server as an option, it will mostly target single users with...

clearml

4 years ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

Quick Note: V1.3.1 Caused Pipelinedecorator Tasks To By Default Disable The Automagic Frameworks Connection, This Bug Is Solved In The Latest Rc

Quick note: v1.3.1 caused PipelineDecorator Tasks to by default disable the automagic frameworks connection, this bug is solved in the latest RC pip install ...

clearml

2 years ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

Gals, Guys &

Gals, Guys & :robot_face: , if you want to checkout the Hyper-Parameters automation (Using Bayesian Optimization Hyper-Band) We have an example on the demo s...

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Gals, Guys & :robot_face: If you want to get some inspiration on building DL Continuous Integration pipelines, I suggest this post (obviously built on top of Trains :smile_cat: ) <https://twitter.com/PyTorch/status/1272919483980500999>

Gals, Guys & :robot_face: If you want to get some inspiration on building DL Continuous Integration pipelines, I suggest this post (obviously built on top of...

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Slack Security ... Go Figure

Slack security ... Go figure 😉

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

@YummyWhale40 you are saying the example code is not working when running with the demo server? Also I think I was able to view your experiment on the demo server, and do get the Scalars without any issues...

YummyWhale40 you are saying the example code is not working when running with the demo server? Also I think I was able to view your experiment on the demo se...

clearml

4 years ago

0 Votes

3 Answers

767 Views

0 Votes 3 Answers 767 Views

We Recently Released A New Version Of

we recently released a new version of clearml-session with Persistent Workspace support! 🚀 🎉 Finally you can develop on remote machines with workspace fold...

remote-ssh

11 months ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Well To Be Honest, We Kind Of Thought It'S Redundant. Basically Storing Artifacts In Experiments And Having Them Retrieved Quickly From The Code Itself Was Way More Convenient For Us Then To Manually Have To Do Clone/Pull Of The Data... Example: Create Da

Well to be honest, we kind of thought it's redundant. Basically storing artifacts in experiments and having them retrieved quickly from the code itself was w...

clearml

4 years ago

0 Votes

1 Answers

746 Views

0 Votes 1 Answers 746 Views

There Is No V1.0 Release Without A Prompt V1.0.1 Following It, And We Are No Different

🙏 There is no v1.0 release without a prompt v1.0.1 following it, and we are no different 😊 pip install clearml==1.0.1

clearml

3 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

:confetti_ball: :champagne: Happy new year <!everyone>! :fireworks: :sparkler: We wanted to thank you all for the great feedback, contribution and general support you guys give us. It is truly fulfilling to see users enjoying the product you build, and y

🎊 🍾 Happy new year ! 🎆 🎇 We wanted to thank you all for the great feedback, contribution and general support you guys give us. It is truly fulfilling to ...

clearml

4 years ago

0 Votes

9 Answers

1K Views

0 Votes 9 Answers 1K Views

Hi https://github.com/allegroai/trains/releases/tag/0.15.1 / https://github.com/allegroai/trains-server/releases/tag/0.15.1 / https://github.com/allegroai/tr...

clearml

4 years ago

0 Votes

7 Answers

708 Views

0 Votes 7 Answers 708 Views

Thank You All For Taking The Time To Answer Our Survey (If You Haven'T Already, We Urge You To

Thank you all for taking the time to answer our survey (If you haven't already, we urge you to do so ). Your feedback has a major impact on what we build, do...

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Lol, I Wonder What The Adblock Rule Was ;)

Lol, I wonder what the adblock rule was ;)

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

<!channel> *important notice* : it seems Nvidia broke some of their PPA's security :confused: , causing `apt-get updates` to fail inside containers. This in term will cause `clearml-agent` to fail with specific Nvidia containers. _If you are seeing simila

important notice : it seems Nvidia broke some of their PPA's security 😕 , causing apt-get updates to fail inside containers. This in term will cause clearml...

clearml

2 years ago

0 Votes

6 Answers

684 Views

0 Votes 6 Answers 684 Views

Hi :robot_face: , humans We have the new documentation site up and running 🎉 None 🎊 This is still a work in progress, so we keep the previous version alive...

clearml

3 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

you set it :slightly_smiling_face:

you set it 🙂

clearml

5 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Is It A One Time Thing? Or Recurring?

Is it a one time thing? or recurring?

clearml

5 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

<!here> Gals/Guys/:robot_face: If you have ideas on improving the Slack Monitoring service, please add them on the dedicated Github Issue : <https://github.com/allegroai/trains/issues/161> For example: generate an alert if my experiment reaches a certain

Gals/Guys/ :robot_face: If you have ideas on improving the Slack Monitoring service, please add them on the dedicated Github Issue : https://github.com/alleg...

clearml

4 years ago

Show more results

0 Hi, I Need Your Help Setting Up An Trains Agent Running In Docker. I Have An Python Script Calling Wget As System Command Which Runs Fine On My Dev Engine. When Cloning The Experiment And Scheduling It Into The Services Queue I Get An Error That The Call

WickedGoat98

for such pods instantiating additional workers listening on queues

I would recommend to create a "devops" user and have its credentials spread across all agents. sounds good?

EDIT:
There is no limit on number of users on the system, so login as a new one and create credentials in the "profile" page :)

4 years ago

0 I Am Back With Another Question: Is There A File Similar To The

ReassuredTiger98 no, but I might be missing something.
How do you mean project-specific?

4 years ago

0 Hey All, Hope You’Re All Doing Well. I’M Running A Self-Deployed Server (0.17, I Think, Where Can You Find The Version In Use?). I’M Having Trouble With The Automatic Plot Capture. If I Run

Could you test if this is working:
https://github.com/allegroai/clearml/blob/master/examples/reporting/matplotlib_manual_reporting.py

3 years ago

0 I Have A Reporting Task I Want To Schedule Using Taskscheduler. 2 Main Input Params Are

LOL totally 🙂

2 years ago

0 So, Here'S A Question. Does Clearml Automatically Save Everything Necessary To Continue Training A Pytorch Language Model? Specifically, I'Ve Been Looking At The Checkpoint Folders Created When I'M Training A Huggingface Robertaformaskedlm. I Checked What

Basically it hooks into any torch.save function (monkey patching in realtime)

3 years ago

If you cannot change the "TrainerState" (i.e. inherit and pass it into the code)
you cloud also monkey-patch it, something like
` class OurTrainerState(TrainerState):
def init(...)
...
def load_from_json(cls, json_path: str):
super().load_from_json(json_path))
Task.current_task().upload_artifact(...)

trainer.state = OurTrainerState(trainer.state) `

3 years ago

yep 🙂

3 years ago

0 With Clearml 1.0 It Seems That Console Logs Are Only Shown In The Web Ui When The Task Has Finished. Is This Expected Behaviour? With Previous Versions I Was Able To See "Live" Output. I Tested This With The Pytorch_Tensorboardx.Py Example. I Run The Scri

Okay, releasing a fix

3 years ago

0 Hey All, Hope You’Re All Doing Well. I’M Running A Self-Deployed Server (0.17, I Think, Where Can You Find The Version In Use?). I’M Having Trouble With The Automatic Plot Capture. If I Run

but could you try with the latest RC?

3 years ago

0 Why Am I Getting A 403 From File Server When The K8 Glue Agent Is Initializing ?

is everything on the same network?

2 years ago

0 Does Clearml Creates Separate Virtual Environments For Each Pipeline Steps When Running Remotely?

what do you mean? the same env for all components ? if they are using/importing exactly the same packages, and using the same container, then yes it could

one year ago

0 Or Is It Just The Ubuntu Official Image

The task pod (experiment) started reaching out to an IP associated with malicious activity. The IP was associated with 1000+ domain names. The activity was identified in AWS guard duty with a high severity level.

BoredHedgehog47 What is the pod container itself ?
EDIT:
Are you suggesting the default "ubuntu:18.04" is somehow contaminated ?
https://hub.docker.com/layers/library/ubuntu/18.04/images/sha256-d5c260797a173fe5852953656a15a9e58ba14c5306c175305b3a05e0303416db?context=explore

2 years ago

0 Hi. I'M Running This Little Pipeline:

Is there any better way to avoid the upload of some artifacts of pipeline steps?

How would you pass "huge datasets (some GBs)" between different machines without storing it somewhere?
(btw, I would also turn on component caching so if this is the same code with the same arguments the pipeline step is reused instead of reexecuted all over again)

2 years ago

0 I Am Back With Another Question: Is There A File Similar To The

BTW: you can always set different config files by with an environment variable:
CLEARML_CONFIG_FILE="path/to/cobfig/file

4 years ago

0 Or Is It Just The Ubuntu Official Image

Are you suggesting the default "ubuntu:18.04" is somehow contaminated ?

This is an official Ubuntu container (nothing to do with ClearML), this is Very Very odd...

2 years ago

0 Avoiding

Be able to trigger the “pure” function (e.g. train()) locally, without any

code running, while driving it from a configuration e.g. path to the data.

When you say " without any http://clear.ml code" do mean without the agent, or without using the Clearml.Dataset ?

Be able to trigger the “

decorator” (e.g. train_clearml()) while driving it from configuration e.g. dataset_id

Hmm I can think of:
` def train_clearml(local_folder=None, dataset_id=None):
...

3 years ago

0 Is There A Way To Generate Usage Stats And Reports For Queues? For Example, How Often Is A Queue Used, How Much Cpu Does

Unfortunately not, the queues tab shows only the number of tasks, but not resources used

in the queue

Oh, yes, that makes sense to add, I like that 🙂
(the main question is what data is there in the backend DBs, let me know what I can get)

2 years ago

0 Hey, I See This In Between My Training Epochs, What Could Be Causing This? Because I See No Affect Of The Following

Hi SmarmyDolphin68

I see this in between my training epochs, what could be causing this?

This is basically saying we are saving a second model on the same Task and even though both are logged, only the last is stored on the Task itself.
This will change as in the next version a Task will be able to hold reference to multiple models in the artifactory 🙂

3 years ago

0 Having Issues Running Trains-Server On Win10. Trains-Elastic Exited With Code 137 Trains-Mongo Exited With Code 100 Trains-Apiserver Exited With Code 1 Some Errors=> Requests.Exceptions.Connectionerror: Httpconnectionpool(Host='Elasticsearch', Port=9200

LazyLeopard18 nice. maybe we should add it in the FAQ / Install. Could you send the exact docker-compose you used and command line, I'll ask the guys to add it 🙂

4 years ago

0 Whet Is The Method For Packages Exploration When Using Conda? Agent Is Set To 'Conda' Mode. We Upload A Task From A Local Conda Env That (Obviously) Has Some Pip Packages As Well. When We Enqueue The Task To Run Remotely, Not All Conda Packages Are Instal

It's in my local conda environment though.

Meaning this is a wheel installed manually in conda? or is it a folder inside the conda environment ?

3 years ago

0 I Just Deployed Clearml Into K8 Cluster Using Clearml Helm Package. When I Ran A Job, It Gave This Error In The Clearml Web Server (Attached Below). I Sshed Into The Pod Running The Clearml-Agent. Upon Typing Clearml-Agent Init, I Realised The Clearml.Con

Ohh okay something seems to half work in terms of configuration, the agent has enough configuration to register itself, but fails to pass it to the task.
Can you test with the latest agent RC:
0.17.2rc4

3 years ago

Ohhh yes, that the problem

3 years ago

0 Hi (Again... Sorry For Asking So Many Questions) Question About Using Google Cloud Storage In A Clearml Agent Running In Aws Ec2 Instance. My

in Your Additional ClearML Configuration (which is basically clearml.conf configuration)
Add the following:
environment { GOOGLE_APPLICATION_CREDENTIALS="~/gs.cred" } files { gsc { contents: "<this is your GCP storage credentials file>" path: "~/gs.cred" } }Reference:
https://github.com/allegroai/clearml-agent/blob/a5a797ec5e5e3e90b115213c0411a516cab60e83/docs/clearml.conf#L421
https://github.com/allegroai/clearml-agent/blob/a5a797ec5e5e3e90b115213c0411a...

2 years ago

0 Hi Everybody, I’M Getting Errors With Automatic Model Logging On Pytorch (Running On A Dockered Agent).

Thanks! Let me check something

2 years ago

0 Hi All. In Upgrading Clearml-Agent On Our Server Because Of:

is it a good "I see" ? 🙂

2 years ago

0 Imagine I Browse Through My Experiment History And Find An Old Experiment That I Want To Use As A Base For A New Experiment. I Did Not Commit All My Changes Before Executing This Old Experiment, So The "Uncommitted Changes" In The "Execution Tab" Is Not

ClumsyElephant70 yes there is 🙂
clearml-agent build --id <task id> --target <folder>(I might have a typo there, but you can basically check the full help clearml-agent build --help )

3 years ago

0 Hi All, I'M Trying To Deploy Trains On Rancher (Nice Kubernetes Cluster Orchestration Project) Where I'M Quite New To Rancher And Kubernetes. I Have Been Able To Install Trains Using Helm

but I still need the laod ballancer ...

No you are good to go, as long as someone will register the pods IP automatically on a dns service (local/public) you can use the regsitered address instead of the IP itself (obviously with the port suffix)

Thanks for your support

With pleasure!

4 years ago

0 Hi (Again... Sorry For Asking So Many Questions) Question About Using Google Cloud Storage In A Clearml Agent Running In Aws Ec2 Instance. My

I think it should look something like:
files { gsc { contents: """{"type": "service_account", "project_id": "ai-platform", "private_key_id": "9999", "private_key": "-----BEGIN PRIVATE KEY-----==\n-----END PRIVATE KEY-----\n", "client_email": "a@ai.iam.gserviceaccount.com", "client_id": "111", "auth_uri": " ", "token_uri": " ", "auth_provider_x509_cert_url": " ", "client_x509_cert_url": " "}""" path: "~/gs.cred" } }

2 years ago

0 In Pipelinev2, Is It Possible To Register Artifacts To The Pipeline Task? I See There Is A Private Variable

Hi WackyRabbit7

I have a pipeline controller task, which launches 30 tasks. Semantically there are 10 applications, and I run 3 tasks for each (those 3 are sequential, so in the UI it looks like 10 lines of 3 tasks).

👍

In one of those 3 tasks that run for every app, I save a dataframe under the name "my_dataframe".

I'm assuming as an artifact:

What I want to achieve is once all tasks are over, to collect all those "my_dataframe" artifacts (10 in number), extract a sin...

3 years ago

Specifically notice step (1) and (2) they are important for Windows docker service to be able to run the elastic container and mongo container

4 years ago

Show more results