AgitatedDove14

48 Questions, 8049 Answers

Active since 10 January 2023

Last activity 5 months ago

Reputation

Badges 1

25 × Eureka!

Answers 8049

0 Can Anyone Recommend A Good Workflow For

do you have a video showing the use case for clearml-session

I totally think we should, I'll pass it along 🙂

what is the difference between vscode via clearml-session and vscode via remote ssh extension ?

Nice! remote vscode is usually thought of as SSH, basically you have your vscode running on your machine, and using SSH vscode automatically connects to the remote machine.

Clearml-Session also ads a new capability VSCode inside your browser, where the VSCode itself as well...

one year ago

0 Can Anyone Recommend A Good Workflow For

I'm guessing this is done through code-server?

correct

I'm currently rolling a JupyterHub instance (multiuser, with codeserver inside) on the same machine as clearml-server. That’s where tasks are executed etc. so, all browser dev env.

Yeah, the idea with clearml-session each user can self serve themselves the container that works best for them. With a jupyterhub they start to step on each other's toes very quickly ...

one year ago

0 Can Anyone Recommend A Good Workflow For

But this config should almost never need to change!

Exactly the idea 🙂
notice the password (initially random) is also fixed on your local machine, for the exact same reason

one year ago

0 Can Anyone Recommend A Good Workflow For

Can you please elaborate on the latter point? My jupyterhub’s fully containerized and allows users to select their own containers (from a list i built) at launch, and launch multiple containers at the same time, not sure I follow how toes are stepped on. (edited)

Definitely a great start, usually it breaks on memory / GPU-mem where too many containers on the same machine are eating each others GPU ram (that cannot be virtualized)

one year ago

0 Can Anyone Recommend A Good Workflow For

I guess only if autoscaling is used (one worker one machine)?

yes, basically depending on how you set autoscaling / k8s integration 🙂

one year ago

0 Can Anyone Recommend A Good Workflow For

exactly! it is very cool to see it in action, and it really works very well, kudos for these guys

one year ago

0 Hi Guys, Just Wondering If Anyone Encountered This Error When Using The Pipeline Controller Object. I Simply Added A Step With The Step-Name And Base_Task_Id As Flags.

the latest server is 1.1.1
https://github.com/allegroai/clearml-server/releases/tag/1.1.1

3 years ago

0 How Do You Create Parameters For Grid Search? I See Classes For Random Sampling Et C But I Don'T See Anything For Fixed Parameter List

Hi @<1539055479878062080:profile|FranticLobster21>
Like this?
https://github.com/allegroai/clearml/blob/4ebe714165cfdacdcc48b8cf6cc5bddb3c15a89f[…]ation/hyper-parameter-optimization/hyper_parameter_optimizer.py
[https://github.com/allegroai/clearml/blob/4ebe714165cfdacdcc48b8cf6cc5bddb3c15a89f[…]ation/hyper-parameter-opt...

one year ago

0 Hey, Using K8S With Trains 0.16.1-320, All Of A Sudden The Entire Data (I.E Experiments, Tasks, Api Creds) Is Not Showing In The Ui Anymore. All Logs Seems To Be Fine Afai Can Tell... Any Idea What Went Wrong?

In that case, I think it is stuck on a previous Node, I can;t think of any other reason.
Do you have something else on the same PV that was lost ? like api server configuration?

3 years ago

0 Hi, Does Anyone Have Some Issues With Cloning Git Repos Within Alegro? I Always Got Some Error Massage: Fatal: Unable To Access '

This will mount the trains-agent machine's hosts file into the docker

3 years ago

0 My Autoscaled Instance Fails When Running "Git Clone" On A Private Repo. I

Actually, dumb question: how do I set the setup script for a task?

When you clone/edit the Task in the UI, under Execution / Container you should have it
After you edit it, just push it into the execution with the autoscaler and wait 🙂

one year ago

0 Hi

Hi @<1641611252780240896:profile|SilkyFlamingo57>

. It is not taking a new pull from Git repository.

When you are saying it's not trying to get the latest, are you referring to a new run of the pipeline, and then the component being pulled is Not pulling the latest from the branch, is that the issue?
When you click on the component Task details (i.e. right hand side panel "Full details"), what's the commit ID you have?
Lastly, is the component running on the same machine as the prev...

10 months ago

0 Hey Folks, When I Run

Could it be the credentials are actually incorrect? because it seems like you can access the server? (I assume you were able to browse to it and generate credentials. right?)

3 years ago

0 Hey Folks, When I Run

I assume https from the log

3 years ago

0 Hey Folks, When I Run

Hmm, Is it http or https ?

3 years ago

0 Hey Folks, When I Run

Seems lime someone sitting in the middle and reroutes the request (maybe both https and port) ?!

3 years ago

0 Hey Folks, When I Run

According to you the VPN shouldn't be a problem right?

Correct as long as all parties are on the same VPN it should work, all the connections are always http so basically trivial communication

3 years ago

0 Hi, Is There Any Way To Get Experiment Debug Images Programmatically?

and trains version

4 years ago

0 Hi, Is There Any Way To Get Experiment Debug Images Programmatically?

Basically try with the latest RC 🙂
pip install trains 0.15.2rc0

4 years ago

0 Is There A Link Which Describes The Differences In Community And Enterprise Versions

PompousParrot44 Enterprise licensing pricing usually custom tailored to the size of the company and based on usage. If you are interested feel free to leave details in the "contact us" form on the website, and someone from sales will contact you shortly after.

4 years ago

0 In My Git Repo, I Have A

Yea the "-e ." seems to fit this problem the best.

👍

It seems like whatever I add to

docker_bash_setup_script

is having no effect.

If this is running with the k8s glue, there console out of the docker_bash_setup_script ` is currently Not logged into the Task (this bug will be solved in the next version), But the code is being executed. You can see the full logs with kubectl, or test with a simple export test

docker_bash_setup_script
` export MY...

one year ago

0 Hello, I Would Like To Optimize Hparams Saved In Configuration Objects. I Used Hydra And Omegaconf For Hparams Definition (See Img). How Should I Define The Name Of Hparam In

assuming you have http://hparams.my _param my suggestion is:
` @hydra.main(config_path="solver/config", config_name="config")
def train(hparams: DictConfig):
task = Task.init(hparams.task_name, hparams.tag)
overrides = {'my_param': hparams.value}
task.connect(overrides, name='overrides')

in remote this will print the value we put in "overrides/my_param"

print(overrides['my_param'])

now we actually use overrides['my_param'] `Make sense ?

2 years ago

0 Hi Guys, Just Wondering If Anyone Encountered This Error When Using The Pipeline Controller Object. I Simply Added A Step With The Step-Name And Base_Task_Id As Flags.

I think I found something, let me see if I can reproduce it

3 years ago

0 Hi, Does Anyone Have Some Issues With Cloning Git Repos Within Alegro? I Always Got Some Error Massage: Fatal: Unable To Access '

SuperiorDucks36 , is the domain name "rz-s-git" this does not seem like a valid domain?

EDIT:
Is it a local domain on your network?

3 years ago

0 Hi, Does Anyone Have Some Issues With Cloning Git Repos Within Alegro? I Always Got Some Error Massage: Fatal: Unable To Access '

Okay, make sure that in your trains.conf on all the trains-agent machine you add the following:
agent.extra_docker_arguments: ["-v", "/etc/hosts:/etc/hosts",]

From here:
https://github.com/allegroai/trains-agent/blob/216b3e21790659467007957d26172698fd74e075/docs/trains.conf#L121

3 years ago

0 Hi, Does Anyone Have Some Issues With Cloning Git Repos Within Alegro? I Always Got Some Error Massage: Fatal: Unable To Access '

Hi SuperiorDucks36
Could you post the entire log?
(could not resolve host seems to be coming from the "git clone" call).
Are you able to manually clone the repository on the machine running trains-agent

3 years ago

0 Hi, Does Anyone Have Some Issues With Cloning Git Repos Within Alegro? I Always Got Some Error Massage: Fatal: Unable To Access '

Yes, the left side is the location of the file on the host machine, the right side is the location of the file inside the docker. in our case it is the same location

3 years ago

0 Hi Abit Of A Crazy Question... But Is It Possible To Use Clearml In Rust, Without Writing A Wrapper. I Noticed The Api Doesnt Cover Dataset Operations But The Cli Can.

BTW: if you feel like writing a wrapper it could be cool 🙂

10 days ago

0 I Am Working Up With The Autoscaler, After Setting Up The Autoscaler Instance I Am Getting The Following Error When I Launch The Autoscaler Googleapiclient.Errors.Httperror: <Httperror 404 When Requesting

Thanks!
Hmm from here : None
Could it be you do not have privileges to the resource, or that you did not provide credentials ?
Did that autoscaler work before ?

12 months ago

0 I Have Managed To Deploy Model By Thr Clearml-Serving, Now They Are Runing On The Docker Container Engine (That Doesn'T Have Gpu In It) , What Is The Entrypoints To The Model In Order To Get Predictions?

Assuming this is a followup on:
https://clearml.slack.com/archives/CTK20V944/p1626184974199700?thread_ts=1625407069.458400&cid=CTK20V944

This depends on how you set it with the clearml-serving --endpoint my_model_entry
curl <serving-engine-ip>:8000/v2/models/my_model_entry/versions/1

3 years ago

Show more results