AgitatedDove14

49 Questions, 8122 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

25 × Eureka!

Answers 8122

0 Hi, Expanding On

After it finishes the 1st Optimzation task, what's the next job which will be pulled ?

The one in the highest queue (if you have multiple queues)
If you use fairness it will pull in round robin from all queues, (obviously inside every queue it is based on the order of jobs).
fyi, you can reorder the jobs inside the queue from the UI 🙂
DeliciousBluewhale87 wdyt?

4 years ago

0 Hi, Expanding On

It is currently only enabled when using ports mode, it should be enabled by default , i.e a new feature :)

4 years ago

0 Hi, Expanding On

Hi DeliciousBluewhale87
Hmm, good question.
Basically the idea is that if you have ingestion service on the pods (i.e. as part of the yaml template used by the k8s glue) you can specify to the glue what are the exposed ports, so it knows (1) what's the maximum of instances it can spin, e.g. one per port (2) it will set the external port number on the Task, so that the running agent/code will be aware of the exposed port.
A use case for it would be combing the clearml-session with the k8s gl...

4 years ago

0 Hi, Is It Possible To Specify Per Experiment (Task In Clearml) Where The Results (Artifacts) Are Saved?

. It is not possible to specify the full output destination right?

Correct 😞

4 years ago

0 I’M Using Catboost For Training, But Sadly It Does Not Have A Native Integration With Clearml (Xgboost And Lightgbm Do Have Integrations). But Catboost Writes Down Training Logs In Tensorboard Format (Into A

it certainly does not use tensorboard python lib

Hmm, yes I assume this is why the automagic is not working 😞

Does it have a pythonic interface form the metrics ?

4 years ago

0 Is There A Way To Interface With Clearml Agent (Cli?) To Handle Model Repositories And Data Versioning (But So, Not Experimentation, Tight Integration, Pipelining, Etc)?

UnevenDolphin73 FYI: clearml-data is documented , unfortunately only in GitHub:
https://github.com/allegroai/clearml/blob/master/docs/datasets.md

4 years ago

0 Hey, I'M Running A Pipeline, And 1 Stage Passed - But The Next One Failed. I Fixed The Bug For The Second One - Is There Any Way To Retry The Pipeline From The Failure?

yup, it's there in draft mode so I can get the latest git commit when it's used as a base task

Yes that seems to be the problem, if it is in draft mode, you have no outputs...

4 years ago

0 Do We Get Workers In The

Sure just setup clearml-agent on any machine 🙂
(The app.community server is the control plane)

4 years ago

0 Hi Everyone. I Am Trying To Migrate From Trains To Clearml. I Am Using My Own Server, And I Installed Trains In Kubernetes Using Helm. I Am Following All Steps From The Docs (

I think this is the issue, it was search and replaced . The thing is I'm not sure the helm chart is updated to clearml. Let me check

4 years ago

0 Hello! I'M Trying To Make A Simple Eval.Py Script That Will Go Pull The Best Model Of A Given Experiment, Load It Locally And Evaluate It On Whatever Data I Give. Question 1: Is There A Standard Way Documented Somewhere To Do This? Question 2: I'M Loadin

That might be me, let me check...

2 years ago

0 Can Anyone Point Me To Web Ui Source Code Out, I’M Wondering How Can I Customizing Web App Ui A Little Bit

MagnificentPig49 quick update, front-end guys updated me that with the next trains-server update they will have the web client code available on the repository , ETA probably mid May or so :)

5 years ago

0 Hi, I Am Giving Another Try To Clearml-Session And I Am Blocked At The Current Error Shown When The Cli Try To Establish The Tunneling:

This is the reason you are getting an error 🙂
Basically the session asks the agent to setup a new SSH server with credentials on the remote machine, this is not an issue inside a container, as this is an isolated environment, but when running in venv mode the User running the agent is not root, hence it cannot spin/configure an SSH server.
Make sense ?

3 years ago

0 Question About The Storage Manager. Assuming I Have An Object That Updates Frequently And Always Saved At The Same Path (E.G.

Well I guess you can say this is definitely not self explanatory line 😉
but, it is actually asking whether we should extract the code, think of it as:
if extract_archive and cached_file: return cls._extract_to_cache(cached_file, name)

4 years ago

0 How Can I Avoid

Hi TrickyRaccoon92
BTW: checkout the HP optimization example, it might make things even easier 🙂 https://github.com/allegroai/trains/blob/master/examples/optimization/hyper-parameter-optimization/hyper_parameter_optimizer.py

4 years ago

0 When I Pass Invalid Key To

if fails during

add_step

stage for the very first step, because

task_overrides

contains invalid keys

I see, yes I guess it it makes sense to mark the pipeline as Failed 🙂
Could you add a GitHub issue on this behavior, so we do not miss it ?

3 years ago

0 Hi, I'M Trying To Run Task.Init Inside A Jupyter Notebook For The First Time (Used It A Lot Before In Normal Python Scripts), And I Get A Warning-

but this gives me an idea, I will try to check if the notebook is considered as trusted, perhaps it isn't and that causes issues?

This is exactly what I was thinking (communication with the jupyter service is done over http, to localhost, sometimes AV/Firewall software will block it, false-positive detection I assume)

4 years ago

0 Hello, I Have Another, Hopefully Minor Question. What Is The Recommended Way Of Checking If A Code Segment Is Running As A Part Of A Task Or Not? I Tried Checking The Return Value Of

Hi ArrogantBlackbird16

but it returns a task handle even after the Task has been closed.

It should not ... That is a good point!
Let's fix that 🙂

4 years ago

0 Hi, Is There Any Documentation For Setting Up And Using Ssl Certs With The Clearml Server And Agent?

So assuming they are all on the same LB IP: You should do:
LB 8080 (https) -> instance 8080
LB 8008 (https) -> instance 8008
LB 8081 (https) -> instance 8081

It might also work with:
LB 443 (https) -> instance 8080

4 years ago

0 Hi,

Hi FloppyDeer99

What is the meaning of no real scheduling

I think the meaning is that from the moment a k8s job is created, the k8s is in charge of actually spinning the container. Since k8s has no real priority/order the scheduling order is not guaranteed form this point.

The idea of the cleaml-k8s -glue is that the glue will launch a job on the k8s cluster only if it is sure there are enough resources to actually spin the job now (as opposed to, sometime in the future), this mea...

4 years ago

0 Is There Any Way To Clear The Installed Packages Of A Task Programmatically? (I.E. Using The Python Sdk And Not The Ui)

GiddyTurkey39

A flag would be really cool, just in case if theres any problem with the package analysis.

Trying to think if this is a system wide flag (i.e. trains.conf) or a flag in task.init.
What do you think?

4 years ago

0 Hi Everyone. I Have An Issue With The Simple Pipeline - It Runs Two Similar Nn Training Steps (Tf2.3, Windows10, Python 3.7) With Only Difference Is A Batch Size. I'M Running First Separately Each Step To Have Them In Clearml Project Page. Then I Run Pipe

How can the first process corrupt the second

I think that something went wrong and both Agents are using the same "temp" folder to setup the experiment.

why doesn't this occur if I run pipeline from command line?

The services queue is creating new dockers with everything in them so they cannot step on each others toes (so to speak)

I run all the processes as administrator. However, I've tested running the pipeline from command line in non-administrator mode, it works fine....

4 years ago

0 I'M Following The Pipeline Controller Example...This Is The Output I Get After Running The The Three Scripts For Step1, Step2, And Step3, And Finally The

MagnificentSeaurchin79
Do notice that the pipeline controller assumes you have an agent running

4 years ago

0 I Am Trying To Use

Ohh

4 years ago

0 Hi All, I Was Trying To Use Clearml-Task To Run A Custom Docker(With Poetry To Install All The Python Dependencies And Activated The Environment) Using Clearml Gpu, But It Seems Like Clearml Always Create A Virtual Environment And Run The Python Script Fr

@<1597762318140182528:profile|EnchantingPenguin77> can you provide the full log?

2 years ago

0 Hi All! Please Tell Me Why The Almost The Last Version Of Docker-Compose Is Used (In The Example From The Site 1.24.1, Link

My question is, which version do you need docker compose?

Ohh sorry, there is no real restriction, we just wanted easy copy-paste for the installation process.

3 years ago

0 Hi Community! I Have Difficulty Using Clearml Pipeline. I Am Writing The Code Using The Pipeline Decorator, But The Pipeline Does Not Work With The Following Error When Specifying The Docker Image As A Argument Of The Decorator. How Should I Solve It?

How are you running it?

one year ago

0 Hey There, I Would Like To Increase The

I think this should work 🤞

4 years ago

0 Can You Help Me Make The Case For Clearml Pipelines/Tasks Vs Metaflow? Context Within...

Thanks! a few thoughts below 🙂

not true — you can specify the image you want for each stepMy apologies, looking at the release notes, it was added a while back and I have not noticed 😞
re: role-base access control - see Outerbounds Platform that provides a layer of security and auth features required by enterprisesRole based access meaning limiting access in metaflow i.e. specific users/groups can only access specific projects etc. ...

2 years ago

0 Hi Again. As I Am Running My Experiment From Server Using Agent, I Am Failing On The Point, Where The Arguments Of Argparse Are Processed. When Is The Agent Task Registered. I Am Getting None For Task.Current_Task() At The Begining Of My Script.

It will store the entire content of the file, then you can edit it in the UI, and in remote it will return a new local copy of the file (based on the data in the UI) for you to read.

5 years ago

0 Hi There, I'Ve Encountered A Problematic Behavior In Python. When Defining An Argument A Default Value Of

I mean , the python package, not the trains-server version

5 years ago

Show more results