AgitatedDove14

49 Questions, 8122 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

25 × Eureka!

Answers 8122

0 I Cloned It And Scheduled It To The Default Queue, But It Is Not Being Processed. Is The Default Queue By Default Not Usable?

WickedGoat98 did you setup a machine with trains-agent pulling from the "default" queue ?

4 years ago

0 I Get This Error When I Try To Add A Batch To A In My Clearml-Dataset. Generating Sha2 Hash For 100 Files Hash Generation Completed 2021-09-10 10:49:13,042 - Clearml.Task - Error - Action Failed <400/110: Tasks.Add_Or_Update_Artifacts/V2.10 (Invalid Task

Hi VexedCat68
Could it be you are trying to update a committed dataset?

4 years ago

0 Hi, Together With

The experiment finished completely this time again

With the RC version or the latest ?

5 years ago

0 "Clearml.Task - Error - Action Failed <500/0: Tasks.Edit/V1.0 (Update Failed (Bsonobj Size: 18330801 (0X117B4B1) Is Invalid. Size Must Be Between 0 And 16793600(16Mb) F"

so 78000 entries ...
wow a lot! would it makes sens to do 1G chunks ? any reason for the initial 1Mb chunk size ?

2 years ago

0 Hey Everyone! Is It Possible To Trigger A Pipeline Run Via Api? We Have A Repo That Builds An Image For Serving To Clearml Server But We'Ve Wrapped It Inside A Fastapi Application So It Can Be Called From Another Web Service.

Hi @<1692345677285167104:profile|ThoughtfulKitten41>

Is it possible to trigger a pipeline run via API?

Yes! a pipeline is at the end a Task, you can take the pipeline ID and clone and enqueue it

pipeline_task = Task.clone("pipeline_id_here") 
Task.enqueue(pipeline_task, queue_name="services")

You can also monitor the pipeline with the same Task inyerface.
wdyt?

one year ago

0 "Clearml.Task - Error - Action Failed <500/0: Tasks.Edit/V1.0 (Update Failed (Bsonobj Size: 18330801 (0X117B4B1) Is Invalid. Size Must Be Between 0 And 16793600(16Mb) F"

Hi DrabOwl94
I think that if I understand you correctly you have a Lot of chunks (which translates to a lot of links to small 1MB files, because this is how you setup the chunk size). Now apparently you have reached the maximum number of chunks per specific Dataset version (at the end this meta-data is stored in a document with limited size, specifically 16MB).
How many chunks do you have there?
(In other words what's the size of the entire dataset in MBs)

2 years ago

0 When It Comes To Continuous Training, I Wanted To Know How You Train Or Would Train If You Have Annotated Data Incoming? Do You Train Completely Online Where You Train As Soon As You Have A Training Example Available? Do You Instead Train When You Have A

Hi VexedCat68
What type of data is it? And what type of annotations?
Streaming data into the training process is great, but is it post quality control?

3 years ago

0 Hi, Together With

JitteryCoyote63 fix pushed to master, let me know if it passes...

5 years ago

0 Hi All. How Do I Read Out The Pipeline Configuration (Attached To The Pipeline Via Connect_Configuration) Inside A Task, Which I Created Via Add_Function_Step() For The Same Pipeline. We Have Tried Use A Pre_Execute_Callback, And Then Acess `Node.Job.Task

Hi @<1543766544847212544:profile|SorePelican79>
You want the pipeline configuration itself, not the pipeline component, correct?

pipeline = Task.get_task(Task.current_task().parent)
conf_text = pipeline.get_configuration_object(name="config name")

conf_dict = pipeline.get_configuration_object_as_dict(name="config name")

one year ago

0 Hi, Together With

To be honest, I'm not sure I have a good explanation on why ... (unless on some scenarios an exception was thrown and caught silently and caused it)

5 years ago

0 Hi, Trying To Understand Clearml-Session. I Have An Agent Running On A Machine Monitoring A Queue Then I Ran Clearml-Session --Queue Myqueu --Docker Torch-Image. The Clearml Session Ended Up Tunneling Into The Physical Machine That My Agent Is Running

Hi, I was expecting to see the container rather then the actual physical machine.

It is the container, it should tunnels directly into it. (or that's how it should be).
SSH port 10022

4 years ago

0 Hey Folks, When I Run

According to you the VPN shouldn't be a problem right?

Correct as long as all parties are on the same VPN it should work, all the connections are always http so basically trivial communication

4 years ago

Are you running the agent in docker mode ?
Is there a mount to the host machine ?

4 years ago

0 Hi Fam! I’M Trying To Get

Hi QuaintPelican38
Assuming you have open the default SSH port 10022 on the ec2 instance (and assuming the AWS premissions are set so that you can access it). You need to use the --public-ip flag when running the clearml-session. Otherwise it "thinks" it is running on a local network and it registers itself with the local IP. With the flag on it gets the public IP of the machine, then the clearml-session running on your machine can connect to it.
Make sense ?

4 years ago

0 Hey There, Does Trains Support

his means that you guys internally catch the argparser object somehow right?

Correct 🙂 this is how you get the type checking casting abilities, and a few other perks

5 years ago

0 Hi, I Am Giving Another Try To Clearml-Session And I Am Blocked At The Current Error Shown When The Cli Try To Establish The Tunneling:

but I still clearml-agent will raise the same error

which one?

3 years ago

0 Hi, I Am Giving Another Try To Clearml-Session And I Am Blocked At The Current Error Shown When The Cli Try To Establish The Tunneling:

Yes, the agent's mode is global, i.e. all tasks are either inside docker or in venv. In theory you can have two agents on the same machine one venv one docker listening to two diff queues

3 years ago

0 Hi, I Am Giving Another Try To Clearml-Session And I Am Blocked At The Current Error Shown When The Cli Try To Establish The Tunneling:

Sorry, what I meant is that it is not documented anywhere that the agent should run in docker mode, hence my confusion

This is a good point! I'll make sure we stress it (BTW: it will work with elevated credentials, but probably not recommended)

3 years ago

0 Hi, I Am Giving Another Try To Clearml-Session And I Am Blocked At The Current Error Shown When The Cli Try To Establish The Tunneling:

This is the reason you are getting an error 🙂
Basically the session asks the agent to setup a new SSH server with credentials on the remote machine, this is not an issue inside a container, as this is an isolated environment, but when running in venv mode the User running the agent is not root, hence it cannot spin/configure an SSH server.
Make sense ?

3 years ago

0 Clearml-Session Question: I’M Using The Tool With An On-Prem Machine. Normal Tasks Are Being Executed Normally - But When Using

Sometimes it is working fine, but sometimes I get this error message

@<1523704461418041344:profile|EnormousCormorant39> can I assume there is a gateway at --remote-gateway <internal-ip> ?
Could it be that this gateway has some network firewall blocking some of the traffic ?
If this is all local network, why do you need to pass --remote-gateway ?

2 years ago

0 Hey Everyone, We Have Such The Following Problem. Our Developers Asked Direct Access To Worker Nodes So That They Can Run Interactive Sessions (Clearml-Session). But The Security Team Does Not Approve, As We Have Requested Access To Ports 0-65535. Here T

worker nodes are bare metal and they are not in k8s yet

By default the agent will use 10022 as an initial starting port for running the sshd that will be mapped into the container. This has nothing to do with the Host machine's sshd. (I'm assuming agent running in docker mode)

3 years ago

0 Clearml-Session Fails Ssh Tunneling. It Does Not Use Key Auth, Instead Sets Up Some Weird Password And Then Fails To Auth:

Btw it seems the docker runs in

network=host

Yes, this is so if you have multiple agents running on the same machine they can find a new open port 🙂

I can telnet the port from my mac:

Okay this seems like it is working

2 years ago

0 Clearml-Session Fails Ssh Tunneling. It Does Not Use Key Auth, Instead Sets Up Some Weird Password And Then Fails To Auth:

It does not use key auth, instead sets up some weird password and then fails to auth:

AdventurousButterfly15 it ssh Into the container inside the container it sets new daemon with new random very long password
It will Not ssh to the host machine (i.e. the agent needs to run in docker mode, not venv mode), make sense ?

2 years ago

0 Clearml-Session Fails Ssh Tunneling. It Does Not Use Key Auth, Instead Sets Up Some Weird Password And Then Fails To Auth:

I mean if I enter my host machine ssh password it works. But we will disable password auth in future, so it’s not an option

To clarify, it should not allow users to ssh into the host machine (if you can do that this means you own it), it only allows users to SSH into the container the host machine spins, make sense ?

2 years ago

0 Clearml-Session Fails Ssh Tunneling. It Does Not Use Key Auth, Instead Sets Up Some Weird Password And Then Fails To Auth:

hmm can you share the log of the Task? (the clearml-session created Task)

2 years ago

0 Can Someone Point Me Whether/How The Services-Agent The Starts With The Clearml-Server Mounts The

BTW: the agent will resolve pytorch based on the install CUDA version.

4 years ago

0 How Do I Think About Tasks/Task_Name-S? Do I See Right If I Run The Same Task With The Same Name, It Overwrites The Previous Run? Is It Possible To Fail If The Task Already Exists And Need

it overwrites the previous run?

It will overwrite the previous if
Under 72h from last execution no artifact/model was createdYou can control it with "reuse_last_task_id=False" passed to Task.init
Task name itself is Not unique in the system, think of it as short description
Make sense ?

2 years ago

0 Does Clearml-Session Work In A Kubernetes Environment?

👍

4 years ago

0 Hello! How Can I Use "Report_Scatter2D" In Order To Report Timestamp In The X-Axis?

SweetGiraffe8

4 years ago

0 Does Clearml-Session Work In A Kubernetes Environment?

Hi TrickySheep9
Long story short, clearml-session fully supports k8s (using k8s glue)
The --remote-gateway along side ports mode will basically allow you to setup a k8s service so that every session will register with a specific port so k8s does ingest foe you and route the SSH connection to the pod itslef, everything else is tunneled over the original SSH connection.
Make sense ?

4 years ago

Show more results