AgitatedDove14

48 Questions, 8049 Answers

Active since 10 January 2023

Last activity 6 months ago

Reputation

Badges 1

25 × Eureka!

Answers 8049

0 I Have Install A Python Environment By Virtualenv Tool, Let'S Say

I have install a python environment by virtualenv tool, let's say

/home/frank/env

and python is

/home/frank/env/bin/python3.

How to reuse the virtualenv by setting clearml agent?

So the agent is already caching the entire venv for you, nothing to worry about, just make sure you have this line in clearml:
https://github.com/allegroai/clearml-agent/blob/249b51a31bee97d63f41c6d5542e657962008b68/docs/clearml.conf#L131
No need to provide it an existing...

one year ago

0 Hello I'M New Here, I Found This Error When Testing My Tensorflow / Keras Model. I Already Create The Model Endpoint By Running Command 'Clearml-Serving --Id <Service_Id> Model Add --Engine Triton --Endpoint "Model_Name"... '. Also My Tensorflow / Keras M

MoodyCentipede68 from your log

clearml-serving-triton | E0620 03:08:27.822945 41 model_repository_manager.cc:1234] failed to load 'test_model_lstm2' version 1: Invalid argument: unexpected inference output 'dense', allowed outputs are: time_distributed

This seems the main issue of triton failing to.load
Does that make sense to you? how did you configure the endpoint model?

2 years ago

0 Hey Folks, Probably Asking A Loaded Question Here, But What Does A Ci/Cd Pipeline With Clearml Being Used Simultaneously Look Like? If I Am Using Github Actions, What Sort Of Stuff Can I Do?

Hi @<1523701132025663488:profile|SlimyElephant79>
Did you check None
There are also ready made templates:
None
None
None

one year ago

0 Has Anyone Compared

Oh like sync the "x-axis" between graphs?

2 years ago

0 Is This An Expected Behaviour? Trains Version 0.16.4, Not Able To Upgrade Now To Latest Version But I Doubt This Was Changed

New version will contain much more advanced search (including all the task fields)

are there any more fields in this function with partial matching? for example project? tags?

Yes they can all be filtered (basically everything you see in the UI)
notice: tags are strings (you can provide list of tags), project is an ID of the project
(Use Task.get_project_id, I think)

3 years ago

0 Hey, I See This In Between My Training Epochs, What Could Be Causing This? Because I See No Affect Of The Following

Hi SmarmyDolphin68

I see this in between my training epochs, what could be causing this?

This is basically saying we are saving a second model on the same Task and even though both are logged, only the last is stored on the Task itself.
This will change as in the next version a Task will be able to hold reference to multiple models in the artifactory 🙂

3 years ago

0 Hi All! I Have A Couple Of Things That Are Not Completely Clear To Me, Hope You Can Help Me To Sort Them Out.

Thanks OutrageousGrasshopper93
I will test it "!".
By the way the "!" is in the project or the Task name?

3 years ago

0 When I Run Experiments I Set

IntriguedRat44 how do I reproduce it ?
Can you confirm that marking out the Task.init(..) call will fix it ?

3 years ago

0 Avoiding

Be able to trigger the “pure” function (e.g. train()) locally, without any

code running, while driving it from a configuration e.g. path to the data.

When you say " without any http://clear.ml code" do mean without the agent, or without using the Clearml.Dataset ?

Be able to trigger the “

decorator” (e.g. train_clearml()) while driving it from configuration e.g. dataset_id

Hmm I can think of:
` def train_clearml(local_folder=None, dataset_id=None):
...

3 years ago

0 Hello, I'M Logging A Plotly Figure Which Contains Subplots Using

Hi SarcasticSparrow10
The plots in the UI allow you to control the colors of the graphs interactively (click on the color in the legend), it also allows you you toggle the legend on/off. This is on purpose so you can later adjust according to your taste 🙂
Is the layout okay (it was hard for me to understand form the screen-grab) ?
I'll make sure to reply the GitHub issue as well

3 years ago

0 Hello All, I'M Trying To Adapt Clearml With My Workflow. I Installed A Server At My Server, With Workers Attached To It. I'M Trying To Execute A Task From My Local Within One Of My Workers. Trying To Use Docker Mode And A Custom Image. I Also Have A Local

ZanyPig66 this should have worked, any chance you can send the full execution log (in the UI "results -> console" download full log) and attach it here? (you can also DM it so it is not public)

2 years ago

0 Thanks For Releasing This Awesome Experiment Manager! I Was Logging A Single Training Session On Multiple Gpus (Using Detectron2), And Torch.Mp Is Called For Each Gpu. This Creates A Separate Task In Trains For Each Gpu, And Only One Of The Tasks Has The

BTW, VexedKangaroo32 are you using torch launch ?

4 years ago

0 Hello Guys, I Have A Strange Situation With A Pipeline Controller I'M Testing Atm. If I Run The Controller Directly In My Pycharm On Notebook It Connects Correctly To The K8S Cluster With Trains Installed. After This, If I Go Directly In The Ui, I Reset T

My pleasure

3 years ago

0 Hi Everyone, Looking For Ml Management Tools I Stumbled Upon Trains, I Must Say It Has Been Awesome So Far. I Just Have A (Probably Stupid) Question: I'M Trying To Setup A Multi-Node Training Environment And I Thought I Could Solve This With Agents, But A

SmilingFrog76 this is not a weird mechanism at all , this is proper HPC scheduler 🙂
trains-agent is not actually aware of other nodes, it is responsible for launching a Task on its own hardware (with whatever configuration it was set). What can be done is to use the trains-agent inside a 3rd party scheduler and have the scheduler allocate the node and trains-agent spin the experiment. There is a k8s example here: basically pulling jobs for the trains-server queue and pushing ...

3 years ago

0 Hi There :) Can Anybody Tell Me What The Best Practice Is For Performing A Normalization In The Preprocess.Py Script Used By Clearml-Serving? Currently I Use A Sklearn Minmaxscaler Which Is Loaded And Applied Before And After The Data Is Send To The Model

Yes! I checked it should work (it checks if you have load(...) function on the preprocess class and if you do it will use it:
None

def load(local_file)
    self._model = joblib.load(local_file_name)
    self._preprocess_model = joblib.load(Model(hard_coded_model_id).get_weights())

one year ago

0 Unrelated Problem (Or Is It?) The Clearml'S Built In Cleanup Service Fails

. Yes I do have a GOOGLE_APPLICATION_CREDENTIALS environment variable set, but nowhere do we save anything to GCS. The only usage is in the code which reads from BigQuery

Are you certain you have no artifacts on GS?
Are you saying that if GOOGLE_APPLICATION_CREDENTIALS and clearml.conf contains no "project" section it crashed when starting ?

2 years ago

0 Hello, I Am Currently Learning How To Build Pipelines With Clearml. I'Ve Created A Pipeline That Has Five Steps, In Which Each Step Depends On The Previous One (Step 2 Depends On Step 1, Step 3 Depends On Step 2, Etc.). However, My Step 4 Depends On Step

Hi @<1523704757024198656:profile|MysteriousWalrus11>

"parents": [

  "step_two",

  "step_four"

],

Seems like step 5 depends on steps 2+4 , how did you create it? what did the console say ?
Could it be your not actually passing any output from step3 ? how is it dependent on it ?

one year ago

0 Hi Everyone, I Am Running A Pipeline Using The Autoscaler, I Am Able To Spin Up The Vm Instance Using The Autoscaler And The Docker Is Also Getting Installed In There Perfectly. The Issue I Am Facing Is That During Executing A Pipeline Task While Cloning

Hmm I see, add this for example

extra_docker_shell_script: ["rm ~/.bashrc", "echo removed bashrc"]

None

11 months ago

Try to add '--network host' to the docker args on the task you are launching

11 months ago

on the host machine or inside the containers that are spinning on the host machine ?

11 months ago

0 I’M Getting These Errors When Using Agent In Docker Mode

LazyTurkey38 notice the assumption is that the docker entry-point ends with bash, and only then the agent take charge. I'm assuming this is not te case hence the agent spins the docker, then the docker just ends, could that be?

3 years ago

@<1610083503607648256:profile|DiminutiveToad80> try to turn on:
None

enable_git_ask_pass: true

11 months ago

0 Hi, I’M Training On Multi-Node, Clearml Captures Only A Single Machine Utility (Memory/Cpu/Etc.). I Assume It Captures Node 0. Is There A Way To Make It Report All Nodes?

@<1558624430622511104:profile|PanickyBee11> how are you launching the code on multiple machines ?
are they all reporting to the same Task?

one year ago

try:
None

docker_install_opencv_libs: true

11 months ago

0 Is The App/Ui/Backend Customizable? Any Tutorials For That?

I would recommend reading this blog post, it should give you a glimpse of what can be built 🙂
https://medium.com/pytorch/how-trigo-built-a-scalable-ai-development-deployment-pipeline-for-frictionless-retail-b583d25d0dd

4 years ago

0 Hi Folks, Is It Possible To Use An Aws P3 Instance (Which As Several Gpus) With One Agent Per Gpu, All Controlled Through Clearml Aws Autoscheduler? So Clearml Aws Autoscheduler Would Know In Advance How Much Agents To Start In The Instances (Can Be An Op

if I encounter the need for that, I will adapt and open a PR

Great!

3 years ago

0 Hi, I'M Trying To Set Storage Manager To Use Our Internal Miniio Installation But I Ran Into This Issue With This Testing Code:

The easiest is to pass an entire trains.conf file

3 years ago

0 Security Question: In My Journey Of Running Clearml The "Hard Way" (Self-Hosted), One Problem I Haven'T Solved Is Security. Some Discussion Here...

If the load balancer it Gateway can do the computation and leverage caching,

Oh that's True. But unfortunately out of scope for the open-source (well at the end someone needs to pay our salaries 🙂 )

I’d prefer not to have our EC2 instance directly exposed to the public Internet.

Yep, I tend to agree 🙂

one year ago

0 Clearml Team Is No Longer To Develp Clearml-Session..? I Wrote An Issue But Nobody Answer

Sorry @<1524922424720625664:profile|TartLeopard58> 😞 we probably missed it
clearml-session is still being developed 🙂
Which issue are you referring to ?

one year ago

0 With

In Azure VMSS, there is a method called "Custom Data", which is basically a way of passing things to be executed

I know that it is in the to do list to add "azure_autoscaler" which is basically asybling to the aws_autoscaler.
With the same idea of the "custom data" as initial bash script:
You can check here:
https://github.com/allegroai/clearml/blob/4a2099b53c09d1feaf0e079092c9e075b43df7d2/clearml/automation/aws_auto_scaler.py#L54

3 years ago

Show more results