AgitatedDove14

48 Questions, 8049 Answers

Active since 10 January 2023

Last activity 5 months ago

Reputation

Badges 1

25 × Eureka!

Answers 8049

0 Hi Fam! I’M Trying To Get

Hi QuaintPelican38
Can you ssh to {instance_public_ip_address}:10022 (something like ssh -p 10022 user@IP_HERE )?
Basically just getting the password prompt means you are okay.
I suspect that you have some AWS security definition (firewall) that prevents a direct access to the instance, could that be?

3 years ago

0 Hi, I Am Trying To Execeute My Code On Nvidia/Cuda Docker, But It Keeps Running, It Is Not Failed Or Not Aborted. The Last Log Message Is

MysteriousBee56 when you run the trains-agent with --foreground , before it starts the docker it print the full command line, could you send it please?
I can't figure out where the extra ' came from...
Also could you send the trains.conf file?
(feel free to redact and confidential information)

4 years ago

0 Hello! Getting Credential Errors When Attempting To Pip Install Transformers From Git Repo, On A Gpu Queue.

So what is the difference?!

3 years ago

0 Fyi: Conda Installation Of Pytorch Is Broken Again. My Old Tasks Which Worked Before Now Fail Since They Do Not Find Torch. However, I Can See In The Execution That Conda Had Errors. Most Probably It Happens Because Pytorch 1.8.1 Has Been Released, But I

nice 🙂

3 years ago

0 I'M Probably Stupid, But How Do I Specify Worker Name? Usecase - I Want To Create Two Workers Using The Same Gpu, And New Worker Just Overwrites The Old One

We should probably have a section on that (i.e. running two agents on the same GPU, then explain how top use it)

4 years ago

0 Hi All, Anyone Also Have Issues With The Logger Hang The Whole Task?? Or Doesn’T Upload The Reported Images And Scalers? I Got Many Tasks That Were Just Hang At The End Of The Script Without Finishing (Staying In

Oh no 😞 I wonder if this is connected to:

Any chance the logger is running (or you have) from a subprocess ?

2 years ago

0 Um, Is There A Way To Delete An Artifact From A Task That Is Running?

VexedCat68
delete the uploaded file, or the artifact from the Task ?

2 years ago

0 Two Questions About Datasets: Question 1: Are Parallel Writes To A Dataset With The Same Version Possible? Is The Way To Go, To Have A Task, Which Creates A Dataset Object, Which In Turn Is Passed As Artifact To The Subsequent Ingestion Tasks? After The P

Hi @<1661542579272945664:profile|SaltySpider22> I'm not sure I understand the answer to my parallel quesion

8 months ago

0 Apart From Having Packages In Requirements.Txt, Does Clearml Expect Them To Be Actuall Installed To Add Them As Installed Packages For A Task?

Ok i did a pip install -r requirements.txt and NOW it picks them up correctly
So packages have to be installed and not just be mentioned in requirements / imported?

Yes, it looks for them locally so it has all the specific versions you need.
If the "installed packages" is totally empty the agent will revert to looking for requirements.txt inside the repository.

3 years ago

0 Hi, I Have A Problem With "Dataset" Module. I Create Dataset And Uploaded Few Files:

Hmm this is odd, when you press on the parent dataset in the UI, and go to full-details, then the INFO tab. Can you copy here everything ?

2 years ago

0 Hi All, I'M Trying To Use The Relatively New Jupyter Preview Feature But For Some Reason I Have The Notebook Artifact Under Artifacts But The Preview Is Unavailable.. Am I Missing Some Needed Steps? Thanks!

(RC will be out in a few days)

3 years ago

0 Hi Guys, How Does Allegro Keep Track Of The Requirements (I'M Running The Scripts On A Remote Train-Agent With

I would like to force the usage of those requirements when running any script

How would you force it? Will you just ignore the "Installed Packages" section ?

3 years ago

0 Hey - I'M Trying To Compare Voxel Versus Clear Ml In Image Data Exploration.

Hi @<1523701111020589056:profile|DefiantSpider5>
So there are two answers here, I'll start with the open-source version of both

Is there a way in clear ml to interactively view subsets of images based on a lasso of embedding plots

The ClearML Datasets have no "query" capabilities of the data inside the dataset. That means you can see preview images, statistics and download the datasets, but no query capabilities. On the other hand, there is no limitation on the type and format of me...

one year ago

0 Hi Everybody, I'M Running Experiments Inside A Docker Which Includes Multiple Python Instances, Some Of Them Are Inside Conda Environments. How Can I Specify The Agent To Use A Specific Conda Environment Inside The Docker?

The agent is using Bash (but when you add command line to the docker run, .bashrc is not executed, hence no conda in PATH)
Maybe add the full path to the conda executable:
ocker_setup_bash_script= [ "export PATH=""/workspace/miniconda/bin:$PATH", "export LOCAL_PYTHON=/workspace/miniconda/bin/python3","/workspace/miniconda/bin/conda activate /PATH_GOES_HERE"])

2 years ago

0 Hey, I Have A Question Regarding Pipelines. Let'S Say I Have 2 Scripts: Train.Py And Evaluate.Py. Each Of Them Creates A Task Using Task.Init And Logs Some Information. These Scripts Are Run Independently (In My Case They Are Run By Dvc). I Would Like Bot

That's why I want to keep it as separate tasks under a single pipeline.

Hmm Yes, if this is the case then you definitely have to have two Tasks (with execution info on each one).
So you could just create a "draft" pipeline Task and report everything to it? Does that make sense ?
(By design a pipeline is in charge of spinning the Tasks and pulling the data/metric from them if needed, in your case it sounds like you need the Tasks to push the data/metric onto the pipeline Task, this is ...

2 years ago

0 Hey, I Have One Question Regarding The Cleanup_Service Task In The Devops Project: Does It Assume That The Agent In Services Mode Is In The Trains-Server Machine?

Hi JitteryCoyote63 ,
The easiest would probably be to list the experiment folder, and delete its content.
I might be missing a few things but the general gist should be:
from trains.storage import StorageHelper h = StorageHelper('s3://my_bucket') files = h.list(prefix='s3://my_bucket/task_project/task_name.task_id') for f in files: h.delete(f)Obviously you should have the right credentials 🙂

4 years ago

0 Hi, I Want Clearml To Not Install Packages From My Pycharm Environment, But Activate An Environment I Have Locally On The Computer The Agent Is Installed On, Is The Variable Clearml_Agent_Skip_Pip_Venv_Install Relevant To This, And If So Can I Configure I

Hi ConvolutedChicken69
assuming you are runnign the agent in venv mode you can do something like:
$ CLEARML_AGENT_SKIP_PIP_VENV_INSTALL=1 clearml-agent daemon --queue defaultThis will basically only clone the code and use the default python the clearml-agent itself is using.
Does that help?
BTW:

it gets an error as it can't find it with pip.

What's the error? how come the package cannot be installed ?

2 years ago

0 Hi. Inside A Notebook When I Cerate A New Clearml Task And Then Run Sklearn Gridsearchcv , Clearml Uploads A Lot Of Model. Is There A Way To Force Clearml Not To Upload These Models? Related Question Is What Are These Models Anyway? Their Name Only Contai

The problem is that I currently don't have a way to get them "from outside".

Maybe as a hack (until we add the model object)
` class MyModelCB:
current_args = dict()
@classmethod
def callback(load_save, model_info):
if load_save != "save":
return model_info
model_info.name = "my new name" + str(current_args) # make a name from args
return model_info

WeightsFileHandler.add_pre_callback(MyModelCB.callback)
MyModelCB.current_args = {"args": "value"} `wdyt?

one year ago

0 Hi! I Have Local Minio Setup, Via Minio Browser I Can Upload 50-100 Mb Per Second As Its Local. But When I Try To Use Task.Upload_Artifact It Uploads 500 Kb Per Second. Does Anyone Have An Idea About This?

Anyhow if the StorageManager.upload was fast, the upload_artifact is calling that exact function. So I don't think we actually have an issue here. What do you think?

4 years ago

0 Hi Everyone! Is Anybody Using Log-Scale Parameter Ranges For Hyper-Parameter Optimization? It Seems That There Is A Bug In The Hpbandster Module. I'M Getting Negative Learning Rates..

GreasyLeopard35 I think you are on to something, I think UniformParameterRange just misses a min value:
https://github.com/allegroai/clearml/blob/fcad50b6266f445424a1f1fb361f5a4bc5c7f6a3/clearml/automation/parameters.py#L168
Should be:
[self.min_value + v*step_size for v in range(0, int(steps))]

2 years ago

0 Hey Guys, I Keep Getting "Failed Parsing Task Parameter" Warning For The Arguments Such As This One:

parser.add_argument( "--dataset_mean", type

=

float, nargs

=

"+", default

=

0.5)

I think providing nargs='+ ' assumes the type is a list. nonetheless we should be able to support it. Could you please add a GitHub issue so we do not forget ?

on the side note, is there any way to automatically give more meaningful names to the running docker containers?

What do you mean by that? running where? and where will you see them ?

3 years ago

0 Okay, 3Rd Question In A Row Here, You Guys Are So Helpful Here!! Okay So Is There Some Kind Of Script That Launches When Say You "Publish" An Experiment So That You Can Get The

If possible, can we have a "only one experiment can be given a single tag"

You mean "moving a tag" automatically (i.e. if someone else had the same tag it is removed from it)?

2 years ago

0 Hi

Hi GrievingTurkey78

How can I check the server dashboard to make sure everything is working? I have tried to access the external ip through https but the browser is not able to connect.

What do you mean by the server dashboard ?

regrading (2) see here: https://allegro.ai/docs/faq/faq/#web-auth

3 years ago

0 Do I Understand Correctly, That Running

I think the reason is that the "original" task is already the right type. I'll make sure we fix it, and always set the system tag

3 years ago

0 Hello Everyone, I Am Using Self Hosted Clearml Server On Ec2 (Clearml Community Amis). This Ec2 Instance Is Attached To S3 With Iam Role. Now If I Create Or Upload Data From Client Side , I Want It To Be Uploaded On S3. There Is A Way Mentioned For Mentio

can I mount the s3 bucket as file system on place where

you need to mount it where the file server is storing it's files, correct (notice, not the DBs, just the files server)

11 months ago

0 Sometimes I Notice That At The End Of An Experiment Clearml Keeps Hanging (Something With Repository Detection?) And The Script Does Not End. Do More People See This? Especially In Our Continuous Integration Pipeline This Give Problems Because Tests Are G

Thanks SolidSealion72 !

Also, I found out that adding "pool.join()" after pool.close() seem to solve the issue in the minimal example.

This is interesting, I'm pretty sure it has something to do with the subprocess not "closing" properly (or too fast or something)
Let me see if I can reproduce

2 years ago

0 Hello! I'M Trying To Test The (Unpublished) Feature That Should Help Me To Deal With Running Cloned Pipelines From Different Commits/Branches. I Found This Commit:

That works AND the feature works!

YEY

Quick follow up question, is there any way to abort a pipeline and all of the tasks it ran?

Hmm yes currently if you abort the pipeline is has no "time" to abort the running Tasks (the DAG itself will stop, because the pipeline controller was aborted, bit the running Tasks will continue).
In order to have better support, we need to add a previously requested feature for "abort" callback. This is actually not as straight forward as it sound...

3 years ago

0 Why Would Every Submitted Task Be Aborted Directly?

I was having this confusion as well. Did behavior for execute_remote change that it used to be Draft is Aborted now?

Actually it was changed, it used to reset the Task (then push into into execution queue if needed), with clearml v1.0, we now support pushing aborted Tasks back into queues, so execute_remotely aborts the Task (instead of reseting it)
(you can always manually reset it)

3 years ago

0 Hello There, I Am Trying To Organize The Dl Code Into A Monorepo, The Repo Will Have A Section Of Shared Packages That Will Be Used By Other Packages That Are The Actual Training Projects. Let'S Say That I Install The Shared Libs With Pip In Editable Mod

Then as you suggested, I would just use sys.path it is probably the easiest and actually very safe (because the subfolders are Always next to the "main" source code)

2 years ago

0 ..

Are these fields of ClearML Task?

correct

2 years ago

Show more results