AgitatedDove14

49 Questions, 8124 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

25 × Eureka!

Answers 8124

0 I’M

This is already part of the docker-compose file,
https://github.com/allegroai/clearml-server/blob/master/docker/docker-compose.yml

2 years ago

0 Hi Community! I'M Currently Trying To Serve My Ai Model Using Clearml-Serving So I Can Access And Try My Model Through The Model Endpoint. Currently The Dataflow Of Clearml-Serving I Know Looks Like On This Diagram 1 (Model As A Rest Service). How Ever I

MoodyCentipede68 is diagram 2 a batch processing workflow?

3 years ago

0 Hi There! Is There An Easy Way To Retrieve The Site-Package Directory That Was Created By An Agent From Inside A Task? Eg.

I want that last python program to be executed with the environment that was created by the agent for this specific task

Well basically they all inherit the Python environment that points to the venv they started from, so at least in theory it should be transparent when the agent is spinning the initial process.

I eventually found a different way of achieving what I needed

Now I'm curious, what did you end up doing ?

2 years ago

0 How Can I Run A New Version Of A Pipeline, Wait For It To Finish And Then Check Its Completion/Failure Status? I Want To Kick Off The Pipeline And Then Check Completion

Hi @<1523701079223570432:profile|ReassuredOwl55> let me try ti add some color here:
Basically we have to parts (1) pipeline logic, i.e. the code that drives the DAG, (2) pipeline components, e.g. model verification
The pipeline logic (1) i.e. the code that creates the dag, the tasks and enqueues them, will be running in the git actions context. i.e. this is the automation code. The pipeline components themselves (2) e.g. model verification training etc. are running using the clearml agents...

2 years ago

0 Hi! I Have Local Minio Setup, Via Minio Browser I Can Upload 50-100 Mb Per Second As Its Local. But When I Try To Use Task.Upload_Artifact It Uploads 500 Kb Per Second. Does Anyone Have An Idea About This?

None of them is problematic, this is what I'm trying to say 🙂
I think the minio browser gets confused.
if you want to test the upload time on the client you can try:
task.flush(wait_for_uploads=True) tic = time() task.upload_artifact('test', '/tmp/localfile') task.flush(wait_for_uploads=True) print(time() - tic)

5 years ago

0 Hi, I Faced With A Silly Error, When I Run The Python Script With Task = Trains.Init(Project_Name='My Project', Task_Name='My Task'). The Task Goes To The Trains Server, But In The Trains Server, In Installed Packages Part One Of The Line

MysteriousBee56 yes, please change the trains code!!! Wee pee, if you think someone else can benefit, feel free to PR :)
Regrading the double entry, that seems like an odd bug, how can I reproduce it?

5 years ago

0 Running Into A Strange Issue—

Could it be you have old OS environment overriding the configuration file ?
Can you change the IP of the server in the conf file, and make sure it has an effect (i.e. the error changed)?

4 years ago

0 Why Does Clearml Still Waste Time On Requirement Analysis When I Provide Them? Any Tips For How I Can Reduce Clearml Overhead ... (The Time Before Work Actually Starts)?

yup! that's what I was wondering if you'd help me find a way to change the timings of. Is there an option I can override to make the retry more aggressive?

you mean wait for less?
None
add to your clearml.conf:

api.http.retries.backoff_factor = 0.1

one year ago

0 Hi There, There Seems To Be An Issue In The Web Ui -> Viewing Plots In "View In Experiment Table" Doesn'T Respect The "Scalars To Display" One Sets When Viewing In "View In Fullscreen". Is This A Bug Or Expected Behaviour?

Feel free to add to the UI request list:
https://github.com/allegroai/trains/issues/81

5 years ago

ElegantKangaroo44 my bad 😞 I missed the nuance in the description

There seems to be an issue in the web ui -> viewing plots in "view in experiment table" doesn't respect the "scalars to display" one sets when viewing in "view in fullscreen".

Yes the info-panel does not respect the full view selection, It's on the to do list to add this ability, but it is still no implemented...

5 years ago

0 Can Anyone Recommend A Good Workflow For

check here None

2 years ago

0 Hey, I Want To Use The Aws Autoscaler With Spot Instances, And I Was Wondering How (Or If) You Handle Interruptions. What We Currently Implemented Is A Mechanism That On Spot Failure Reruns The Training With A Flag, And Our Code Knows To Search For The La

Hi CleanPigeon16

I was wondering how (or if) you handle interruptions.

Good question, basically (and I might be missing a few details but I think that's the general gist).
A new instance will be spinned (spot/regular based on your "compute budget") as long as there is a job in the "monitored" queue. that mean that if a worker was kicked by amazon (i.e. is spot) another one will be spinned instead as long as there is a job in the queue. That means that what is probably missing in you...

4 years ago

0 Hi, Can I Use Clearml As A Tool For Deploying Models In A Private Network? Did Not Manage To Understnd From The Docs.

Hi EcstaticPelican93
Sure, the model deployment itself (i.e. the serving engine) can be executed on any private network (basically like any other agent)
make sense ?

4 years ago

0 It Is A Good Practice To Call A Function Decorated By

Hi GiganticTurtle0
The main issue is the cache=True it will cause the second time you call the function to essentially reuse the Task, ending with the same result.
Can you test with cache=False in the decorator ?

3 years ago

0 So, Here'S A Question. Does Clearml Automatically Save Everything Necessary To Continue Training A Pytorch Language Model? Specifically, I'Ve Been Looking At The Checkpoint Folders Created When I'M Training A Huggingface Robertaformaskedlm. I Checked What

Hi SmallDeer34
The any generally any pytorch.save(...) is logged/uploaded by clearml automatically. specifically in your case I think the only missing one is the trainer_sate.json, which I assume is general json file, and I imagine is part of huggingface framework. You can easily upload it as additional artifact with Task.upload_artifact wdyt?

4 years ago

0 Hi, Is It Possible To Specify Per Experiment (Task In Clearml) Where The Results (Artifacts) Are Saved?

So two folders with artifacts per experiment. I was wondering if there was a more efficient solution and if it could be combined.

Not sure I follow, is two subfolders for two different things are not they it is supposed to be ?

4 years ago

0 Is There An Easy Way To Add A Link To One Of The Tasks Panels? (As An Artifact, Configuration, Info, Etc)? Edit: And Follow Up Regarding The Dataset. As Discussed Somewhere Previously, The Datasets Are Now Automatically Moved To A Hidden "Sub-Project" Pr

Why is it using an OutputModel and an InputModel?

So calling OutputModel will create the new Model entity and upload the data, InputModel will store it as required input Model.
Basically on the Task you have input & output section, when you clone the Task you are copying the input section into the newly created Task, and the assumption is that when you execute it, your code will create the output section.
Here when you clone the Task you will be clone the reference to the InputModel (i...

3 years ago

0 Can I Manually Delete

Hi MelancholyElk85

Can I manually delete

.zip

files with datasets in

.clearml/cache/storage_manager/datasets

directory?

Yes, you can. I "think" the .zip is stored for easier access, but you can delete it, as long as the "extracted" folder exists, it should be fine.

3 years ago

0 If We’Re Using The Same Git Repo Over And Over For Almost All Jobs, Is It Possible To Have The Agents Keep A Local Version And Only Download The Diff Of The Job Commit To Speed Things Up?

Hi LazyTurkey38

, is it possible to have the agents keep a local version and only download the diff of the job commit to speed things up?

This is what it does, it has a local cached copy and it only pulls the latest changes

4 years ago

0 If I Clone A Task, I Suppose All Artifacts Are Not Cloned With It, Even If They Are Registered, Right?

I'm not sure about the intended use of

connect_configuration

now.

Basically here is the rationale behind it:
I have a config file that I want to log on the Task, and I Also want to be able to change this configuration file externally when launching using an agent (i.e. edit the content) I have a nested dictionary that I do not want to flatten and push as hyper-parameters because it is not very readble, so I want to store it in a more human readable form and edit it a...

3 years ago

0 Hello, I'M Using Trains For Logging My Training Script. However, While Using The Logger I'M Getting This: Trains.Task - Warning - ### Task Stopped - User Aborted - Status Changed ### And Eventually The Process Is Killed. If I Disable The Logger, The Proc

SoreDragonfly16 the torchvision warning has nothing to do with the Trains warning.
The Trains warning means that somehow someone changes the state of the Task from running (in_progress) to "stopped" (aborted). Could it be one of the subprocesses raised an exception ?

5 years ago

0 Hello Everyone I Am Trying To Use Task Scheduler To Make A Cron Job. I Have Used S3 Bucket As My File Server But When This Cron Runs It Gives The Error Not Able To Connect To S3. What Should I Do?

files_server:

://genuin-ai/

should be:

files_server:

2 years ago

0 When Running An Experiment From A Notebook, It Knows It’S A Notebook And Automatically Adds The Notebook As An Artifact Right? And The Uncommited Changes Becomes The Nottebook Converted To A Script? In One Case I Am Seeing Actual Git Diff Coming In Instea

I always have my notebooks in git repo but suddenly it's not running them correctly.

What do you mean?

Can I switch off git diff (change detection?)

Yes, Task.init(..., auto_connect_frameworks={"detect_repository": False})

4 years ago

0 Um, Is There A Way To Delete An Artifact From A Task That Is Running?

Hmm, you can delete the artifact with:
task._delete_artifacts(artifact_names=['my_artifact']However this will not delete the file itself.
Do delete the file I would do :
remote_file = task.artifacts['delete_me'].url h = StorageHelper.get(remote_file) h.delete(remote_file) task._delete_artifacts(artifact_names=['delete_me']Maybe we should have a proper interface for that? wdyt? what's the actual use case?

3 years ago

0 Can Anyone Recommend Some Good Ai Deployment Frameworks For Kubernetes? (Better If They Have/Can Be Integrated With Clearml)

https://github.com/allegroai/clearml-serving

4 years ago

0 Is It Possible To Set An Environment Variable For A Task?

Sure:
task.get_project_name()

4 years ago

0 Correct Way To Configure Ssh Authentication For Git In Agent With Docker Mode

I am struggling with configuring ssh authentication in docker mode

GentleSwallow91 Basically the agent will automatically mount the .ssh into the container , just make sure you set the following in the clearml.conf:
force_git_ssh_protocol: truehttps://github.com/allegroai/clearml-agent/blob/178af0dee84e22becb9eec8f81f343b9f2022630/docs/clearml.conf#L30

3 years ago

0 Hi, When Using

. So i'd like to use the command line argument it in the first argparse, and then hide/delete/override before running the second argparse.

Nice, hack!

3 years ago

0 Hi I'M Using Latest Versions And Experiencing A Bug. I Created A Controller In A New Project. Just After That I Clone A Task And Give It The New Project Name To Clone Into. It Fails To Create The Task: Code:

task.project is the project ID (not name)
task.get_project_name() will return the project name

4 years ago

0 Hi, When Using

Why would you need to manually change the current run? you just provided the values with either default/command-line ?
what am I missing here?
ResponsiveHedgehong88 I'm not sure I state dit, but the argparser arguments and values are collected automatically from your current run and put on the Task, there is no need to manually set them if you have the argparser running on your machine. Basically it collects the current (i.e. the process running on your machine) settings, and "copies" them ...

3 years ago

Show more results