SmugDolphin23

0 Questions, 418 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Answers 418

0 Hello All, Although I Call Pipe.Wait() Or Pipe.Start(Wait=True), The Pipelinecontroller Does Not Wait In The Script Until The Pipeline Actually Terminates And Throws: Warning - Terminating Local Execution Process. Can Someone Please Help Me? Thanks A Lot

or, if you want the steps to be ran by the agent, set run_pipeline_steps_locally=False

one year ago

0 Hi! I'M Running Launch_Multi_Mode With Pytorch-Lightning

@<1578555761724755968:profile|GrievingKoala83> did you call task.aunch_multi_node(4) or 2 ? I think the right value is 4 in this case

6 months ago

0 Hello Everyone, Is It Possible To Perform A Optimization Task Based On An Executed Task? In The Pipeline I Handover The Task Id Of An Executed Training Task To The Optimization Task (E.G. Parameter_Override='General/Taskidtrainyolov5': '${Trainyolov5.Id}

Hi @<1555000563244994560:profile|OutrageousSealion55> ! How do you pass base_task_id in the HyperParamterOptimizer ?

one year ago

0 Hi All, I'Ve Been Experimenting Around With Automating The Data Sync. This Is Related To This Thread:

Is there any way to look at all the tasks that used that version of the dataset?
Not easily. You could query the runtime properties of all tasks and check for datasets used.
But what I would do is tag the task that uses a certain dataset, and then you should be able to query by tags

one year ago

0 I'Ve Noticed A Change From Clearml

Hi @<1545216070686609408:profile|EnthusiasticCow4> ! This is actually very weird. Does your pipeline fail when running the first step? What if you run the pipeline via "raw" python (i.e. by doing python3 your_script.py )?

one year ago

0 Hi! I'M Currently Considering Switching To Clearml. In My Current Trials I Am Using Up The Api Calls Very Quickly Though. Is There Some Way To Limit That? The Documentation Is A Bit Sparse On What Uses How Many Api Calls. Is It Possible To Batch Them For

Hi FlutteringWorm14 ! Looks like we indeed don't wait for report_period_sec when reporting data. We will fix this in a future release. Thank you!

2 years ago

0 Hey, Guys, Following My Previous Post, I Managed To Fix Several Problems I Had. It Seems A Little Left To Get Things Working. My Problem Now Is I Get This Error Message When Running

Can you please update it to the latest version? pip install -U jsonschema

one year ago

0 Hello Everyone! I Ran A Test Experiment And Got An Error. I'M Running On An M1 Mac. Worker Local Without Gpu. Has Anyone Already Solved This Problem?

MammothParrot39 try to set this https://github.com/allegroai/clearml-agent/blob/ebb955187dea384f574a52d059c02e16a49aeead/docs/clearml.conf#L82 in your clearml.conf to "22.3.1"

one year ago

Oh I see what you mean. start will enqueue the pipeline, in order for it to be ran remotely by an agent. I think that what you want to call is pipe.start_locally(run_pipeline_steps_locally=True) (and get rid of the wait ).

one year ago

0 Hi - Quick Question. I Am Using The Pipelinecontroller With Abort_On_Failure Set To False. I Have A Pipe With A First Task That Branch Out In 3 Branches.

Hi @<1523715429694967808:profile|ThickCrow29> ! We identified the issue. We will soon release a fix for it

11 months ago

0 Hi Everybody! I'M Running An Example Pipeline From A Web Ui. I Notice Very Strange Behavior. After The First Local Run, I Can Create A New Run And Pass Initialization Parameters There, But After A Successful Run, I Lose The Ability To Create New Runs With

this is likely an UI bug. We should have a fix soon. In the meantime, yes, you can edit the configuration under the pipeline task to achieve the same effect

one year ago

0 Hello Everyone Again! So, I Have A Bit Of An Issue This Time Where Sometimes Clearml Won'T Be Able To Find A File On S3, Occasionally It Logs A 503 Error Too Where It Has Exceeded Its 4 Max Retries. So, Essentially, It'S A Server Problem In A Way. Howeve

Hi @<1724235687256920064:profile|LonelyFly9> ! ClearML does not allow for those to be configured, but you might consider setting AWS_RETRY_MODE and AWS_MAX_ATTEMPTS env vars. Docs from boto3: None

5 months ago

0 Hi, I Have An Issue When Running A Pipeline Controller Remotely In Docker. Basically I Have A Module That Reads A Config File Into A Dict And Calls The Pipeline Controller, Like

Hi @<1570220858075516928:profile|SlipperySheep79> ! What happens if you do this:

import yaml
import argparse
from my_pipeline.pipeline import run_pipeline
from clearml import Task

parser = argparse.ArgumentParser()
parser.add_argument('--config', type=str, required=True)

if __name__ == '__main__':
    if not Task.current_task():
      args = parser.parse_args()
      with open(args.config) as f:
          config = yaml.load(f, yaml.FullLoader)
    run_pipeline(config)

one year ago

0 Hi, I Have An Issue When Running A Pipeline Controller Remotely In Docker. Basically I Have A Module That Reads A Config File Into A Dict And Calls The Pipeline Controller, Like

basically, I think that the pipeline run starts from __ main_ _ and not the pipeline function, which causes the file to be read

one year ago

0 Hi, I Have An Issue When Running A Pipeline Controller Remotely In Docker. Basically I Have A Module That Reads A Config File Into A Dict And Calls The Pipeline Controller, Like

How about if Task.running_locally(): ?

one year ago

0 I Configured S3 Storage In My Clearml.Conf File On A Worker Machine. Then I Run Experiment Which Produced A Small Artifact And It Doesn'T Appear In My Cloud Storage. What Am I Doing Wrong? How To Make Artifacts Appear On My S3 Storage? Below Is A Sample O

it's the same file you added your s3 creds to

one year ago

0 Hi There, I Am Having Issues Executing A

@<1654294828365647872:profile|GorgeousShrimp11> Any change your queue is actually named megan-testing and not megan_testing ?

11 months ago

0 Hey, Guys, Following My Previous Post, I Managed To Fix Several Problems I Had. It Seems A Little Left To Get Things Working. My Problem Now Is I Get This Error Message When Running

My pleasure

one year ago

0 Hi, I'M Trying To Upload Data From My S3 Bucket To Clearml Dataset Where I Can Start Versioning It All For My Ml Project. I Have Connected Successfully To My S3, Correctly Configured My Clearml.Conf File, But I Am Struggling With Some Task Initialization

Hi @<1719162259181146112:profile|ShakySnake40> ! It looks like you are trying to update an already finalized dataset. Datasets that are finalized cannot be updated. In general, you should create a new dataset that inherits from the dataset you want to update (via the parent_datasets argument in Dataset.create ) and operate on that dataset instead

6 months ago

0 Hi All, I'Ve Been Experimenting Around With Automating The Data Sync. This Is Related To This Thread:

Hi @<1545216070686609408:profile|EnthusiasticCow4> ! I have an idea.
The flow would be like this: you create a dataset, the parent of that dataset would be the previously created dataset. The version will auto-bump. Then, you sync this dataset with the folder. Note that sync will return the number of added/modified/removed files. If all of these are 0, then you use Dataset.delete on this dataset and break/continue, else you upload and finalize the dataset.

Something like:

parent =...

one year ago

0 Hi! I'M Running Launch_Multi_Mode With Pytorch-Lightning

Hi @<1578555761724755968:profile|GrievingKoala83> ! It looks like lightning uses the NODE_RANK env var to get the rank of a node, instead of NODE (which is used by pytorch).
We don't set NODE_RANK yet, but you could set it yourself after launchi_multi_node :

import os    
current_conf = task.launch_multi_node(2)
os.environ["NODE_RANK"] = str(current_conf.get("node_rank", ""))

Hope this helps

6 months ago

0 Hi! I'M Running Launch_Multi_Mode With Pytorch-Lightning

@<1578555761724755968:profile|GrievingKoala83> does it work properly when gpus=1? Also, what are the values found under Initializing distributed: GLOBAL_RANK: , MEMBER: in the 2 scenarios, for each task?

6 months ago

0 Hi, I’M Trying To Integrate Logger In My Pipelinedecorator But I’M Getting This Error -

Each step is a separate task, with its own separate logger. You will not be able to reuse the same logger. Instead, you should get the logger in the step you want to use it calling current_logger

10 months ago

0 Hi

Hi @<1546303293918023680:profile|MiniatureRobin9> The PipelineController has a property called id , so just doing something like pipeline.id should be enough

9 months ago

Could you try adding region under credentials as well?

one year ago

0 Hi. I Have A Job That Processes Images And Creates ~5 Gb Of Processed Image Files (Lots Of Small Ones). At The End - It Creates A

Hi PanickyMoth78 ! This will likely not make it into 1.9.0 (this will be the next version we release, most likely before Christmas). We will try to get the fix out in 1.9.1

2 years ago

0 Hello! I Can'T Seem To Be Able To Stop Clearml From Automatically Logging Model Files (Optimizer, Scheduler). It'S A Useful Feature But I'D Like To Have Some Control Over It, So That The Disk Space In My File Storage Isn'T Overused. I'M Using

Hi @<1523701345993887744:profile|SillySealion58> ! We allow finer grained control over model uploads. Please refer to this GH thread for an example on how to achieve that: None

6 months ago

0 Anyone Here With Any Idea Why My Service Tasks Get Aborted When Going To Sleep?

Hi @<1523701868901961728:profile|ReassuredTiger98> ! Looks like the task actually somehow gets ran by both an agent and locally at the same time, so one of the is aborted. Any idea why this might happen?

one year ago

0 Anyone Here With Any Idea Why My Service Tasks Get Aborted When Going To Sleep?

There might be something wrong with the agent using ubuntu:22.04 . Anyway, good to know everything works fine now

one year ago

0 So From What I Can Tell Using

Hi SoggyHamster83 ! Any reason you can't use Task.init?

2 years ago

Show more results