AgitatedDove14

48 Questions, 8049 Answers

Active since 10 January 2023

Last activity 6 months ago

Reputation

Badges 1

25 × Eureka!

Answers 8049

0 I Am Running Trains=0.16.4 Python==3.7.5 , And Notice That The "Log" Page Sometimes Didn'T Capture The Console Log From My Program. Is This A Known Issue, Anyone Have Experienced Similar Behavior?

EnviousStarfish54 following on this issue, the root cause is that dictConfig will clean All handlers if Not passed "incremental": True
conf_logging = { "incremental": True, ... }Since you pointed that Kedro is internally calling logging.config.dictConfig(conf_logging) ,
this seems like an issue with Kedro as this call will remove All logging handlers, which seems problematic. wdyt ?

3 years ago

0 I Am Running Trains=0.16.4 Python==3.7.5 , And Notice That The "Log" Page Sometimes Didn'T Capture The Console Log From My Program. Is This A Known Issue, Anyone Have Experienced Similar Behavior?

EnviousStarfish54 Yes i'm not sure what happens there we will have to dive deeper, but now that you got us a code snippet to reproduce the issue it should not be very complicated to fix (I hope 🤞 )

3 years ago

0 I Am Running Trains=0.16.4 Python==3.7.5 , And Notice That The "Log" Page Sometimes Didn'T Capture The Console Log From My Program. Is This A Known Issue, Anyone Have Experienced Similar Behavior?

Hi EnviousStarfish54
You mean the console output ? if that's the case, the Task.init call will monkey patch the sys.stdout/sys.stderr to report to clearml as well as the console

3 years ago

0 I Am Running Trains=0.16.4 Python==3.7.5 , And Notice That The "Log" Page Sometimes Didn'T Capture The Console Log From My Program. Is This A Known Issue, Anyone Have Experienced Similar Behavior?

Thanks EnviousStarfish54
Let me check if I can reproduce it

3 years ago

0 What Is The Right Way To Increase Number Of Retries When Using

DilapidatedDucks58 I think they are used here:
https://github.com/allegroai/clearml/blob/3d3a835435cc2f01ff19fe0a58a8d7db10fd2de2/clearml/storage/helper.py#L1407

https://boto3.amazonaws.com/v1/documentation/api/latest/reference/core/session.html#boto3.session.Session.resource

2 years ago

0 Hello Everyone! I'M Using S3 For My Model Saving. During Hyperparameter Optimization My New Tasks Get Very Long Names Due To Override Parameters And Uploading Path Becomes Something Like This "/Traffic Lights Classification/

Maybe you should make

naming_function

as public variable in

SearchStrategy

class or allow changing it in

HyperParameterOptimizer

class?

I like this idea, let's do that
Just making sure, you hit the 1024 character limit on S3 path?
If this is the case we should also fix the "artifact naming" to take that into account (it already does and has a limit, see here:
https://github.com/allegroai/clearml/blob/24464b7c1019f7a7b3149ecb80a379...

2 years ago

0 Hey Everyone

when u say use

Task.current_task()

you for logging? which i’m guessing that the fastai binding should do right?

right, this is a fancy way to say, make sure the actual sub-process is initializing ClearML so all the automagic kicks in, since this is not "forked" but a whole new process, calling Task.current_task is the equivalent of calling Task.init with the same arguments (which you can also do, I'm not sure which one is more straight forward, wdyt?)

2 years ago

0 Hi, How Can I Check If My Clearml-Agent Is Running Probably? I Setup A Local Server To Test, But Seems It Does Not Pick Up Any Job. In The Ui, I Saw The New Agent Was Registered (It Shown Up In The "Workers" Page) The Terminal Looks A Bit Weird, After S

Hi EnviousStarfish54
docker on windows , with nvidia runtime support is only with WSL (I think)
https://docs.nvidia.com/cuda/wsl-user-guide/index.html#installing-wip
https://medium.com/@dalgibbard/docker-with-gpu-support-in-wsl2-ebbc94251cf5

3 years ago

0 I Am Trying To Run A Task That Is Completely Detached From Git - Remotely. The Script Uploads Fine But In The Ui, The Git Repo Appears As “Origin”. When The Agent Tries To Pick This Up, It Fails On Trying To Clone “Origin”. What Can I Do To Let The Agent

RoughTiger69 how did you end up with a Task with just "origin" in the repo field ?

2 years ago

0 Hey There, Happy New Year To All Of You

Did you experiment any drop of performances using forkserver?

No, seems to be working properly for me.

If yes, did you test the variant suggested in the pytorch issue? If yes, did it solve the speed issue?

I haven't tested it, that said it seems like a generic optimization of the DataLoader

3 years ago

0 Is It Possible To Upload A Hyperdataset? Or Can We Only Upload Datasts

I want to store only my raw data in my blob storage, and I want to create a Hyperdataset with all the artificats, metrics, frames,

Yes that's exactly how it works.
None

This line adds a reference to raw file (local/remote)
[https://github.com/allegroai/clearml/blob/1b474dc0b057b69c76bc2daa9eb8be927cb25efa[…]es/hyperdatasets/data-registration/register_dataset_wit...

2 months ago

0 I Hit A Issue That I Cannot See My Matplotlib Plot, But It Was Shown In The Panel. Any Idea?

EnviousStarfish54

and the 8 charts are actually identical

Are you plotting the same plot 8 times?

4 years ago

0 Hi New With Clearml I Create Clearml Server On Gcp With Docker Now I’M Training Yolov5 And I Want To Save All The Info (Model And Metrics ) With Clearml To My Bucket.. (So I Can Have Small Server And No Memory Issue ) Where Should I Start? Its Should Be C

is there something else in the conf that i should change ?

I'm assuming the google credentials?
https://github.com/allegroai/clearml/blob/d45ec5d3e2caf1af477b37fcb36a81595fb9759f/docs/clearml.conf#L113

one year ago

0 I Am Trying Pytorch Nightly Again With Python 3.10. Works Fine Locally, But Fails On Clearml-Agent In Docker Mode.

seems like pip 20.1.1 has the issue, but >= 22.2.2 do not.

Notice we changed the value there, it now has two versions, pne for python 3.10 < and one for python 3.10>=
The main reason is that pip changed their resolving algorithm, and the new one can break its own dependencies (i.e. pip freeze > requirements.txt -> pip install might not actually work)
None

one year ago

0 When I Tried To Create A Clearml Serving Inference Endpoint For Yolov8, I Received The Following Error:

This line 🙂
None
Notice Triton (and so is clearml-serving) needs the pytorch model to be converted into torchscript, so that the triton backend can load it

one year ago

0 Hi, Is Clearml Support Creating New Tasks While In Offline Mode? I'M Trying To Run The Following:

Yes, offline got broken in 1.3.0 😞 , RC fixed it:
pip install clearml==1.3.1rc0Stable release later this week

2 years ago

0 Dear All, Great To Join Your Community. We Are Working On Plant Growth Stage Models At Basf For Farmers And I Was Wondering If Clearml Can Be Used Also For Data Versioning Of Tabular Data, Structured Data. I Would Like To Track If This And That Row Is Par

How can I track in clearML that this and that row was part of experiment x because it belonged to test/training data set y?

Hi @<1543766544847212544:profile|SorePelican79>
the experiments themselves will have a link to the Dataset they were using. From a dataset perspective, the idea is not to limit you, so essentially it will package all your files, and retrieve them when you fetch the datset. In terms of specifying a row / sample. My suggestion is to mark those rows when training a...

one year ago

0 Hello, I’M Trying To Update Our Clearml Server Running On Kubernetes (1.6.0-213) But I Get This Error:

Hi @<1523706645840924672:profile|VirtuousFish83>
could it be you have some permission issues ?

: Forbidden: updates to statefulset spec for fields other than 'replicas',

It might be that you will need to take it down and restart it. not while it is running.
(do make sure you backup your server 🙂 )

one year ago

0 Um, Is There A Way To Delete An Artifact From A Task That Is Running?

Hmm, you can delete the artifact with:
task._delete_artifacts(artifact_names=['my_artifact']However this will not delete the file itself.
Do delete the file I would do :
remote_file = task.artifacts['delete_me'].url h = StorageHelper.get(remote_file) h.delete(remote_file) task._delete_artifacts(artifact_names=['delete_me']Maybe we should have a proper interface for that? wdyt? what's the actual use case?

2 years ago

0 Security Question: In My Journey Of Running Clearml The "Hard Way" (Self-Hosted), One Problem I Haven'T Solved Is Security. Some Discussion Here...

Is there a way I could move the JWT authentication (not authorization) logic into an API Gateway or Load Balancer?

Hmm in theory, but not in practice 😞

if ClearML is following OAuth 2.0, t

This is for the SSO part, not for the API, API is only using JWT for verification, the login process itself is with external SSO (OAuth 2.0). But the open-source version does not support SSO 😞

Why are you trying to add another ELB with JWT verification on it ? ...

one year ago

0 [Pipeline] Hey, Is It Possible To Specify The Output Uri For Pipelines And Their Components Using Pipeline Decorators? I Would Like To Store Pipeline Artifacts And Component Artifacts On S3.

So the way it works when you run a component the return value with the entire function execution is cached, basically:

this did NOT add the artifact to the pipeline via caching on subsequent runs ❌

you just need to do:

PipelineDecorator.upload_artifact(name='images', artifact_object=img_dir, wait_on_upload=True)
return Task.current_task().artifacts['images'].url

This will return the URL of the uploaded images (i.e. S3 bucket)
which means if this is cached you will get it...

one year ago

0 Clearml Version 1.8.1 Had "Fix" For The Deferred Init Which Introduces A Bug Btw, I'Ve Opened

Oh this is so internally, the background thread can signal it is not deferred, are you saying there is bug or the code is odd?

one year ago

0 I Would Like To Use Clearml Together With Hydra Multirun Sweeps, But I’M Having Some Difficulties With The Configuration Of Tasks.

,

remote_execute

kills the thread so the multirun stops at the first sub-task.

Hmm

task = Task.init(...)
# config some stuff
task.remote_execute(queue_name_here, exit_process=False)
# this means that the local execution will stop but when running on the remote agent it will be skipped
if Task.running_locally():
  return

one year ago

0 Hello, Does Anybody Here Have Much Experience In Creating Sub-Tasks Or Sub-Pipelines? I'M Not Sure The Concept Is Particularly Well Established But The Docs Mention:

the SDK is unable to see each of the nodes?

Exactly ! I mean I love the idea of "nested" component, but implementation wise this is not trivial, it will also hurt the ability of caching individual component. The workaround is to have all the "business logic" in the pipeline function itself, routing data between components is basically "free". The data does not actually go through the pipeline logic, it only passes reference (unless the pipeline logic actually tries to access the data o...

one year ago

0 Hello Everyone I Am Trying To Use Task Scheduler To Make A Cron Job. I Have Used S3 Bucket As My File Server But When This Cron Runs It Gives The Error Not Able To Connect To S3. What Should I Do?

files_server:

://genuin-ai/

should be:

files_server:

one year ago

0 Are There Any Particular System Dependencies Needed To Enable

Oh that is odd. Is this reproducible? @<1533620191232004096:profile|NuttyLobster9> what was the flow that required another task.init?

5 months ago

0 Hey, I'M Trying To Set Up A Clearml Server On Docker As Per Documentation. Everything Goes Well Until The Docker-Compose Up Step, That'S When I Get This Error; Error: Error Pulling Image Configuration: Download Failed After Attempts=6: X509: Certificate

I can't see any reason it should not work 😀

2 years ago

hurray 🎊

2 years ago

0 Running This Code From Inside A Docker Container Locally:

It seems to fail when trying to download the model
local_download = StorageManager.get_local_copy(uri, extract_archive=False) File "/opt/venv/lib/python3.7/site-packages/clearml/storage/manager.py", line 47, in get_local_copy cached_file = cache.get_local_copy(remote_url=remote_url, force_download=force_download) File "/opt/venv/lib/python3.7/site-packages/clearml/storage/cache.py", line 55, in get_local_copy if helper.base_url == "file://":And based on the error I suspect the...

2 years ago

0 Hi All! I Have Methods Inside Notebooks That I Made Available To Clis Using Nbdev

In a notebook, create a method and decorate it by fastai.script’s @call_parse .Any chance you have a very simple code/notebook to reference (this will really help in fixing the issue)?

one year ago

Show more results