AgitatedDove14

49 Questions, 8126 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

25 × Eureka!

Answers 8126

0 When Launching A Task To Trains Agent, I'M Having Trouble Getting The Imports From Other Files Working Correctly. For Instance, If My Task Imports A Function From Another File Within The Same Git Repo [

would I have to execute each task in the pipeline locally(but still connected to trains),

Somehow you have to have the pipeline step Task in the system, you can import it from code, or you can run it once, then the pipeline will clone it and reuse it. Am I missing something ?

5 years ago

0 Hey! I Just Finished The Movie

Hi GiddyPeacock64
If you already have K8s setup, and are already using ClearML.
In your kubeflow Yaml:
trains-agent execute --id <task_id> --full-monitoringThis will install everything your Task needs inside the docker. Just make sure that you pass the env variable setting the ClearML , see here:
https://github.com/allegroai/clearml-server/blob/6434f1028e6e7fd2479b22fe553f7bca3f8a716f/docker/docker-compose.yml#L127

4 years ago

0 Hi, I Am Trying To Run Experiment From Clearml Web Ui. I Did Experiment Copy, Enqueue, But In The Execution Log I See That It Runs Command

orchestration module
When you previously mention clone the Task I the UI and then run it, how do you actually run it?
regarding the exception stack
It's pointing to a stdout that was closed?! How could that be? Any chance you can provide a toy example for us to debug?

4 years ago

0 I Have A Little Bit Of Code That Goes Like:

ElegantCoyote26
parser = get_parser() args_ = vars(parser.parse_args()) task.connect(args_)There is no need to connect args_ Task.init will automatically catch the argparser.

4 years ago

0 Hi. After Upgrading Clearml To Latest Version, Got This Error From My Pipeline (Windows10, Configured And Running Tensorflowod For Tf 2.3.):

So far my local and remote gitlab repositories are synchronized, I suspect, that

Failed applying git diff, see diff above

error is caused by cached repository from which clearml tries to run the process. I've cleaned the cache, but it haven't helped.

Hmm can you test with empty "uncommitted changes" ?
Just making sure when you say still does n't work, you are not trying to run the Task with the git diff that includes teh binary data right?

Yes, but where I can fi...

4 years ago

0 Hi All, I Observed That When I Get A Dataset With

I think you are correct 😞 Let me make sure we add that (docstring and documentation)

3 years ago

0 I’M Getting 404 Errors When Trying To Click Links For Notebook Artifacts And I’M Trying To Figure Out If It’S The File Or If It’S The File Server. Is There Some Sort Of Endpoint We Can Hit On The Fileserver To Verify It’S Available?

looks like at the end of the day we removed

proxy_set_header Host $host;

and use the fqdn for the proxy_pass line

And did that solve the issue?

4 years ago

0 Hello Everyone, I’M Newcomer For Clearml. I Have Question Related To

I do not think this is the upload timeout, it makes no sense to me for GCP package (we do not pass any timeout, it's their internal default for the argument) to include a 60sec timeout for upload...
I'm also not sure where is the origin of the timeout (I'm assuming the initial GCP handshake connection could not actually timeout, as the response should be relatively quick, so 60sec is more than enough)

4 years ago

0 Has Anyone Got Any Experience With C++ Extensions In Python When Using Clearml? In Our Setup.Py We Have:

So could it be that pip install --no-deps . is the missing issue ?
what happens if you add to the installed packages "/opt/keras-hannd" ?

3 years ago

0 Hi! I'M Using Func

. Could you clarify the question for me, please?
...
Could you please point me to the piece of ClearML code related to the downloading process?

I think I mean this part:
https://github.com/allegroai/clearml/blob/e3547cd89770c6d73f92d9a05696018957c3fd62/clearml/datasets/dataset.py#L2134

3 years ago

0 Hi, I'M Trying To Set Storage Manager To Use Our Internal Miniio Installation But I Ran Into This Issue With This Testing Code:

Sounds good to me 🙂

4 years ago

0 Hello! I Was Wondering If There'S A Way To Have A Time Series Chart When Comparing Experiments? The Bar Chart Isn'T Ideal For Me: Context: Doing A Classification Task Where An Llm Decides The Output Class. Trying To Determine Which Prompt Performs The B

Hi @<1726410010763726848:profile|DistinctToad76>
Why not just report scalars, the x-axis you can use as "iterations" if this is a running in real time to collect the prompts.
If this is a summary then just report a scatter plot (you can also specify the names of the axis and the series)
None

one year ago

0 Is There An Enviroument Variable That I Can Use To Set The Trains.Conf File Path?

SlipperyDove40 Yes there is
TRAINS_CONFIG_FILEhttps://allegro.ai/docs/faq/faq/#trains-configuration

5 years ago

0 Has Anyone Done This Exact Use Case - Updates To Datasets Triggering Pipelines?

Good news a dedicated class for exactly that will be out in a few days 🙂
Basically task scheduler and task trigger scheduler, running as a service cloning/launching tasks either based on time (cron alike) or based on a trigger).
wdyt?

4 years ago

0 What Is

With

pipe.start(queue='services')

, it still tries to run some docker for some reason

The services agent is always running with --docker:
https://github.com/allegroai/clearml-agent/blob/e416ab526ba9fe05daa977b34c9e46b50fb214a0/docker/services/entrypoint.sh#L16
Actually I think we should have it as an argument, so it is easier to control from docker-compose

I'll be waiting for the full log to check the "git clone" issue

4 years ago

0 Hi, I Am Trying To Upload A Model But I Am Getting The Following Error:

SkinnyPanda43 issue verified, this seems to be related to python 3.9 and subprocesses.
Let me check what we can do

4 years ago

0 I Have A Reporting Task I Want To Schedule Using Taskscheduler. 2 Main Input Params Are

LOL totally 🙂

3 years ago

0 Hello, I Am Looking For A Way To Increase Number Of Images Saved In Results>Debug Samples. Looks Like There Is A Limit Of 100 Images Per Experiment, And All Images Saved After Are Not Displayed In Web Client. I Like To Have First Batch With Predictions V

you can also increase the limit here:
https://github.com/allegroai/clearml/blob/2e95881c76119964944eaa0289549617e8afeee9/docs/clearml.conf#L32

4 years ago

0 Hi, I’M Trying To Create A Dataset On Clearml Server From My Aws S3 Bucket Via:

@<1562610699555835904:profile|VirtuousHedgehong97>

source_url="s3:...",

This means your data is already on S3 bucket, it will not "upload" it it will just register it.
If you want to upload files, then they should be local and then when you call upload you can specify the target S3 bucket, and the data will be stored in a unique folder in the bucket
Does that make sense ?

2 years ago

0 Two Annoying Visual Bugs In Clearml Server Ui After Latest Update:

DilapidatedDucks58 I'm assuming clearml-server 1.7 ?
I think both are fixed in 1.8 (due to be released wither next week, or the one after)

3 years ago

0 Hi, I Am New Here. I Was Wondering Where Can I Configure Which Machines Trains (Or Trains-Agent?) Use For Queueing Tasks, And How Do I Create Such Queues. Thanks.

Check the examples on the github page, I think this is what you are looking for 🙂
https://github.com/allegroai/trains-agent#running-the-trains-agent

5 years ago

0 Hi There I'M Trying Out Clearml. I Saw Mention That Clearml Can Capture Tensorboard Output So I Tried It With This Little Script (Image Below). The Events File Is Filled, The Clearml Task Is Created, And Marked Complete However There Is Nothing In The Sc

Yes it should
here is fastai example, just in case 🙂
https://github.com/allegroai/clearml/blob/master/examples/frameworks/fastai/fastai_with_tensorboard_example.py

3 years ago

0 Hi All. I'M Setting Up An Model Export Script That Will Export Trained Models For Edge Deployment. I Initially Thought About Setting It Up As A Trigger Scheduler, And To Have It Trigger On Tags On A Published Model, But As Time Goes By The Trigger Schedul

Also, how do pipelines compare here?

Pipelines are a type of Task, so like Tasks you can clone and enqueue them, or set them as the target of the trigger.

the most flexible solution would be to have some way of triggering the execution of a script in the parent task environment,

This is the exact idea of the TriggerScheduler None
What am I missing here?

one year ago

0 Has Anyone Tried Using Clearml With Ray Based Distributed Training For Computer Vision Models Like Resnet?

Hi @<1658281093108862976:profile|EncouragingPenguin15>
Should work, I'm assuming multiple nodes are running agents ? or are you saying Ray spins the jobs and clearml logs them ?

one year ago

0 Hi

Hi IrritableJellyfish76
If you are running a code that uses clearml from kubeflow, you have out of the box integration between the two, what am I missing?

3 years ago

0 Is There A Way To Set Precedence On Package Managers? If We Set An Agent To Use

Local changes are applied before installing requirements, right?

correct

3 years ago

0 What Could Be The Reason For Fail Status Of A Task That Seems To Have Completed Correctly? No Information In The Log Whatsoever

Yes
Are you trying to upload_artifact to a Task that is already completed ?

4 years ago

0 Hi, I'M Trying To Get Tensorboard Plots Into The Allegro Trains Server. Although I Followed The Example

TrickyRaccoon92 Thanks you so much! 😊

5 years ago

0 Task Struck At

The only weird thing to me is not getting any "connection warnings" if this is indeed a network issue ...

3 years ago

0 Hello, I'M Trying Clearml-Serving On Any Of The Example Models From The 'Clearml Examples' Project. After Running 'Clearml-Serving Triton ...' I Always Get The Following Error: Clearml-Serving Triton --Endpoint "Keras_Mnist" --Model-Project "Clearml Exa

Hi ScaryLeopard77
I think the error message you are getting is actually "passed" from Triton. Basically someone needs to tell it what the Model in/out look like (matrix size/type) this is essentially the content of the "config.pbtxt" , and this has to be set when spinning the model endpoint. does that make sense to you?

3 years ago

Show more results