AgitatedDove14

48 Questions, 8049 Answers

Active since 10 January 2023

Last activity 6 months ago

Reputation

Badges 1

25 × Eureka!

Answers 8049

0 Hi, Which Database Services Are Used To Store The Logged Data Such As Scalar, Text, Matrix, Etc? How Can I Query These For A Downstream Process Programmatically Instead Of Just Within The Web Ui? If Scalar Data Is Stored In Mongodb, Can I Use Pymongo To R

Ohh if this is the case, and this is a stream of constant inference Results, then yes, you should push it to some stream supported DB.
Simple SQL tables would work, but for actual scale I would push into a Kafka stream then pull it (serially) somewhere else and push into a DB

3 years ago

0 Is It Possible To Give The Agent Access To Install Private Pip Packages (Needs To Be Installed From The Repo)?

p.s. you should remove this line 🙂
extra_index_url: ["git@github.com:salimmj/xxxx"]

3 years ago

0 Is It Possible To Give The Agent Access To Install Private Pip Packages (Needs To Be Installed From The Repo)?

Can you copy the "Installed Packages" here, and point to the package causing the issue?

3 years ago

0 Is It Possible To Give The Agent Access To Install Private Pip Packages (Needs To Be Installed From The Repo)?

I wonder if using our own containers which should have most the deps will work better than a simpler container.

Why not, it's transparent, just run in --docker mode and provide a default docker image if the Task doesn't specify one.

3 years ago

0 It Appears That When I Use Poetry, It Recreates The Environment (Doesn'T Persist The Venv) And Redownloads All The Python Packages (Doesn'T Use A Cache). Is This How The System Currently Works, Or Are There Specific Flags And Such That I Have Not Enabled

Should I map the poetry cache volume to a location on the host?

Yes, this will solve it! (maybe we should have that automatically if using poetry as package manager)
Could you maybe add a github issue, so we do not forget ?
Meanwhile you can add the mapping here:
https://github.com/allegroai/clearml-agent/blob/bd411a19843fbb1e063b131e830a4515233bdf04/docs/clearml.conf#L137
extra_docker_arguments: ["-v", "/mnt/cache/poetry:/root/poetry_cache_here"]

3 years ago

Hi StrangePelican34 , you mean poetry as package manager of the agent? The venvs cache will only work for pip and conda, poetry handles everything internally:(

3 years ago

0 Hi Again

Hi JealousParrot68
no need for decorators, you can just pass the function to schedule_function=<function goes here> 🙂
See scheduler here
https://github.com/allegroai/clearml/blob/8708967a5ef4d8529a1a5ea417672e3ebbb258d7/clearml/automation/scheduler.py#L485
And triggers here:
https://github.com/allegroai/clearml/blob/8708967a5ef4d8529a1a5ea417672e3ebbb258d7/clearml/automation/trigger.py#L193
https://github.com/allegroai/clearml/blob/8708967a5ef4d8529a1a5ea417672e3ebbb258d7/clea...

3 years ago

0 Hi, I'M Attempting To Use

Hi GracefulDog98

Any guess why the password is "incorrect" for me?

Basically the clearml-session CLI needs to be able to access (SSH) into the host (cleaml-agent) machine,
is that possible?

3 years ago

0 Is It Possible To Filter Tasks By There Output And Input Names Using .Get_Tasks?

Hi JealousParrot68
You mean by artifact names ?

3 years ago

0 Hey Guys, I Keep Getting "Failed Parsing Task Parameter" Warning For The Arguments Such As This One:

parser.add_argument( "--dataset_mean", type

=

float, nargs

=

"+", default

=

0.5)

I think providing nargs='+ ' assumes the type is a list. nonetheless we should be able to support it. Could you please add a GitHub issue so we do not forget ?

on the side note, is there any way to automatically give more meaningful names to the running docker containers?

What do you mean by that? running where? and where will you see them ?

3 years ago

0 Hey Guys, I Keep Getting "Failed Parsing Task Parameter" Warning For The Arguments Such As This One:

and sometimes there are hanging containers or containers that consume too much RAM.

Hmmm yes, but can't you see it in CLearML dashboard ?

unless I explicitly add container name in container arguments, it will have a random name,
it would be great if we could set default container name for each experiment (e.g., experiment id)

Sounds like a great feature! with little implementation work 🙂 Can you add a GitHub issue on clearml-agent ?

3 years ago

0 Is There An Efficient Way To Query All Unique Models (Ie Excluding Versions) In A Project?

Which means there will be atleast multiple published models entries of same model over time?

Only the specific one will be published (not all the Models the Task created)

3 years ago

0 Can Someone Help Me With Deploying This Example Model (From Triton Inference Server) Deployed In Clearml-Serving? Too Many Random Errors For Me To Figure It Out

I think you are correct, it seems like it is missing requirements to boto/azure/google (I will make sure this is added). In the meantime, you can stop the "triton serving engine" Task, reset it, add boto3 to the installed packages and relaunch.
That said your main issue might be packaging the python model. Basically you need to create a model from the entire folder (with whatever there is inside the folder), then Triton should be able to run it (if the config.pbtxt is correct).
` m = OutputMo...

3 years ago

0 Can Someone Help Me With Deploying This Example Model (From Triton Inference Server) Deployed In Clearml-Serving? Too Many Random Errors For Me To Figure It Out

mode.savemodel ?

3 years ago

0 I Am Starting To Use Clearml-Data, And I Have A Feature Request - Add A Progress Bar For The Upload Phase / Log Which Files Are Uploaded / Add Upload Speed Currently When Uploading Large Amounts Of Data, We Can An Obscure Message Of

The issue is uploading reporting fro http uploads (object storage will report upload). Basically the http upload is post with urllib that does not support upload callbacks for progress report. If you have an idea here, we will gladly add it (as you mentioned it can be quite annoying to have to open network manager to verify the upload is progressing)

3 years ago

Thanks LethalCentipede31 , i think (3) is the most stable solution (as it doesn't require to add another package, and should work on any python version / OS)
This is actually what we do for downloads .
DO you know if there is a minimum required python requests version ?

3 years ago

0 Hi, I Am Trying To Use Agent With A Sample, Very Simple Task. But It Stucks And Task Does Not Finish. In Ui In Console I See What I Pasted On Image. Do You Know What I Might Be Doing Wrong? Agent Is Run In Virtual Env Mode

RoundMosquito25 do notice the agent is pulling the code from the remote repo, so you do need to push the local commits, but the uncommitted changes clearml will do for you. Make sense?

2 years ago

0 Hello, I'M A Bit Lost In The Docs For The Mlops, I Have Script Which Already Integrate Clearml Logging, Should I Use Clearml-Task To Launch It On An Agent ? (I Already Have A Clearml-Server And A Clearml-Agent Running).

How does the folder structure look like, and where is the "package" and the entry script ?

3 years ago

0 Hi, I Would Like To Check What Would Be The Recommended Hardware Specs For The Server Host Clearml Server. I Had One Configured With 32 Cpu Cores, 64Gb Ram And I Noticed That If We Have A Surge In Remote Task Creation, The Following Delays Occurs.

If the only issue is this line
task.execute_remotely(..., exit_process=True)It has to finish the static analysis of the entire repository (which usually happens in the background but now we have to wait for it). If the repo is large this could actually take 20sec (depending on CPU/drive of the machine itself)

3 years ago

0 Hi There, I Am Running A Clearml-Agent In Services Mode (With Docker) On A Machine With Two Disks: One With The Os (8Go, 91% Space Used) And One For The Data (100Go, 40% Space Used). When Executing The Auto-Scaler Task In This Agent, I Get The Following E

it will constantly try to resend logs

Notice this happens in the background, in theory you will just get stderr messages when it fails to send but the training should continue

3 years ago

0 Is It Possible To Schedule Pipelines On Events Like Dataset Update?

How can i make it such that any update to the upstream database

What do you mean "upstream database"?

3 years ago

0 There Seems To Be An Error If A Project Name Has Spaces (At Least At The Top-Level Name). I Created A Project Called

UnevenDolphin73 are you positive, is this reproducible? What are you getting?

2 years ago

0 Hello Everyone. I Don'T Uderstand Why Is My Training Slower With Connected Tensorboard Than Without It. I Have Some Thoughts About It But I Not Sure. My Internet Traffic Looks Wierd.I Think This Is Because Tensorboard Logs Too Much Data On Each Batch And

Hmm I wonder, can you try with this line before?
Task._report_subprocess_enabled = False frameworks = { 'tensorboard': True, 'pytorch': False } Task.init(...)

2 years ago

0 Additionally I'M Wondering If You Can Configure The Routes For The Api, Server And File Server To Be Paths Instead Of Sub-Domains: I.E.

ScantMoth28 it should work, I think default deployment also has an NGINX with reverse proxy on it switching from " http://clearml-server.domain.com/api " to " http://api.clearml-server.domain.com "

2 years ago

0 Performance Under Docker Is 10% Lower Than On Bare Metal

Hi DullCamel78

Hi everyone! Has anyone tried running

aws_autoscaler.py without docker?

Well generally since this is a remote machine the easiest way to control environment is with containers, hence the default use case. In theory you can change it to use venv, but then of course your a somewhat limited with the diff drivers/cuda/python environement.

performance under docker is 10% lower than on bare metal

add to your extra docker args
` extra_docker_arguments: ["...

2 years ago

0 Hi, When I Use Task.Get_Logger().Report_Table, I Go The Ui After The Experiment Finishes And I Download The Table (Under Results > Plots), It Gives Me A Json File. How Can I Use It? It Seems To Follow A Structure Specific To Clearml, How Can I For Example

Are you trying to upload an artifact post execution ?

3 years ago

0 Hello Clearml Community! I'M Trying To Make Autonomous Learning, The Case Is I Want To Use Clearml To Train My Ai Model Once Every 2 Weeks And Then Register And Serve The Model To Clearml Automatically. Is It Possible In Clearml? Thank You :) P.S. I Foun

I found "scheduler" on allegroai github, is it something related to the case I want to make?

MoodyCentipede68 it is exactly what you are looking for 🙂
Do notice that you need to make sure you have your services queue configured and running for that to work 🙂

2 years ago

0 If Possible, I Would Like All Together Prevent The Fileserver And Write Everything To S3 (Without Needing Every User To Change Their Config)

If possible, i would like all together prevent the fileserver and write everything to S3 (without needing every user to change their config)

There is no current way to "globally" change the default files server (I think this is part of the enterprise version, alongside vault etc.).
What you can do is use an OS environment to override the conf file:
CLEARML_FILES_HOST=" "PricklyRaven28 wdyt?

2 years ago

0 Hey, Using K8S With Trains 0.16.1-320, All Of A Sudden The Entire Data (I.E Experiments, Tasks, Api Creds) Is Not Showing In The Ui Anymore. All Logs Seems To Be Fine Afai Can Tell... Any Idea What Went Wrong?

Now I suspect what happened is it stayed on another node, and your k8s never took care of that

3 years ago

0 I’M Getting 404 Errors When Trying To Click Links For Notebook Artifacts And I’M Trying To Figure Out If It’S The File Or If It’S The File Server. Is There Some Sort Of Endpoint We Can Hit On The Fileserver To Verify It’S Available?

the issue moving forward is if we restart the pod we will have to manually update that again.

Can't you map the nginx configuration file ? (making the changes persistent across pods)

3 years ago

Show more results