AgitatedDove14

49 Questions, 8122 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

25 × Eureka!

Answers 8122

0 I'M Trying To Set Up Clearml Server On A New Vm But The Elasticsearch Container Is Erroring With The Following:

WittyOwl57 what about? vm.max_map_count echo "vm.max_map_count=262144" > /tmp/99-clearml.conf
sudo mv /tmp/99-clearml.conf /etc/sysctl.d/99-clearml.conf
sudo sysctl -w vm.max_map_count=262144
sudo service docker restart `https://clear.ml/docs/latest/docs/deploying_clearml/clearml_server_linux_mac (5)

2 years ago

0 Hello, This Is The Following Python Code I Had Saved As Main.Py.

Seems like credentials error
Do you have everything setup correctly in your ~/clearml.conf ?

4 years ago

0 Hi Again. As I Am Running My Experiment From Server Using Agent, I Am Failing On The Point, Where The Arguments Of Argparse Are Processed. When Is The Agent Task Registered. I Am Getting None For Task.Current_Task() At The Begining Of My Script.

Hi WorriedParrot51 , what do you mean by "call get_parameters_as_dict() from agent" ?
Do you mean like change the trains-agent to run the task differently?
Or inside your code while the trains agent runs it?
From the code itself (regardless off how you run it) you can always call, and get the current states parameters (i.e. from backend if running with trains-agent, or copied from the code, if running manually)
task.get_parameters_as_dict()

5 years ago

0 Hi All! I Am Logging A Plot And I Am Having An Issue With Its Legend. When I Run The Code Myself It Appears As Expected (1St Image). When I Log It In Clearml The Legend Is Not Right (2Nd Image). The Code For Generating The Plot Is (Df Is A Pandas Datafram

Yeah @<1689446563463565312:profile|SmallTurkey79> is right, reverting to image is the safest way to get exactly the same...
btw, @<1791277437087125504:profile|BrightDog7> if you can produce a standalone example of reporting the data, we can probably fix whatever is broken in the auto convert, or at least revert to image based automatically (basically if the plot is simple enough it will try to convert it, otherwise it will automatically revert to image internally)

7 months ago

0 Does Dataset.Add_Files Support Uploading From S3 Uri? I Have No Problem Uploading To S3 But Cant Use Data That Is Already In S3? Or Am I Dong Something Wrong? I Read In Documentation That Add_External_Files Supports This Feature, But I Want To Be Able To

Hi @<1590514584836378624:profile|AmiableSeaturtle81>
I think you should use add_external_files , instead of add_files (which is for local files)
None

one year ago

0 Hello, I'M Using Trains For Logging My Training Script. However, While Using The Logger I'M Getting This: Trains.Task - Warning - ### Task Stopped - User Aborted - Status Changed ### And Eventually The Process Is Killed. If I Disable The Logger, The Proc

SoreDragonfly16 could you reproduce the issue?
What's your OS? trains versions?

4 years ago

0 <image>

Hmm so the Task.init should be called on the main process, this way the subprocess knows the Task is already created (you can call Task.init twice to get the task object). I wonder if we somehow can communicate between the sub processes without initializing in the main one...

4 years ago

0 Is There Any Reason Why Doing The Following Is Not Possible? Am I Doing It Right? I Want To Run A Pipeline With Different Parameters But I Get The Following Error?

Right! I just noticed that! this is odd... and yes defiantly has something to do with the multi pipeline executed on the agent, I think I know what to look for ...
(just making sure (again), running_locally produced exactly what we were expecting, is that correct?)

3 years ago

0 Hello Folks! We Have Started Using Clearml In Kubernetes. The Trainings Are Run In K8S With Help Of K8Sintegration And Some Custom Coding. Now For The Clearml-Session Tasks, A Port-Forward Should Be Done Each Time If I Need To Access The Jupyter Notebook

. I’m using the default operation mode which uses kubectl run. Should I use templates and specify a service in there to be able to connect to the pods?

Ohh the default "kubectl run" does not support the "ports-mode" 😞

There’s a static number of pod which services are created for…

You got it! 🙂

4 years ago

0 Hi There, I'Ve Encountered A Problematic Behavior In Python. When Defining An Argument A Default Value Of

i.e. run
pip install --upgrade trains

5 years ago

0 Hi Everyone, Additional Arguments To The Script Execution, Is It Possible? How Can It Be Done? So At The Moment When My Script Is Being Executed The

PompousBeetle71 a few questions:
is this like using PyTorch distributed , only manually? Why don't you use call trains.init in all the sub processes? We had a few threads on that, it seems like a recurring question, I'll make sure we have an example on GitHub. Basically trains will take care of passing the arg-parser commands to the sub processes, and also on torch node settings. It will also make sure they all report to the tame experiment.What do you think?

5 years ago

0 Hi All - I Have A Question To Ask (And Not Sure If There Is A Channel For Faqs So Sorry For Putting It Here) ... I Am Using Trains In Combination With Pycharm'S Remote Debugging. I Have The Pycharm Plugin Installed. When The Experiment Ends, I Get

My pleasure :)

5 years ago

0 Hi All

The main reason to add the timeout is because the warning was annoying to users 🙂
The secondary was that clearml will start reporting based on seconds from start, then when iterations start it will revert back to iterations. But if the iterations are "epochs" the numbers are lower so you end up with a graph that does not match the expected "iterations" x-axis. Make sense ?

4 years ago

WorriedParrot51 trains should support subparsers etc.
Even if your code calls the parsing before trains.
The only thing you need is to import the package when argparser is called (not to initialize it, that can happen later)
It should (hopefully) solve the issue.

5 years ago

0 Hello, I Am Getting `Valueerror: Could Not Get Access Credentials For '

AWS_ACCESS_KEY_ID AWS_SECRET_ACCESS_KEY AWS_DEFAULT_REGION

4 years ago

Hi DisgustedDove53

Now for the clearml-session tasks, a port-forward should be done each time if I need to access the Jupyter notebook UI for example.

So basically this is why the k8s glue has --ports-mode.
Essentially you setup a k8s service (doing the ingest TCP ports) then the template.yaml that is used by the k8s glue should specify said service. Then the clearml-session knows how to access the actual pod, by a the parameters the k8s glue sets on the Task.
Make sense ?

4 years ago

0 Hey Guys, In Your Opinion, What The Best Way To Upload An Artifact To An Existing Experiment From A Storage-Server (E.G., S3)? In The Storage Module Documentation, I Saw A Function That Uploads An Object (E.G., Dataframe) To The Storage-Server, And It Is

Hi SpotlessFish46 ,
Is the artifact already in S3 ?
Is the S3 configured as the default files_server in the trains.conf ?
You can always use the StorageManager upload to wherever and register the url on the artifacts.
You can also programmatically change the artifact destination server to S3, then upload the artifact as usual.
What would be the best natch for you?

4 years ago

0 If I Create A Task Using Task.Create And Then In A Separate Piece Of Code I Want To Report To It (By Using

I guess we should have obfuscated the name better 😄

3 years ago

0 Sorry For The Noob Questions..) I Have The On Premise Server Running. Examples All Good. What Is Best Way To Add Own Experiments? One Github Repo Pr Experiment? To To Get To The Server? Api? Github Runner?

Hi LazyFox65
So the idea is that you add two lines of code to your codebase :
from clearml import Task task = Task.init(project_name='examples', task_name='change me')And you run it once, then it will create the experiment, environment arguments etc.
Now that you have it in the UI you can clone / change all the fields and send for execution.
That said you can also create an experiment from CLI (basically pointing to a repo and entry point)
You can read here:
https://github.com/allegroa...

4 years ago

0 How Can I Upload A Model Manually If I’M Training Using Catboost Framework, Which Is Not Natively Supported By

BTW: I think we had a better example, I'll try to look for one

4 years ago

0 Hi! I Am Setting Up Clearml Server With Web Authentication. As Far As I Understand, Users Use Logins And Passwords Specified In Config/Apiserver.Conf To Access Webserver Ui And Key/Secret Key From Their Local ~/Clearml.Conf To Access Apiserver. What Is Th

When are those keys used?

They are the default keys for internal access, basically just make up something, otherwise someoune could access the server with the default keys

4 years ago

0 Hi Everyone. I Have An Issue With The Simple Pipeline - It Runs Two Similar Nn Training Steps (Tf2.3, Windows10, Python 3.7) With Only Difference Is A Batch Size. I'M Running First Separately Each Step To Have Them In Clearml Project Page. Then I Run Pipe

BattyLion34 I have a theory, I think that any Task on the "default" queue qill fail if a Task is running on the "service" queue.
Could you create a toy Task that just print "." and sleeps for 5 seconds and then prints again.
Then while that Task is running, from the UI launch the Task that passed on the "default" queue. If my theory holds it should fail, then we will be getting somewhere 🙂

4 years ago

0 Hi, We Are Using Clearml For Our Experiment Tracking But Now Investigating Using The Pipeline Functionality As Well For Scheduling. We Also Want To Be Able To Trigger A Pipeline Run When There Is New Data In An External Database. Is This Possible? From Wh

. Looking at this example here, it looks like it only works with tasks:

Aha! Pipeline is a Task 🙂 (a specific type of Task, nonetheless a Task)
Just use the pipeline ID, and make sure you push it into the services queue, voila

one year ago

0 Hello, This Is The Following Python Code I Had Saved As Main.Py.

No worries 🙂 glad to hear it worked out 🙂

4 years ago

0 Hello! I Add To Inject The Configuration Into Clearml With

ClearML does

4 years ago

0 Hey Has Anyone Managed To Capture Darts Logging With Clearml When Using The Temporal Fusion Transformers ? Even When Overriding Their Trainer With A Custom Pytorch Lightning Trainer It Seems That Clearml Cannot Retrieve The Iteration Log...

a bit sad that there is no working integration with one of the leading time series framework...

You mean a series darts reports ? if it does report it, where does it do so? are you suggesting we have Darts integration (which sounds like a good idea) ?

2 years ago

Hi UnevenHorse85

As far as I understand, users use logins and passwords specified in config/apiserver.conf to access webserver UI and key/secret key from their local ~/clearml.conf to access apiserver.

Correct 🙂

access apiserver. What is the use of all other security keys

To be able to configure the SDK client (i.e. clearml package) from OS environment and not clearml.conf file

4 years ago

0 A Question About Ssh Keys Mount To A Clearnl-Agent Running In Docker Mode. I Noticed That Only When The Task Is Created And Enqueued (Using Python Script), The Local .Ssh Folder Will Be Bind With The Container, But If I Later Reset (Or Clone) And Enqueue

CrookedWalrus33
Force SSH git authentication, it will auto mount the .ssh from the host to the docker
https://github.com/allegroai/clearml-agent/blob/6c5087e425bcc9911c78751e2a6ae3e1c0640180/docs/clearml.conf#L25

3 years ago

0 Hi, It Seems Like When I Upload The Dataset To Clearml With The Methods From The Dataset Class. It Packed The Whole Dataset Into Several .Zip Files In /Tmp Folder Before Uploading. I'D Like To Know Whether I Can Point The .Zip Files To Be Saved Somewhere

If you set the TMP env variable you can control the tmp folder. Would that work?

2 years ago

0 I Have No Prior Devops Experience. I'Ve Been Able To Set Up A Simple Continuous Training Setup Using Clearml. I Wanted To Ask What Should I Learn Which Would Help Me Move A Project From Mlops Level 0 To Level 1, And Then Level 2, Using Clear Ml. I Would A

Hi VexedCat68
Are we talking youtubes ? docs? courses ?

3 years ago

Show more results