Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
AgitatedDove14
Moderator
48 Questions, 8049 Answers
  Active since 10 January 2023
  Last activity 6 months ago

Reputation

0

Badges 1

25 × Eureka!
0 Votes
0 Answers
984 Views
0 Votes 0 Answers 984 Views
New video is out ๐Ÿ™‚ Cloud Autoscalers are awesome https://www.youtube.com/watch?v=j4XVMAaUt3E
2 years ago
0 Votes
0 Answers
1K Views
0 Votes 0 Answers 1K Views
YEY!!!! Download as CSV ๐Ÿคฏ
2 years ago
0 Votes
1 Answers
1K Views
0 Votes 1 Answers 1K Views
This is usually due to enterprise level issued https certificates not part of the local installation (basically any python generated SSL request will fail)
4 years ago
0 Votes
0 Answers
1K Views
0 Votes 0 Answers 1K Views
New RC for trains-agent is out pip install trains-agent==0.13.2rc1
4 years ago
0 Votes
0 Answers
1K Views
0 Votes 0 Answers 1K Views
Is it a one time thing? or recurring?
4 years ago
0 Votes
0 Answers
1K Views
0 Votes 0 Answers 1K Views
you set it ๐Ÿ™‚
4 years ago
0 Votes
0 Answers
1K Views
0 Votes 0 Answers 1K Views
Slack security ... Go figure ๐Ÿ˜‰
4 years ago
0 Votes
0 Answers
1K Views
0 Votes 0 Answers 1K Views
Is you server using https ?!
4 years ago
0 Votes
0 Answers
1K Views
0 Votes 0 Answers 1K Views
apparently everyone can ...
4 years ago
0 Votes
0 Answers
1K Views
0 Votes 0 Answers 1K Views
docs are up
4 years ago
0 Votes
0 Answers
1K Views
0 Votes 0 Answers 1K Views
Finally
4 years ago
0 Votes
0 Answers
953 Views
0 Votes 0 Answers 953 Views
4 years ago
0 Votes
0 Answers
975 Views
0 Votes 0 Answers 975 Views
Gals, Guys & :robot_face: If you want to get some inspiration on building DL Continuous Integration pipelines, I suggest this post (obviously built on top of...
4 years ago
0 Votes
0 Answers
1K Views
0 Votes 0 Answers 1K Views
New releases: pip install trains==0.13.3https://github.com/allegroai/trains/releases/tag/0.13.3 pip install trains-agent==0.13.2https://github.com/allegroai/...
4 years ago
0 Votes
1 Answers
956 Views
0 Votes 1 Answers 956 Views
Quick note: v1.3.1 caused PipelineDecorator Tasks to by default disable the automagic frameworks connection, this bug is solved in the latest RC pip install ...
2 years ago
0 Votes
9 Answers
972 Views
0 Votes 9 Answers 972 Views
Hi
Hi https://github.com/allegroai/trains/releases/tag/0.15.1 / https://github.com/allegroai/trains-server/releases/tag/0.15.1 / https://github.com/allegroai/tr...
4 years ago
0 Votes
0 Answers
1K Views
0 Votes 0 Answers 1K Views
We are at AAAI NY, come look us up :)
4 years ago
0 Votes
0 Answers
1K Views
0 Votes 0 Answers 1K Views
https://m.facebook.com/story.php?story_fbid=2484620658505570&id=1620822758218702&refid=52&tn=-R
4 years ago
Show more results questions
0 Hi Everyone And Thanks Again For The Help, I Still Have No Success In Running Clearml Agent, It Just Gets Stuck Without Any Output, On Debug Mode For

Okay found the issue, to disable SSL verification global add the following env variable:
CLEARML_API_HOST_VERIFY_CERT=0(I will make sure we fix the actual issue with the config file)

2 years ago
0 Hello Community! Is There An Option To Only Download A Part Of A Dataset With .Get_Local_Copy()? I Imagine Something Like This, But I Can'T Find The Right Way To Do It.

I see...
Current (and this will change soon) the entire delta is stored in a single file, so there is no real way to download a "subset" of the data, only a parent version ๐Ÿ˜ž

Lets say that this small dataset has a ID ....

Yes this would be exactly the way to do so:

` param ={'dataset': small_train_dataset_id_here}
task.connect(param)

dataset_folder = Dataset.get(param['dataset']).get_local_copy()
... Locally it will use the small_train_dataset_id_here ` , then whe...

3 years ago
0 I Am Not Familiar With Pytorch, But Is It Expected That So Many “Models” Are Created? These Are Being Repeated As Well For A Single Task (This Is Training A T5_Model With Transformers):

Do people use ClearML with huggingface transformers? The code is std transformers code.

I believe they do ๐Ÿ™‚
There is no real way to differentiate between, "storing model" using torch.save and storing configuration ...

3 years ago
3 years ago
0 Hello! Question About

Notice this is per frame (single) not per 8

one year ago
0 Hello! Since Today I Get

Yes that is exactly what I will make sure we change :)

3 years ago
0 Hi All, Are There Any Alternatives To Storing User Credentials In

Hi @<1687653458951278592:profile|StrangeStork48>
I have good news, v1.0 is out with hashed passwords support.

3 years ago
0 Hey Everyone

Yes that should work, only thing is you need to call Task init on the master process (and make sure you call Task.current_task() on the subprocesses, if you want to automagic to kick in, that said, usually there is no need, they are supposed to report everything back to the main one anyhow
basically
` @call_parse
def main(
ย  ย gpus:Param("The GPUs to use for distributed training", str)='all',
ย  ย script:Param("Script to run", str, opt=False)='',
ย  ย args:Param("Args to pass to script", nargs=...

2 years ago
0 Hi All, I'M Trying To Deploy Trains On Rancher (Nice Kubernetes Cluster Orchestration Project) Where I'M Quite New To Rancher And Kubernetes. I Have Been Able To Install Trains Using Helm

Hi WickedGoat98
Regardless on the ingress configuration (which seems like you have the hang of), the API instance itself needs to be configured with persistent volume (the web / file server do not need direct access to the API server).
Can you get the API to run properly ?

Regrading the trains-agent once you have the API/Web/File server configured, you can configure it like the trains-agent-services is configured inside the docker-compose (e.g. set the environment variable with the c...

3 years ago
0 Hi, Is There Any Document About Migration Clearml-Server. Currently, I Have Clearml-Server Running On Servera But I Want To Move All Data (Including Artifacts, Task, Dataset) From Servera To Serverb.

For example, ServerA stores file at /opt/clearml but ServeB stores at /some_path/clearml

As long as you adjust your docker-compose yaml file, should be just fine

2 years ago
one year ago
0 Hey! Is There Way To Get Latest/Best Checkpoint From Another Task (I Know Task Id)? I Know How To Get Data From Artifacts:

Hi FlatOctopus65
You are almost there
prev_task: Task = Task.get_task(task_id=<prev_task_id_here>) model = prev_task.models['output'][-1] my_check_point = model.get_local_copy()

one year ago
0 Hi I'M Trying To Clearml-Agent In My Dockerfile, But Even After Copying The Clearml.Conf To My Dockerfile Working Dir, The Clearlml Agent Does Not Start, Throwing Error Couldn'T Find ~/Clearlml.Conf How Do I Resolve This?

Hi TenderCoyote78

I'm trying to clearml-agent in my dockerfile,

I'm not sure I'm following, Are you traying to create a docker container containing the agent inside? for what purpose ?
(notice that the agent can spin any off the shelf container, there is no need to add the agent into the container it will take of itself when it is running it)

Specifically to your docker file:

RUN curl -sSL

| sh

No need for this line

COPY clearml.conf ~/clearml.conf

Try the ab...

2 years ago
0 Good Morning Folks, I Am Setting Up Clearml On A (Self-Hosted) K8S Cluster Using The

So that agent on different nodes will probably require different cuda-version images.

That makes sense SarcasticSquirrel56
I would edit the helm chart (or deploy manually) based on a selector that will select the different nodes/gpus and assign the correct containers (i.e. matching CUDA versions to the diff GPUs / drivers)
BTW: you can also playaround with k8s glue, which would dynamically spin pods based on clearml Tasks.
wdyt?

2 years ago
0 Hi, V1 Of Agent Seems To Have Removed Agent.Package_Manager.Force_Repo_Requirements_Txt. Is This Still Available In Other Forms?

SubstantialElk6 This seems to be the issue
cp: failed to access '/root/default_clearml.conf': Permission denied clearml_agent: ERROR: Could not find task id=024a421c0e174650a1c7ff64af756c26 (for host: )Notice it seems it just cannot read the clearml.conf , wdyt?

3 years ago
0 Hi, I Have A Question About How To Get Notified When A Task Finishes Running. We Are Creating Training Tasks Programatically On Clearml Server. It Is Relatively Easy To Do Through Directly Accessing The Clearml Server Api. However, We'D Like Some Way T

Hi JoyousElephant80

Another possibility would be to run a process somewhere that periodically polls ClearML Server for tasks that have recently finished

this is the easiest way to implement what you are after, and have full control over the logic itself.
Basically you inherit from the Monitor class
And implement the callback function:
https://github.com/allegroa...

3 years ago
0 Hi, I Am Giving Another Try To Clearml-Session And I Am Blocked At The Current Error Shown When The Cli Try To Establish The Tunneling:

This is the reason you are getting an error ๐Ÿ™‚
Basically the session asks the agent to setup a new SSH server with credentials on the remote machine, this is not an issue inside a container, as this is an isolated environment, but when running in venv mode the User running the agent is not root, hence it cannot spin/configure an SSH server.
Make sense ?

2 years ago
0 Hi. After Upgrading Clearml To Latest Version, Got This Error From My Pipeline (Windows10, Configured And Running Tensorflowod For Tf 2.3.):

BattyLion34 is this running with an agent ?
What's the comparison with a previously working Task (in terms of python packages) ?

3 years ago
0 I Have An On-Prem/Free Clearml-Server Setup With Custom S3 Back-End Storage. I'M Trying Out The Clearml-Serving Capability And Not Sure What'S Failing. When I Start The Serving Containers It Can'T Retrieve The Model:

Okay that makes sense, if this is the case I'm assuming you have set the files server to point to your S3 bucket is that correct ?
could it be you are missing the credentials for that (it is trying to upload the preprocessing code there, so the clearml-serving container would be able to pull it later)

2 years ago
0 Hello! I Was Hoping I Could Get Some Debug Help. I'Ve Set Up A Clearml Pipeline Using The Pipelinecontroller, And When Running Through

sets up the venv correctly, prints

Starting Task Execution:

then does nothing

Can you provide a log?
Do you see the code/git reference in the Pipeline Task details - Execution Tab ?

one year ago
0 Hi Everyone, Is There A Way To Increase The Cache Size Of Each Clearml Task? I'M Running An Experiment And Many Artifacts Are Downloaded. My Dataloader Fails To Load Some Of The File Since They Are Missing, Although They Were Downloaded. I Guess There Is

Hi ScaryKoala63
Sure, add the following to your clearml.conf:
sdk.storage.cache.default_cache_manager_size = 400I think you are correct, it seems like for some reason you hit the cache limit, and a previous entry was deleted

2 years ago
3 years ago
0 Is There Any Reason Why Doing The Following Is Not Possible? Am I Doing It Right? I Want To Run A Pipeline With Different Parameters But I Get The Following Error?

After testing the code again, I see the task parameter dictionary has been removed properly

Great!

However, I still have the same problem with duplicate tasks, as you can see in the image.

Any chance the pipeline script Itself is running from the agent (as opposed to running the pipeline code locally, then the pipelines are executed on the agent)?

2 years ago
Show more results compactanswers