Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
JitteryCoyote63
Moderator
214 Questions, 1021 Answers
  Active since 10 January 2023
  Last activity 7 months ago

Reputation

0

Badges 1

979 × Eureka!
0 Votes
4 Answers
961 Views
0 Votes 4 Answers 961 Views
Hey there, happy new year to all of you 🍾 I have several tasks that are stuck while training a model with pytorch/ignite, more precisely right after uploadi...
3 years ago
0 Votes
5 Answers
955 Views
0 Votes 5 Answers 955 Views
Hello, I have a small question regarding UI: Currently, in the artifacts section of a task, the FILE PATH displayed for artifacts stored in s3 are displayed ...
4 years ago
0 Votes
30 Answers
1K Views
0 Votes 30 Answers 1K Views
Hi, in one of my agents with CUDA Version: 11.1 (from nvidia-smi), clearml agent 0.17.1 detects version 100 (I can see from experiments logs: agent.cuda_vers...
3 years ago
0 Votes
7 Answers
1K Views
0 Votes 7 Answers 1K Views
Hi, one more question: When creating a task with Task.init(), we can specify the task_type . Now when using Task.clone(), we cannot specify the task_type (is...
4 years ago
0 Votes
1 Answers
956 Views
0 Votes 1 Answers 956 Views
The Markdown editor for documenting projects is so great, I love it 🤩
3 years ago
0 Votes
11 Answers
1K Views
0 Votes 11 Answers 1K Views
Hey, I moved my trains-server to another machine, zipping the /opt/trains/data folder as described in the docs https://allegro.ai/docs/deploying_trains/train...
4 years ago
0 Votes
1 Answers
1K Views
0 Votes 1 Answers 1K Views
Hi there, any plan/benefit to support virtualenv= 20 ?
4 years ago
0 Votes
5 Answers
917 Views
0 Votes 5 Answers 917 Views
Hey again 😁 I am migrating my trains-server to AWS and I would like now to have secure accounts (with password). But I don't want to loose the current users...
4 years ago
0 Votes
16 Answers
1K Views
0 Votes 16 Answers 1K Views
Hi guys, coming this time to share an idea of a killer feature for ClearML 🚀 I am pretty sure you guys already heard of https://www.streamlit.io/ , which is...
3 years ago
0 Votes
5 Answers
1K Views
0 Votes 5 Answers 1K Views
2 years ago
0 Votes
28 Answers
1K Views
0 Votes 28 Answers 1K Views
Hi, I am trying to use omegaconf with task.connect_configuration and I get the following error: >>> OmegaConf.create(task.connect_configuration(config_dict))...
2 years ago
0 Votes
1 Answers
1K Views
0 Votes 1 Answers 1K Views
Hi, I think I found a small bug: Clone an experiment Enqueue it on a queue with no workers Delete the queue Try to Dequeue the experimentThe last operation w...
3 years ago
0 Votes
5 Answers
1K Views
0 Votes 5 Answers 1K Views
Hi, I have an error with clearml-agent 1.5.1 when importing tensorflow 2.10 from tensorflow.python.client._pywrap_tf_session import * File "/root/.clearml/ve...
one year ago
0 Votes
22 Answers
1K Views
0 Votes 22 Answers 1K Views
Hi, I would like to switch from the elastic-search service in the docker-compose of the clearml-server to an externally managed, scalable elastic-search clus...
3 years ago
0 Votes
30 Answers
1K Views
0 Votes 30 Answers 1K Views
Hi there 🙂 Task.get_parameters() returns an empty dict from within a trains-agent task being executed. When I execute it outside, it works properly. Is it i...
4 years ago
0 Votes
1 Answers
934 Views
0 Votes 1 Answers 934 Views
Small error in doc: https://allegro.ai/docs/references/trains_agent_ref/#daemon The detach parameter is shown in the command as --detached while it is listed...
4 years ago
0 Votes
30 Answers
998 Views
0 Votes 30 Answers 998 Views
Hi guys, with the new venv caching available in clearml, I have the following problem: I force my pip requirements to be: torch==1.7.1 pytorch-ignite clearml...
3 years ago
0 Votes
30 Answers
1K Views
0 Votes 30 Answers 1K Views
Hi, I just updated clearml-server to 1.1.0 and got the following error when starting it with docker-compose: clearml-apiserver | [2021-08-02 13:37:09,852] [8...
3 years ago
0 Votes
4 Answers
994 Views
0 Votes 4 Answers 994 Views
Hi, what happens exactly when I execute the following command: trains-agent daemon --gpus 0 --queue default &In my code, how to know which GPU to choose insi...
4 years ago
0 Votes
8 Answers
976 Views
0 Votes 8 Answers 976 Views
Hi, is it possible to pass temporary IAM role to the web app could access?
3 years ago
0 Votes
5 Answers
1K Views
0 Votes 5 Answers 1K Views
Does trains 0.16 supports pip >=20.2?
4 years ago
0 Votes
30 Answers
940 Views
0 Votes 30 Answers 940 Views
Hi, if I am starting my training with the following command: python -u -m torch.distributed.launch --nproc_per_node=2 --use_env train.py --config configs/tra...
3 years ago
0 Votes
26 Answers
1K Views
0 Votes 26 Answers 1K Views
Hi, I would like to follow-up in this https://clearml.slack.com/archives/CTK20V944/p1646123127790389 happening on clearml server 1.2.0 (self hosted on a sing...
2 years ago
0 Votes
19 Answers
1K Views
0 Votes 19 Answers 1K Views
one year ago
0 Votes
8 Answers
1K Views
0 Votes 8 Answers 1K Views
Hi guys, is a Task updating its status to 'Complete' before finishing to upload its artifacts/metrics in the background?
4 years ago
0 Votes
3 Answers
938 Views
0 Votes 3 Answers 938 Views
Hey there, I see that in the autoscaler configuration, the queues param accept dictionaries with values of type list of lists (see eg below.) What does it me...
3 years ago
0 Votes
1 Answers
1K Views
0 Votes 1 Answers 1K Views
Hi, how can I easily start a shell script from within an experiment and have its logs (stdin/err) logged in clearml?
2 years ago
0 Votes
17 Answers
970 Views
0 Votes 17 Answers 970 Views
Hi there, I have a problem with PyJWT: I am using trains==0.16.4 and trains-agent==0.16.3 in my agents. I installed PyJWT==1.7.1 in the agent (through extra_...
3 years ago
0 Votes
7 Answers
1K Views
0 Votes 7 Answers 1K Views
2 years ago
0 Votes
19 Answers
1K Views
0 Votes 19 Answers 1K Views
Hi again, I am trying to make the aws autoscaler work with ec2 instances, but it fails to setup the agent in the machine: the logs of the user-data script sh...
3 years ago
Show more results questions
0 Hi, Is It Possible To Pass Environment Variables To Agents Created By The Aws Autoscaler Service?

SO I updated the config with:
resource_configurations { A100 { instance_type = "p3.2xlarge" is_spot = false availability_zone = "us-east-1b" ami_id = "ami-04c0416d6bd8e4b1f" ebs_device_name = "/dev/xvda" ebs_volume_size = 100 ebs_volume_type = "gp3" key_name = "<my-key-name>" security_group_ids = ["<my-sg-id>"] subnet_id = "<my-subnet-id>" } }
but I get in the logs of the autoscaler:
` Warning! exception occurred: An error occurred (InvalidParam...

3 years ago
0 Hi, Is It Possible To Pass Environment Variables To Agents Created By The Aws Autoscaler Service?

For some reason the configuration object gets updated at runtime to
resource_configurations = null queues = null extra_trains_conf = "" extra_vm_bash_script = ""

3 years ago
0 Hi Guys, I Had Several Times Now The Following Errors Poping In Agents While Executing A Task:

yes, here is the error (the space at the end of the line is there)
` Applying uncommitted changes
Executing: ('git', 'apply'): b'error: corrupt patch at line 13\n'
Failed applying diff
trains_agent: ERROR: Failed applying git diff:
diff --git a/configs/2.2.2_from_scratch.yaml b/configs/2.2.2_from_scratch.yaml
index 9fece48..5816f78 100644
--- a/configs/2.2.2_from_scratch.yaml
+++ b/configs/2.2.2_from_scratch.yaml
@@ -136,7 +136,7 @@ data_processing:
optimizer:
type: 'RMSprop'
args:

  • lr: 2.5e...
4 years ago
0 Hey There, Is It Possible For A Clearml Pipeline Step To Log A Folder Instead Of Numpy/Pickle Objects? Looking At The Docs,

In all the steps I want to store them as artifacts to s3 because it’s very convenient.
The last step should merge them all, ie. it needs to know all the other artifacts of the previous steps

2 years ago
0 Hi, Is It Possible To Pass Environment Variables To Agents Created By The Aws Autoscaler Service?

(Btw the instance listed in the console has no name, it it normal?)

3 years ago
3 years ago
0 Hi, With Clearml-Agent 1.5.1, I Tried To Run An Experiment Within A Docker With Image Python3:8 And It Failed Executing The Task While Trying To Call Python3.9. I Am Not Sure Why It'S Using Python3.9, Since The Agent.Default_Python Is 3.8 And The Image Is

` # Set the python version to use when creating the virtual environment and launching the experiment
# Example values: "/usr/bin/python3" or "/usr/local/bin/python3.6"
# The default is the python executing the clearml_agent
python_binary: ""
# ignore any requested python version (Default: False, if a Task was using a
# specific python version and the system supports multiple python the agent will use the requested python version)
# ignore_requested_python_version: ...

one year ago
0 Hello, I Have An Error While Installing Git Dependencies Of Local Package: So Far I Used Task.

That would be awesome, yes, only from my side I have 0 knowledge of the pip codebase 😄

3 years ago
0 Hi, I Would Like To Switch From The Elastic-Search Service In The Docker-Compose Of The Clearml-Server To An Externally Managed, Scalable Elastic-Search Cluster. I Have Two Questions:

You mean it will resolve by itself in the following days or should I do something? Or there is nothing to do and it will stay this way?

3 years ago
0 Hello, I Would Like To Use Spot Instances Together With The Aws Autoscaler To Train Models With Pytorch/Ignite And I Am Wondering How To Support Interruptions During The Training (In Case The Instance Is Terminated By Aws). Is There Anything Already Built

If the reporting is done on a subprocess, I can imagine that the task.set_initial_iteration(0) call is only effective in the main process, not in the subprocess used for reporting. Could it be the case?

3 years ago
0 Hi Again, My Clearml Api-Server Is Having A Memory Leak. Each Time I Restart It, Its Ram Consumption Grows Until Getting Oom, Is Not Killed And Make The Ec2 Instance Crash

Something like that?
` curl "localhost:9200/events-training_stats_scalar-adx3r00cad1bdfvsw2a3b0sa5b1e52b/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"must": [
{
"match": {
"variant": "loss_model"
}
},
{
"match": {
"task": "8f88e4b8cff84f23bde74ed4b7213ec6"
}
}
]
}
},
"aggs": {
"series": {
"terms": { "field": "iter" }
}
}
}...

3 years ago
2 years ago
4 years ago
0 Hi

And if you need a very small change, you can also simply https://www.geeksforgeeks.org/monkey-patching-in-python-dynamic-behavior/ it

3 years ago
0 Hi

MagnificentSeaurchin79 You could also just fork the tensorflow repo, make changes in a specific branch and specify your forked repo with your custom branch in the install_requires of your setup.py

3 years ago
4 years ago
0 Hi, I Encounter A Weird Behavior: I Have A Task A That Schedules A Task B. Task B Is Executed On An Agent, But With An Old Commit

for some reason when cloning task A, trains sets an old commit in task B. I tried to recreate task A to enforce a new task id and new commit id, but still the same issue

4 years ago
0 Hi, I Just Updated Clearml-Server To 1.1.0 And Got The Following Error When Starting It With Docker-Compose:

Not of the ES cluster, I only created a backup of the clearml-server instance disk, I didn’t think there could be a problem with ES…

3 years ago
0 Hi, I Just Updated Clearml-Server To 1.1.0 And Got The Following Error When Starting It With Docker-Compose:

Now I am trying to restart the cluster with docker-compose and specifying the last volume, how can I do that?

3 years ago
0 Hi, I Just Updated Clearml-Server To 1.1.0 And Got The Following Error When Starting It With Docker-Compose:

So it can be that when restarting the docker-compose, it used another volume, hence the loss of data

3 years ago
0 Hi, I Just Updated Clearml-Server To 1.1.0 And Got The Following Error When Starting It With Docker-Compose:

I’ve set dynamic: “strict” in the template of the logs index and I was able to keep the same mapping after doing the reindex

3 years ago
0 Hi, I Just Updated Clearml-Server To 1.1.0 And Got The Following Error When Starting It With Docker-Compose:

I see that I have several volumes:
` $ docker volume ls
DRIVER VOLUME NAME
local 5b0bfe5ab1a3d645bd635b2fb6f2aefd2b657d566019343c8305959903996c67
local 43b60287d60db798dc9d1defe1d7d861334c9c8299aefad6da2f20db278cfc5b
local 1406d50aa65ab55d323500d1fb23f19adfc6e721261ab6103a59d20e82146099
local 7367a215bd42a4e888e5d88ce708bf74aedc48a6e9417c72a19739cb80f25e6d
local 7413c39f5e4b6568304832d9d2e925ebdbf47ad31ad22d77830d3618af79237b
local a55cb71edff48c2138a5da9d8d1e26df3b...

3 years ago
Show more results compactanswers