Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
JitteryCoyote63
Moderator
215 Questions, 1023 Answers
  Active since 10 January 2023
  Last activity 3 months ago

Reputation

0

Badges 1

981 × Eureka!
0 Votes
30 Answers
2K Views
0 Votes 30 Answers 2K Views
Hi, Together with ElegantKangaroo44 we found two unexpected behaviors in task.models['output'] : The input model of the task is included in the list The best...
5 years ago
0 Votes
3 Answers
2K Views
0 Votes 3 Answers 2K Views
Hey guys, quick question: is there a tool function to know if a task id is valid? Not verifying that the task itself exists, just that the task id is the cor...
5 years ago
0 Votes
3 Answers
432 Views
0 Votes 3 Answers 432 Views
3 months ago
0 Votes
3 Answers
2K Views
0 Votes 3 Answers 2K Views
⚠️ Hi there, I recently updated clearml server to 1.7.0, and found the following critical regression: When I reset an experiment, it is actually deleted 😵 ,...
2 years ago
0 Votes
1 Answers
2K Views
0 Votes 1 Answers 2K Views
Hi, I have a clearml-agent (1.1.2) in a g4dn.4xlarge AWS instance (with one T4 GPU), that reports agent.cuda_version = 0 agent.cudnn_version = 0and does not ...
3 years ago
0 Votes
5 Answers
2K Views
0 Votes 5 Answers 2K Views
3 years ago
0 Votes
2 Answers
2K Views
0 Votes 2 Answers 2K Views
How can I filter out archived tasks with Task.get_tasks?
4 years ago
0 Votes
7 Answers
2K Views
0 Votes 7 Answers 2K Views
Hi, one more question: When creating a task with Task.init(), we can specify the task_type . Now when using Task.clone(), we cannot specify the task_type (is...
5 years ago
0 Votes
1 Answers
2K Views
0 Votes 1 Answers 2K Views
Hi, I have a question about https://clear.ml/docs/latest/docs/references/sdk/logger#report_scatter3d : Would it be possible to pass a matplotlib figure in 3d...
3 years ago
0 Votes
30 Answers
3K Views
0 Votes 30 Answers 3K Views
Hello, I am getting ValueError: Could not get access credentials for ' s3://my-bucket ' , check configuration file ~/trains.conf but I did specify them in my...
5 years ago
0 Votes
3 Answers
2K Views
0 Votes 3 Answers 2K Views
Hi ClearML team members! Is there any progress made on the clearml-serving repo? I’d love to start using it but I lack a straightforward get started example....
3 years ago
0 Votes
30 Answers
2K Views
0 Votes 30 Answers 2K Views
Hi, I just updated clearml-server to 1.1.0 and got the following error when starting it with docker-compose: clearml-apiserver | [2021-08-02 13:37:09,852] [8...
4 years ago
0 Votes
1 Answers
1K Views
0 Votes 1 Answers 1K Views
Hi there, would it be possible to add some Neural Architecture Search example, as for the HyperParameter Optimizer examples?
4 years ago
0 Votes
15 Answers
2K Views
0 Votes 15 Answers 2K Views
Hi, I restarted my clearml-server (1.1.0) and the login page always redirects me to the login page. I am using fixed users in config files. In the logs of th...
4 years ago
0 Votes
14 Answers
2K Views
0 Votes 14 Answers 2K Views
4 years ago
0 Votes
6 Answers
2K Views
0 Votes 6 Answers 2K Views
3 years ago
0 Votes
4 Answers
2K Views
0 Votes 4 Answers 2K Views
Hey again 😁 Is it possible to run multiple agents on the same machine? And with some in services mode?
5 years ago
0 Votes
5 Answers
2K Views
0 Votes 5 Answers 2K Views
Hi, from within an experiment, how can I intercept the signal that the experiment was aborted and execute a cleanup function? I tried to intercept SIGINT and...
3 years ago
0 Votes
1 Answers
2K Views
0 Votes 1 Answers 2K Views
Hi there, I moved my ClearML server from US to EU and now I am trying to setup the AWS autoscaler with the different architecture that I have now. So far I u...
4 years ago
0 Votes
7 Answers
2K Views
0 Votes 7 Answers 2K Views
3 years ago
0 Votes
4 Answers
2K Views
0 Votes 4 Answers 2K Views
Hey there, is there a way to access the trains configuration programmatically at runtime in a task (the configuration that is dumped by the agent in the logs...
5 years ago
0 Votes
5 Answers
2K Views
0 Votes 5 Answers 2K Views
Hey there, since which version, clearml stops connecting to the demo server by default?
4 years ago
0 Votes
2 Answers
2K Views
0 Votes 2 Answers 2K Views
Hi, is there a way to control after how much time an agent that went down is removed from the web-ui? I find the current value too high for my needs
2 years ago
0 Votes
3 Answers
2K Views
0 Votes 3 Answers 2K Views
Hi, in the context of multi-gpu training, is Model.get_local_copy() multi-process safe? or should make sure only the first process calls it first, then others
3 years ago
0 Votes
12 Answers
2K Views
0 Votes 12 Answers 2K Views
Hi, where can I find the server parameter to control when the server is unregistering an agent after not receiving updates? Currently it's quite long (30mins...
2 years ago
0 Votes
11 Answers
2K Views
0 Votes 11 Answers 2K Views
Are the various task types available in 0.15? I am getting > 2020-06-09 12:58:53,287 - trains.Task - WARNING - Retrying, previous request failed : 'custom' i...
5 years ago
0 Votes
5 Answers
2K Views
0 Votes 5 Answers 2K Views
Quick question: How can I clone a task and change the cloned task type? I see no Task.set_type() function
5 years ago
0 Votes
5 Answers
2K Views
0 Votes 5 Answers 2K Views
Hi, It seems that the package_manager.pip_version has been removed from the https://allegro.ai/docs/references/trains_ref/#agent , although still being shown...
5 years ago
0 Votes
18 Answers
2K Views
0 Votes 18 Answers 2K Views
Hello there, I would like to do run cleanup code in case the user aborts one task from the dashboard (the agent is not using the task in docker). What signal...
4 years ago
0 Votes
10 Answers
2K Views
0 Votes 10 Answers 2K Views
Hey guys, I am setting up a new machine with two rtx 3070 GPUs where I created two agents (one for each GPU). On both agents, my experiments fail with error:...
4 years ago
Show more results questions
0 Hello, I Am Getting `Valueerror: Could Not Get Access Credentials For '

File "devops/valid.py", line 80, in valid(parse_args) File "devops/valid.py", line 41, in valid valid_task.output_uri = args.artifacts File "/data/.trains/venvs-builds/3.6/lib/python3.6/site-packages/trains/task.py", line 695, in output_uri ", check configuration file ~/trains.conf".format(value)) ValueError: Could not get access credentials for 's3://ml-artefacts' , check configuration file ~/trains.conf

5 years ago
0 Hi, I Would Like To Bring Awareness

oh seems like it is not synced, thank you for noticing (it will be taken care immediately)

Thank you!

does not contain a specific wheel for cuda117 to x86, they use the pip defualt one

Yes so indeed they don't provide support for earlier cuda versions on latest torch versions. But I should still be able to install torch==1.11.0+cu115 even if I have cu117. Before that is what the clearml-agent was doing

2 years ago
0 Hi, With Clearml-Agent 1.5.1, I Tried To Run An Experiment Within A Docker With Image Python3:8 And It Failed Executing The Task While Trying To Call Python3.9. I Am Not Sure Why It'S Using Python3.9, Since The Agent.Default_Python Is 3.8 And The Image Is

` # Set the python version to use when creating the virtual environment and launching the experiment
# Example values: "/usr/bin/python3" or "/usr/local/bin/python3.6"
# The default is the python executing the clearml_agent
python_binary: ""
# ignore any requested python version (Default: False, if a Task was using a
# specific python version and the system supports multiple python the agent will use the requested python version)
# ignore_requested_python_version: ...

2 years ago
0 Hi There, It Seems Like There Is A Bug With The Visualization Of Debug Samples On The Ui (Server V1.2.0, Self-Hosted): When Clicking On A Debug Sample Then On The Download Button, If The Sample Is Stored In S3, The Download Button Opens A Blank Page With

Sure, it’s because of a very annoying bug that I shared in this https://clearml.slack.com/archives/CTK20V944/p1648647503942759 , that I couldn’t solve so far.

I’m not sure you can downgrade that easily ...

Yea that’s what I thought, that’s a bit of pain for me now, I hope I can find a way to fix the bug somehow

3 years ago
5 years ago
0 Hi, I Would Like To Switch From The Elastic-Search Service In The Docker-Compose Of The Clearml-Server To An Externally Managed, Scalable Elastic-Search Cluster. I Have Two Questions:

SuccessfulKoala55 I was able to recreate the indices in the new ES cluster. I specified number_of_shards: 4 for the events-log-d1bd92a3b039400cbafc60a7a5b1e52b index. I then copied the documents from the old ES using the _reindex API. The index is 7.5Gb on one shard.
Now I see that this index on the new ES cluster is ~19.4Gb 🤔 The index is divided into the 4 shards, but each shard is between 4.7Gb and 5Gb!
I was expecting to have the same index size as in the previous e...

4 years ago
0 Are The Various Task Types Available In 0.15? I Am Getting

Would you like me to open an issue for that or will you fix it?

5 years ago
0 Hello, I Am Getting `Valueerror: Could Not Get Access Credentials For '

And I can verify that ~/trains.conf exists in the su home folder

5 years ago
3 years ago
0 Hi, I Am Trying To Use Omegaconf With Task.Connect_Configuration And I Get The Following Error:

Doing it the other way around works:
` cfg = OmegaConf.create(read_yaml(conf_yaml_path))
config = task.connect(cfg)
type(config)

<class 'omegaconf.dictconfig.DictConfig'> `

3 years ago
0 Hi Again, My Clearml Api-Server Is Having A Memory Leak. Each Time I Restart It, Its Ram Consumption Grows Until Getting Oom, Is Not Killed And Make The Ec2 Instance Crash

well I still see some ES errors in the logs
` clearml-apiserver | [2021-07-07 14:02:17,009] [9] [ERROR] [clearml.service_repo] Returned 500 for events.add_batch in 65750ms, msg=General data error: err=('500 document(s) failed to index.', [{'index': {'_index': 'events-training_stats_scalar-d1bd92a3b039400cbafc60a7a5b1e52b', '_type': '_doc', '_id': 'c2068648d2fe5da975665985f44c20b6', 'status':..., extra_info=[events-training_stats_scalar-d1bd92a3b039400cbafc60a7a5b1e52b][0] primary shard is not...

4 years ago
0 Hi, I Would Like To Follow-Up In This

That said, v1.3.1 is already out, with what seems like a fix:

So you mean 1.3.1 should fix this bug?

3 years ago
0 Hi, I Am Trying To Use Omegaconf With Task.Connect_Configuration And I Get The Following Error:

erf, I have the same problem with ProxyDictPreWrite 😄 What is the use case of this one ?

3 years ago
0 Hi, Where Can I Find The Logs Of Trains-Agent By Default?

Thanks, the message is not logged in GCloud instances logs when using startup scripts, this is why I did not see it. 👍

5 years ago
0 Hey There, Is It Possible For A Clearml Pipeline Step To Log A Folder Instead Of Numpy/Pickle Objects? Looking At The Docs,

CostlyOstrich36 super thanks for confirming! I have then the follow-up question: are the artifacts duplicated (copied)? or just referenced?

3 years ago
0 Hi, With Clearml-Agent 1.5.1, I Tried To Run An Experiment Within A Docker With Image Python3:8 And It Failed Executing The Task While Trying To Call Python3.9. I Am Not Sure Why It'S Using Python3.9, Since The Agent.Default_Python Is 3.8 And The Image Is

Not really because this is difficult to control: I use the AWS autoscaler with ubuntu AMI and when an instance is created, packages are updated, and I don't know which python version I get, + changing the python version of the OS is not really recommended

2 years ago
0 Hey, I Have A Problem With The Following Task:

I mean that I have a taskA (controller) that is in charge of creating a taskB with the same argv parameters (I just change the entry point of taskB)

5 years ago
3 years ago
0 Hi, I Just Updated Clearml-Server To 1.1.0 And Got The Following Error When Starting It With Docker-Compose:

my docker-compose for the master node of the ES cluster is the following:
` version: "3.6"
services:

elasticsearch:
container_name: clearml-elastic
environment:
ES_JAVA_OPTS: -Xms2g -Xmx2g
bootstrap.memory_lock: "true"
cluster.name: clearml-es
cluster.initial_master_nodes: clearml-es-n1, clearml-es-n2, clearml-es-n3
cluster.routing.allocation.node_initial_primaries_recoveries: "500"
cluster.routing.allocation.disk.watermark.low: 500mb
clust...

4 years ago
0 Hi, I Am Giving Another Try To Clearml-Session And I Am Blocked At The Current Error Shown When The Cli Try To Establish The Tunneling:

Does the agent install the nvidia-container toolkit, so that GPUs of the instance can be accessed from inside the docker running jupyterlab?

3 years ago
0 Hi, I Am Giving Another Try To Clearml-Session And I Am Blocked At The Current Error Shown When The Cli Try To Establish The Tunneling:

sorry, the clearml-session. The error is the one I shared at the beginning of this thread

3 years ago
0 Hi, I Cannot Manage To Start Trains-Server 0.16 With The Docker-Compose File, The Trains-Elastic Container Fails With The Following Error:

Yes I did, I found the problem: docker-compose was using trains-server 0.15 because it didn't see the new version of trains-server. Hence I had trains-server 0.15 running with ES7.
-> I deleted all the containers and it successfully pulled trains-server 0.16. Now everything is running properly 🙂

5 years ago
Show more results compactanswers