Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
JitteryCoyote63
Moderator
215 Questions, 1023 Answers
  Active since 10 January 2023
  Last activity 3 months ago

Reputation

0

Badges 1

981 × Eureka!
0 Votes
4 Answers
2K Views
0 Votes 4 Answers 2K Views
Hi, are the experiments logs stored in s3 or in the trains-server? (When using s3 as artifact storage)
4 years ago
0 Votes
2 Answers
2K Views
0 Votes 2 Answers 2K Views
Hi, is it possible to start a clearml-agent (not in docker mode) on a machine with a gpu, but enforce the clearml-agent to not “see” the gpu? So that the exp...
4 years ago
0 Votes
23 Answers
2K Views
0 Votes 23 Answers 2K Views
Hi, I started a trains-agent (0.15) in services mode (full command: trains-agent daemon --services-mode --detached --queue services --create-queue --docker u...
5 years ago
0 Votes
2 Answers
2K Views
0 Votes 2 Answers 2K Views
Hey there 🙂 Still my journey to deploy the aws-autoscaler with spot instances, I have another question: I would like to limit the amount of time spent setti...
4 years ago
0 Votes
3 Answers
2K Views
0 Votes 3 Answers 2K Views
hi guys, is it possible to spin up two agents on one GPU? Something like trains-agent daemon --gpus 0 --queue default & trains-agent daemon --gpus 0 --queue ...
4 years ago
0 Votes
5 Answers
2K Views
0 Votes 5 Answers 2K Views
Hi again, it seems like the aws autoscaler is not spinning instances with the EBS configuration I configured. Here is the configuration: resource_configurati...
4 years ago
0 Votes
2 Answers
2K Views
0 Votes 2 Answers 2K Views
Hi, where can I find the logs of trains-agent by default?
5 years ago
0 Votes
30 Answers
2K Views
0 Votes 30 Answers 2K Views
Hi, is it possible to pass environment variables to agents created by the AWS AutoScaler service?
4 years ago
0 Votes
3 Answers
2K Views
0 Votes 3 Answers 2K Views
⚠️ Hi there, I recently updated clearml server to 1.7.0, and found the following critical regression: When I reset an experiment, it is actually deleted 😵 ,...
2 years ago
0 Votes
3 Answers
2K Views
0 Votes 3 Answers 2K Views
Hi, in the context of multi-gpu training, is Model.get_local_copy() multi-process safe? or should make sure only the first process calls it first, then others
3 years ago
0 Votes
5 Answers
2K Views
0 Votes 5 Answers 2K Views
Hey there, since which version, clearml stops connecting to the demo server by default?
4 years ago
0 Votes
4 Answers
2K Views
0 Votes 4 Answers 2K Views
Hi guys, I got a very unexpected error today on in one of my agents: ... Collecting tqdm Using cached tqdm-4.48.2-py2.py3-none-any.whl (68 kB) Processing /ro...
5 years ago
0 Votes
23 Answers
2K Views
0 Votes 23 Answers 2K Views
Hi, I would like to bring awareness on this issue , this impacts my work as I cannot install the older version of torch (1.11.0)
2 years ago
0 Votes
0 Answers
2K Views
0 Votes 0 Answers 2K Views
Hi, I encountered a bug on clearml-server 1.0.1: I tried to add in a project page a custom column in +HYPER PARAMETERS > Args > queue and got an error pop up...
4 years ago
0 Votes
0 Answers
2K Views
0 Votes 0 Answers 2K Views
Hi all, Would it be possible to make the aws autoscaler log each scale in/out operation in the console to help debugging/understanding the course of events?
4 years ago
0 Votes
25 Answers
2K Views
0 Votes 25 Answers 2K Views
Hi, I have another problem 😅 in one of my agent, one experiment started without torch using GPU. In the logs of the experiment shared below, we can see that...
5 years ago
0 Votes
16 Answers
2K Views
0 Votes 16 Answers 2K Views
Got some errors while running migration script from ES5 to ES7: 2020-08-11 15:21:50,130 Running on: Linux 2020-08-11 15:21:50,227 Docker allocated memory: 16...
5 years ago
0 Votes
5 Answers
2K Views
0 Votes 5 Answers 2K Views
Hi there, I would like to report a bug with the resizing of the columns in the projects view: it doesn’t work as expected. Please look at the behavior of the...
4 years ago
0 Votes
26 Answers
2K Views
0 Votes 26 Answers 2K Views
Hi, I would like to follow-up in this https://clearml.slack.com/archives/CTK20V944/p1646123127790389 happening on clearml server 1.2.0 (self hosted on a sing...
3 years ago
0 Votes
4 Answers
2K Views
0 Votes 4 Answers 2K Views
The “Manage queue” option in the right tab on a queued experiment is broken in v1.0 (it does nothing)
4 years ago
0 Votes
11 Answers
2K Views
0 Votes 11 Answers 2K Views
Hi, I have a question regarding the aws-autoscaler: am I understanding correctly that: max_idle_time_min=5 max_spin_up_time_min=10 polling_interval_time_min=...
4 years ago
0 Votes
10 Answers
2K Views
0 Votes 10 Answers 2K Views
Hi, another bug to report with the aws_auto_scaler using 1.1.2: Traceback (most recent call last): File "aws_autoscaler.py", line 297, in main() File "aws_au...
4 years ago
0 Votes
30 Answers
2K Views
0 Votes 30 Answers 2K Views
Hi guys, with the new venv caching available in clearml, I have the following problem: I force my pip requirements to be: torch==1.7.1 pytorch-ignite clearml...
4 years ago
0 Votes
4 Answers
2K Views
0 Votes 4 Answers 2K Views
Hi, in the Metric Snapshot section of the Overview tab of a project page, would it be possible to: Show running experiments Have the legend clickable, to hid...
3 years ago
0 Votes
1 Answers
2K Views
0 Votes 1 Answers 2K Views
Small error in doc: https://allegro.ai/docs/references/trains_agent_ref/#daemon The detach parameter is shown in the command as --detached while it is listed...
5 years ago
0 Votes
30 Answers
2K Views
0 Votes 30 Answers 2K Views
Could you please explain a bit more how trains adapt the torch version depending on the installed cuda version? Here is my setup: cuda 102 installed and corr...
5 years ago
0 Votes
1 Answers
2K Views
0 Votes 1 Answers 2K Views
4 years ago
0 Votes
12 Answers
2K Views
0 Votes 12 Answers 2K Views
Hey, would it possible to add an option to make task.upload_artifact() blocking? (Not running in background)
5 years ago
0 Votes
18 Answers
2K Views
0 Votes 18 Answers 2K Views
Hi Guys, I had several times now the following errors poping in agents while executing a task: trains_agent: ERROR: Failed applying git diff: I attached the ...
4 years ago
0 Votes
2 Answers
2K Views
0 Votes 2 Answers 2K Views
Another one: What is the difference between Task.connect() and Task.set_parameter?
5 years ago
Show more results questions
0 Hey, I Have A Problem With The Following Task:

AgitatedDove14 So what you are saying is that since I have trains-server 0.16.1, I should use trains>=0.16.1? And what about trains-agent? Only version 0.16 is released atm, this is the one I use

5 years ago
0 I Guess One Experiment Is Running Backwards In Time

Just caught another star 😄

3 years ago
0 Hey There, I Would Like To Increase The

because at some point it introduces too much overhead I guess

4 years ago
0 Hello, I Am Getting `Valueerror: Could Not Get Access Credentials For '

AgitatedDove14 This seems to be consistent even if I specify the absolute path to /home/user/trains.conf

5 years ago
0 Is It Possible To Run An Agent, Listen To The Services Queue Without Using Docker?

btw, I tried with alpine instead of ubuntu:18.04, got :

Unable to find image 'alpine:latest' locally
latest: Pulling from library/alpine
df20fa9351a1: Pulling fs layer
df20fa9351a1: Verifying Checksum
df20fa9351a1: Download complete
df20fa9351a1: Pull complete
Digest: sha256:185518070891758909c9f839cf4ca393ee977ac378609f700f60a771a2dfe321
Status: Downloaded newer image for alpine:latest
docker: Error response from daemon: OCI runtime create failed: container_linux.go:349: starting containe...

5 years ago
0 Hi, I Have An Agent That Is Running Two Experiments At The Same Time: One That Was Running For A Long Time (11H) And One That The Agent Picked Up Afterwards, While The First One Was Still Running. Context: I Have 3 Agents Up (Not In Docker Mode) And All O

Some more context: the second experiment finished and now, in the UI, in workers&queues tab, I see randomly
trains-agent-1 | - | - | - | ... (refresh page) trains-agent-1 | long-experiment | 12h | 72000 |

5 years ago
0 Hi Guys, Coming This Time To Share An Idea Of A Killer Feature For Clearml

Nope, I’d like to wait and see how the different tools improve over this year before picking THE one 😄

4 years ago
0 Hi There, I Have A Problem With Pyjwt: I Am Using

Hi SuccessfulKoala55 , yes indeed

4 years ago
0 Hi, In The Clearml-Server Web-Ui, Under Debug Sample, Would It Be Possible To Improve The Logic For Fetching The Images? If I Have Say 200 Iteration, It Will The Last By Default. If I Want To See Iteration 50, I Will Have To Manually Click On The Arrow Un

Hi CostlyOstrich36 , most of the time I want to compare two experiments in the DEBUG SAMPLE, so if I click on one sample to enlarge it I cannot see the others. Also once I closed the panel, the iteration number is not updated

2 years ago
0 Hi, What Happens Exactly When I Execute The Following Command:

Thanks AgitatedDove14 !
What would be the exact content of NVIDIA_VISIBLE_DEVICES if I run the following command?
trains-agent daemon --gpus 0,1 --queue default &

5 years ago
0 Hello, I Am Getting `Valueerror: Could Not Get Access Credentials For '

After some investigation, I think it could come from the way you catch error when checking the creds in trains.conf: When I passed the aws creds using env vars, another error poped up: https://github.com/boto/botocore/issues/2187 , linked to boto3

5 years ago
0 Hi, Is It Possible To Pass Temporary Iam Role To The Web App Could Access?

They are, but this doesn’t work - I guess it’s because temp IAM accesses have an extra token, that should be passed as well, but there is no such option on the web UI, right?

3 years ago
0 Hi, Is It Possible To Pass Environment Variables To Agents Created By The Aws Autoscaler Service?

` resource_configurations {
A100 {
instance_type = "p3.2xlarge"
is_spot = false
availability_zone = "us-east-1b"
ami_id = "ami-04c0416d6bd8e4b1f"
ebs_device_name = "/dev/xvda"
ebs_volume_size = 100
ebs_volume_type = "gp3"
}
}

queues {
aws_a100 = [["A100", 15]]
}

extra_trains_conf = """
agent.package_manager.system_site_packages = true
agent.package_manager.pip_version = "==20.2.3"
"""

extra_vm_bash_script = """

sudo apt-get install -y libsm6 libxext6 libx...

4 years ago
5 years ago
4 years ago
0 Hi There, I Used

AgitatedDove14 So I’ll just replace task = clearml.Task.get_task(clearml.config.get_remote_task_id()) with Task.init() and wait for your fix 🙂

3 years ago
0 Hi, It Seems That The

Ok so it seems that the single quote is the reason, using double quotes works

5 years ago
0 Hi, I Just Updated Clearml-Server To 1.1.0 And Got The Following Error When Starting It With Docker-Compose:

AppetizingMouse58 the events_plot.json template misses the plot_len declaration, could you please give me the definition of this field? (reindexing with dynamic: strict fails with: "mapping set to strict, dynamic introduction of [plot_len] within [_doc] is not allowed )

4 years ago
0 Could You Please Explain A Bit More How Trains Adapt The Torch Version Depending On The Installed Cuda Version? Here Is My Setup:

Ho I see, I think we are now touching a very important point:
I thought that torch wheels already included cuda/cudnn libraries, so you don't need to care about the system cuda/cudnn version because eventually only the cuda/cudnn libraries extracted from the torch wheels were used. Is this correct? If not, then does that mean that one should use conda to install the correct cuda/cudnn cudatoolkit?

5 years ago
0 Hi, I Am Trying To Use Omegaconf With Task.Connect_Configuration And I Get The Following Error:

Yes that’s what I did initially, but eventually I decided that it’s too much complexity added for nothing really, I’d rather drop omegaconf and if one day clearml supports it out of the box take advantage of it

3 years ago
Show more results compactanswers