Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
AgitatedDove14
Moderator
49 Questions, 8126 Answers
  Active since 10 January 2023
  Last activity one year ago

Reputation

0

Badges 1

25 × Eureka!
0 I Updated Trains-Server Today, And Now It'S Very Unstable, Web Interface Randomly Stops Working. Anyone Had The Same Problem? I'Ve Never Had Any Problems With Updating The Server Before

web-server seems okay, could you send the logs from the api-server?
Also if you can, the console logs from your browser, when you get the blank screen. Thanks.

5 years ago
0 Regarding The “Classic” Datasets (Not Hyper Datasets): Is There An Option To Do Something Equivalent To Dvc’S “

you can run md5 on the file as stored in the remote storage (nfs or s3)

s3 is implementation specific (i.e. minio weka wassaby etc, might not support it) and I'm actually not sure regrading nfs (I mean you can run it, but it actually means you are reading the data, that said, nfs by definition I'm assuming is relatively fast access)
wdyt?

3 years ago
0 So I Bumped Onto This Comparison Shared By Dagshub. It Kinda Placed Clearml Is A Rather Bad Position Compared To Everything Else In The Industry.

Please feel free to do so (always better to get it from a user not the team behind the product πŸ˜‰ )

4 years ago
0 Hi There! Some Background Info Before I Put Forward My Question: I'M Writing-Up A Small Script To Help Me Manage My Tasks. Specifically I Often Need To Abort (And Archive) A

Hi StickyMonkey98

aΒ 

very

Β large number of running and pending tasks, and doing that kind of thing via the web-interface by clicking away one-by-one is not a viable solution.

Bulk operations are now supported , upgrade the clearml-server to 1.0.2 πŸ™‚

Is it possible to fetch a list of tasks via Task.get_tasks,

Sure:
Task.get_tasks(project_name='example', task_filter=dict(system_tags=['-archived']))

4 years ago
0 Hi Everyone, Does Anybody Now If The Latest Release 1.15 Is Still Vulnerable To

Hi Martin, of course not,

Smart!

I was just wondering if it has been patched yet and if not what is the expected timeline for patching it

Yes, I believe the target is a patch version 1.15.1 to be released in a couple of weeks. This is not a major issue but it's always better to have have it fixed. (btw: the enterprise version never had this issue to being with, because it is of course authenticated, as well as it has additional RBAC layer on top.)

one year ago
0 Hi, I Have Another Problem

what do you see in the console when you start the trains-agent , it should detect the cuda version

5 years ago
0 I Have The Slack Server Running At Localhost:8080 When Trying To Access It From A Remote Computer, I Am Getting A Screen Like So: How Can I See The Dashboard From Another Computer?

WobblyCrab70 sure, put a load-balancer in between, AWS has a solution for that basically use the AMI from the GitHub and ask IT to add https on the 8080/8008/8081 ports

5 years ago
0 Hi, I Am New Here, Can I Ask Question On Trains-Server Also?

OHH nice, I thought that it just some kind of job queue on up and running machines

It's much more than that, it's a way of life πŸ™‚
But seriously now, it allows you to use any machine as part of your cluster, and send jobs for execution from the web UI (any machine, even just a standalong GPU machine under your desk, or any cloud GPU instance any mixing the two together:)

Maybe I need to change something here:Β 

apiserver.conf

Not sure, I'm still waiting on answer... It...

5 years ago
0 Hi, I Am New Here, Can I Ask Question On Trains-Server Also?

It manages the scheduling process, so no need to package your code, or worry about building dockers etc. It also has an AWS autoscaler, that spins ec2 instances based on the amount of jobs you have in the execution queue, and the limit of your budget (obviously spinning down machines that are idle)

5 years ago
0 Hi, I Am New Here, Can I Ask Question On Trains-Server Also?

CooperativeFox72 btw, are you guys running those 20 experiments manually or through trains-agent ?

5 years ago
0 Hi, I Am New Here, Can I Ask Question On Trains-Server Also?

CooperativeFox72 yes 20 experiments in parallel means that you always have at least 20 connection coming from different machines, and then you have the UI adding on top of it. I'm assuming the sluggishness you feel are the requests being delayed.
You can configure the API server to have more process workers, you just need to make sure the machine has enough memory to support it.

5 years ago
0 Hi, I Am New Here, Can I Ask Question On Trains-Server Also?

Let me check... I think you might need to docker exec
Anyhow, I would start by upgrading the server itself.
Sounds good?

5 years ago
0 Quick Question On The

GrievingTurkey78 short answer no 😞
Long answer, the files are stored as differentiable sets (think changes set from the previous version(s)) The collection of files is then compressed and stored as a single zip. The zip itself can be stored on Google but on their object storage (not the GDrive). Notice that the default storage for the clearml-data is the clearml-server, that said you can always mix and match (even between versions).

4 years ago
0 Hi - Quick Question. I Am Using The Pipelinecontroller With Abort_On_Failure Set To False. I Have A Pipe With A First Task That Branch Out In 3 Branches.

if the first task failed - then the remaining task are not schedule for execution which is what I expect.

agreed

I'm just surprised that if the first task is

aborted

instead by the user,

How is that different from failed? The assumption is if a component depends on another one it needs its output, if it does not then they can run in parallel. What am i missing?

one year ago
0 Hi, I Upgraded The Clearml Client To

Hi CooperativeFox72
I think the upload reporting (files over 5mb) was added post 0.17 version, hence the log.
The default is upload chunk reporting is 5MB, but it is not configurable, maybe we should add it to the clearml.conf ? wdyt?

4 years ago
0 Hi, I Upgraded The Clearml Client To

CooperativeFox72 I would think the easiest would be to configure it globally in the clearml.conf (rather than add more arguments to the already packed Task.init) πŸ™‚
I'm with on 60 messages being way too much..
Could you open a Github Issue on it, so we do not forget ?

4 years ago
0 Hi All

The main reason to add the timeout is because the warning was annoying to users πŸ™‚
The secondary was that clearml will start reporting based on seconds from start, then when iterations start it will revert back to iterations. But if the iterations are "epochs" the numbers are lower so you end up with a graph that does not match the expected "iterations" x-axis. Make sense ?

4 years ago
0 Hi All

This will set more time before the timeout right?

Correct.

task.freeze_monitor()
download()
task.defrost_monitor()

Currently there isn't, but that's a good ides.
What would be the argument of using it vs increasing the timeout ?
btw: setting the resource timeout to 99999 will basically mean that it will wait until the first reported iteration, Not that it will just sleep for 99999sec πŸ™‚

4 years ago
5 years ago
0 Hi All, I Am Starting To Use Clearml-Agent. Run It With

Hi CooperativeFox72

But my docker image has all my code and all the packages it needed I don't understand why the agent need to install all of those again?Β (edited)

So based on the docker file you previously posted, I think all your python packages are actually installed on the "appuser" and not as system packages.
Basically remove the "add user" part and the --user from the pip install.
For example:
` FROM nvidia/cuda:10.1-cudnn7-devel

ENV DEBIAN_FRONTEND noninteractive
RUN ...

4 years ago
0 Hi All, I Am Starting To Use Clearml-Agent. Run It With

CooperativeFox72
Could you try to run the docker and then inside the docker try to do:
su root whoami

4 years ago
0 Hi All, I Am Starting To Use Clearml-Agent. Run It With

Okay we have something πŸ™‚
To your clearml.conf add:
agent.docker_preprocess_bash_script = [ "su root", "cp -f /root/*.conf ~/", ]Let's see if that works

4 years ago
0 Hi All, I Am Starting To Use Clearml-Agent. Run It With

I am creating this user

Please explain, I think this is the culprit ...

4 years ago
0 Hi All, I Am Starting To Use Clearml-Agent. Run It With

but I am think they done it for a reason no?

Not a very good one, they just installed everything under the user and used --user for the pip.
It really does not matter inside a docker, the only reason one might want to do that is if you are mounting other drives and you want to make sure they are not accessed with "root" user, but with 1000 user id.

4 years ago
0 Hi All, I Am Starting To Use Clearml-Agent. Run It With

Let me check if we can hack something...

4 years ago
0 Hi All, I Am Starting To Use Clearml-Agent. Run It With

Yes this is definitely the issue, the agent assume the docker user is "root".
Let me check something

4 years ago
Show more results compactanswers