Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
AgitatedDove14
Moderator
48 Questions, 8049 Answers
  Active since 10 January 2023
  Last activity 5 months ago

Reputation

0

Badges 1

25 × Eureka!
0 Hi, I Recently Started Evaluating Trains. Given That Tensorboard Is Much More Mature, And Our Team Is Used To It, I Think It Is Likely We Won’T Want To Stop Using Tensorboard Completely And Just Switch To Trains. But I Am Thinking It Could Be Pretty Use

Hi LivelyLion31 I missed your S3 question, apologies. What did you guys end up doing?
BTW you could always upload the entire TB log folder as artifact, it's simple task.upload_artifact('tensorboard', './tblogsfolder')

4 years ago
0 Gm Folks, Really Liking Clearml So Far As My Top Choice (After Looking At Dvc, Mlflow), And Thank You For Your Help Here! I Had Another Q: Is There A Recommended Workflow To Be Able To “Drop Into” The

gm folks, really liking ClearML so far as my top choice (after looking at dvc, mlflow), and thank you for your help here!

Thanks HurtWoodpecker30 !

Is there a recommended workflow to be able to “drop into” the

exact

env

(code, venv, data) of a previous experiment (which may have been several commits ago), to reproduce that experiment?

You can use clearml-agent on your local machine to build the env of any Task,
` clearml-agent build --id <ta...

2 years ago
0 Hi! I Need Help Debugging The Following Issue Please. I'M Training A Cnn And Plotting The Confusion Matrices For Train And Val In Each Epoch. When I Get To Epoch 101, The Ui Kind Of Breaks..It Starts Showing Me The Images For Epoch 1. When I Right Click O

oh...so is this a bug?

It was always a bug, only an elusive one 😉
Anyhow, I'll make sure we push a fix to GitHub, an RC is planned for later this week, it will contain it

3 years ago
0 Is There Any Reason Why Doing The Following Is Not Possible? Am I Doing It Right? I Want To Run A Pipeline With Different Parameters But I Get The Following Error?

Right! I just noticed that! this is odd... and yes defiantly has something to do with the multi pipeline executed on the agent, I think I know what to look for ...
(just making sure (again), running_locally produced exactly what we were expecting, is that correct?)

2 years ago
0 Hi Everyone, How Do I Integrate Sagemaker With Clearml , Currently I Only See Wandb Integrated With The Hugging Face And Don'T See Any Tutorials On Clearml , I Am Fine Tuning A Llama Model And Following This

Hi @<1549202366266347520:profile|GorgeousMonkey78>

how do I integrate sagemaker with clearml ,

you mean to launch an experiment, or just to log it?

11 months ago
0 I Have Set
  • try with the latest RC 1.8.1rc2

, it feels like after git clone, it spend minutes without outputting anything

yeah that is odd , can you run the agent with --debug (add before the daemon command) , and then at the end of the command add --foreground
Now launch the same task on that queue, you will have a verbose log in the console.
Let us know what you see

4 months ago
0 Hello

Sorry I need the full log ... feel free to DM it to me

one year ago
0 One More Thing, I'M Trying To Take Full Advantage Of The Controller, But I Run Into A Problem In My Use Case. The Controller Is Super Useful For Creating A Dag Of Tasks Which Is A Behaviour Of Interest. But Issues Rise When The Tasks Are Changing. Not On

SmarmySeaurchin8 I might be missing something in your description. The way the pipeline works,
the Tasks in the DAG are pre-executed (either with "execute_remotely" or actually fully executed once").
The DAG nodes themselves are executed on the trains-agent , which means they reproduce the code / env for every cloned Task in the DAG (not on the original Tasks).
WDYT?

3 years ago
0 Hi, I Was Some How Able To Get A Project Running Yesturday, However Now I Am Unable To Get It Running, I Keep Getting An Failed Getting Token Error

i keep getting an failed getting token error

MiniatureCrocodile39 what's the server you are using ?

3 years ago
0 Hi! Trying To Run The Following Very Basic Code. The First Few Parts Works As They Should:

Hi FunnyTurkey96
Any chance you can try to run with the latest form GitHub (i just tested your code and it seemed to work on my machine).
pip install git+

3 years ago
0 Hi Everybody, I'M Running Experiments Inside A Docker Which Includes Multiple Python Instances, Some Of Them Are Inside Conda Environments. How Can I Specify The Agent To Use A Specific Conda Environment Inside The Docker?

Hi CrookedWalrus33

docker_setup_bash_script= ["export PATH=""/workspace/miniconda/bin:$PATH"])

Oh I think you are correct, this should do the trick:
docker_setup_bash_script= ["export PATH=/workspace/miniconda/bin:$PATH", "export LOCAL_PYTHON=/workspace/miniconda/bin/python3"]This will make sure both agent and script execute on the same python

but to run a script inside a docker which already has the environment built in.

If this is already activated, the latest agent w...

2 years ago
0 Hello, I Am Using Clearml In Docker Mode. I Have A Simple Script That Runs Locally, Runs On The Target Machine Running The Same Tensorflow Container, But Doesn'T Run When I Deploy It Using Clearml. Here'S The Log Of The Error:

TroubledHedgehog16

but doesn't run when I deploy it using clearml. Here's the log of the error:

...

My guess is that clearml is reimporting keras somewhere, leading to circular dependencies.

It might not be circular, but I would guess it does have something to do with order of imports. I'm trying to figure out what would be the difference between local run and using an agent
Is it the exact same TF version?

one year ago
0 Hi I Have An Issue Where Experiments Are All Showing That They Started From Iteration 0. This Is Even True For Experiments Which I Know Used To Show The Correct Iteration, So It Seems To Be Due To An Update Of The Web Interface. Here You Can See That Sup

this is not the case as all the scalars report the same iterations

MassiveHippopotamus56 could it be the the machine statistics? (i.e. cpu/gpu etc. these are considered scalars as well...)

2 years ago
0 When My Remote Task Is Installing The Python Dependencies

Could it be something else is missing and hence the import fails ?

one year ago
0 Hi!

Ohh I see now the force SSH did not replace the user in the SSH link (only if the original was http), right ?

3 years ago
2 years ago
0 Hi! I Am Having Some Problems With A Loss After A Good Amount Of Training, What Would Be The Best Way To Log A Value To Have A Better Idea Of What Is Happening?

I would do something like:

` from clearml import Logger

def forward(...):
    self.iteration += 1 
    weights = self.compute_weights(...)
    m = (weights * (target-preds)).mean()
    Logger.current_logger().report_scalar(title="debug", series="mean_weight", value=m, iteration=self.iteration)
    return m `
2 years ago
0 Any Idea Why I Get This Error In All My Agents

Is this still an issue (if you provide queue name, the default tag is not used so no error should be printed)

3 years ago
0 Please Tell Me What Ram Metric Is Tracked By Clearml? What I See In Htop And On The Board Don'T Match Even Though It'S The Same Server 20 Gb Vs 70Gb

Hi @<1523702932069945344:profile|CheerfulGorilla72>

Please tell me what RAM metric is tracked by ClearML?

Free RAM is the entire machine free RAM
Yeah htop shows odd numbers as it doesn't "count" allocated buffers
specifically you can see the code here:
None

one year ago
0 Hi Fam! Sorry For The Potential Dumb Question, But I Couldn’T Find Anything On The Interwebs About It. I’M Hosting A Clearml Server On Aws, Using S3 As A Backend For Artifact Storage. I Find That Whenever I Delete Archived Artifacts In The Web App, I Get

I get a popup saying that the actual files weren’t deleted from S3 (so presumably only the metadata on the server gets deleted).

Hi QuaintPelican38
The browser client actual issues the delete "command", (the idea is separation of the meta-data and data, e.g. artifacts). That means you have to provide the key/secret to the UI (see profile page)

2 years ago
2 years ago
3 years ago
0 Hi Guys, Just Wondering If Anyone Encountered This Error When Using The Pipeline Controller Object. I Simply Added A Step With The Step-Name And Base_Task_Id As Flags.

The release was supposed to be out this week, got delayed by some py2 support issue, anyhow the release will be almost exactly like the latest we now have on the GitHub repo (and I'm assuming it will be out just after the weekend)

3 years ago
0 Is There A Way To Set Precedence On Package Managers? If We Set An Agent To Use

Hmmm maybe 

 I thought that was expected behavior from poetry side actually

I think this is the expected behavior, hence bug?!

2 years ago
Show more results compactanswers