Reputation
Badges 1
25 × Eureka!DefeatedOstrich93 many thanks I was able to reproduce it (basically newly added files caused git apply to fail)
Fix will be part of the next clearml-agent RC
UptightMouse31 You can add any metric (KPI) with "manual" loggingLogger.current_logger().report_scalar("KPI", "metric", iteration=0, value=1.1)
This means you can later add a column KPI/metric to your experiment table.
Will this do the trick ?
Local changes are applied before installing requirements, right?
correct
Hi VexedCat68
(sorry I just saw the message)
I wanted to ask, how to run pipeline steps conditionally? E.g if step returns a specific value, exit the pipeline or run another step instead of the sequential step
So do do so you can do:
` def pre_execute_callback_example(a_pipeline, a_node, current_param_override):
# if we want to skip this node (and subtree of this node) we return False
...
# ew decided to skip so we return False
return False
pipe.add_step(name='...
An upload of 11GB took around 20 hours which cannot be right.
That is very very slow this is 152kbps ...
Hi BeefyHippopotamus73
. I checked the template task and the list of βInstalled Packagesβ indeed does not have one of my required packages in the list.
Basically the "installed packages" is auto populated based on the directly imported packages n your code base.
Could it be you do not have import snowflake-connector-python
and this is a derivative package (i.e. required from a different package)
BTW: when you clone your Task in the UI you can edit and add the missing packages,...
Yes that makes total sense to me. How about a GitHub issue on the clearml-docs ?
ScantMoth28 where are you seeing this warning ?
it will constantly try to resend logs
Notice this happens in the background, in theory you will just get stderr messages when it fails to send but the training should continue
Hover over the border (I would suggest to use the full screen, i.e. maximize)
My only point is, if we have no force_git_ssh_port
or force_git_ssh_user
we should not touch the SSH link (i.e. less chance of us messing with the original URL if no one asked us to)
With the warning ?
I was able to reproduce it on the old versions, but it seems fixed on the latest from GitHub.
There are also "completed, aborted, queued" .
Archived is actually a tag (system tag, not user tag). There is a "state machines" of moving from one state to the other. The special case is "published" that we probably should have called "locked". The idea is that if a Task/Model is published, you cannot reset it (and even deleting requires force flag).
I would use additional user tags (or even system-tags) to mark "deployed" state, wdyt?
VexedCat68 both are valid. In case the step was cached (i.e. already executed) the node.job will be None, so it is probably safer to get the Task based on the "executed" field which stores the Task ID used.
Hi GrittyCormorant73
When I archive the pipeline and go into the archive and delete the pipeline, the artifacts are not deleted.
Which clearml-server version are you using? The artifact delete was only recently added
My bad, I worded my question wrong I see,
LOL no worries π
Any chance you have some "debug" leftover in the Pipeline code:
https://github.com/allegroai/clearml/blob/7016138c849a4f8d0b4d296b319e0b23a1b7bd9e/examples/pipeline/pipeline_from_decorator.py#L113
Maybe we should show a warning when we it is being called, or ignore it when running via an agent ...
OK, I got it by modifying the .conf file and putting the credentials on node
Nice! π
In the "installed packages" section you should have "nvidia-dali-cuda110" In the agent's clearml.conf you should add:extra_index_url: ["
", ]
https://github.com/allegroai/clearml-agent/blob/master/docs/clearml.conf#L78
Should solve the issue
IrateBee40
Check the first steps here:
https://clear.ml/docs/latest/docs/getting_started/ds/ds_first_steps
(Basically you have to generate credentials / configure you machine so it knows where the server is and how to access it)
Make sense ?
So far my local and remote gitlab repositories are synchronized, I suspect, thatΒ
Failed applying git diff, see diff above
Β error is caused by cached repository from which clearml tries to run the process. I've cleaned the cache, but it haven't helped.
Hmm can you test with empty "uncommitted changes" ?
Just making sure when you say still does n't work, you are not trying to run the Task with the git diff that includes teh binary data right?
Yes, but where I can fi...
So clearml-init can be skipped, and I provide the users with a template and ask them to append the credentials at the top, is that right?
Correct
What about the "Credential verification" step in clearml-init command, that won't take place in this pipeline right, will that be a problem?
The verification test is basically making sure the credentials were copy pasted correctly.
You can achieve the same by just running the following in your python console:
` from clearml import Ta...
No, I mean actually compare using the UI, maybe the arguments are different or the "installed packages"
Yep, that would do it ...
You can disable it with:Task.init(..., auto_connect_frameworks={'scikit': False})
Could you manually configure the ~/trains.conf ?
(Just copy paste the section from the UI)
then try to run:trains-agent list
Hmm TrickyRaccoon92 take a look at the cleanup service, I think you can hack it so instead of deleting the artifacts, it will archive them somewhere (also you can change the filter, maybe only perform on experiments with specific user tag)
What do you think?
https://github.com/allegroai/trains/blob/master/examples/services/cleanup/cleanup_service.py
Hi ColossalAnt7
Try ctrl-F5 and refresh the page?!
It seems you are missing a few buttons π
JitteryCoyote63
Picks a new experiment on top of the long one running
This is very very strange. Is the long running experiment being logged (i.e. do you still see console output in the UI)?