Reputation
Badges 1
25 × Eureka!EnviousStarfish54 a fix is already available in the latest RC
Could you verify it solves your issue as well?pip install trains==0.16.2rc0
I just set the git credentials in the
clearml.conf
and it works out of the box
git has issues with passing the user/token from the main repo to the submodules, hence my surprise that it is working out-of-the-box.
Do notice that if you are ussing ssh-key this is a none issue.
Nope, no
.netrc
defined anywhere, ...
If this is the case can you try to add the following to your "extra_vm_bash_script"
` echo machine example.com > ~/.netrc && echo log...
Hi BoredPigeon26
what do you mean by "reuse the task" ? is this manual execution (i.e. from code)?
How about archiving the old version?
You can also force Task.init to always create a new Task (which preserves the previous run alongside the execution tab)
Basically what's the specific use case ?
Hi, I was expecting to see the container rather then the actual physical machine.
It is the container, it should tunnels directly into it. (or that's how it should be).
SSH port 10022
However, I have not yet found a flexible solution other than ssh-agent forwarding.
And is it working?
For example HPO, early stopping. It would mark the Task as aborted. Make sense ?
Hi ConfusedPig65
Any keras model will be automatically uploaded if you pass an upload url to the Task init:task = Task.init('examples', 'keras upload test', output_uri=" ")(You can also pass to output_uri s3://buckket/folder or change the default output_uri in the clearml.conf file)
After this line any keras model will be automatically uploaded (you will see it under the Artifacts Tab)
Accessing models from executed tasks:
` trains_task = Task.get_task('task_uid_here')
last_check...
Well it should work, make sure you see the Task "holds" all the information needed (under the execution tab). repo / uncommitted changes / python packages etc.
Then configure your agent (choose pip/conda/poetry as package managers), and spin it up (by default in venv/coda mode, or in docker mode)
Should work π
Thanks @<1569496075083976704:profile|SweetShells3> ! let me see if I can reproduce the issue
Okay I found it, this is due to the fact the newer versions are sending the events/images in a subprocess (it used to be a thread).
The creation of the object is done on he main process, updating file index (round robin manner), but the check itself, happens on the subprocess., which is not "aware" of the used indexes (i.e. it is always 0, hence when exceeding the history side, it skips it)
in the UI, find the task (just search for the Task ID, it will find it), then tight click it, and select "reset"
The one it is trying to execute, i.e. on the Task it shows as Script Path
BTW: if you make the right column the base line (i.e. move it to the left, you will get what you probably expected)
In any case, do you have any suggestion of how I could at least hack tqdm to make it behave? Thanks
I think I know what the issue is, it seems tqdm is using Unicode for the CR this is the 1b 5b 41 sequence I see on the binary log.
Let me see if I can hack something for you to test π
Hi FierceFly22
You called execute_remotely a bit too soon. If you have any manual configuration, they have to be called before, so they are stored in the Task. This includes task.connect and task.connct_configuration.
CloudyHamster42 FYI the warning will not be shown in the next Trains version, the issue is now fixed, thank you π
Regrading the double axes, see if adding plt.clf() helps. It seems the axes are leftover from the previous figure, that somehow are still there...
if fails duringΒ
add_step
Β stage for the very first step, becauseΒ
task_overrides
Β contains invalid keys
I see, yes I guess it it makes sense to mark the pipeline as Failed π
Could you add a GitHub issue on this behavior, so we do not miss it ?
Hi ArrogantBlackbird16
but it returns a task handle even after the Task has been closed.
It should not ... That is a good point!
Let's fix that π
Hmm so is the problem having the gituser inside the code? or the k8s_glue print ?
How can I make it show progress less often/rewrite?
I'm not sure this is configurable ... you mean like reports on the uploads right? (i.e. report every 5mb I think is the default)
while we are at it, maybe we should use twdm if it is installed
wdyt?
FYI all the git pulls are cached even in docker mode so there is no "tax" to pay for pulling the sub-modules (only the first time of course)
In that case, no the helm chart does not spin a default agent (You should however spin a service mode agent for running pipelines logic)
Which would mean the error is because of a company firewall/self-signed certificate.
The easiest solution,Disable SSL certificate check for ClearML.
Create the ~/clearml.conf manually:
` #disable SSL certificate check
api.verify_certificate: False
copy paste the credentials section from the UI
it should look something like:
api {
# web_server on port 8080
web_server: " "
# Notice: 'api_server' is the api server (default port 8008), not the web server.
api_server: ...
GiganticTurtle0 found it, fix will be pushed tomorrow π
Hi SmugTurtle78
Unfortunately there is no actual filtering for these logs, because they are so important for debugging and visibility. I have to ask, what's the use case to remove some of them ?
Thanks JumpyPig73
Yeah this would explain it ... (if hydra is setting something else we can tap into that as well)