Reputation
Badges 1
25 × Eureka!Hi EnviousStarfish54
The Enterprise edition extends Trains functionality.
It adds security, scale and full data management (data management and versioning being the key difference)
You can get it as a saas solution or on prem.
If you need more information, you can leave contact details on the website, I'm sure sales will be happy to help :)
Hi EnviousStarfish54
You mean the console output ? if that's the case, the Task.init call will monkey patch the sys.stdout/sys.stderr to report to clearml
as well as the console
Also in the same open docker session, can you try:$LOCAL_PYTHON -m clearml_agent execute --disable-monitoring --id <task_id_here>
Where the Task ID is one of the failed executions (only reset it before)
GrievingTurkey78 I'm not sure I follow, are you asking how to add additional scalars ?
how would I get an agent to launch in the same instance of my clearml server
Actually that is my point, you do not have to spin the agent on the clearml-server instance. We added the services agent as part of the docker-compose for easier deployment, that said you can always manually SSH to the server, or run on any other machine, like you would spin any other clearml-agent
.
Does that make sense ?
agree, but setting the agentβs env variable TMPDIR
I think this needs to be passed to the docker with -e TMPDIR=/new/tmp
as additional container args:
see example
None
wdyt?
Thanks! Let me check if we can reproduce it. BTW what's your clearml package version?
Hi JitteryCoyote63
Just making sure, the package itself it installed as part of the "Installed packages", and it also installs a command line utility ?
So assuming they are all on the same LB IP: You should do:
LB 8080 (https) -> instance 8080
LB 8008 (https) -> instance 8008
LB 8081 (https) -> instance 8081
It might also work with:
LB 443 (https) -> instance 8080
I find it quite difficult to explain these ideas succinctly, did I make any sense to you?
Yep, I think we are totally on the same wavelength π
However, it also seems to be not too prescriptive,
One last question, what do you mean by that?
A true mystery π
That said, I hardly think it is directly related to the trains-agent
...
Do you have any more insights on when / how it happens ?
Yep I think you are correct, you should have had the same output as a local jupyter notebook, and it seems that in sagemaker studio it is not working π
Let me check something
In order for the sample to work you have to run the template experiment once. Then the HP optimizer will find the best HP for it.
I would like to be able to send a request to unload the model (because I cannot load all the models in gpu, only 7-8) o
Hmm is this part of the gRPC interface of Triton? if it is, we should be able to add that quite easily,
Hi @<1716987924207112192:profile|CostlyOctopus40>
is opensearch supported in ClearML instead of Elasticsearch ? please shed some light on that
Long story short, maybe?! but this is not officially supported.
We only support elasticsearch, the opensearch fork is not officially supported and since we continue to use more advanced features of Elastic, it might be that the API will not be compatible in the future.
Out of curiosity, why are you using opensearch?
This is what I just used:
` import os
from argparse import ArgumentParser
from tensorflow.keras import utils as np_utils
from tensorflow.keras.datasets import mnist
from tensorflow.keras.layers import Activation, Dense, Softmax
from tensorflow.keras.models import Sequential
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import ModelCheckpoint
from clearml import Task
parser = ArgumentParser()
parser.add_argument('--output-uri', type=str, required=False)
args =...
(just using local server not connected to Internet), am I right?
You can if you host your own git server, Or if your code is a single file / jupyter notebook, then the entire code is stored on the Task.
btw: what is the exact setup, how come there is no git repo?
AgitatedTurtle16 from the screenshot, it seems the Task is stuck in the queue. which means there is no agent running to actual run the interactive session.
Basic setup:
A machine running clearml-agent
(this is the "remote machine") A machine running cleaml-session (let's call it laptop π )You need to first start the agent on the "remote machine" (basically call clearml-agent daemon --docker --queue default
), Once the agent is running on the remote machine, from your laptop ru...
The 'on-premise' server fails to connect to the ClearML server because of the VPN I think
I think you are correct.
You can quickly test it, try ti run curl
http://local-server:8008 see if that works
Hi JumpyDragonfly13 , just making sure, do you have an agent running on a remote machine ?
Can you have a direct TCP connection to the remote machine (the default port it will use is 10022)
Sure, run:clearml-agent init
It is a CLI wizard to configure the initial configuration file.
TenseOstrich47 make sense π