Sorry @<1524922424720625664:profile|TartLeopard58> 😞 we probably missed it
clearml-session is still being developed 🙂
Which issue are you referring to ?
if they're mission critical, but rather the clearml cache folder?
hmmm... they are important, but only when starting the process. any specific suggestion ?
(and they are deleted after the Task is done, so they are temp)
So how do I solve the problem? Should I just relaunch the agents? Because they can't execute jobs now
Are you running in docker mode ?
If so you can actually delete mapped files (they will still be available inside the docker), just make sure you delete them X hours after they were created, and you should be fine.
wdyt?
This is by design, they cannot use the exact same venv because if the code starts creating files/change them it happens inside the venv and might cause them to crash.
That said if you are running with venv cache, the first one will create the venv and the second one will create a copy from the cache.
and then?
The thing is programmatically this is not easy to do as API, because at the end the "function" (i.e. LCI) never leaves, it connects to the SSH and stays
But you can query the Task it creates, the project is known, the user is known and it is of special type/tag
Did you you set 'force_git_ssh_protocol: true '?
https://github.com/allegroai/clearml-agent/blob/249b51a31bee97d63f41c6d5542e657962008b68/docs/clearml.conf#L39
Let's try:
` echo 'Binary::apt::APT::Keep-Downloaded-Packages "true";' > /etc/apt/apt.conf.d/docker-clean ; chown -R root /root/.cache/pip ; export DEBIAN_FRONTEND=noninteractive ; export CLEARML_APT_INSTALL="$CLEARML_APT_INSTALL libsm6 libxext6 libxrender-dev libglib2.0-0" ; [ ! -z $(which git) ] || export CLEARML_APT_INSTALL="$CLEARML_APT_INSTALL git" ; declare LOCAL_PYTHON ; for i in {10..5}; do which python3.$i && python3.$i -m pip --version && export LOCAL_PYTHON=$(which python3.$i) && b...
What's the host you have in the clearml.conf ?
is it something like " http://localhost:8008 " ?
GreasyPenguin14 whats the clearml version you are using, OS & Python ?
Notice this happens on the "connect_configuration" that seems to be called after the Task was closed, could that be the case ?
Hi SucculentBeetle7
The parameters passed to add_step need to contain the section name (maybe we should warn if it is not there, I'll see if we can add it).
So maybe something like:{'Args/param1', 1}Or{'General/param1', 1}Can you verify it solves the issue?
does the clearml server is a worker i can serve on models?
The serving is done by one of the clearml-agents.
Basically you spin an agent, then this agent is spinning the model serving engine container (fully managed).
(1) install run run clearml-agent (2) run clearml-session CLI to configure and spin the serving engine
I see, when you run it manually (i.e. not via an agent) what do you have under the configuration tab in the UI (meaning do you see both argparser arguments there)?
StorageHelper is used internally.
I'll make sure we remove it from the examples/docs
Are hparms saved in hypeparameter section superior to hparams saved in configuration objects?
well I'm not sure about "superior" but they are structured, as opposed to configuration object, which is as generic as could be
Can you provide some further explanation, please? Sorry, I am beginner.
My bad, I was thinking out loud on improving the HPO process and allowing users to modify the configuration_object , not just the hyperparameters
pass :task_filter=dict(system_tags=['-archived'])
So I'm gusseting the cli will be in the folder of python:import sys from pathlib2 import Path (Path(sys.executable).parent / 'cli-util-here').as_posix()
My question was about the automatically uploaded models. Those that were uploaded by clearml client.
So there is a way to add a callback would that work?
https://github.com/allegroai/clearml/blob/cf7361e134554f4effd939ca67e8ecb2345bebff/clearml/binding/frameworks/init.py#L137def callback(_, model_info): model_info.name = "my new name" return model_info
I think the real issue is that I am not able to specify a platform for the model,
None
there is no need to specify it, remove it from the config.pbtxt - the clearml-serving will automatically add the background
Hi JitteryCoyote63
If you want to refresh the task object, call task.reload() It will also refresh the artifacts.
The reason for not always do so when accessing the .artifacts objects is for speed optimization (It might be slow compared to dict access, and we assume users will expect it to behave the dict)
An easier fix for now will probably be some kind of warning to the user that a task is created but not connected
That is a good point, maybe if you do not have a "main" Task, then we print the warning (with some flag to disable the warning) ?
This is sitting on top of the serving engine itself, acting a s a control plane.
Integration with GKE is being worked on (basically KFServing as the serving engine)
Hi UnsightlyLion90
from my understanding agent do the job of SLURM,
That is kind of correct (they overlap in some ways 🙂 )
Any guide of how to integrate both of them?
The easiest way is to just add the "Task.init()" call to your code, and use SLURM to schedule the job. this will make sure all jobs are fully logged (this can also includes automatically uploading the models, and artifact support etc)
Full SLURM support (i.e. similar to the k8s glue support), is currently ou...
and then in Preprocess:
self.model = get_model(task_id=os.environ['TASK_ID'], model_name=os.environ['MODEL_NAME'])That's the part I do not get, Models have their own entity (with UID), this is in contrast to artifacts that are only stored on Tasks.
The idea when you are registering a model with clearml-serving, you can specify the model ID, this should replace the need for the TASK_ID+model_name in your code, and the clearml-serving will basically bring it to you
Basically this fun...
Hi WickedElephant66
So I'm trying to upload an artefact to clearml’s fileserver(I have a self hosted clearml server running),
Are you trying to upload an artifact? If so I would do:task.upload_artifact('local file', artifact_object="/path/to/file")Or is it about Model files?
You can alst check how to upload artifacts / models here:
https://github.com/allegroai/clearml/blob/master/examples/reporting/artifacts.py
https://github.com/allegroai/clearml/blob/master/examples/reporti...
Hmm so the concept of "company" wide configuration is supported in the enterprise version.
I'm trying to think of a "hack" to just pass these env/conf ...
How are you spinning the agent machines?
WackyRabbit7 How do I reproduce it ?
LudicrousParrot69 you mean post execution or while you are executing the hyperparameter optimizer ?
Is there a reasonÂ
clearml
 will use the demo server when there is noÂ
~/clearml.conf
?
It's the default server for easy getting started journey, e.g. you run some sample code and it works , with zero configuration.
that said you can set an environment flag to disable the default server behavior .CLEARML_NO_DEFAULT_SERVER=1
ReassuredTiger98
wdyt?
BTW:
it will push potentially proprietary data to the public demo server.
The server if su...
Hi IrritableOwl63
Yes this seems like a docker setup issue 🙂
either run the agent with sudo (not really recommended 😉 ) or add to suduers :
https://docs.docker.com/engine/install/linux-postinstall/
If you choose between skipping or logging like nan, then here I find it difficult, it seems that it is better to log than skip, but you need to think.
So I "think" the issue is plotly (UI), doesn't like NaN and also elastic (storing the scalar) is not a NaN fan. We need to check if they both agree on the representation, that it should be easy to fix...
Maybe you could open a github issue, so we do not forget?