and you have clearml v0.17.2 installed on the "system" packages level, and 0.17.5rc6 installed inside the pyenv venv ?
It seems stuck somewhere in the python path... Can you check in runtime what's os.environ['PYTHONPATH']
Yes that would work 🙂
You can also put it in the docker compose see TRAINS_AGENT_DEFAULT_BASE_DOCKER
RoundMole15 how does the Task.init
look like?
it does appear on the task in the UI, just somehow not repopulated in the remote run if it’s not a part of the default empty dict…
Hmm that is the odd thing... what's the missing field ? Could it be that it is failing to Cast to a specific type because the default value is missing?
(also, is issue present in the latest clearml RC? It seems like a task.connect issue)
EnviousPanda91
in your clearml.conf I think you are missing a sectionagent.git_user="" agent.git_pass="" agent.git_host="" agent.force_git_ssh_protocol: true
ReassuredTiger98 you mean when calling clearml-init
? or default value?
Sounds like something very similar, I'll try to use it,
You can set it per container with -e CLEARML_AGENT_SKIP_PIP_VENV_INSTALL=1
Or add it here:
https://github.com/allegroai/clearml-agent/blob/51eb0a713cc78bd35ca15ed9440ddc92ffe7f37c/docs/clearml.conf#L149extra_docker_arguments: ["-e", "CLEARML_AGENT_SKIP_PIP_VENV_INSTALL=1"]
Are they ephemeral or later used by other Tasks, execution etc ?
For example: configuration files, they are specific for an execution, and someone will edit them.
Initial weights files, are something that multiple execution might needs them, and they will be used to restore an execution. Data, even if changing, is usually used by multiple executions tasks etc.
It seems like you treat these files as "configurations", is that right ?
the optimizer such that the study object of the optimizer keeps track of the results and the next sample will be aware of all previous studies
This is done from the optimizer side, by sampling the scalars reported by any experiment the optimizer created.
I am looking for a way to manually sample and report from and to the optimizer...
.. I can avoid running unnecessary common heavy setup, for a light weight experiment
Maybe it makes sense to inherit from the Optimizer and add ...
Hi WorriedParrot51
Assuming you run the code "manually" once (i.e. without the agent). Then when you call Task.init it will register the argparser.
When running with the agent, the first time you will call parse, it will automatically override the argparse defaults with the values stored in the Task.
Make sesne?
am getting None for Task.current_task() at the beginning of my script.
Task.init() is doing the magic , only after this call you will have current_task (either running manua...
Hi UnsightlyBeetle11
Is it possible to report the model's architecture (PyTorch model) automatically on ClearML, as we do it via Netron or other neural network visualisation tools?You mean like the actual network layout? Unfortunately, there is currently no option to do that, you can however manually store a plot/image that represents it
BTW:I think that at the beginning Netron was somehow integrated, but it was rarely used and support for it was not trivial so it was phased out. You can ho...
However, I have not yet found a flexible solution other than ssh-agent forwarding.
And is it working?
To store all the debug samples, also it can store all the models (if you configure the output_uri=' http://file_server_here:8081 ') Yes: instead of the file server have 's3://<ip_of_minio>:9000/bucket' make sure you add the credentials for the minio in the trains.conf Yes, basically once you have the creendtials in the trains.conf, you could do StorageManager.get_local_copy('s3://<minio>:9000/bucket/file') (also upload of course 🙂 )
Thanks FiercePenguin76 , I can totally understand your point on running proper tests, and reluctance to break other things.
I suggest to add a comment with the temp fix that solved the problem for you, and we will make sure the team takes it from there. wdyt?
EnviousPanda91 this seems like a specific issue with the clearml-task
cli, could that be ?
Can you send a full clearml-task command-line to test ?
GiganticTurtle0 quick update, a fix will be pushed, so that casting is based on the Actual value passed not even type hints 🙂
(this is only in case there is no default value, otherwise the default value type is used for casting)
I was thinking such limitations will exist only for published
Published Task could not me "marked started" even when with force flag
I should manually copy it to the remote services agents?
The code itself needs to run somewhere, currently this has to be your machine, either you manually run the AWS autoscaler or an agents runs it for you. Make sense ?
Hi SteepCockroach81CLEARML_CONFIG_FILE
point to the configuration file being used
See here:
https://clear.ml/docs/latest/docs/configs/env_vars#server-connection
UnevenDolphin73 I have a suspicion we have a few terms mixed:
hyperparameters :
These are essentially key/value.
when you call Task. connect (dict_with_params), clearml will flatten the dict and you end up with key/value
configuration objects :
These are actually blobs of text, the UI will show as is
When you call my_local_file=Task. connect_configuration (name, "path/to/config/file")
The entire Content of the config file is stored on the Task object itself.
Back to the use case, instead ...
100% of things with
task_overrides
would be the most convenient way
I think the issue is that you have to pass the project ID not project name (the project unique IS is the property that is actually stored on the Task)
@<1523707653782507520:profile|MelancholyElk85> can you check the following works:
pipe.add_task(, ..., task_overrides={'project': Task.get_project_id(project_name='examples')},)
Hi GrievingTurkey78 ,
Yes this is a per file download, but I think you can list the bucket and download everything
Try:from trains import StorageManager from trains.storage.helper import StorageHelper helper = StorageHelper.get('gs://bucket/folder') remote_files = helper.list('*') for f in remote_files: StorageManager.get_local_copy(f)
You might need to play around a bit, it might be that StorageHelper.get(' gs://bucket ') and then helper.list('folder/*')
Let me know what worked 🙂