Reputation
Badges 1
25 × Eureka!Yes, but does add_external_files makes chunked zips as add_files do?
No it references them, (i.e. meta-data not actually doing something with the files themselves)
I need the zipping, chunking to manage millions of files
That makes sens, if that's the case you will have to download those files anyway, and then add them with add_files
you can use the StoargeManager to download them, and then add them from the local copy (this will zip/chunk them)
[None](https://clear.ml/docs/la...
Hi @<1590514584836378624:profile|AmiableSeaturtle81>
I think you should use add_external_files
, instead of add_files
(which is for local files)
None
the parameter datatypes are not being changed when loading them up.
These are the auto logged parameters , inside YOLO, correct?
Just to make sure, you can actually see the value None
in the UI, is that correct? (if everything works as expected, you should see empty string there)
I think I was not able to fully express my point. Let me try again.
When you are running the pipeline Fully locally (both logic and components) the assumption is this is for debugging purposes.
This means that the code of each component is locally available, could that be a reason?
Hi @<1734020162731905024:profile|RattyBluewhale45>
What's the clearml agent version? And could you verify with the latest RC?
Lastly how are you running the agent, docker mode? What's the bade container?
"what's the trains/trains-agent/trains-server versions ?" how can I check it?
trains/trains-agent are pip packages os,pip freeze | grep trains
trains-server you can check in the /profile page top left corner
help_models is a dir in the git
And the git is registered on the experiment correctly ?
HI PlainSquid19 could you add a bit more information? Are you running trains-agent ? is it in docker/venv mode ? what's the trains/trains-agent/trains-server versions ?
I thought this is the issue on the thread you linked, did I miss something ?
Yes, let's assume we have a task with id aabbcc
On two different machines you can do the following:trains-agent execute --docker --id aabbcc
This means you manually spin two simultaneous copies of the same experiment, once they are up and running, will your code be able to make the connection between them? (i.e. openmpi torch distribute etc?)
CurvedHedgehog15 there is not need for :task.connect_configuration( configuration=normalize_and_flat_config(hparams), name="Hyperparameters", )
Hydra is automatically logged for you, no?!
You mean like for your internal support channel inside your company ?
@<1569496075083976704:profile|SweetShells3> remove these from your pbtext:
name: "conformer_encoder"
platform: "onnxruntime_onnx"
default_model_filename: "model.bin"
Second, what do you have in your preprocess_encoder.py
?
And where are you getting the Error? (is it from the triton container? or from the Rest request?
I execute theÂ
clearml-session
 withÂ
--docker
 flag.
This is to control the docker image the agent will spin for you (think dev enviroment you want to work in, like nvidia pytorch container already having everything you need)
How do you run theÂ
clearml-agent
 in docker mode
clearml-agent --docker
See here:
https://clear.ml/docs/latest/docs/clearml_agent#docker-mode
This is assuming you can just run two copies of your code, and they will become aware of one another.
so other process can use it
This is why there is a model repository, so you can query the last model created, or by name or tag or query the Task that created it and then via the Task the model and it's location.
This is a stable way to make sure your application code (the one using the model) will get to use stable models regardless of the training processes.
I would add a Tag to the model and then search based on the project and the tag, wdyt?
Hi @<1564785037834981376:profile|FrustratingBee69>
It's the previous container I've used for the task.
Notice that what you are configuring is the Default container, i.e. if the Task does not "request" a specific container, then this is what the agent will use.
On the Task itself (see Execution Tab, down below Container Image) you set the specific container for the Task. After you execute the Task on an Agent, the agent will put there the container it ended up using. This means that ...
Hi @<1624579015031394304:profile|JitterySeal56>
... and credentials in clearml.conf file on client side, but I have restrictions of aws keys expiring each hour
This means that you need to configure IAM role on your client machine, the data never goes through the server it is uploaded directly from the dev machine to the S3 bucket.
You can however just store the data on your clearml-files server ...
can I mount the s3 bucket as file system on place where
you need to mount it where the file server is storing it's files, correct (notice, not the DBs, just the files server)
But I have no idea what will be input of step2.
What do you mean by that? the assumption is that somehow the output of step 1 will be passed (a string reference) to step 2, what am I missing ?
NVIDIA_VISIBLE_DEVICES=0,1
Basically it is uses "as is" and Nvidia drivers do the rest
Same goes for all
or 0-3
etc.
I have no idea what string reference could be used when steps come from Task?
Oh I see, you are correct, when it comes to Tasks the assumption is your are passing strings (with selectors on the strings, i.e. the curly brackets) but there is no fancy serialization/deserialization as you have with pipelines from decorators / functions. The reason for that is that the Task itslef is a standalone, there is no way for the pipeline logic to actually "pull data" from it and "pass" it to the o...