
Reputation
Badges 1
25 × Eureka!for example, if I somehow start the execution of an agent task in a specific docker container?)
You mean to specify the container from code? or to make sure the agent can access private docker container registry ? Or is it for private pypi container repository ?
Thanks SmallDeer34 !
Would you like us to? How about a footnote/acknowledgement?
How about a reference / footnote ?@misc{clearml, title = {ClearML - Your entire MLOps stack in one open-source tool}, year = {2019}, note = {Software available from
}, url={
}, author = {allegro.ai}, }
Thank you JuicyOtter4 ! π
. Is there a way to programmatically set that in the code?
Something like?
` task = Task.init(...)
probably we should change that to description ?!
task.set_comment("best thing ever") `
Hi VexedCat68
Could it be you are trying to update a committed dataset?
I mean to reduce the API calls without reducing the scalars that are logged, e.g. by sending less frequent batched updates.
Understood,
In my current trials I am using up the API calls very quickly though.
Why would that happen?
The logging is already batched (meaning 1API for a bunch of stuff)
Could it be lots of console lines?
BTW you can set the flush period to 30 sec, which would automatically collectt and batch API calls
https://github.com/allegroai/clearml/blob/25df5efe7...
Nice SoreHorse95 !
BTW: you can edit the entire omegaconf yaml externally with set/get configuration object (name = OmegaConf) , do notice you will need to change Hydra/allow_omegaconf_edit to true
PompousBeetle71 I think that was you saw as tags in previous version was actually systems tags, now we also have users tags (i.e. .tags). If you still want to access the system tags can you try:InputModel('aabbcc')._get_base_model().data.system_tags
Hi JitteryCoyote63
Somehow I thought it was solved π
1 ) Yes please add GitHub issue so we can keep track
2 )
Task.current_task().get_logger().flush(wait=True). # <-- WILL HANG HERE
Is this the main issue ?
Hmm I see, add this for example
extra_docker_shell_script: ["rm ~/.bashrc", "echo removed bashrc"]
because it should have detected it...
Did you see "Repository and package analysis timed out ..."
As I installed ClearML using pip,
Where is the clearml-serving runs ? usually your configuration file is in ~/clearml.conf
Notice if it is not there it means it is using the defaults so just create a new one and add that line
using the docker-compose file for the
clearml-serving
pipeline, do we also have to mount it somehow?
oh yes, you are correct the values are passed using environment variables (easier when using docker compose)
You can in addition add a mount from the host machine to a conf file,
volumes:
- ${PWD}/clearml.conf:/root/clearml.conf
wdyt?
this is not the case as all the scalars report the same iterations
MassiveHippopotamus56 could it be the the machine statistics? (i.e. cpu/gpu etc. these are considered scalars as well...)
I get a popup saying that the actual files werenβt deleted from S3 (so presumably only the metadata on the server gets deleted).
Hi QuaintPelican38
The browser client actual issues the delete "command", (the idea is separation of the meta-data and data, e.g. artifacts). That means you have to provide the key/secret to the UI (see profile page)
LovelyHamster1 what do you mean by "assume the permissions of a specific IAM Role" ?
In order to spin an ec2 instance (aws autoscaler) you have to have correct credentials, to pass those credentials you must create a key/secret pair to pass to the autoscaler. There is no direct support for IAM Role. Make sense ?
Hi @<1593413673383104512:profile|MiniatureDragonfly17>
These are the specific model input/output layers name.
The way Triton analyses PyTorch model is usuallyinput__0
then input__1
for the input layers and output__0
and so on for the results:
You can see an example here:
None
--input-size 1 28 28 --input-name "INPUT__0" --input-type float32 --output-size -1 10 --output-name "OUTPUT__0" --outpu...
GiganticTurtle0 I think I located the issue:
it seems the change is in "config" (and for some reason it stores the entire dict) but the split values are not changed.
Is this it?
Just dropping this here but I've had some funky compressions with very small datasets!
Odd deflate behavior ...?!
I think your "files_server" is misconfigured somewhere, I cannot explain how you ended up with this broken link...
Check the clearml.conf on the machines or the env vars ?
That being said it returns none for me when I reload a task but it's probably something on my side.
MistakenDragonfly51 just making sure, you did call Task.init, correct ?
What duesfrom clearml import Task task = Task.current_task()
returns ?
Notice that you need to create the Task before actually calling Logger.current_logger()
or Task.current_task()
I ended up using
task = Task.init(
continue_last_task
=task_id)
to reload a specific task and it seems to work well so far.
Exactly, this will initialize and auto log the current process into existing task (task_id). Without the argument continue_last_task ` it will just create a new Task and auto log everything to it π
SubstantialElk6 feel free to tweet them on their very inaccurate comparison table π
PlainSquid19 Trains will analyze the entire repository if this is a git repo code, and a single script file if there is no repository found.
It will not analyze an entire folder if it is not in a git repository, because it will not be able to recreate this folder anyhow. Make sense ?
Hi SmallDeer34
Can you try with the latest RC , I think we fixed something with the jupyter/colab/vscode support!pip install clearml==1.0.3rc1
instead of the one that I want or the one of the env which it is started from.
The default is the python that is used to run the agent.agent.ignore_requested_python_version = true agent.python_binary = /my/selected/python3.8
LovelyHamster1 Now I see... Interesting credentials ability. Specifically all the S3 access on trains is derived from the ~/clearml.conf
credentials section :
https://github.com/allegroai/clearml/blob/ebc0733357ac9ead044d0ed32d41447763f5797e/docs/clearml.conf#L73
( or the AWS S3 environment variables )
I'm not sure how this AWS feature works, I suspect it is changing the AWS_ACCESS_KEY_ID AWS_SECRET_ACCESS_KEY variables on the ec2 instance. If this is the case, it should work out of...