Hi SubstantialElk6 ,
Please feel free to ask anything 🙂
If nothing is reported, then nothing will be displayed.
For example, if you are reporting to Tensorboard, it will be automatically sent to the UI - https://github.com/allegroai/clearml/blob/master/examples/frameworks/pytorch/pytorch_tensorboard.py (run this example and view the scalers section 🙂 )
Regarding this one, as
mention, you can have a full iam role (without any credentials) in higher tiers, in the regular youll need the credentials for the auth using the boto3 commands for spin up, spin down, tags and such apis commands. The app currently is hosted by us, so you iam role won’t be really available
With it the new created instance will have the iam role associate to it too
How do you load the file? Can you find this file manually?
is it possible to overwrite if trains.conf did exist
Yes, you can choose specific configuration file with TRAINS_CONFIG_FILE
environment var.
pip install clearml
works for me now, if you like to try…
simply changing toÂ
show
doesn’t work in my case as i am displaying CM.. what about if i useÂ
matshow
Can you share with me some code you have (just the matplot part)? What about the example? if you run it, do you get some plots in plots section and some in debug?
Hi SteepDeer88 ,
You can use https://clear.ml/docs/latest/docs/apps/clearml_task for this, what do you think?
Hi GleamingGiraffe20 , still getting those errors?
đź‘Ť can you try with secure as true
?
Trains auto-magical reports many frameworks plots: matplotlib, Tensorboard and more, maybe this is the issue? The report_media / report_image
was double reporting?
You can try set_base_docker
:
t = Task.init(project_name="examples", task_name="set docker parames") t.set_base_docker( docker_cmd="nvidia/cuda:11.1", docker_arguments="-e ENV=1", docker_setup_bash_script=['apt update', 'apt-get install -y gcc'] )
But, if you like, you can connect a remote interpreter and debug with PyCharm, locally, without clearml-agent
Hi ItchyHippopotamus18  , can you try withtorch.save(model_jit, os.path.join(checkpoint_path, f'{epoch_num}_{round(acc_full, 4)}.pt'))
?
Hi PanickyMoth78 , thanks for the logs, I think I know the issue, i’m trying to reproduce it my side, keeping you updated about it
Thanks ImpressionableAlligator9 and MagnificentWorm7 for reporting this, I will double check it
Hi ThickDove42 ,
The SETUP SHELL SCRIPT is the bash script to run at the beginning of the docker before launching the Task itself.
You can just try edit it, for example:
apt update apt-get install -y gcc
Where did you add the task.execute_remotely
command? do you have a sample code I can run?
Hi MammothGoat53 ,
which clearml
version are you using? I run the same and all worked as expected (I changed the project_name
and the task_name
to be 4 chars length)
agree
E: Could not get lock /var/lib/apt/lists/lock - open (11: Resource temporarily unavailable)
Another process is using the lock, can you specify the ami (and region) so I can try to reproduce it?
Hi PompousHawk82 . Are you running in parallel the several instances of the same code on the same task?
Hi VictoriousPenguin97
sdk.storage.direct_access
is part of the extended support in the paid version.
But I think its not required since ClearML will simply try to access the path directly as it is, and you don’t need to configure it.
Hi MysteriousBee56 ,
The https://github.com/allegroai/trains/blob/master/examples/services/cleanup/cleanup_service.py is an example how you can add services to manage your experiments.
You can change the criteria for fetching the tasks in this script (in the https://github.com/allegroai/trains/blob/master/examples/services/cleanup/cleanup_service.py#L72 call) to something like a specific tag you can add to the experiments ( delete
tag?, you can add tag to multi tasks) and it should...
and using https://github.com/allegroai/clearml-agent/blob/21c4857795e6392a848b296ceb5480aca5f98e4b/docs/clearml.conf#L140 for running scripts at docker startup
Hi ArrogantBlackbird16 ,
How do you generate and run your tasks? Do you use the same flow as in the https://clear.ml/docs/latest/docs/fundamentals/agents_and_queues#agent-and-queue-workflow ? Some other automation?
Hi LazyTurkey38 ,
Yes, it will create a virtual env for the task
You can add a limitation to the query page size:task_filter = {"page_size": <your-limit>, "page": 0}
what do you think?
Hi ImmensePenguin78 ,
You can get all the console outputs using task.get_reported_console_output()
. can this do the trick?
Hi GleamingGiraffe20 ,
Without adding Task.init
, i’m getting some OSError: [Errno 9] Bad file descriptor
error, do you get those too?
Do you run your script from CLI or IDE (pycharm maybe?)?
FierceFly22 like Elior wrote, you can use Task.execute_remotely
, just need to supply the queue name 🙂