
Reputation
Badges 1
25 × Eureka!i hope can run in same day too.
Fix should be in the next RC π
GreasyPenguin66 you can pass:AZURE_STORAGE_ACCOUNT AZURE_STORAGE_KEY
As the default azure access/secret π
it is shown in the recording above
It was so odd, I had to ask π okay let me see if we can reproduce
I donβt have any error message in the browser console - Just an empty array returned on events.get_task_logs. This bug didnβt exist on version 1.1.0 and is quite annoyingβ¦
meaning the RestAPI returns nothing, is that correct ?
Hi SmarmyDolphin68
You have two options:
Automatically upload the models when training pass output_uri
to Task.init. For example output_uri=True
will upload to the clearml-server, output_uri='
s3://bucket/folder '
will upload to S3 etc. Manually upload a model that you have locally: https://github.com/allegroai/clearml/blob/9ff52a8699266fec1cca486b239efa5ff1f681bc/examples/reporting/model_config.py#L37
Hi AdventurousWalrus90
Thank you for the kind words! π
/home/usr_338436_ulta_com/.clearml/venvs-builds/3.7/.gitignore
so this is the error on the agent ?
Ohh, hmm, that is odd, there should not be a limit there. Let me check ....
Verified, you are correct "." in label enumeration will break the clone .
I'll make sure this bug is passed to backend guys to fix. Thanks TenseOstrich47 !
meanwhile maybe "_" instead ? π
Hi TenseOstrich47
Thanks for following up!
Should be solved in the upcoming release (I think ETA is next week) π
Hi ShallowArcticwolf27
However, the AMI for version 0.16.1 has the following docker-compose file
I think we moved the docker-compose yaml when we upgraded from trains to clearml. Any reason your are installing the old docker-compose ?
Not intentional! When I launched the AMI it was running an older version
I think this is exactly the reason they decided to change the location π so you will have to manually upgrade, reasoning is we changed directory names (maybe a few more things)
Yes shutdown the current docker copse curl the new docker compose rename folder spin it up againFull instructions here:
https://allegro.ai/clearml/docs/docs/deploying_clearml/clearml_server_aws_ec2_ami.html#upgrading
GreasyPenguin14
Is it possible in ClearML to have a main task (the complete cross validation) and subtasks (one for each fold)?
You mean to see it as nested in the UI? or Auto logged by the code ?
Nested in the UI is not possible I think?
Yes, but the next version will have nested projects, that's something π
I mean that it is possible to start the subtask while the main task is still active.
You cannot call another Task.init while a main one is running.
But you can call Task.create and log into it, that said the autologging is not supported on the newly created Task.
Maybe the easiest solution is just to do the "sub-tasks" and close them. That means the main Task i...
CooperativeFox72 could you expand on "not working"?
If you have a yaml file, I would do:
` # local_path = './my_config.yaml'
path = task.connect_configuration(local_path, name=name)
if task.running_locally():
with open(local_path, "r") as config_file:
my_params_dict = yaml.load(config_file, Loader=yaml.FullLoader)
my_params_dict['change_me'] = 'new value'
my_params_text = yaml.dump(my_params_dict)
store back the change, my_params assumed to be the content of the param file (tex...
CheerfulGorilla72 sounds like a great idea, I'll pass along to documentation ppl π
GrievingTurkey78 please feel free to send me code snippets to test π
Hi RipeGoose2
Could you expand on "inconsistency in the iteration reporting" ? Also "calling trainer.fit multiple" would you expect it to show as a single experiment or is it kind of param search ?
I assume every fit starts reporting from step 0 , so they override one another. Could it be?
but the debug samples and monitored performance metric show a different count
Hmm could you expand on what you are getting, and what you are expecting to get
when you are running the n+1 epoch you get the 2*n+1 reported
RipeGoose2 like twice the gap, i.e internally it adds the an offset of the last iteration... is this easily reproducible ?
I also found that you should have a deterministic ordering
before
you apply a fixed seed
Not sure I follow ?
I am creating this user
Please explain, I think this is the culprit ...
Yes this is definitely the issue, the agent assume the docker user is "root".
Let me check something
The issue itself is changing the default user.
USER appuser
WORKDIR /home/appuser
Any reason for it ?
CooperativeFox72
Could you try to run the docker and then inside the docker try to do:su root whoami
GrotesqueDog77 one issue with this design, in order to run a sub-component, the call must be done from the parent component, does that make sense?
` def step_one(data):
return data
def step_two(path):
return model
def both_steps()
path = step_one("stuff")
return step_two(path)
def pipeline():
both_steps() Which would make
both_steps ` a component and step_one and step_two sub-components
wdyt?
Regrading the demoapp, this is just a default server that allows you to start play around with ClearML without needing to setup any of your own servers or signup
That said, I would recommend to sign up (totally free) on the community server
https://app.community.clear.ml/
Did you you set 'force_git_ssh_protocol: true '?
https://github.com/allegroai/clearml-agent/blob/249b51a31bee97d63f41c6d5542e657962008b68/docs/clearml.conf#L39
Draft created successfully, but it doesn't contain property with docker command.
Could you help me?
ApprehensiveFox95 could you test with the latest RC, I think there was a fixpip install clearml==0.17.5rc5
WittyOwl57 what about? vm.max_map_count
echo "vm.max_map_count=262144" > /tmp/99-clearml.conf
sudo mv /tmp/99-clearml.conf /etc/sysctl.d/99-clearml.conf
sudo sysctl -w vm.max_map_count=262144
sudo service docker restart `https://clear.ml/docs/latest/docs/deploying_clearml/clearml_server_linux_mac (5)