
Reputation
Badges 1
25 × Eureka!Hi @<1538330703932952576:profile|ThickSeaurchin47>
Specifically Iโm getting the error โcould not access credentialsโ
Put your minio credentials here:
None
Hi GentleSwallow91
I think this would be a good start:
https://github.com/allegroai/clearml/blob/master/examples/pipeline/pipeline_from_decorator.py
wdyt?
Hi @<1523701949617147904:profile|PricklyRaven28>
Sorry, we missed that one
we need to invoke it with
accelerate launch
so we use
subprocess.run
So you have two options, either you change the script entry of the Task from your " script.py
" to" -m accelerate launch script.py
or you manually do that inside your entry point (i.e. call accelerate launch)
BTW, I "think" we added an "auto detect" for it, so that if you launched it manually this wa...
If nothing specific comes to mind i can try to create some reproducible demo code (after holiday vacation)
Yes please! ๐
In the mean time see if the workaround is a valid one
We used subprocess for it, ...
Popen? os.system? fork?
How does this work in the context of a pipeline?
Is your pipeline from functions / decorators ? or is it from Tasks ?
(if this is Tasks then just changing the entry point in the overides)
In case of functions or decorators, you have to do that manually (i.e. your function needs to do "accelerate launch"
from accelerate.commands.launch import launch_command, launch_command_parser
parser = launch_command_parser()
args = parser.parse_args("-command -here".split())
launch_command(arg...
Hi DepressedChimpanzee34
How do I reproduce the issue ?
What are we expecting to get there ?
Is that a Colab issue or hyper-parameter encoding issue ?
Hmm DepressedChimpanzee34 my bad it seems the loading is done via YAML loader, but the dumping is straight forward str casting...
https://github.com/allegroai/clearml/blob/6e6271fb91f2aeb2aa7a13c6d07d4e635baaa670/clearml/backend_interface/task/task.py#L934
What would you expect to get (BTW "value\blah"
is Not a valid string assignment in python as there is no \b escape character, it should be "value\blah" which translates into the text "value\blah")
DepressedChimpanzee34
so parsing bask is done via a yaml reader:
https://github.com/allegroai/clearml/blob/49fcbd7bbf3236f4175cdff29fa951847b0923cc/clearml/backend_interface/task/args.py#L506
We could add extra test here, checking for \ in the string, that should solve it and will be backwards compatible (I think)
https://github.com/allegroai/clearml/blob/49fcbd7bbf3236f4175cdff29fa951847b0923cc/clearml/backend_interface/task/task.py#L935
I want to build a real time data streaming anomaly detection service with clearml-serving
Oh, so the way it currently works clearml-serving will push the data in real-time into Prometheus (you can control the stats/input/out), then you can build the anomaly detection in grafana (for example alerts on histograms over time is out-of-the-box, and clearml creates the histograms overtime).
Would you also need access to the stats data in Prometheus ? or are you saying you need to process it ...
WittyOwl57 what about? vm.max_map_count
echo "vm.max_map_count=262144" > /tmp/99-clearml.conf
sudo mv /tmp/99-clearml.conf /etc/sysctl.d/99-clearml.conf
sudo sysctl -w vm.max_map_count=262144
sudo service docker restart `https://clear.ml/docs/latest/docs/deploying_clearml/clearml_server_linux_mac (5)
Hi SmarmySeaurchin8
StorageManager docs is broken in the example notebook here:
Thanks ๐ I'll make sure we fix it
I want to display is already stored locally
Sure you can:Logger.current_logger().report_image('title','series', iteration=0, local_path='/my_file/is_here.jpg')
SmarmySeaurchin8 just so that I don't miss anything.
One machine, two trains-agents each one connected to a different trains-server, correct ?
from the trains-agent --help
trains-agent --config-file /home/user/my_trains_server1.conf daemon trains-agent --config-file /home/user/my_trains_server2.conf daemon
There is no way to create an artifact/model/dataset without a task, right?
Models are a an entity of it's own, and you can actually create one without a Task.
(just for my own interest: how much does the enterprise version divert from the open source version? It it just extended or are there core changes to the enterprise version)
It adds a few security layers on top, and adds a few features that are just not part of the open source (RBAC, hyper-datasets, advanced scheduling, cu...
Wtf? can you try with = (notice single not double)?
channels:
- defaults
- conda-forge
- pytorch
dependencies:
- cudatoolkit=11.1.1
- pytorch=1.8.0
Actually with
base-task-id
it uses the cached venv, thanks for this suggestion! Seems like this is equivalent to cloning via UI.
exactly !
But โcloningโ via UI runs an exact copy of the code/config, not a variant,
You can override the commit/branch and get the latest ...
run exp tweak code/configs in IDE, or tweak configs via CLI have it re-rerun in exact same venv (with no install overhead etc)So you can actually launch it remotely directly from the code:
...
Hi UnevenDolphin73
You mean this part?
https://github.com/allegroai/clearml-agent/blob/5afb604e3d53d3f09dd6de81fe0a494dacb2e94d/docs/clearml.conf#L212
(In other words, theย
the Task's Environment section
ย is a bit unclear)
Yes we should expand, but generally you are correct it should work as you described ๐
SmarmySeaurchin8
When running in "dev" mode (i.e. writing the code) only packages imported directly are registered under "installed packages" , then when the agent is executing the experiment, it will update back the entire environment (including derivative packages etc.)
That said you can set detect_with_pip_freeze
to true (in trains.conf) and it will basically store the entire pip freeze.
https://github.com/allegroai/trains/blob/f8ba0495fb3af1f99732fdffbbccd2fa992934a4/docs/trains.c...
or point to the self signed certificate:export REQUESTS_CA_BUNDLE=/path/to/your/certificate.pem
MagnificentSeaurchin79
Can this be solved by using a docker image with the preinstalled packages at a user level?
Yes ๐
BTW: I think I missed how you managed to install the object_detection API in the first place?
Is it the git repo of the Task? did you fork it? is it a submodule of your git repo?
p.s.
Yes Slack is quite good at reminding you, but generally saying always prefer @ , it will send me an email if I miss the message :)
By default the pl Trainer will output everything to TB, which we automatically store. But verify that TB is installed
can you see these metric on TB ?
Why? The task should have completed successfully, how is this aborting?
Early stopping by the HPO process, like hyper-band, e.g. this training model is going nowhere let's stop it.
Are tagging / archiving available in the API for a task?
Everything that the UI can do you can do programmatically ๐
Tags:
task.add_tags / set_tags / get_tags
Archive:
task.set_system_tags(task.get_system_tags() + ['archived'])
Also. finally the columns will be movable and re sizable, I can't wait for the next version ;)
Do you think this is better ? (the API documentation is coming directly from the python doc-string, so the code will always have the latest documentation)
https://github.com/allegroai/clearml/blob/c58e8a4c6a1294f8acec6ed9cba81c3b91aa2abd/clearml/datasets/dataset.py#L633
I appended python path with /code/app/flair in my base image and execute
the python path is changing since it installs a new venv into the system.
Let me check what's going on with the pythonpath, because it is definitely is changed when running the code (the code base root folder is added to it). Maybe we need to make sure that if you had PYTHON PATH pre-defined we restore it.
it is shown in the recording above
It was so odd, I had to ask ๐ okay let me see if we can reproduce
I donโt have any error message in the browser console - Just an empty array returned on events.get_task_logs. This bug didnโt exist on version 1.1.0 and is quite annoyingโฆ
meaning the RestAPI returns nothing, is that correct ?