Reputation
Badges 1
979 × Eureka!Note: I can verify that post_packages is well picked up by the trains-agent, since in the experiment log I see:agent.package_manager.type = pip agent.package_manager.pip_version = \=\=20.2.3 agent.package_manager.system_site_packages = true agent.package_manager.force_upgrade = false agent.package_manager.post_packages.0 = PyJWT\=\=1.7.1
I think this is because this API is not available in elastic 5.6
They are, but this doesn’t work - I guess it’s because temp IAM accesses have an extra token, that should be passed as well, but there is no such option on the web UI, right?
I added the pass_hashed and restarted the server, still get the same problem
as it's also based on pytorch-ignite!
I am not sure to understand, what is the link with pytorch-ignite?
We're in the brainstorming phase of what are the best approaches to integrate, we might pick your brain later on
Awesome, I'd be happy to help!
AgitatedDove14 So I copied pasted locally the https://github.com/pytorch-ignite/examples/blob/main/tutorials/intermediate/cifar10-distributed.py from the examples of pytorch-ignite. Then I added a requirements.txt and called clearml-task
to run it on one of my agents. I adapted a bit the script (removed python-fire since it’s not yet supported by clearml).
to pass secrets to each experiment
I am not sure what you mean by unless the domain is different
? Personal Access Token are designed such that to allow cloning a private repo, the user has to give the PAT full access to repos, including public repos. So it should also work with all other git repos
ok, so there is no way to cache it and detect when the ref changes?
yes but they are in plain text and I would like to avoid that
Hi CostlyOstrich36 ! no I am running on venv mode
I was able to fix by applying for a license and registering it
SuccessfulKoala55 Here is the trains-elastic error
what would be the name of these vars?
Oof now I cannot start the second controller in the services queue on the same second machine, it fails with
` Processing /tmp/build/80754af9/cffi_1605538068321/work
ERROR: Could not install packages due to an EnvironmentError: [Errno 2] No such file or directory: '/tmp/build/80754af9/cffi_1605538068321/work'
clearml_agent: ERROR: Could not install task requirements!
Command '['/home/machine/.clearml/venvs-builds.1.3/3.6/bin/python', '-m', 'pip', '--disable-pip-version-check', 'install', '-r'...
I think clearml-agent tries to execute /usr/bon/python3.6 to start the task, instead of using the python used to start clearml-agent
The file /tmp/.clearml_agent_out.j7wo7ltp.txt
 does not exist
I execute the clearml-agent this way:/home/machine/miniconda3/envs/py36/bin/python3 /home/machine/miniconda3/envs/py36/bin/clearml-agent daemon --services-mode --cpu-only --queue services --create-queue --log-level DEBUG --detached
in clearml.conf:agent.package_manager.system_site_packages = true agent.package_manager.pip_version = "==20.2.3"
this is the last line, same a before
AgitatedDove14 So I’ll just replace task = clearml.Task.get_task(clearml.config.get_remote_task_id())
with Task.init()
and wait for your fix 🙂
Yes, not sure it is connected either actually - To make it work, I had to disable both venv caching and set use_system_packages to off, so that it reinstalls the full env. I remember that we discussed this problem already but I don't remember what was the outcome, I never was able to make it update the private dependencies based on the version. But this is most likely a problem from pip that is not clever enough to parse the tag as a semantic version and check whether the installed package ma...
AgitatedDove14 So what you are saying is that since I have trains-server 0.16.1, I should use trains>=0.16.1? And what about trains-agent? Only version 0.16 is released atm, this is the one I use
yes, in setup.py I have:..., install_requires= [ "my-private-dep @ git+
", ... ], ...
AppetizingMouse58 the events_plot.json template misses the plot_len
declaration, could you please give me the definition of this field? (reindexing with dynamic: strict
fails with: "mapping set to strict, dynamic introduction of [plot_len] within [_doc] is not allowed
)
Sorry both of you, my problem was actually lying somewhere else (both buckets are in the same region) - thanks for you time!
trains-elastic container fails with the following error:
in the controller, I want to upload an artifact and start a task that will query that artifact and I want to make sure that the artifact exists when the task will try to retrieve it