AbruptWorm50 can you send full image (X axis is missing from the graph)
So I might be a bit out of sync, but I think there should be Triton serving and OpenVino serving built into it (or at least in progress).
Hi CleanPigeon16
can I make the steps in the pipeline use the latest commit in the branch?
Yes:
manually clone the stesp's Task (in the UI), and in the UI edit the Execution section and change to "last sommit on branch" and specify the branch name programmatically (as the above, clone+edit)
ValueError: Could not parse reference '${run_experiment.models.output.-1.url}', step run_experiment could not be found
Seems like the "run_experiment" step is not defined. Could that be ...
Hi SuperiorDucks36
you have such a great and clear GUI
😊
I personally would love to do it with a CLI
Actually a lot of stuff are harder to get from UI (like current state of your local repository etc.) But I think your point stands 🙂 We will start with CLI, because it is faster to deploy/iterate, then when you guys say this is a winner we will have a wizard in the UI.
What do you think?
In fact, as I assume, we need to write our custom HyperParameterOptimizer, am I right?
Yes exactly! it should be very easy
Just Inherit from RandomSearch and change create_job
https://github.com/allegroai/clearml/blob/d45ec5d3e2caf1af477b37fcb36a81595fb9759f/clearml/automation/optimization.py#L1043
ElegantKangaroo44 I tried to reproduce the "services mode" issue with no success. If it happens again let me know maybe will better understand how it happened (i.e. the "master" trains-agent gets stuck for some reason)
What's the python, torch, clearml version?
Any chance this can be reproducible ?
What's the full error trace/stack you are getting?
Can you try to debug it to where exactly it fails here?
https://github.com/allegroai/clearml/blob/86586fbf35d6bdfbf96b6ee3e0068eac3e6c0979/clearml/binding/import_bind.py#L48
RoughTiger69 wdyt?
Hi CooperativeFox72
But my docker image has all my code and all the packages it needed I don't understand why the agent need to install all of those again? (edited)
So based on the docker file you previously posted, I think all your python packages are actually installed on the "appuser" and not as system packages.
Basically remove the "add user" part and the --user
from the pip install.
For example:
` FROM nvidia/cuda:10.1-cudnn7-devel
ENV DEBIAN_FRONTEND noninteractive
RUN ...
Ohh I see, okay next pipeline version (coming very very soon 😉 will have the option of function as Task, would that be better for your use case ?
(Also in case of local execution, and I can totally see why this is important, how would you specify where is the current code base ? are you expecting it to be local ?)
I'm all for trying to help with debugging pipeline, because this is really challenging.
BTW: you can run your code as if it is executed from an agent (including the param ove...
Hi SubstantialBaldeagle49
yes, you can backup the entire trains-server (see the github docs on how) You mean upgrading the server? Yes, you can change the name or add comments (Info tab / description ), and you can add key/value description (under the configuration tab, see user properties)
additionally, I found is that clearml==1.0.5 package is able to find these partial changes, newer versions find nothing at all, maybe it's because it's always comparing against remote
Hmm it was always from remote...
it is actually doing the following:git rev-parse --abbrev-ref --symbolic-full-name @{u}
Then with the branch name output,git diff --submodule=diff <add_branch_name_here>
Also could you explain the difference between trigger.start() and trigger.start_remotely()
Start will start the trigger process (the one "watching the changes") locally (this makes sense for debugging etc.)
start_remotely will launch the trigger process on the "services" where it should live forever 🙂
Okay so when I add trigger_on_tags, the repetition issue is resolved.
Nice!
This problem occurs when I'm scheduling a task. Copies of the task keep being put on the queue ...
There is a version coming out next week, the one after it (probably 2/3 weeks later) will have this feature
Okay that might explain the issue...
MysteriousBee56 so what you are saying ispython3 -m trains-agent --help
does NOT work
but trains-agent --help
does work?
Task deletion failed: unhashable type: 'dict'
Hi FlutteringWorm14 trying to figure where this is coming from, give me a sec
VexedCat68 are you manually creating the OutputModel object?
yes they do 🙂
SubstantialElk6 I just executed it , and everything seems okay on my machine.
Could you pull the latest clearml-agent from the github and try again ?
EDIT:
just try to run:git clone
cd clearml-agent python examples/k8s_glue_example.py
it fails because my_package using pip...so I have to manually edit the section and remove the "my_package"
MagnificentSeaurchin79 did you manually add both "." and my_package ?
If so, what was the reasoning to add my_package if pip cannot install it ?
Just making sure, the machine that you were running the "trains-init" on can access the API server ?
Hmm make sense, then I would call the export_task once (kind of the easiest to get the entire Task object description pre-filled for you) with that, you can just create as many as needed by calling import_task.
Would that help?
This really makes little sense to me...
Can you send the full clearml-session --verbose console output ?
Something is not working as it should obviously, console output will be a good starting point
So you want to have two Tasks and connect the two ?
Maybe the best approach is to have th current_task. the parent of the Dataset Task ?dataset._task.set_parent(Task.current_task())
maybe this can cause the issue?
Not likely.
In the original pipeline (the one executed from the Pycharm) do you see the "Pipeline" section under Configuration -> "Config objects" in the UI?