The reason is because it is logged as an image, not a plot 🙂
One last thing make sure you spin the pod container with privileged mode, because the trains-agent docker will spin a sibling docker for your actual experiment.
SmarmySeaurchin8
When running in "dev" mode (i.e. writing the code) only packages imported directly are registered under "installed packages" , then when the agent is executing the experiment, it will update back the entire environment (including derivative packages etc.)
That said you can set detect_with_pip_freeze
to true (in trains.conf) and it will basically store the entire pip freeze.
https://github.com/allegroai/trains/blob/f8ba0495fb3af1f99732fdffbbccd2fa992934a4/docs/trains.c...
I had again the same problem but within a remote pipeline setup.
Are you saying the ussue is not fixed? can you verify the pipeline & pipeline components are using the at least 1.104rc0 version?
(I suspect you are correct, but I'm missing some information in order to understand where the problem is)
WackyRabbit7 can you send mock code that explains how you create the pipeline ?
GrievingTurkey78 Actually it is in progress, see the GitHub issue for details:
https://github.com/allegroai/trains/issues/219
Okay, I'm pretty sure there is a hack, let me see if there is something "nicer"
However, that would mean passing back the hostname to the Autoscaler class.
Sorry my bad, the agent does that automatically in real-time when it starts, no need to pass the hostname it takes it from the VM (usually they have some random number/id)
Ok, so it doesn't follow the exact same rules asÂ
Task.init
?
Correct
I was afraid all the logs and outputs of a hyperparameter optimization task would be deleted just because no artifacts were created. (edited)
Should not happen 🙂
GiganticTurtle0
If there are several tasks running concurrently, which task shouldÂ
Task.current_task()
 return? (
How could you have that ?
Per process, there is one Main current Task (until you close it).
Are you referring to a pipeline with multiple steps ?
If this is the case, task.current_task
will return the Task of the component (if executed form the component) and the pipeline (if called from the pipeline logic function).
Notice we added the ability to s...
I see, by default it will look for requirements.txt in the root of the repo (the actual repo).
That said in code you can specify the requirements .txt:Task.force_requirements_env_freeze(requirements_file='repo/project-a/requirements.txt') task = Task.init(...)
Notice, you need to call it prior to the Task.init call
Hi ClumsyElephant70
So do you need both requirements.txt combined ?
How will the agent be able to reproduce both repo on the remote machine ?
DeliciousBluewhale87 fyi, the new version of the pipeline (hopefully pushed towards the end of this week), will allow you to more easily write steps as functions (not only as Tasks, as the current implementation)
Also check the new Trigger and Scheduler both intended to trigger these pipelines:
https://github.com/allegroai/clearml/blob/fe3c481c37e70881c44d67c1cf9bbce00a84747e/clearml/automation/scheduler.py#L457
https://github.com/allegroai/clearml/blob/fe3c481c37e70881c44d67c1cf9bbce00a8...
we can add non-clearml code as a step in the pipeline controller.
Yes 🙂 , btw you can kind of already do that, with pre/post function callbacks (notice they are running from the same scope as the actual pipeline controller).
What exactly did you have in mind to put there ?
ResponsiveCamel97
could you attach the full log?
Error 101 : Inconsistent data encountered in document: document=Output, field=model
Okay this point to a migration issue from 0.17 to 1.0
First try to upgrade to 1.0 then to 1.0.2
(I would also upgrade a single apiserver instance, once it is done, then you can spin the rest)
Make sense ?
HI ResponsiveCamel97
What's the clearml-server version? How do you spin the server on your k8s cluster, helm ?
Hi SmallDeer34
ClearML automagical logging will work on the current python process. But in your example yyour Bash is running another python script (that has nothing to do with the original notebook), hence clearml automagic is not aware of it (i.e. it cannot "patch" the tensorboard calls).
In order to make it work.
you should do something like:from joeynmt import train train.main(...)
Or something similar 🙂
Make sense ?
Hi GiddyTurkey39
First, yes you can just edit the "installed packages" section and add any missing package (this is equal to requirements.txt)
I wonder why trains
failed detecting the "bigquery" package in the first place... Any thoughts ?
pytorch DDP
with what backend ? gloo ? nvcc ? openmpi ?
This is odd... can you post the entire trigger code ?
also what's the clearml version?
What do you mean? every Model has a unique ID, what do you consider a version?
MagnificentSeaurchin79 no need for the detection api (yes definitely a mess to setup), it will be more helpful to get a toy example.
The only important for me is to know if exist anyway to get more information in the apiserver log
what do you mean by that ?
SmarmySeaurchin8 regarding the original question:task.set_project(project_id)
Task.get_projects() to get all the project names/ids
ClearML maintains a github action that sets up a dummy clearml-server,
You have one, it's the http://app.clear.ml (not a dummy one, but for this purpose it will work)
thoughts ?
Hi ReassuredTiger98
Agent's queue priory can be translated to the order the agent will pull jobs from.
Now let's assume we have two agents with priorities A,B for one and B,A for the other. If we only push a Task to queue A, and both agents are idle (implying queue B is empty), there is no guarantee which one will pull the job.
Does that make sense ?
What is the use-case you are trying to solve/optimize for ?