Yes, thanks AgitatedDove14 ! It's just that the configuration
object passed onwards was a bit confusing.
Is there a planned documentation overhaul? 🤔
For example, the
Task
object is heavily overloaded and its documentation would benefit from being separated into logical units of work. It would also make it easier for the ClearML team to spot any formatting issues.
This is a very good point (the current documentation is basically docstring, but we should create a structured one)
... but some visualization/inline code with explanation is also very much welcome.
I'm assuming this connected with the previous point?
I'm not sure about the intended use of connect_configuration
now.
I was under the assumption that in connect_configuration(configuration, name=None, description=None)
, the configuration
is only used in local execution.
But when I run config = task.connect_configuration({}, name='General')
(in remote execution), the configuration is set to the empty dictionary
There used to be a good example but it's now missing. I'm not sure what does Use only for automation (externally), otherwise use Task.connect_configuration
mean when e.g. looking at Task.set_configuration_object
, etc.
Could you clarify a bit, CostlyOstrich36 or AgitatedDove14 ?
Generally, really. I've struggled recently (and in the past), because the documentation seems:
Very complete wrt available SDK (though the formatting is sometimes off) Very lacking wrt to how things interact with one anotherA lot of what I need I actually find from pluging into the source code.
I think ClearML would benefit itself a lot if it adopted a documentation structure similar to numpy ecosystem (numpy, pandas, scipy, scikit-image, scikit-bio, scikit-learn, etc)
Very lacking wrt to how things interact with one another
If I'm reading it correctly, what you are saying is that some of the "big picture" / holistic approach on how different parts interact with one another is missing, is that correct?
I think ClearML would benefit itself a lot if it adopted a documentation structure similar to numpy ecosystem
Interesting thought, what exactly would you suggest we "borrow" in terms of approach?
Basically when running remotely, the first argument to any configuration (whether object or string, or whatever) is ignored, right?
Yeah, and just thinking out loud what I like about the numpy/pandas documentation
Basically when running remotely, the first argument to any configuration (whether object or string, or whatever) is ignored, right?
Correct 🙂
Is there a planned documentation overhaul?
you mean specifically for the connect_configuration ? or in general on the connect
approach rationale ?
e.g. a separate structured user guide with common tips, usability, best practices - https://pandas.pydata.org/pandas-docs/stable/user_guide/index.html
vs the doc, where each function is its own page, e.g.
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html
Hi UnevenDolphin73 , yes this is correct. Artifacts are outputs so they are all removed during a clone or reset
First bullet point - yes, exactly
Second bullet point - all of it, really. The SDK documentation and the examples.
For example, the Task
object is heavily overloaded and its documentation would benefit from being separated into logical units of work. It would also make it easier for the ClearML team to spot any formatting issues.
Any linked example to github is welcome, but some visualization/inline code with explanation is also very much welcome.
Yes that makes total sense to me. How about a GitHub issue on the clearml-docs ?
Yes that's what I thought, thanks for confirming.
I'm not sure about the intended use of
connect_configuration
now.
Basically here is the rationale behind it:
I have a config file that I want to log on the Task, and I Also want to be able to change this configuration file externally when launching using an agent (i.e. edit the content) I have a nested dictionary that I do not want to flatten and push as hyper-parameters because it is not very readble, so I want to store it in a more human readable form and edit it as config file
...
Task.set_configuration_object
, etc.
Basically do not call set_configuration_object directly, instead use connect_configuration. The main difference is that "connect" as a concept is a two way conneciton, when runnign manually log on the Task/backend, when running via agent, get the values from the Task/backend into the code. set_xyz calls are always setting the Task's/backend value
Make sense ?