
Reputation
Badges 1
84 × Eureka!@<1523701205467926528:profile|AgitatedDove14> : I am writing quite a bit of documentation on the topic of pipelines. I am happy to share the article here, once my questions are answered and we can make a pull request for the official documentation out of it.
I think now that here the documentation means the usage of connect
.
Ah... if I run the same script not from PyCharm, but from the terminal, then it gets completed... puh...
: What does
- The component code still needs to be self-composed (or, function component can also be quite complex)
Well it can address the additional repo (it will be automatically added to the PYTHONPATH), and you can add auxilary functions (as long as they are part of the initial pipeline script), by passing them to
helper_functions
mean? Is it not possible that I call code that is somewhere else on my local computer and/or in my code base? That makes thi...
@<1523701083040387072:profile|UnevenDolphin73> : I am not sure who you mean by "user"? I am not aware that we are building an app... 😄 Do you mean a person that reruns the entire pipeline but with different parameters from the Web UI? But here, we are not able to let the "user" configure all those things.
Is there some other way - that does not require any coding - to build pipelines (I am not aware)?
Also, when I build pipelines via tasks, the (same) imports had to be done in each...
@<1523701205467926528:profile|AgitatedDove14> : In general: If I do not build a package out of my local repository/project , I cannot reference anything
from the local project/repository directly, right? I must make a package out of it, or I must reference it with the repo
argument, or I must reference respective functions using the helper_functions
argument. Did I get this right?
@<1523701205467926528:profile|AgitatedDove14> : Wait, so, if a task is initialized in process A and I call mark_completed
in a process B, which process is terminated? A or B?
@<1523701205467926528:profile|AgitatedDove14> : "does that make sense ?" Not really.
"you do not need to automatically Add/Log/Track things into the Task in the current process." - I do not need to automatically do [...]? You mean I can do it automatically, but alternatively I can do it manually? Do you mean I use close
within a process to prevent automatic logging/adding/tracking? But, as far as I know, after I used close
I am not able to log etc. manually either. So...
"Mark...
"using your method you may not reach the best set of hyperparameters."
Of course you are right. It is an efficiency trade-off of speed vs effectiveness. Whether this is worth it or not depends on the use-case. Here it is worth it, because the performance of the modelling is not sensitive to the parameter we search for first. Being in the ball-park is enough. And, for the second set of parameters, we need to do a full grid search (the parameters are booleans and strings); thus, this wo...
@<1523701083040387072:profile|UnevenDolphin73> : From which URL is your most recent screenshot?
I just see the website that I linked to. I am not sure what is meant by "python environment". I cannot make a screen shot, because I do not know where to look for this in the first place.
No these are 3 different ways of building pipelines.
That is what I meant to say 🙂 , sorry for the confusion, @<1523701205467926528:profile|AgitatedDove14> .
@<1523701083040387072:profile|UnevenDolphin73> , your point is a strong one. What are clear situations in which pipelines can only be build from tasks, and not one of the other ways? An idea would be if the tasks are created from all kinds of - kind of - unrelated projects where the code that describes the pipeline does not ...
KindChimpanzee37 : Thank you so much! I asked follow up questions 🙂 .
@<1523701083040387072:profile|UnevenDolphin73> : No, I love it ❤ . Now, I just have to read everything 😄 .
@<1523701070390366208:profile|CostlyOstrich36> : After more playing around, it seems that ClearML Server does not store the models or artifacts itself. These are stored somewhere else (e.g., AWS S3-bucket) or on my local machine and ClearML Server is only storing configuration parameters and previews (e.g., when the artifact is a pandas dataframe). Is that right? Is there a way to save the models completely on the ClearML server?
@<1523701070390366208:profile|CostlyOstrich36>
My training outputs a model as a zip file. The way I save and load the zip file to make up my model is custom made (no library is directly used), because we invented the entire modelling ourselves. What I did so far:
output_model = OutputModel(task=..., config_dict={...}, name=f"...")
output_model.update_weights("C:\io__path\...", is_package=True)
and I am trying to load the model in a different Python process with
mymodel =...
I mean those, that you see in the screen shot. The difference in code is - at least for me - to write
- parameters_data = {'custom1': 'no', 'custom2': False}; parameters_data = task.connect(parameters_data , name='data')
- task.set_user_properties(custom1='no', custom2=False)
@<1523701087100473344:profile|SuccessfulKoala55> : That is the link I posted as well. But this should be mentioned also at places where it is about about the external or non-external storage. Also it should be mentioned everywhere we talk about models or artifacts etc. Not necessarily in details, but at least with a sentence and a link.
@<1523701083040387072:profile|UnevenDolphin73> : I see. I did not make the connection that output_uri=True
is what I was missing. I thought this was the default. But the default is actually "None", which is different than "True".
I have already been trying to contribute (have three pull requests), but honestly I feel it is a bit weird, that I need to update a documentation about something I do not understand, while I actually try to evaluate if ClearML is the right tool for our company...
Here is my code:
from clearml import Task, TaskTypes
from clearml.task_parameters import TaskParameters, param, percent_param
class MyParams(TaskParameters):
iterations = param(
type=int,
desc="Number of iterations to run",
range=(0, 100000),
)
target_accuracy = percent_param(
desc="The target accuracy of the model",
)
myPar = MyParams(iterations=1000, target_accuracy=0.95)
parameters_to_track1 = {'var1': 'a', 'hyper_par': 1}
parameter...
Thank you I found the error.myPar = task.connect(myPar, name='from TaskParameters')
is required.