Reputation
Badges 1
183 × Eureka!Exactly!! That's what I was looking for: create the pipeline but not launching it. Thanks again AgitatedDove14
Hi AgitatedDove14 , just one last thing before closing the thread. I was wondering what is the use of PipelineController.create_draft
if you can't use it to clone and run tasks, as we have seen
Brilliant, that worked like a charm!
Yes, before removing the 'default' queue I was able to shut down agents without specifying further options after the --stop
command. I just had to run clearml-agent daemon --stop
as many times as there were agents. Of course, I will open the issue as soon as possible :D
Ok! I'll try to spin up an agent with the --service-mode
command and I will give you feedback
Hi TimelyPenguin76
No errors with this new version!
Of course it's always a good idea to have that extra option just in case 🙂
Nevermind, I've already found a cleaner way to address this problem. I really appreciate your help!
I can't figure out what might be going on
BTW, let's say I accidentally removed the 'default' queue from the queue list. As a result, when I try to stop an agent using clearml-agent daemon --stop
, I get the following error:clearml_agent: ERROR: APIError: code 400/707: No queue is tagged as the default queue for this company
I have already created another queue also called 'default' but it had no effect :/
By the way, where can I change the default artifacts location ( output_uri
) if a have a script similar to this example (I mean, from the code, not agent's config):
https://github.com/allegroai/clearml/blob/master/examples/pipeline/pipeline_from_decorator.py
I'm totally agree with the pipelinecontroller/decorator part. Regarding the proposal for the component parameter, I also think it would be a good feature, although it might mislead the fact that there will be times when the pipeline will fail because it is an intrinsically crucial step, so it doesn't matter whether 'continue_pipeline_on_failure' is set to True or False. Anyway, I can't think a better way to deal with that right now.
Sure, here is a trivial example:from clearml import Dataset dataset = Dataset.create(dataset_name="Dataset_v1.1.3", dataset_project="Mocks") dataset.finalize() loaded_dataset = Dataset.get(dataset_id=dataset.id)
Well, just as you can pass the 'task_type' argument in PipelineDecorator.component
, it might be a good option to pass the rest of the 'Task.init' arguments as they are passed in the original method (without using a dictionary)
I'm getting a NameError because 'Optional' type hint is not defined in the global scope
Oh, I see. In the meantime I will duplicate the function and rename it so I can work with a different configuration. I really appreciate your effort as well as having a continuous feedback to keep improving this wonderful library!
So ClearML will scan all the repository code searching for package dependencies? Is that right?
Perfect, that's exactly what I was looking for 🙂 Thanks!
Beautiful. I have tested the new functionality with several use cases and it works just as I expected. Excellent work, as usual :D
Hey CostlyOstrich36 AgitatedDove14 ! Any news on this? Should I open an issue?
AgitatedDove14 Oops, something still seems to be wrong. When trying to retrieve the dataset using get_local_copy() I get the following error:
` Traceback (most recent call last):
File "/home/user/myproject/lab.py", line 27, in <module>
print(dataset.get_local_copy())
File "/home/user/.conda/envs/myenv/lib/python3.9/site-packages/clearml/datasets/dataset.py", line 554, in get_local_copy
target_folder = self._merge_datasets(
File "/home/user/.conda/envs/myenv/lib/python3.9/site-p...
They share the same code (i.e. the same decorated functions), but using a different configuration.
Oh, I see. This explains the surprising behavior. But what if Task.init
code is created automatically by PipelineDecorator.component
? How can I pass arguments to the init method in that case?
I'm using the last commit. I'm just fitting a scikit-learn MinMaxScaler
object to a dataset of type tf.data.Dataset
inside a function (which represents the model training step) decorated with PipelineDecorator.component
. The function does not even return the scaler object as an artifact. However, the scaler object is logged as an artifact of the task, as shown in the image below.
Great, thank you very much TimelyPenguin76
Yes, although I use both terms interchangeably. The information will actually be contained in JSON files.
Thanks for the background. I now have a big picture of the process ClearML
goes through. It was helpful in clarifying some of the questions that I didn't know how to ask properly. So, the idea is that a base task is already stored on the ClearML
server for later use in a production environment. This is because such a task will always be created during the model development process.
Going back to my initial question, as far as I understood, if the environment caching option is ena...
By adding the slash I have been able to see that indeed the dataset is stored in output_url
. However, when calling finalize
, I get the same error. And yes, I have installed the version corresponding to the last commit :/
Yes, when the parameters that are connected do not have nested dictionaries, everything works fine. The problem comes when I try to do something like this:
` from clearml import Task
task = Task.init(project_name="Examples", task_name="task with connected dict")
args = {}
args["period"] = {"start": "2020-01-01 00:00", "end": "2020-12-31 23:00"}
task.connect(args) `
and the clone task is like this:
` from clearml import Task
template_task = Task.get_task(task_id="<Your template task id>"...
I see the point. The reason I'm using PipelineController now is that I've realised that in the code I only send IDs from one step of the pipeline to another, and not artefacts as such. So I think it makes more sense in this case to work with the former.