I am writing quite a bit of documentation on the topic of pipelines. I am happy to share the article here, once my questions are answered and we can make a pull request for the official documentation out of it.
Amazing please share once done, I will make sure we merge it into the docs!
Does this mean that within component or add_function_step I cannot use any code of my current directories code base, only code from external packages that are imported - unless I add my code with ...
- At its simplest, this could just mean checking that all of the steps and the pipeline itself have completed successfully (by checking their โTask statusโ).If a pipeline step ends with "failed" status in the pipeline execution function an exception will be raised, if the exception is not caught, the pipeline itself will also fail
run
pipeline_script.py
which contains the pipeline code as decorators.
So in theory the following should actually work.
Let's assume you ...
of that makes sense, basically here is what you should do:
Task.init(... output_uri='
')
output_model.update_weights(register_uri=model_path)
It will automatically create a unique target folder / file under None to store your model
(btw: passing the register_uri
basically sais: "I already uploaded the model there, just store the link" - i.e. does Not upload the model)
Do you have to have a value there ?
no, i just commented it and it worked fine
Yeah, we should add a comment saying "optional" because it looks as if you need to have it there if you are using Azure.
ok so i accidentally (probably with luck) noticed the max_connection: 2 in the azure.storage config.
NICE!!!! ๐
But wait where is that set?
None
Should we change the default or add a comment ?
When looking at the worker details, it says "No queues currently assigned to this worker"
Yes, I think we should have better information there, the "AWS service" is not directly pulling jobs from any specific queue, hence nothing there. It is "listening" to queues and launching machines, those machines will be listening to the queue. I wonder if it is just easier to also make sure it is listed as "assigned" to those queues . wdyt?
Hi @<1551376687504035840:profile|StraightSealion9>
AWS Autoscaler to create a new instance when you enqueue a task to the relevant queue.
Does that mean that you were able to enqueue a Task and have it launch on the remote EC2 machine ?
can you see these metric on TB ?
Hi @<1547028052028952576:profile|ExuberantBat52>
task = Task.get_task(...)
print(task.data)
wdyt?
basically @<1554638166823014400:profile|ExuberantBat24> you can think of hyper-datasets as a "feature-store for unstructured data"
where the ui merges the plots just as we want and I was wondering if there is some simple way to do it in the case of all plots.
we can do it for scalars (this is trivial)
We can merge specific plots when they are simple, I think basic histograms.
But for any generic plots we fear the merge will just fail, and this is why it defaults to side by side.
how can I combine two plots in the ui as you mentioned?
The easiest solution is to use, "report_scatter2d", these are specific pl...
Hi @<1546303293918023680:profile|MiniatureRobin9> could it be the pipeline logic is created via the clrarml-task CLI? If this is the case, I think this is an edge case we should fix. Basically it creates a Task instead of pipeline, which in.essence only effects the UI. To solve it, just run the pipeline locally, notice that by default when you start it, it will actually stop the local run and relaunch itself on an agent.
Also, could you open a GitHub issue so we add a flag for it?
GreasyPenguin66 you can pass:AZURE_STORAGE_ACCOUNT AZURE_STORAGE_KEY
As the default azure access/secret ๐
Hi ColossalAnt7
Following on SuccessfulKoala55 answer
I saw that there is a config file where you can specify specific users and passwords, but it currently requires
- mount the configuration file (the one holding the user/pass) into the pod from a persistent volume .
I think the k8s way to do this would be to use mounted config maps and secrets.
You can use ConfigMaps to make sure the routing is always correct, then add a load-balancer (a.k.a a fixed IP) for the users a...
Is there a way to move existing pipelines between projects?
You should be able to, go to your settings page and turn on "show hidden folders"
Then go to your project, you should see " .pipeline
" sub project there, right click it and move it to another folder.
I don't know whether you have access to the backend,
Creepy , no I do not ๐
I can't make anything appear in the console part of the ui
clearml_task.logger.report_text("some text")
should work
Hi @<1544853721739956224:profile|QuizzicalFox36>
Sure just change the ports on the docker compose
JitteryCoyote63 virtualenv v20 is supported, pip v21 needs the latest trains/trains-agent RC,
Hi @<1523701132025663488:profile|SlimyElephant79>
I would like to save only the last & best checkpoints and not all of them if possible.
Basically it will mimic the local file system, so if you overwrite the local files it will overwrite the remote model.
You can also disable auto logging, and manually upload the models
In Task.init
pass auto_connect_frameworks
False for the specific framework
see:
[None](https://clear.ml/docs/latest/docs/clearml_sdk/task_sdk/#automatic-lo...
BTW, this one seems to work ....
` from time import sleep
from clearml import Task
Task.set_offline(True)
task = Task.init(project_name="debug", task_name="offline test")
print("starting")
for i in range(300):
print(f"{i}")
sleep(1)
print("done") `
does this work for multiple levels?
Yep ๐
ReassuredTiger98 oh wow I did not realize you actually call importlib to import your libraries (any reason not to call import
?)
And yes, I think we will miss it as the package analysis is actually static text analysts of the code
If you use this one for example, will the component have pandas as part of the requirement
None
def step_two(...):
import pandas as pd
# do stuff
If so (and it should), what's the difference, where is "internal.repo " different from pandas ?