Reputation
Badges 1
25 × Eureka!Oh, so the pipeline basically makes itself their parent, this means you can get their IDs:steps_ids = Task.query_tasks(task_filter=dict(parent=<pipeline_id_here)) for task_id in steps_ids: task = Task.get_task(task_id)
Verified @<1643060801088524288:profile|HarebrainedOstrich43> RC will be out soon for you to test, thank you again for catching it, not sure how internal tests missed it (btw the pipeline is created it's just not shown in the right place due to some internal typo)
Thus, the return data from step 2 needs to be available somewhere to be used in step 3.
Yep π
It will serialize the data on the dict?
I thought it will just point to a local file location where you have the data π
I didnβt know that each steps runs in a different process
Actually ! you can run them as functions as well, try:if __name__ == '__main__': PipelineDecorator.debug_pipeline() # call pipeline function hereIt will just run them as functions (ret...
Okay, this is odd the request returned exactly 100 out 100.
It seems not all of them were reported?!
Could you post the toy code, I'll check what's going on.
HappyDove3 where are you running the code?
(the upload is done in the background, but it seems the python interpreter closed?!)
You can also wait for the upload:task.upload_artifact(name="my artifact", artifact_object=np.eye(3,3), wait_on_upload=True)
Thanks EnviousStarfish54 !
Cloud Access section is in theΒ
Profile
Β page.
Any storage credentials (S3 for example) are only stored on the client side (never the trains-server), this is the reason we need to configure them in the trains.conf. When the browser needs to access those URL's (downloading an artifact) it also needs the secret/key, it automatically display a popup requesting them, and will store them in this section. Notice they are stored on the browser session (as a cookie).
Looks great, let me see if I can understand what's missing, because it should have worked ...
How can I add additional information, e.g. debug samples, or scalar to the data to be shown in the UI?Β Logger.current_logger() is not working
Yes π
dataset.get_logger() to the rescue
Hi @<1566596960691949568:profile|UpsetWalrus59>
you should call it before initializing the Task
Task.ignore_requirements("pywin32")
task = Task.init(...)
Happy new year @<1618780810947596288:profile|ExuberantLion50>
- Is this the right place to mention such bugs?Definitely the right place to discuss them, usually if verified we ask to also add in github for easier traceability / visibility
m (i.e. there's two plots shown side-by-side but they're actually both just the first experiment that was selected). This is happening across all experiments, all my workspaces, and all the browsers I've tried.
Can you share a screenshot? is this r...
Does the clearml module parse the python packages?
Yes it analyzes the installed packages based on the actual mports you have in the code.
If I'm using a private pypi artifact server, would I set the PIP_INDEX_URL on the workers so they could retrieve those packages when that experiment is cloned and re-ran?
Correct π the agent basically calls pip install on those packages, so if you configure it, with PIP_INDEX_URL it should just work like any other pip install
Yes, I think we just found out it breaks clearml π
could you test with the latest stable, just in case ?
(I'll make sure we have an RC that supports the hydra dev version)
Hi FunnyTurkey96
Which pip are you using, basically pip changed the dependency resolver after 20.1
Change: https://github.com/allegroai/clearml-agent/blob/aede6f4bac71c8fc56e7cf982318a48527953a3c/docs/clearml.conf#L57pip_version: "<20.2"See if that helps
SillyPuppy19 yes you are correct, actually I can promise you the callback will be called from a different thread (basically the monitoring thread) so it's on the user to make sure the callback can handle it .
How about we move this discussion to GitHub?
Hi @<1643423185791619072:profile|DashingCentipede5>
Notice that you called "start_locally", it tries to run the code locally inside your jupter notebook, it assumes everything including code already exists, is that your case ?
I think latest:
clearml==1.17.0
matplotlib==3.6.2
shap==0.46.0
Python 3.10
Hi @<1556812486840160256:profile|SuccessfulRaven86>
Every clearml-serving session (you can have multiple different "sessions") is assumed to be homogeneous, this would mean it will serve the same models on as many nodes as possible supporting multiple models per pod.
In your example I think the easiest is to create two serving sessions one with a node selector for the 24GB node and another for the 16GB node, wdyt?
Yes, or at least credentials and API...
Maybe inside your code you can later copy the model into fixed location ?
This way you have the model in the model repository and a copy in a fixed location (StorageManager can upload to a specific bucket/folder with the same credentials you already have)
Would that work?
@<1699955693882183680:profile|UpsetSeaturtle37> can you try with the latest clearml-session (0.14.0) I remember a few improvements there
The remote machine is in Azure behind the load-balancer, we are using docker images, so directly connecting to pods.
yeah LB in the middle might be introducing SSH hiccups, first upgrade to the latest clearml-session it better ocnfigures the SSH client/server to support longer timeout connection, if that does not work try the -- keepalive=true
Le...
Hi @<1797800418953138176:profile|ScrawnyCrocodile51>
Will the docker container / disk space (really I am more interested about the dataset that download by the task) get automatically clean up?
Yes, the agent is running the container with --rm π
Using agent v1.01r1 in k8s glue.
I think a fix was recently committed, let me check it
Great!
BTW: you can take some inspiration from here:
https://github.com/allegroai/trains/blob/master/examples/automation/task_piping_example.py
Or from the full pipeline:
https://github.com/allegroai/trains/blob/master/examples/pipeline/pipeline_controller.py
SmarmyDolphin68 , All looks okay to me...
Could you verify you still get the plot on debug samples as image with the latest trains RCpip install trains==0.16.4rc0
Oh I see, yes the "metrics" include both scalars / plots & console outputs,
I also think they are updated only once a day (or maybe twice a day?) so even if you delete them it will take to update
(archive is not delete, you then need to go to the archived view and delete it from there)
Hi @<1552101458927685632:profile|FreshGoldfish34>
self-hosted, you mean the open source ? if so, then yes totally free π
That said I would recommend to have the server inside your VPN, just in case from a security perspective
TrickyFox41 are you saying that if you add Task.init inthe code it works, but when you are calling "clearml-task" it does not work? (in both cases editing the Args/overrides ?
RoughTiger69 I think this could work, a pseudo example:
` @PipelineDecorator.component(...)
def the_last_step_before_external_stuff():
print("doing some stuff")
@PipelineDecorator.pipeline()
def logic():
the_last_step_before_external_stuff()
if not check_if_data_was_ingested_to_the_system:
print("aborting ourselves")
Task.current_task().abort()
# we will not get here, the agent will make sure we are stopped
sleep(60)
# better safe than sorry
exit(0) `wdyt? (the...
Programmatically before , importing the package, set os.environ['TRAINS_CONFIG_FILE']='~/my_new_trains.conf'
BTW: What's the use case for doing so?
thanks for helping again
My pleasure :)