I want pipeline / task dispatch to be reported and monitored outside of clearml. For example, I might want to log the dispatch event in some non-clearml system and then monitor the health of the pipeline and alert if if it is pending for too long.Hmm interesting, so like a callback?!
I'm thinking a callback is being executed after the Pipelines is sent, but once the callback is done, the pipeline process leaves?
Does that make sense ?
I might want to dispatch other jobs from within the same p...
maybe worth updating the main Readme.md in the github.. if someone try to follow the instructions there it breaks
Hmm I thought we already did, Yes you are absolutely correct, I'll make sure we do
Hi ShinyWhale52
Every execution of the pipeline (by definition) will create a new job based on the pipeline steps
This is the reason you see all the steps twice (the default assumption is you wish to re-run the step, as this is part of the processing workflow (e.g. training a model)
the model has been overwritten. I guess this is due to this instruction:
This is because you are storing it locally to the same path, it just reflects the fact you just overwrote your model.
To create a...
FiercePenguin76 in the Tasks execution tab, under "script path", change to "-m filprofiler run catboost_train.py".
It should work (assuming the "catboost_train.py" is in the working directory).
GiganticTurtle0 what's the Dataset Task status?
I did change the
instead of 8080?
So this is the issue
SubstantialElk6
Regrading cloning the executed Task:
In the pip requirements syntax, "@" is a hint that tells pip where to find the package if it is not preinstalled.
Usually when you find the @ /tmp/folder It means the packages was preinstalled (usually pre installed in the docker).
What is the exact scenario that caused it to appear (this was always the case, before v1 as well).
For example zipp package is installed from pypi be default and not from local temp file.
Your fix b...
Thanks, yes you are correct the color is derived from the series name, so I guess the issue is the name+Id is not kept in full screen
So why is it trying to upload to "//:8081/files_server:" ?
What do you have in the trains.conf on the machine running the experiment ?
where people can do @'s for experiments/projects/tasks and even comparisons ...
ohhh I like that! for me this throws me directly to Slack integration .
I think my main question is, "is the discussion ephemeral?" in other words, is this an on going discussion that later no one will care about, or are we creating some "knowledge base" that we want to later share?
Also, by "address bar at the top", i assume you mean address url right?
yes... apologies for the phrasing, it was w...
I think I found something relating to the issue on the subprocess not logging. Let me check if we can share something quickly
In fact, as I assume, we need to write our custom HyperParameterOptimizer, am I right?
Yes exactly! it should be very easy
Just Inherit from RandomSearch and change create_job
https://github.com/allegroai/clearml/blob/d45ec5d3e2caf1af477b37fcb36a81595fb9759f/clearml/automation/optimization.py#L1043
AttractiveCockroach17 could it be Hydra actually kills these processes?
(I'm trying to figure out if we can fix something with the hydra integration so that it marks them as aborted)
DeliciousBluewhale87
node.base_task_id
is the base task, which will always be in draft mode, Instead we should use the
node.executed
which references the current executed node.
YES, maybe we should add that into the example, so it is clearer ? WDYT?
load_model will get a link to a previously registered URL (i.e. it search a model pointing to the specific URL, if it finds it, it will get you the Model object)
HealthyStarfish45
No, it should work 🙂
Hi VexedCat68
Are we talking youtubes ? docs? courses ?
what format should I specify it
requirements.txt format e.g. ["package >= 1.2.3"]
Would this enforce that package on various components
This is a per component control, so you can have different packages / containers based on the componnent
Would it then no longer capture import statements?
This is replacing the auto detected packages, but obviously this fails to detect your internal repo package, which is the main issue here.
How is "internal package" installed, in o...
Sorry @<1798525199860109312:profile|IntriguedGoldfish14> just noticed your reply
Yes two inference container, running simultaneously on the cluster. As you said, each one with its own environment (assuming here that the requirements of the models collide)
Make sense
Hi @<1538330703932952576:profile|ThickSeaurchin47>
Specifically I’m getting the error “could not access credentials”
Put your minio credentials here:
None
Yeah the hack would work but i’m trying to use it form the command line to put in airflow. I’ll post on GH
Oh, then set TMP/TMPDIR environment variable, it should have the same effect
Can you try to run the example code, see if that works for you?
If the problem consists (i.e. trains failing to detect packages, please open a GitHub Issue so the bug will not get lost 🙂
create a new file, copy paste to the new file these lines, and run it inside vscode, what are you getting in the console?
from clearml import Task Task.add_requirements("tensorflow") task = Task.init(project_name="debug", task_name="requirements") print("done")
Hi CrookedAlligator14
Hi, I just started using clearml, and it is amazing!
Thank you! 😍
When I enqueue the task, the venv is setup and starts to install all the packages from the
requirements.txt
file, but at the end I get the following in the console:
Can you try with the latest agent, we improved the support for pytorch (they now have a proper pypi compatible repo), can you see if that solves it?pip3 install clearml-agent==1.5.0rc0
Ohh then you do docker sibling:
Basically you map the docker socket into the agent's docker , that lets the agent launch another docker on the host machine.
You cab see an example here:
https://github.com/allegroai/clearml-server/blob/6434f1028e6e7fd2479b22fe553f7bca3f8a716f/docker/docker-compose.yml#L144
@<1523701868901961728:profile|ReassuredTiger98> thank you so much for testing it!