Reputation
Badges 1
25 × Eureka!Hmm can you test with the latest RC?pip install clearml==0.17.6rc1
Also, how would one ensure immutability ?
I guess this is the big question, assuming we "know" a file was changed, this will invalidate all versions using it, this is exactly why the current implementation stores an immutable copy. Or are you suggesting a smarter "sync" function ?
Hi ElegantCoyote26
If there is, it will have to be using the docker-mode, but I do not think this is actually possible because this is not a feature of docker. It is possible to do on k8s, but that's a diff level of integration 🙂
EDIT:
FYI we do support k8s integration
Regrading the missing packages, you might want to test with:force_analyze_entire_repo: false
https://github.com/allegroai/trains/blob/c3fd3ed7c681e92e2fb2c3f6fd3493854803d781/docs/trains.conf#L162
Or if you have a full venv you like to store instead:
https://github.com/allegroai/trains/blob/c3fd3ed7c681e92e2fb2c3f6fd3493854803d781/docs/trains.conf#L169
BTW:
What is the missed package?
Hi MortifiedCrow63 , thank you for pinging! (seriously greatly appreciated!)
See here:
https://github.com/googleapis/python-storage/releases/tag/v1.36.0
https://github.com/googleapis/python-storage/pull/374
Can you test with the latest release, see if the issue was fixed?
https://github.com/googleapis/python-storage/releases/tag/v1.41.0
Hi DeliciousBluewhale87
My theory is that the clearml-agent is configured correctly (which means you see it in the clearml-server). The issue (I think) is that the Task itself (running inside the docker) is missing the configuration. The way the agent passes the configuration into the docker is by mapping a temporary configuration file into the docker itself. If the agent is running bare-metal, this is quite straight forward. If the agent is running on k8s (or basically inside a docker) th...
SmarmyDolphin68 okay what's happening is the process exists before the actual data is being sent (report_matplotlib_figure is an async call, and data is sent in the background)
Basically you should just wait for all the events to be flushedtask.flush(wait_for_uploads=True)
That said, quickly testing it it seems it does not wait properly (again I think this is due to the fact we do not have a main Task here, I'll continue debugging)
In the meantime you can just dosleep(3.0)
And it wil...
In the Task log itself it will say the version of all the packages, basically I wonder maybe it is using an older clearml version, and this is why I cannot reproduce it..
Thanks PompousBaldeagle18 !
Which software you used to create the graphics?
Our designer, should I send your compliments 😉 ?
You should add which tech is being replaced by each product.
Good point! we are also missing a few products from the website, they will be there soon, hence the "soft launch"
Hi HarebrainedBear62
What's the type of data ?
This is odd... can you post the entire trigger code ?
also what's the clearml version?
Just to clarify, where do I run the second command?
Anywhere just open a python console and import the offline task:from trains import TaskTask.import_offline_session('./my_task_aaa.zip')
Related, how to I specify in my code the cache_dir where the zip is saved?
This is the Trains cache folder, you can set it in the trains.conf file:
https://github.com/allegroai/trains/blob/10ec4d56fb4a1f933128b35d68c727189310aae8/docs/trains.conf#L24
I think so, when you are saying "clearml (bash script..." you basically mean, "put my code + packages + and run it" , correct ?
Hi EnchantingWorm39
Great question!
Regrading the data management, I know the enterprise edition has full support for unstructured data, and we plan to soon have a solution for structured data as part of the open source (soon= hopefully in a month time)
Regrading model serving, I know you can integrate with TFServing or seldon with very little effort (usually the challenge is creating triggers etc, but but in most cases this is custom code anyhow 🙂 )
I do not have experience with Cortex/B...
Hi, Is there a way to stop a clearml-agent from within an experiment?
It is possible but only in the paid tier (it needs backend support for that) 😞
My use case it: in a spot instance marked for termination after 2 mins by aws
Basically what you are saying is you want the instance to spin down after the job is completed, correct?
Hi JollyChimpanzee19
I found this one:
https://clearml.slack.com/archives/CTK20V944/p1622134271306500
in this week I have met at least two people combining ClearML with other tools (one with Kedro and the other with luigi)
I would love to hear how/what is the use case 🙂
If I run the pipeline twice, changing only parameters or code of taskB, ...
I'll start at the end, yes you can clone a pipeline in the UI (or from code) and instruct it to reuse previous runs.
Let's take our A+B example, Let's say I have a pipeline P, and it executed A and then B (which relies on A's output...
actually no
hmm, are those packages correct ?
Can you send the console output of this entire session please ?
Hi @<1556450111259676672:profile|PlainSeaurchin97>
Is there any simple way to use
argparse
to pass a clearml task name?
need to call
args = task.connect(args)
.
noooo 🙂 there is no need to do that, the arguments are automatically detected
see for yourself
args = parse_args()
task = Task.init(task_name=args.task_name)
Can you see the repo itself ? the commit id ?
Thanks ShakyJellyfish91 ! please let me know what you come up with, I would love for us to fix this issue.
(torchvision vs. cuda compatibility, will work on that),
The agent will pull the correct torch based on the cuda version that is available at runtime (or configured via the clearml.conf)
WackyRabbit7 interesting! Are those "local" pipelines all part of the same code repository? do they need their own environment ?
What would be the easiest pipeline interface to run them locally? (I would if we could support this workflow, it seems you are not alone in this approach, and of course that you can always use them remotely, i.e. clone the pipeline and launch it on an agent)
Hi CooperativeFox72
But my docker image has all my code and all the packages it needed I don't understand why the agent need to install all of those again? (edited)
So based on the docker file you previously posted, I think all your python packages are actually installed on the "appuser" and not as system packages.
Basically remove the "add user" part and the --user
from the pip install.
For example:
` FROM nvidia/cuda:10.1-cudnn7-devel
ENV DEBIAN_FRONTEND noninteractive
RUN ...
yes 🙂
But I think that when you get the internal_task_representation.execution.script you are basically already getting the API object (obviously with the correct version) so you can edit it in place and pass it too