No Task.create is for creating an external Task not logging your own process,
That said you can probably override the git repo with env vars:
None
PompousParrot44 That should be very easy to do, basically a service mode code that clones a base task and puts it into a queue:
This should more or less do what you need :)
` from trains import Task
task = Task.init('devops', 'daily train', task_type='controller')
stop the local execution of this code, and put it into the service queue, so we have a remote machine running it.
task = execute_remotely('services')
while True:
a_task = Task.clone(base_task_id='aaabb111')
Task.enqueu...
the error for uploading is weird
wait, are you still getting this error?
when I run it on my laptop...
Then yes, you need to set the default_output_uri
on Your laptop's clearml.conf (just like you set it on the k8s glue)
Make sense ?
I assume the account name and key refers to the storage account credentials that you can from Azure Storage Explorer?
correct
Wait, so the pipeline step only runs if the pre execute callback returns True? It'll stop if it doesn't run?
Only if you have a Callback function, and that callback function returns False, then it will skip it (otherwise it will process it)
Another question, in the parents sequence in pipe.add_step, we have to pass in the name of the step right?
Correct, the step name is a unique identifier for the pipeline
how would I access the artifact of a previous step within the pre ...
Try:task.update_requirements('\n'.join([".", ]))
Hi LovelyHamster1
Could you think of a toy code that reproduces this issue ?
I find it quite difficult to explain these ideas succinctly, did I make any sense to you?
Yep, I think we are totally on the same wavelength 🙂
However, it also seems to be not too prescriptive,
One last question, what do you mean by that?
Notice that we are using the same version:
https://github.com/allegroai/clearml-serving/blob/d15bfcade54c7bdd8f3765408adc480d5ceb4b45/clearml_serving/engines/triton/Dockerfile#L2
The reason was that previous version did not support torchscript, (similar error you reported)
My question is, why don't you use the "allegroai/clearml-serving-triton:latest" container ?
What we would like ideally, is a system where development, training, and deployment are almost one and the same thing, to reduce the lead time from development code to production models.
This is very aligned with the goals of ClearML 🙂
I would to understand more on what is currently missing in ClearML so we can better support this approach
my inexperience in using them a lot until recently. I can see how that is a better solution
I think I failed in explaining my self, I me...
Could it be that this is the callback that causes it?
None
ThickDove42 Windows conda python3.6 was exactly what I was using,
started the jupyter with:
"python -m jupyter notebook"
Then opened / created a new notebook, everything worked.
Tested on latest clearml 0.17.2
Maybe it's something with the path to the repo that breaks it? Because obviously the issue is it is looking in the wrong folder.
You can try callingtask._update_repository()
I'm still trying to figure out how to reproduce it...
What's the output_uri
you are passing ?
And the OS / Python version?
And If I create myself a Pro account
Then you have the UI and implementation of both AWS & GCP autoscalers, am I missing something?
in my repo I maintain a bash script to setup a separate python env.
Hmm interesting, now I have to wonder what is the difference ? meaning why doesn't the agent build a similar one based on the requirements ?
however can you see the inconsistency between the key and the name there:
Yes that was my point on "uniqueness" ... 😞
the model-key must be unique, and it is based on the filename itself (the context is known, it is inside the Task) but the Model Name is an entity, so it should have the Task Name as part of the entity name, does that make sense ?
@<1533619716533260288:profile|SmallPigeon24> , failed task should not actually be reused (i.e. cached), are you saying a failed Task is being reused? or are you saying that you want to "invalidate" the cache in the execution but still leave the Task as completed ?
It will always set it's own environment, wither with static analysis or with "pip freeze" / "conda freeze"
It needs to log the exact setup that was actually installed.
When you later launch it on a remote machine, it can either use this to recreate the environment (using pip or conda), or you can clear the entire section, where it will fall back to "requirements.txt"
Any reason for specifically using the "environment.yaml" ?
Nice! I'll see if we can have better error handling for it, or solve it altogether 🙂
python version to be used and conda will install it
clearml does that automatically (albeit it is not shown in the UI, which should be fixed)
But I am considreing just failing the task.
This will of course work, just raise exception in the Task itself, and protect the call from the pipeline logic function with try/except
regrading the second option, try to nullify the hash on the Component Task:
# running the Task component here
# if we do not want someone to use us
Task.current_task()._set_runtime_properties({"pipeline_job_hash": None})
ThickDove42 looking at the code, I suspect it fails interacting with the actual jupyter server (that is running on the same machine, but still).
Any chance you have a firewall on the Windows machine ?
but this gives me an idea, I will try to check if the notebook is considered as trusted, perhaps it isn't and that causes issues?
This is exactly what I was thinking (communication with the jupyter service is done over http, to localhost, sometimes AV/Firewall software will block it, false-positive detection I assume)
Oh I see the pipeline controller itself (not the components) is the one with the repo
To fix that add at the top of the script the following:
` from clearml import Task
Task.force_store_standalone_script()
@PipelineDecorator.pipeline(...) `That should do the trick
Hi SkinnyPanda43
Do you mean the cleaml-agent or the cleaml python (a.k.a the auto package detection) ?
Need - in my CI, the url used is https but I need the ssh url to be used. I see that we can pass repo to Task.create but not Task.init
Are you cloning an existing Task, or creating a new one ?
Agent works when I am running it from virtual environment but stucks in the same place all the time when I using Docker
Can you please provide a log? I'm not sure what it means stuck