That's the theory, I still see it is not there
Hi SubstantialElk6
Generically, we would 'export' the preprocessing steps, setup an inference server, and then pipe data through the above to get results. How should we achieve this with ClearML?
We are working on integrating the OpenVino serving and Nvidia Triton serving engiones, into ClearML (they will be both available soon)
Automated retraining
In cases of data drift, retraining of models would be necessary. Generically, we pass newly labelled data to fine...
Hi RipeGoose2
Can you try with the latest from git ?pip install -U git+
there is probably some way to make an S3 path open up in the browser by default
You should have a pop-up asking for credentials ...
Could you check that if you add the credentials in the profile page it works ?
RipeGoose2 yes, the UI cannot embed the html yet, but if you go click on the link itself it will open the html in a new tab.
Could you verify it works ?
RipeGoose2 you mean to have the preview html on S3 work as expected (i.e. click on it add credentials , open in a new tab) ?
Oh I do not think this is possible, this is really deep in a background thread.
That said we can sample the artifacts and re-register the html as a debug media:url = Task.current_task().artifacts['notebook preview'].url Task.current_task().get_logger().report_media('notebook', 'notebook', iteration=0, url=url)
Once the html is uploaded, it will keep updating on the same link so no need to keep registering the "debug media". wdyt?
RipeGoose2 yes that will work π
That said, we should probably fix the S3 credentials popup π
SweetGiraffe8
That might be it, could you test with the Demo server ?
Should work in all cases, plotly/matplotlib/scalar_rerport
The bug was fixed π
RipeGoose2
HTML file is not a standalone and has some dependencies that require networking..
Really? I thought that when jupyter converts its own notebook it packages everything into a single html, no?
Hi RipeGoose2
Just to clarify, the issue with the html stuck in cache is a UI, thing, basically the webapp needs to tell the browser not to cache the artifacts, it has nothing to do with how the artifacts are created.
Regardless we love improvements so feel free to mass around with the code and PR once you get something useful π
Specifically this is where the html conversion happens
https://github.com/allegroai/clearml/blob/9d108d855f784e1fe7f5691d3b7bf3be64576218/clearml/backend_in...
BoredHedgehog47 if you are running it on K8s, then the setup script is running before everything else, even before an agent appears on the machine, unfortunately this means the output is not logged yet, hence the missing console lines (I think the next version of the glue will fix that)
In order to test you can do:export TEST_ME
then inside your code you will be able to see itos.environ['TEST_ME']
Make sense ?
The fact the html file does not refresh in the browser even though there is a new copy of it uploaded.
The problems comes from ClearML that thinks it starts from iteration 420, and then adds again the iteration number (421), so it starts logging from 420+421=841
JitteryCoyote63 Is this the issue ?
Hi CooperativeFox72 trains 0.16 is out, did it solve this issue? (btw: you can upgrade trains to 0.16 without upgrading the trains-server)
Seems like everything is in order. Can you curl to the API/web/files server?
I don't have the compose file, or at least can't seem to find it inΒ
/opt
you can manually take down all dockers with:docker ps
then docker stop <container id>
for each container id
Hi @<1739818374189289472:profile|SourSpider22>
could you send the entire console log? maybe there is a hint somewhere there?
(basically what happens after that is the agent is supposed to be running from inside the container, but maybe it cannot access the clearml-server for some reason)
One more question, in the second log, trains agent is configured with Conda, on the first it is configured with pip, or at least this is what it looks like, can you confirm?
No, an old experiment changed, nothing was rerun
ohh, that is odd. I think the max iteration value is stored on the DB, which is odd if it changed after an update.
BTW: just making sure, could it be these Tasks were imported ? (i.e. offline execution + import)
You can however pass a specific Task ID and it will reuse it "reuse_last_task_id=aabb11", would that help?
Hmm I'm sorry it might be "continue_last_task", can you try:Task.init(..., continue_last_task="aabb11")
SmarmyDolphin68 , All looks okay to me...
Could you verify you still get the plot on debug samples as image with the latest trains RCpip install trains==0.16.4rc0
could one also limit the number of CPU cores available?
If you are running in docker mode you can add:--cpus=<value>
see ref here: https://docs.docker.com/config/containers/resource_constraints/
Just add it to extra_docker_arguments
:
https://github.com/allegroai/clearml-agent/blob/2cb452b1c21191f17635bcb6222fa8bfd82afe29/docs/clearml.conf#L142
@<1547390422483996672:profile|StaleElk72> when you go to the dataset in the UI, and press on "Full Details" then go to the Artifacts tab, what is the link you see there?
Hi RoughTiger69
I'm actually not sure about DVC support as well, see in these links, syncing and registering is a link, not creating an immutable copy.
And the sync between the local and remote seems like it is downloading the remote and comparing to the local copy.
Basically adding remote source Does not mean DVC will create an immutable copy of the content, it's just a pointer to a bucket (feel free to correct me if I misunderstood their capability)
https://dvc.org/doc/command-reference/...
How so? Installing a local package should work, what am I missing?