Reputation
Badges 1
25 × Eureka!Hi @<1524922424720625664:profile|TartLeopard58>
Yes this is the default it is designed to serve multiple models and scale horizontally
Hi @<1547028116780617728:profile|TimelyRabbit96>
Start with the simple scikit learn example
https://github.com/allegroai/clearml-serving/tree/main/examples/sklearn
The pipeline example is more complicated, it needs the base endpoints, start simple π
It should print to console...print(task.get_output_log_web_page())
the use case i have is to allow people from my team to run their workloads on set of servers without stepping over each other..
So does that mean CPU only workloads?
Also are we afraid of fairness? (i.e. someone "taking" all the CPU for themselves)
ElegantCoyote26 point me to where Keras stores the data π
If in the process of integration you had to add a logger/callback to your Keras code, that is the equivalent of using the TB.
Funny enough Iβm running into a new issue now.
Sorry my bad, I thought have known π yes it probably should be packages=["clearml==1.1.6"]
BTW: do you have any imports inside the pipeline function itself ? if you do not, then no need to pass "packages" at all, it will just add clearml
For now I come to the conclusion, that keeping aΒ
requirements.txt
Β and making clearml parse
Maybe we could just have that as another option?
Could it be it checks the root target folder and you do not have permissions there only on subfolders?
I guess it wonβt due to the nature of services?
Correct, k8s glue works differently, that said I would actually use the helm to spin a pod woth the agent in services mode and venv mode.
Hi ConvolutedBee40
If we deploy a task to
clearml-server
, will it automatically scale?
The way it works is with agents and agent glue, basically using k8s as a resource allocator and the clearml agent as orchestrator, did that answer the question ?
before exposing our IP to the world, I suggest going over security advisory in the docs: None
as a general note, do not expose your server, the open source version is not designed for it, just put it inside your VPN and it will be fine
No -- that section is blank,
This is the main issue, it should be filled with the requirement being auto detected.
The entire script was executed from within vscode, and the Task was created but it was not prefilled with anything ?
Just making sure, you called Task.init inside your code ?
It will store the entire content of the file, then you can edit it in the UI, and in remote it will return a new local copy of the file (based on the data in the UI) for you to read.
(with older clearml versions thoughβ¦).
Yes, we added content type header for the files when uploading to S3 (so it is easier for users to serve them back). But it seems the python 3.5 casting from Path to str breaks it mimetype call....
It seems the code is trying to access an s3 bucket, could that be the case? PanickyMoth78 any chance you can post the full execution log? (Feel free to DM so it won't end up being public)
StorageManager
Oh it has no remove πStorageHelper.delete is the only way
Using the dataset.create command and the subsequent add_files, and upload commands I can see the upload action as an experiment but the data is not seen in the Datasets webpage.
ScantCrab97 it might be that you need the latest clearml package installed on the client end (as well as the new server with the UI)
What is your clearml package version ?
I mean just add the toy tqdm loop somewhere just before starting the lightning train function. I just want to verify that it works, or maybe there is something in the specific setup happening in real-time that changes it
My question is what happens if I launch in parallel multiple doit commands that create new Tasks.
Should work out of the box.
I would like to confirm that current_task ...
Correct.
Hi LivelyLion31 I missed your S3 question, apologies. What did you guys end up doing?
BTW you could always upload the entire TB log folder as artifact, it's simple task.upload_artifact('tensorboard', './tblogsfolder')
HandsomeCrow5 if you want to edit the Task object you can just use:internal_task_representation = task.data internal_task_representation.execution.script = ... task._edit(execution=internal_task_representation.execution)This will make sure you do not need to worry about API version etc. the Task object will take care of it.
BTW: it seems a few more people wanted this ability, maybe we should edit a proper .edit method to Task. Thoughts ?
I can read them programmatically using tensorboard and the log the using clearml logger,
StaleButterfly40 this will be a great script to put somewhere (I'm sure you are not the only one with this problem). Maybe put it as a GitHub issue ? wdyt ?
In any case, do you have any suggestion of how I could at least hack tqdm to make it behave? Thanks
I think I know what the issue is, it seems tqdm is using Unicode for the CR this is the 1b 5b 41 sequence I see on the binary log.
Let me see if I can hack something for you to test π
Actually if you can send the full log of the Task that would be great
One additional question, if you import clearml after you call torch does it work ?