
Reputation
Badges 1
25 × Eureka!PompousParrot44 with pleasure. If during your search for a solution you come across something that solves it, and might integrate to the agent, do not hesitate to suggest it :)
EnviousPanda91 so which frame works are being missed? Is it a request to support new framework or are you saying there is a bug somewhere?
No worries, just found it. Thanks!
I'll make sure to followup on the GitHub issue for better visibility š
Hmm that is odd, it seemed to missed the fact this is a jupyter notbook.
What's the clearml version you are using ?
Hi @<1671689437261598720:profile|FranticWhale40>
Are you positive the Triton container finished syncing ?
Could you provide the docker log (both the serving and the triton)?
What is the clearml-serving version you are using ?
Could you add a print in the "preprocess" function, just to validate you are getting to the correct model version ?
Give me a minute, I'll check something
CurvedHedgehog15 the agent has two modes of opration:
single script file (or jupyter notebook), where the Task stores the entire file on the Task itself. multiple files, which is only supported if you are working inside a git repository (basically the Task stores a refrence to the git repository and the agent pulls it from the git repo)Seems you are missing the git repo, could that be?
I also have task_override that adds a version which changes each run
It's just a tag, so no real difference
HugeArcticwolf77 oh no, I think you are correct š
Do you want to quickly PR a fix ?
The image is
allegroai/clearml:1.0.2-108
Yep, that makes sense, seems like a backwards compatibility issue
Regarding this, does this work if the task is not running locally and is being executed by the trains agent?
This line: "if task.running_locally():" makes sure that when the code is executed by the agent it will not reset it's own requirements (the agent updates the requirements/installed_packages after it installs them from the requiremenst.txt, so that later you know exactly which packages/versions were used)
Sounds great! I really like that approach, thanks GrotesqueDog77 !
But I believe it would be harder for our team to detect and respond to failures in the event handler functions if they were placed there because it seems unclear how we could use our existing systems and practices to do that.
Okay I think this is the issue, handler functions
are not "supposed" to fail, they are supposed to trigger Tasks, these can fail.
e.g.:
Model Tag Trigger -> handler function creates a Task -> Task does something, like build container, trigger CI/CD etc -> Task...
It's more or less here:
https://github.com/allegroai/clearml-session/blob/0dc094c03dabc64b28dcc672b24644ec4151b64b/clearml_session/interactive_session_task.py#L431
I think that just replacing the package would be enough (I mean you could choose hub/lab, which makes sense to me)
ClearML best practice to create a draft pipeline to have the task on the server so that it can be cloned, modified and executed at any time?
Well it is, we just assume that you executed the pipeline somewhere (i.e. made sure it works) š
Correction:
What you actually are looking for (and I will make sure we have it in the doc) is :pipeline.start(queue=None)
It will just leave it as is, so you can manually enqueue / clone it š
RipeGoose2 models are automatically registered
i.e. added to the models artifactory, but it only points to where the files are stored
Only if you are passing the output_uri
argument to the Task.init, they will be actually uploaded.
If you want to disable this behavior you can passTask.init(..., auto_connect_frameworks={'pytorch': False})
It's a good abstraction for monitoring the state of the platform and call backs, if this is what you are after.
If you just need "simple" cron, then you can always just loop/sleep š
Hi LazyLeopard18 ,
See details below, are you using the win10 docker-compose yaml?
https://github.com/allegroai/trains-server/blob/master/docs/install_win.md
When we enqueue the task using the web-ui we have the above error
ShallowGoldfish8 I think I understand the issue,
basically I think the issue is:task.connect(model_params, 'model_params')
Since this is a nested dict:model_params = { "loss_function": "Logloss", "eval_metric": "AUC", "class_weights": {0: 1, 1: 60}, "learning_rate": 0.1 }
The class_weights is stored as a String key, but catboost expects "int" key, hence it fails.
One op...
EnviousStarfish54 we just fixed an issue that relates to "installed packages" on windows.
RC is due to be release in the upcoming days, I'll keep you posted
Yes, the webserver doesn't know where the api server is, it will access /api and then the nginx running the webapp will do the routing (reverse proxy)
I think that for some reason it is failing to do that (actually similarly to the stackoverflow you linked)
The agent is installing the "Installed Paclages" section of the Task (think of it as requirements.txt)
And again, what do you have there? Is it the outcome of the Task.init auto populating it?
Notice both needs to be str
btw, if you need the entire folder just use StorageManager.upload_folder
I would like to use ClearML together with Hydra multirun sweeps, but Iām having some difficulties with the configuration of tasks.
Hi SoreHorse95
In theory that should work out of the box, why do you need to manually create a Task (as opposed to just have Task.init call inside the code) ?
Hi @<1523701260895653888:profile|QuaintJellyfish58>
You mean some "daemon service" aborting Tasks that do not end after X hours? or is it based on CPU/GPU utilization?
And other question is clearml-serving ready for serious use?
Define serious use? KFserving support is in the pipeline, if that helps.
Notice that clearml-serving is basically a control plane for the serving engine, not to neglect the importance of it, the heavy lifting is done by Triton š (or any other backend we will integrate with, maybe Seldon)
Any specific use case for the required "draft" mode?
@<1541954607595393024:profile|BattyCrocodile47> first let me say I ⤠the dark theme you have going on there, we should definitly add that š
When I run
python set_triggers.py; python basic_task.py
, they seem to execute, b
Seems like you forgot to start the trigger, i.e.
None
(this will cause the entire script of the trigger inc...
Regrading the missing packages, you might want to test with:force_analyze_entire_repo: false
https://github.com/allegroai/trains/blob/c3fd3ed7c681e92e2fb2c3f6fd3493854803d781/docs/trains.conf#L162
Or if you have a full venv you like to store instead:
https://github.com/allegroai/trains/blob/c3fd3ed7c681e92e2fb2c3f6fd3493854803d781/docs/trains.conf#L169
BTW:
What is the missed package?