Reputation
Badges 1
25 × Eureka!Hi CheerfulGorilla72
the "installed packages" section is used as "requirements.txt for the agent.
Are you saying the autodetection fails to detect all packages? You can specify in "manual execution" (i.e not when the agent is running the code), to just take the requirements.txt locally:` Task.force_requirements_env_freeze(requirements_file="./requirements.txt")
notice the above call should be executed Before Task.init
task = Task.init(...) `3. If you clear all the "installed packages" se...
Yes, look for the clearml serving session ID in the web UI (just go to the home screen and put the UID in the search ๐ )
Hi CurvedHedgehog15
User aborted: stopping task (3)
?
This means "someone" externally aborted the Task, in your case the HPO aborted it (the sophisticated HyperBand Bayesian optimization algorithms we use, both Optuna and HpBandster) will early stop experiments based on their performance and continue if they need later
SpicyCrab51 you can change the task to complete, it is just a state change nothing will actually change other than the status. Task.get_task(pass_dataset_id_here).mark_complete()
Is it not possible to serve a model with preprocessing pipeline from scikit-learn using clearml-serving?
of course it is, did you first try the example , here: None
If you need to run your own LogisticRegression
call you can use this example:
None
Notice this is where the custom endpoint actually calls the prediction: [None](https...
Oh I see, that kind of make sense
I think this is the section you should use:
None
But instead of the clearml-services container you should use the regular container (or just have it installed as part of the entry-point on any ubuntu based container)
Notice the important parts here are:
[None](https://github.com/allegroai/clearml-server/blob/6a1fc04d1e8b112fb334c8743d...
Hi @<1645597514990096384:profile|GrievingFish90>
You mean the agent itself inside a docker then the agent spins sibling dockers for the Tasks ?
How does it work with k8s?
You need to install the clearml-glue and them on the Task request the container, notice you need to preconfigure the clue with the correct Job YAML
Hi @<1545216070686609408:profile|EnthusiasticCow4>
My biggest concern is what happens if the TaskScheduler instance is shutdown.
good question, follow up, what happens to the cron service machine if it fails?!
TaskScheduler instance is shutdown.
And yes you are correct if someone stops the TaskScheduler instance
it is the equivalent of stopping the cron service...
btw: we are working on moving some of the cron/triggers capabilities to the backend , it will not be as flexi...
do you know how can i save all the logs and all the metric images?
These are stored into clearml-server, no? what am I missing ?
Okay we got to the bottom of this. This was actually because of the load balancer timeout settings we had, which was also 30 seconds and confusing us.
Nice!
btw:
in the clearml.conf we put this:
for future reference, you are missing the sdk section:
sdk.http.timeout: 300
.
notation works as well as {}
MysteriousBee56 Edit in your ~/trains.conf:api_server:
http://localhost:8008
toapi_server:
http://192.168.1.11:8008
and obliviously the same for web & files
I'll make sure we fix the trains-agent to output an error message instead of trying to silently keep accessing the API server
Getting you machine ip:
just run :ifconfig | grep 'inet addr:'
Then you should see a bunch of lines, pick the one that does not start with 127 or 172
Then to verify runping <my_ip_here>
AgitatedTurtle16 could you check with the latest clearml RC (I remember a similar issue was fixed).pip install clearml==0.17.5rc3
Then run againclearml-task ...
Now that we have the free tier (a.k.a community server) we might change the default behavior.
The idea is always to allow an easy way to on-board and test the system.
ReassuredTiger98
BTW: what's the scenario where your machine reverted to the default configuration (i.e. no configuration file) ?
I can add files to the data set, even after I finish the experiment?
Correct
https://clear.ml/docs/latest/docs/clearml_data#creating-a-dataset
https://clear.ml/docs/latest/docs/guides/data%20management/data_man_cifar_classification
https://github.com/allegroai/clearml/blob/master/docs/datasets.md#create-dataset-from-code
Hi @<1695969549783928832:profile|ObedientTurkey46>
How can I connect clearml to a relational database, and have sql query as a dataset? (e.g. dataset.add_references(query = โselect * from images where label = โ1โ)).
hmm interesting, you have a couple of options that I can think of:
- You can have your query and an argument to the Task, which means it is logged and can be changed later from the UI when you are relaunching it.
- You can have the query an an argument for a preprocessin...
Requested version: 2.28, Used version 1.0" for some reason
This is fine that means there is no change in that API
Nice! So out of curiosity why didn't it work this time and you had to do it manually?
BoredHedgehog47 this is basically a wizard explaining the steps, see the 3 tabs ๐
BTW, you can launch an experiment directly from CLI with clearml-task
https://clear.ml/docs/latest/docs/apps/clearml_task
Try to upload something to the file server ?
None
OddAlligator72 just so I'm sure I understand your suggestion:
pickle the entire locals()
on current machine.
On remote machine, create a mock entry point python, restore the "locals()" and execute the function ?
BTW:
Making this actually work regardless on a machine is some major magic in motion ... ๐
restart_period_sec
I'm assuming development.worker.report_period_sec
, correct?
The configuration does not seem to have any effect, scalars appear in the web UI in close to real time.
Let me see if we can reproduce this behavior and quickly fix
Hi DisgustedDove53
Is redis used as permanent data storage or just cache?
Mostly cache (Ithink)
Would there be any problems if it is restarted and comes up clean?
Pretty sure it should be fine, why do you ask ?
I suppose the same would need to be done for anyย
clientย
PC runningย
clearml
ย such that you are submitting dataset upload jobs?
Correct
That is, the dataset is perhaps local to my laptop, or on a development VM that is not in theย
clearml
ย system, but I from there I want to submit a copy of a dataset, then I would need to configure the storage section in the same way as well?
Correct
What's the python, torch, clearml version?
Any chance this can be reproducible ?
What's the full error trace/stack you are getting?
Can you try to debug it to where exactly it fails here?
https://github.com/allegroai/clearml/blob/86586fbf35d6bdfbf96b6ee3e0068eac3e6c0979/clearml/binding/import_bind.py#L48
RoughTiger69 wdyt?
Will the new fix avoid this issue and does it still requires theย
incremental
ย flag?
It will avoid the issue, meaning even when incremental is not specified, it will work
That said the issue any other logger will be cleared as well, so, just good practice ...
From theย
logging
ย documentation ...
Hmmm so I guess Kedro should not use dictConfig ?! I'm not sure on the exact use case, but just clearing all loggers seems like a harsh approach
Hi @<1628927672681762816:profile|GreasyKitten62>
Notice that in the github actions example this psecific Task is executed on the GitHub backend, the Task it creates is executed on the clearml-agent.
So basically:
Action -> Git worker -> task_stats_to_comment.py -> Task Pushed to Queue -> Clearml-Agent -> Task execution is here
Does that make sense ?