
Reputation
Badges 1
25 × Eureka!RobustRat47 are you saying updating the nvidia drivers solved the issue ?
BTW: if you only need the git diff you can just copy them from the UI into a txt file and do:git apply <copied-diff.txt>
They inherit from one another, so it does make sense. Also the add_tags is on the "main" Task and not the backend parent
Thank you! 😊
Okay, I think I lost you...
DilapidatedDucks58 you mean detect at which "iteration" the max value was reported, and then extract all the other metrics for that iteration ?
let me check
report_text does not, this is very weird
Okay this seems to be the issue.
Just making sure the Task status is "running" and task.get_logger().report_text("something")
does not report a thing ?
Do you see it on your screen?
Can you test without the "Task.debug_simulate_remote_task / init" ?
I don't know whether you have access to the backend,
Creepy , no I do not 🙂
I can't make anything appear in the console part of the ui
clearml_task.logger.report_text("some text")
should work
The main question I have is why is the ALB not passing the request, I think you are correct it never reaches the serving server at all, which leads me to think the ALB is "thinking" the service is down or is not responding, wdyt?
What do you mean by a custom queue ?
In the queues page you have a plus button, this will just create a new queue
I want the model to be stored in a way that clearml-serving can recognise it as a model
Then OutputModel or task.update_output_model(...)
You have to serialize it, in a way that later your code will be able to load it.
With XGBoost, when you do model.save clearml automatically picks and uploads it for you
assuming you created the Task.init(..., output_uri=True)
You can also manually upload the model with task.update_output_model or equivalent with OutputModel class.
if you want to dis...
I would do something like:
` from clearml import Logger
def forward(...):
self.iteration += 1
weights = self.compute_weights(...)
m = (weights * (target-preds)).mean()
Logger.current_logger().report_scalar(title="debug", series="mean_weight", value=m, iteration=self.iteration)
return m `
Not at all, we love ideas on improving ClearML.
I do not think there is a need to replace feast, it seems to do a lot, I'm just thinking on integrating it into the ClearML workflow. Do you have a specific use case we can start to work on? Or maybe a workflow that would make sense to implment?
trains-agent RC (which they tell me will be out tomorrow) will have a switch to do that, just so it is easier 🙂
Where are you seeing this message?
CooperativeFox72 you can you start by checking the latest RC :)pip install trains==0.15.2rc0
with conda ?!
Check the log to see exactly where it downloaded the torch from. Just making sure it used the right repository and did not default to the pip, where it might have gotten a CPU version...
Hi TightElk12
Are you looking for a way to set the output_uri
from environment variable ? Is this it?
Hi EnthusiasticCoyote38
But one one process finished it changed task status to complete. May be you know some save way to deal with such situation? Or maybe the best way to check task status before upload object?
Well, you can actually forcefully set the state of the Task to running, then add artifacts, then close it?
would that work?
` my_other_task.reload()
my_other_task.mark_started(force=True)
my_other_task.upload_artifact(...)
my_other_task.flush(wait_for_uploads=True)
my_othe...
In both case if I get the element from the list, I am not able to get when the task started. Where is info stored?
If you are using client.tasks.get_all( ...)
should be under started
field
Specifically you can probably also do:queried_tasks = Task.query_tasks(additional_return_fields=['started']) print(queried_tasks[0]['id'], queried_tasks[0]['started'],)
it would be nice to group experiments within projects
DilapidatedDucks58 you mean is collapse/expand ? or in something like "sub-project" ?
what if for some old tasks I get WARNING:root:Could not delete Task ID=a0908784a2a942c3812f947ec1f32c9f, 'Task' object has no attribute 'delete'? What's the best way of cleaning them?
This seems like an old SDK no?
Hmm reading this: None
How are you checking the health of the serving pod ?
Hi EnchantingOstrich20
You how doe s clearml get it there?
In runtime it analyzes the code you are running looking for imports then checks the version you have actively used (i.e. active venv / python) and lists it there.
You can also override those in code, or edit them after you clone the ask and before you enqueue it for remote execution