Reputation
Badges 1
25 × Eureka!Could I just build it and log these parameters using
task.set_parameters()
so that I call
task.get_parameters()
later?
instead of manually calling set/get, you call task.connect(some_dict_or_object)
, it does both:
When running manually (i.e. without an agent) it logs the keys/values on the Task,
when running with an agents, it takes the values from the backend (Task) and sets them on the dict/object
Make sense ?
BTW: if you need you can do the following:
` from clearml import Task
from clearml.automation import PipelineController
task = Task.init(project_name='pipelines', task_name='pipeline test')
task.set_base_docker(...)
the pipeline object is using the Current Task, hence docker image is set
pipe = PipelineController(...)
pipe.start() `
Basically the links to the file server are saved in both mongo and elastic, so as long as these are host:ip based, at least in theory it should work
follow the backup procedure, it is basically the same process
This one is used when the agent manually downloads wheels, (pytorch mostly), but as you can see it is under ~/.clearml
directory, which usually is already shared on the host
basically @<1554638166823014400:profile|ExuberantBat24> you can think of hyper-datasets as a "feature-store for unstructured data"
UptightMouse31 You can add any metric (KPI) with "manual" loggingLogger.current_logger().report_scalar("KPI", "metric", iteration=0, value=1.1)
This means you can later add a column KPI/metric to your experiment table.
Will this do the trick ?
I see now.
Let's assume you know which snapshot that was:
` prev_task = Task.get_task(task_id='the_first_training_task_id')
get the second from last checkpoint
task.models['output'][-2].url
prev_scalars = prev_task.get_reported_scalars()
new_task = Task.init('example', 'new task')
logger = new_task.get_logger()
do some fpr loop and report the prev_scalars with logger.report_scalars
new_task.flush(wait_for_uploads=True)
new_task.set_initial_iteration(22000)
start the train `
Hi OddAlligator72
for instance - remove all the metrics from some step onward?
(I think that as long as the Task is not published you could do such a thing directly with the RestAPI (aka APIClient from python)
What's the use case?
Getting the last checkpoint can be done via.
Task.get_task(task_id='aabbcc').models['output'][-1]
OddAlligator72 I like this idea.
The single thing I'm not sure about is the "function entry point"
Why would one do that? Meaning why wouldn't you have a proper python entry-point.
The reason I'm reluctant is that you might have calls/functions/variables in global scope of the file storing the function, and then users will not know why something broke, ans it will be very cumbersome to debug.
A simple script entry point seems trivial to launch and debug locally.
What do you think ? What woul...
Good news, there is an offline mode.Task.set_offline(True)
If you want your code to be aware, you can do:from trains import Task if Task.current_task(): Task.current_task().get_logger().report_confusion_matrix(...)
The reason is because it is logged as an image, not a plot 🙂
And you cannot see it in Trains UI?
DefeatedCrab47 yes that is correct. I actually meant if you see it on the tensorboard's UI 🙂
Anyhow if it there, you should find it in the Tasks Results Debug Samples
Task.completed(ignore_errors=False)
What are you getting?
I failed to update the "STARTED AT" and the "COMPLETED AT" attributes in the "INFO" tab.
I'm not sure this can actually be overridden...
I couldn't change the task status from draft to complete
Task.completed(ignore_errors=True)
Questions
I want to trigger a retrain task when F1
That means that in inference you are reporting the F1 score, correct?
As part of the retraining I have to train all the models and then have to choose best one and deploy it
Are you using passing output_uri to Task.init? are you storing the model as artifact?
You can tag your model/task with "best" tag (and untag the previous one). Then in production , look for the "best" task and get its model
Thoughts?
Long story short, work in progress.
BTW: are you referring to manual execution or trains-agent
?
Hey IntriguedRat44 ,
Is this what you are after?
https://github.com/allegroai/trains/issues/181
So I have a task that just loads a model, but I don't see it as an artifact in the UI
You should see it under Artifacts, Input model if you are calling Keras load function (or similar)
It does not upload, the default behavior is to log the artifact (so you know where you stored, but not enforce unnecessary uploads)
If you were to change:task = Task.init(project_name='examples', task_name='Keras with TensorBoard example')
to:task = Task.init(project_name='examples', task_name='Keras with TensorBoard example', output_uri="
")
It would also upload the model
If you are using the latest RC:pip install clearml==0.17.5rc5
You can pass True
it will use the "files_server" as configured in your clearml.conf
I used the http link as a filler to point to the files_server.
Make sense ?
Hi ConfusedPig65
Any keras model will be automatically uploaded if you pass an upload url to the Task init:task = Task.init('examples', 'keras upload test', output_uri="
")
(You can also pass to output_uri s3://buckket/folder or change the default output_uri in the clearml.conf file)
After this line any keras model will be automatically uploaded (you will see it under the Artifacts Tab)
Accessing models from executed tasks:
` trains_task = Task.get_task('task_uid_here')
last_check...
You can always log it manually:from clearml import InputModel input_model = InputModel.import_model(weights_url='/tmp/keras_example/weight.6.hdf5')
ohh sorry, weights_url=path
Basically url can be the local path to the weights file 🙂