...I'm not sure I follow, the clearml-task is designed to always be used so that at the end the agent will be running the Task. What am I missing?
@<1523707653782507520:profile|MelancholyElk85> what are you trying to change ? maybe there is a better way?
BTW: if you do step_base_task.export_task() you can use the parts that you need in the dict and pass them to:task_overrides argument in add_step (you might need to flatten the nested arguments with '.' , and thinking about it, maybe we should do that automatically?!)
Hi JitteryCoyote63 report_frequency_sec=30. controller how frequently monitoring events are sent to the server, default is every 30 seconds (you can change the UI display to wall-time to review). You can change it to 180 so it will only send an event every 3 minutes (for example).
sample_frequency_per_sec is the sampling frequency it uses internally, then it will average the results over the course of the report_frequency_sec time window, and send the averaged result on the repo...
I am symlinking the .clearml directory to a NAS server and this is perhaps part of the problem.
Yep, that sounds about right, it uses Posix file system for internal lock mechanisms (multi process locks), and my guess is that the NAS for some reason does not support it...
I think it is on the JWT token the session gets from the server
a bit of a hack but should work 🙂
session = task.session # or Task._get_default_session()
my_user_id = session.get_decoded_token(session.token)['identity']['user']
where is the port? why https ?
Will they get ordered ascending or descending?
Good point, I'll check the docs... but I think they do not specify
https://clear.ml/docs/latst/docs/references/sdk/task#taskget_tasks
From the code it seems the ordered is not guaranteed.
You can however pass '-last_update' : order_by which will give you the latest updated first
` task_filter = {
'page_size': 2,
'page': 0,
'order_by': ['last_metrics.{}.{}'.format(title, series), '-last_update']
}
Task.get_tasks(...
CooperativeFox72 you can you start by checking the latest RC :)pip install trains==0.15.2rc0
bash: line 1: 1031 Aborted (core dumped)
@<1570583227918192640:profile|FloppySwallow46> seems like the processes crashed,
Oh I see the pipeline controller itself (not the components) is the one with the repo
To fix that add at the top of the script the following:
` from clearml import Task
Task.force_store_standalone_script()
@PipelineDecorator.pipeline(...) `That should do the trick
Hi RotundHedgehog76
we have issues with
clearml-agent
when using standalone mode. ...
What is the use case for standalone mode? is this venv or docker mode?
How can the first process corrupt the second
I think that something went wrong and both Agents are using the same "temp" folder to setup the experiment.
why doesn't this occur if I run pipeline from command line?
The services queue is creating new dockers with everything in them so they cannot step on each others toes (so to speak)
I run all the processes as administrator. However, I've tested running the pipeline from command line in non-administrator mode, it works fine....
Hi CooperativeFox72 ,
From the backend guys, long story short, upgrade your machine => more cpu cores , more processes , it is that easy 🙂
ShallowCat10 try something similar to this one, due notice that it might take a while to get all the task objects, so I would start with a single one 🙂
`
from trains import Task
tasks = Task.get_tasks(project_name='my_project')
for task in tasks:
scalars = task.get_reported_scalars()
for x, y in zip(scalars['title']['original_series']['x'], scalars['title']['original_series']['y']):
task.get_logger().report_scalar(title='title', series='new_series', value=y, iteration=...
LudicrousParrot69 this is implementation issue, this entire page is based on "task comparison" single Task means totally different interface for querying the data 🙂
you can also increase the limit here:
https://github.com/allegroai/clearml/blob/2e95881c76119964944eaa0289549617e8afeee9/docs/clearml.conf#L32
It will store the entire content of the file, then you can edit it in the UI, and in remote it will return a new local copy of the file (based on the data in the UI) for you to read.
WickedGoat98
The webUI will look like the demo server 🙂https://demoapp.trains.allegro.ai/
2. curl http://server-ip:8008 should return something like:{"meta":{"id":"78a9dc77081348e2930d1f429fd7e092","trx":"78a9dc77081348e2930d1f429fd7e092","endpoint":{"name":"","requested_version":1.0,"actual_version":null},"result_code":400,"result_subcode":0,"result_msg":"Invalid request path /","error_stack":null},"data":{}}%3. curl http://server-ip:8080 should return something like:
` <!d...
The agent is installing the "Installed Paclages" section of the Task (think of it as requirements.txt)
And again, what do you have there? Is it the outcome of the Task.init auto populating it?
Hi EnviousStarfish54
I remember this feature request, let me check where it stands..
Hi DisgustedDove53
When you say "deployment" there are a lot of way to interpret that 🙂 what exactly are you looking for ?
Hi LazyLeopard18
I think that these toy examples will help:
uploading local datasethttps://github.com/allegroai/events/blob/master/odsc20-east/generic/dataset_artifact.py
2. pre-process data
https://github.com/allegroai/events/blob/master/odsc20-east/generic/process_dataset.py
3. Training example:
https://github.com/allegroai/events/blob/master/odsc20-east/scikit-learn/sklearn_jupyter.ipynb
The pipeline stores the state of it's previous run, specifically the executed steps.
In our case the executed step was reset (I assume) so it cannot find the output model you are referring to, hence crashing
CleanPigeon16 make sense ?
You will be able to set it.
You will just not see the output in the console log , but everything is running and being executed
Hi CleanPigeon16
I think now the issue is missing git credentials, did you pass git_user / git_pass to the AWS autoscaler ?
Since the error says network error, is it possible because I'm in Taiwan? Like downloading from Asia leads to this kind of issue
Can you download it from the browser ? (I mean the file size after download , is it 400mb?)
Oh I think that I understand what's going on, @<1523701260895653888:profile|QuaintJellyfish58> let me check how to update the cron scheduler while it is running (I really like this idea, so if this is not already supported I'l like us to add this capability 🙂 )