Reputation
Badges 1
25 × Eureka!(BTW: any reason not to use the agent?)
Could you download and send the entire log ?
correct on both.
notice that with upload
you can specify any storage (S3/GS/Azure atc)
if they're mission critical, but rather the clearml cache folder?
hmmm... they are important, but only when starting the process. any specific suggestion ?
(and they are deleted after the Task is done, so they are temp)
Hmm, Notice that it does store sym links to parent data versions (to save on multiple copies of the same file). If you call get_mutable_local_copy() you will get a standalone copy
Hi TrickyRaccoon92
If you are reporting to tensor-board, then "iteration" equals step. Is this the case?
At the moment I'm querying by paging through the tasks as you recommended, and then filtering with standard python list-comprehension filters...Which is less than ideal.
At least let's do that better:
Use Task._query_tasks:Task._query_tasks(order_by=['-started'], page_size=10, page=0, only_fields=['id', 'started'])
You will get "lighter" objects returned, then you can filter them with code (but the request will be a lots faster)
SuccessfulKoala55 any suggestion on improving that ?
Hmmm, can you view the settings? that's the only thing I can think of at the moment that will be diff between your setup and the working one...
Also, is there a way for you to have the trains-server behind https (on your GCP)
how can I for example convert it back to a pandas dataframe?
You can always report csv file with report_media as well, or if this is not for debugging maybe an artifact ?
Hi @<1671689437261598720:profile|FranticWhale40>
Are you positive the Triton container finished syncing ?
Could you provide the docker log (both the serving and the triton)?
What is the clearml-serving version you are using ?
Could you add a print in the "preprocess" function, just to validate you are getting to the correct model version ?
Hi @<1585078763312386048:profile|ArrogantButterfly10>
Now i want to clone the pipeline and change the hyperparameters of train task, is it possible? If so, how??
the pipeline arguments are for the pipeline DAG/logic, you need to pass one of the arguments as an argument for the training step/task. Make sense ?
I guess we should have obfuscated the name better π
Hmmm, are you running inside pycharm, or similar ?
can someone show me an example of howΒ
PipelineController.create_draft
I think the idea is to store a draft versio of the pipeline (not the decorator type, I think, but the one launching pre-executed Tasks).
GiganticTurtle0 I'm not sure I fully understand how / why you are using it, can you expand?
EDIT:
However, my intention is ONLY to create it to be executed later on.
Hmm so may like enqueue it?
@<1545216077846286336:profile|DistraughtSquirrel81> shoot an email to "support@clear.ml" and provide all the information you can on the "lost account" (i.e. the one you had the data on), this means email account that created it (or your colleagues emails), and any other information that might help to locate it.
Just one more question, do you have any idea about how I could change the x-axis label from "Iterations" to "Epochs"
You mean in the UI (i.e. just the title) ? or are you actually reporting iterations instead of epochs? and if so is this auto connected to tensorboard or is it reported manually ?
Hi RattySeagull0
I'm trying to execute trains-agent in docker mode with conda as package manager, is it supported?
It should, that said we really do not recommend using conda as package manager (it is a lot slower than pip, and can create an environment that will be very hard to reproduce due to internal "compatibility matrix" of conda, that might be changing from one conda version to another)
"trains_agent: ERROR: ERROR: package manager "conda" selected, but 'conda' executable...
Do you think It can be fixed somehow? It would be theΒ easiest way to launch new experiments with a different configuration
Let me check, it might be it.
It would be theΒ easiest way to launch new experiments with a different configuration
Definitely
give me a minute to test
Hi @<1635088270469632000:profile|LividReindeer58>
You mean the clearml.conf?
You can do:
from clearml.config import config_obj
you should have the entire configuration file as an object (dict interface)
fyi: under the hood it uses pyHOCON
Hi LazyLeopard18 ,
So long story short, yes it does.
Longer version, to really accomplish full federated learning with control over data at "compute points" you need some data abstraction layer. Without data abstraction layer, federated learning is just averaging derivatives from different location, this can be easily done with any distributed learning framework, such as horovod pr pytorch distributed or TF distributed.
If what you are after is, can I launch multiple experiments with the sam...
@<1539780258050347008:profile|CheerfulKoala77> make sure the AMI id matches the zone of the EC2 machine
/opt/clearml/data/fileserver
this is ion the host machine and it is mounted Into the container to /mnt/fileserer
Hi @<1523707131994312704:profile|CrabbyKoala94>
I wanted to use method Task.reset() or Task.delete() however none of that seems to be able to delete
only
the logs in the "console" section in the UI.
So Task.reset
will reset the entire outputs of the Task (and the status), as you noticed. Why would you want to just remove the logs?
You can disable the auto logs altogether if you really want to, see Task.init [auto_connect_streams](https://github.com/allegroai/cl...
Hi DeliciousBluewhale87
I think you are correct, there is no way to pass it.
As TimelyPenguin76 mentioned you can either set a default output_uri on the agent's config file, or edit the created Task in the UI.
What is the specific use case ? Maybe we should add this ability, wdyt?
Does adding external files not upload them ti the dataset output_uri?
@<1523704667563888640:profile|CooperativeOtter46> If you are adding the links with add_external_files
these files are Not re-uploaded
And you pass:
scheduler.add_task(..., reuse_task=True)
?