Reputation
Badges 1
25 × Eureka!Hi @<1523701868901961728:profile|ReassuredTiger98>
Could you send the full log ? Also what's the clearml-agent version?
I am actually saving a dictionary that contains the model as a value (+ training datasets)
How are you specifically doing that? pickle?
@<1538330703932952576:profile|ThickSeaurchin47> can you try the artifacts example:
None
and in this line do:
task = Task.init(project_name='examples', task_name='Artifacts example', output_uri="
")
Yes, I find myself trying to select "points" on the overview tab. And I find myself wanting to see more interesting info in the tooltip.
Yep that's a very good point.
The Overview panel would be extremely well suited for the task of selecting a number of projects for comparing them.
So what you are saying, this could be a way to multi select experiments for detailed comparison (i.e. selecting the "dots" on the overview graph), is this what you had in mind?
Hi GiganticTurtle0
dataset_task = Task.get_task(task_id=dataset.id)
Hmmm I think that when it gets the Task "output_uri" is not updated from the predefined Task (you can obviously set it again).
This seems like a bug that is unrelated to Datasets.
Basically any Task that you retrieve will default to the default ouput_uri (not the stored one)
The first pipeline
Β step is calling init
GiddyPeacock64 Is this enough to track all the steps?
I guess my main question is every step in the pipeline an actual Task/Job or is it a single small function?
Kubeflow is great for simple DAGs but when you need to build more complex logic it is usually a bit limited
(for example the visibility into what's going on inside each step is missing so you cannot make a decision based on that).
WDYT?
Hi PanickyMoth78
dataset name is ignored if
use_current_task=True
Kind of, it stores the Dataset on the Task itself (then dataset.name becomes the Task name), actually we should probably deprecate this feature, I think this is too confusing?!
What was the use case for using it ?
Hi ColossalDeer61 ,
My question is about existing monitors in the trains-server (preferably the web UI)
So the idea is you run the code once, it creates a Task in the system and verifies the Slack credentials are working. then you can enqueue it in the "services", and voila, you have a monitoring service running, that you can control from the UI and creates alerts to Slack. unfortunately there is no built-in way to achieve that in the UI. but it should not take more than a few minute...
but DS in order for models to be uploaded,
you still have to set:
output_uri=True
in the
No, if you set the default_output_uri, there is no need to pass output_uri=True in the Task.init() π
It is basically setting it for you, make sense ?
Hi @<1724960475575226368:profile|GloriousKoala29>
Is there a way to aggregate the results, such as defining an iteration as the accuracy of 100 samples
Hmm, i'm assuming what you actually want is to store it with the actual input/output and a score, is that correct?
LOL π
Make sure that when you train the model or create it manually you set the default "output_uri"
task = Task.init(..., output_uri=True)
or
task = Task.init(..., output_uri="s3://...")
Yey π !
So now you can add some logic based on the model object passed as the second argument (see WeightsFileHandler.ModelInfo)
The easiest is based on the model name see model.local_model_path
Okay. AndΒ
110
Β means 11.1 and not 11.0?Β (edited)
110 means 11.0, the odd thing is, it actually installed 11.1, and from the pytorch website this is exactly how they suggest to install with conda...
Let me know if forcing the CUDA version changes anything
DepressedChimpanzee34
I might have an idea , based on the log you are getting LazyCompletionHelp in stead of str
Could it be you installed hyrda bash completion ?
https://github.com/facebookresearch/hydra/blob/3f74e8fced2ae62f2098b701e7fdabc1eed3cbb6/hydra/_internal/utils.py#L483
E.g. I might need to have different N-numbers for the local and remote (ClearML) storage.
Hmm yes, that makes sense
That'd be a great solution, thanks! I'll create a PR shortly
Thank you! π π€©
so does the container install anything,
The way the agent works with dockers:
spin the docker Install the base stuff inside the docker (like git and make sure it has python etc) Create a new venv inside the docker, inheriting everything from the docker's system wide python packages, this means that if you have the "object_detection" package installed, it will be available inside the new venv. Install specific python package your Task requires (inside the venv). This allows you to over...
Hi GreasyPenguin14
It looks like you are trying to delete a Task that does not exist
Any chance the cleanup service is misconfigured (i.e. accessing the incorrect server) ?
I think the main difference is that I can see a value of having access to the raw format within the cloud vendor and not only have it as an archive
I see it does make sense.
Two options, one, as you mentioned use the ClearML StorageManager to upload the files, then register them as external links with Dataset.
Two, I know the enterprise tier has HyperDatasets, that are essentially what you describe, with version control over the "metadata" and "raw storage" on the GCP, including the ab...
If this is GitHub/GitLab/Bitbucket what I'm thinking is just a link opening an iframe / tab with the exact entry point script / commit.
What do you think?
Because we are working with very big files, having them stored at multiple locations is something we try to avoid
Just so I better understand, is this for storing files as part of a dataset, or as debug samples ?
In other words can two diff processes create the exact same file (image) ?
WackyRabbit7 basically starting v1.1 if you are running code without any configuration file, you will get an error (in contrast to previous versions where it defaulted to the demo-server)
but I'm pretty confident it was the size of the machine that caused it (as I mentioned it was a 1 cpu 1.5gb ram machine)
I have the feeling you are right π
Hi OutrageousGrasshopper93
I think that what you are looking for is Task.import_task and Task.export
https://allegro.ai/docs/task.html#trains.task.Task.import_task
https://allegro.ai/docs/task.html#trains.task.Task.export_task
I set up the alert rule on this metric by defining a threshold to trigger the alert. Did I understand correctly?
Yes exactly!
Or the new metric should...
basically combining the two, yes looks good.
My typos are killing us, apologies :
change -t to -it it will make it interactive (i.e. you can use bash π )