
Reputation
Badges 1
84 × Eureka!How would you compare those to ClearML?
KindChimpanzee37 , this time, I was away for a week 🙂 . I do not think, that I made the mistake you suggested. At the top of the script I wroteproject_name = 'RL/Urbansounds'
and then later
` self.original_dataset = Dataset.get(dataset_project=project_name, dataset_name='UrbanSounds example')
This will return the pandas dataframe we added in the previous task
self.metadata = Task.get_task(task_id=self.original_dataset.id).artifacts['metadata'].get() `
Last point on component caching, what I suggest is actually providing users the ability to control the cache "function". Right now (a bit simplified but probably accurate), this is equivalent to hashing of the following dict:
{"code": "code here", "container": "docker image", "container args": "docker args", "hyper-parameters": "key/value"}
We could allow users to add a function that get's this dict and returns a new dict that will be used for hashing. This way we will e...
CostlyOstrich36 any ideas?
KindChimpanzee37 : First I went to the dataset and clicked on "Task information ->" in the right bottom corner of the "VERSION INFO". I supposed that is the same as what you meant with "right click on more information"? Because I did not find any option to "right click on more information". The "Task information ->" leads me to a view in the experiment manager. I posted the two screen shots.
PS: It is weird to me that the datamanager leads me to the experiment manager, specifically an experi...
@<1523701083040387072:profile|UnevenDolphin73> : No, I love it ❤ . Now, I just have to read everything 😄 .
@<1523701083040387072:profile|UnevenDolphin73> : A big point for me is to reuse/cache those artifacts/datasets/models that need to be passed between the steps, but have been produced by colleagues' executions at some earlier point. So for example, let the pipeline be A(a) -> B(b) -> C(c), where A,B,C are steps and their code, excluding configurations/parameters, and a,b,c are the configurations/parameters. Then I might have the situation, that my colleague ran the pipeline A(a) -> B(b) -> C(c...
KindChimpanzee37 , any idea 🙂 ?
@<1523701205467926528:profile|AgitatedDove14> : In general: If I do not build a package out of my local repository/project , I cannot reference anything
from the local project/repository directly, right? I must make a package out of it, or I must reference it with the repo
argument, or I must reference respective functions using the helper_functions
argument. Did I get this right?
Ah... if I run the same script not from PyCharm, but from the terminal, then it gets completed... puh...
My entire code is
from clearml import Task, TaskTypes
task = Task.init(project_name='FirstTrial', task_name='first_trial', task_type=TaskTypes.training)
PACKAGE_VERSION = '0.4.1'
dataset_name = "Demodata"
which I - now - also ran as a whole script.
I just see the website that I linked to. I am not sure what is meant by "python environment". I cannot make a screen shot, because I do not know where to look for this in the first place.
>pip show clearml
WARNING: Ignoring invalid distribution -upyterlab (c:\users\...\lib\site-packages)
WARNING: Ignoring invalid distribution -illow (c:\users\...\lib\site-packages)
Name: clearml
Version: 1.6.4
Summary: ClearML - Auto-Magical Experiment Manager, Version Control, and MLOps for AI
Home-page:
None
`Auth...
It means "The syntax for the file name, folder name or volume label / disk is wrong" somthing along those lines. The [...] is the directory path to my project, which I opened in PyCharm and from which I run the commands in the Python Console.
But still, in the web app the task is considered to be still "running". I am not sure what to do, so that the task is considered to be "completed".
CostlyOstrich36 sure:[..]\urbansounds8k\venv\lib\site-packages\torchaudio\backend\utils.py:62: UserWarning: No audio backend is available. warnings.warn("No audio backend is available.") ClearML Task: overwriting (reusing) task id=[..] 2022-09-14 14:40:16,484 - clearml.Task - INFO - No repository found, storing script code instead ClearML results page:
`
Traceback (most recent call last):
File "[..]\urbansounds8k\preprocessing.py", line 145, in <module>
datasetbuilder = DataSe...
GrittyStarfish67 : In terms of "has a good name" you literally mean the name or do you mean, they have a good reputation 😄 ?
AgitatedDove14 : Not sure: They also have the feature store (data management), as mentioned, which is pretty MLOps-y 🙂 . Also, they do have workflows ( https://docs.mlrun.org/en/latest/concepts/multi-stage-workflows.html ) and artifacts/model management ( https://docs.mlrun.org/en/latest/store/artifacts.html ) and serving ( https://docs.mlrun.org/en/latest/serving/serving-graph.html ).
I am running it in the Python Console in PyCharm with Task.init. I get with the log:
ClearML Task: overwriting (reusing) task id=dfa2dff538d54c18ad97ea1593cbd357
2023-02-14 13:06:44,336 - clearml.Task - WARNING - Failed auto-detecting task repository: [WinError 123] Die Syntax für den Dateinamen, Verzeichnisnamen oder die Datenträgerbezeichnung ist falsch: '[...]\<input>'
ClearML results page: [None](https://app.clear.ml/projects/9acc061c880344a881790461a4baa837/experiments/dfa2dff538d54c1...
KindChimpanzee37 : Ok, will do. (More question from my side though. :-D) But I need to have pretty good idea before presenting our concept to the bosses.
@<1523701070390366208:profile|CostlyOstrich36> I read those, but did not understand.
@<1523701070390366208:profile|CostlyOstrich36> : After more playing around, it seems that ClearML Server does not store the models or artifacts itself. These are stored somewhere else (e.g., AWS S3-bucket) or on my local machine and ClearML Server is only storing configuration parameters and previews (e.g., when the artifact is a pandas dataframe). Is that right? Is there a way to save the models completely on the ClearML server?
KindChimpanzee37 : Thank you so much! I asked follow up questions 🙂 .
Also, I could not find any larger examples on github about Model, InputModel, or OutputModel. It's kind of difficult to build a PoC this way... 😅
As far as I understand, the workflow is like this. I define some model. Then I register it as an OutputModel. Then I train it. During training I save snapshots (not idea how, though) and then I save the final model when training is finished. This way the Model is a) connected to the task and b) available in the model store of ClearML.
Later, in a different task, I can load an already trained model with InputModel. This InputModel is read-only (regarding the ClearML model store), but I can ma...
@<1523701087100473344:profile|SuccessfulKoala55> : I referenced this conversation in the issue None