Reputation
Badges 1
84 × Eureka!My entire code is
from clearml import Task, TaskTypes
task = Task.init(project_name='FirstTrial', task_name='first_trial', task_type=TaskTypes.training)
PACKAGE_VERSION = '0.4.1'
dataset_name = "Demodata"
which I - now - also ran as a whole script.
I just see the website that I linked to. I am not sure what is meant by "python environment". I cannot make a screen shot, because I do not know where to look for this in the first place.
As far as I understand, the workflow is like this. I define some model. Then I register it as an OutputModel. Then I train it. During training I save snapshots (not idea how, though) and then I save the final model when training is finished. This way the Model is a) connected to the task and b) available in the model store of ClearML.
Later, in a different task, I can load an already trained model with InputModel. This InputModel is read-only (regarding the ClearML model store), but I can ma...
@<1523701070390366208:profile|CostlyOstrich36> I read those, but did not understand.
@<1523701083040387072:profile|UnevenDolphin73> : From which URL is your most recent screenshot?
CostlyOstrich36 sure:[..]\urbansounds8k\venv\lib\site-packages\torchaudio\backend\utils.py:62: UserWarning: No audio backend is available. warnings.warn("No audio backend is available.") ClearML Task: overwriting (reusing) task id=[..] 2022-09-14 14:40:16,484 - clearml.Task - INFO - No repository found, storing script code instead ClearML results page:
`
Traceback (most recent call last):
File "[..]\urbansounds8k\preprocessing.py", line 145, in <module>
datasetbuilder = DataSe...
KindChimpanzee37 , any idea 🙂 ?
CostlyOstrich36 any ideas?
KindChimpanzee37 : First I went to the dataset and clicked on "Task information ->" in the right bottom corner of the "VERSION INFO". I supposed that is the same as what you meant with "right click on more information"? Because I did not find any option to "right click on more information". The "Task information ->" leads me to a view in the experiment manager. I posted the two screen shots.
PS: It is weird to me that the datamanager leads me to the experiment manager, specifically an experi...
KindChimpanzee37 , this time, I was away for a week 🙂 . I do not think, that I made the mistake you suggested. At the top of the script I wroteproject_name = 'RL/Urbansounds'
and then later
` self.original_dataset = Dataset.get(dataset_project=project_name, dataset_name='UrbanSounds example')
This will return the pandas dataframe we added in the previous task
self.metadata = Task.get_task(task_id=self.original_dataset.id).artifacts['metadata'].get() `
AgitatedDove14 : Not sure: They also have the feature store (data management), as mentioned, which is pretty MLOps-y 🙂 . Also, they do have workflows ( https://docs.mlrun.org/en/latest/concepts/multi-stage-workflows.html ) and artifacts/model management ( https://docs.mlrun.org/en/latest/store/artifacts.html ) and serving ( https://docs.mlrun.org/en/latest/serving/serving-graph.html ).
The first scenario is you standard "the code stays the same, the configuration changes" for the second step. Here, I want
The second and third scenario is "the configuration stays the same, the code changes", this is the case, e.g., if code is refactored, but effectively does the same as before.
@<1523701083040387072:profile|UnevenDolphin73> , you wrote
About the third scenario I'm not sure. If the configuration has changed, shouldn't the relevant steps (the ones where the configuration...
"using your method you may not reach the best set of hyperparameters."
Of course you are right. It is an efficiency trade-off of speed vs effectiveness. Whether this is worth it or not depends on the use-case. Here it is worth it, because the performance of the modelling is not sensitive to the parameter we search for first. Being in the ball-park is enough. And, for the second set of parameters, we need to do a full grid search (the parameters are booleans and strings); thus, this wo...
KindChimpanzee37 , I ensured that the dataset_name is the same in get_data.py and preprocessing.py and that seemed to help. Then, I got the error RuntimeError: No audio I/O backend is available.
, because of which I installed PySoundFile
with pip; that helped. Weirdly enough then, the old id error came back. So, I re-ran get_data.py and then preprocessing.py - this time the id error was gone again. Instead, I got `raise TypeError("Invalid file: {0!r}".format(self.name))
TypeError:...
@<1537605940121964544:profile|EnthusiasticShrimp49> : The biggest advantage I see to split your code into pipeline components is caching. A little bit structuring your code, but I was told by the staff this should not one's main aim with ClearML components. What is your main take away for splitting your code into components?
My HPO on top of the pipeline is already working 🙂 I am currently experimenting on using the HPO in a (other) pipeline that creates two HPO steps (from the same funct...
@<1523701083040387072:profile|UnevenDolphin73>
@<1523701083040387072:profile|UnevenDolphin73> : If I do, what should I configure how?
I expect either 'var1' to be 'b' or - better - there to be log of the change, so that I would be able to see how the value changed over time.
Thank you I found the error.myPar = task.connect(myPar, name='from TaskParameters')
is required.
Do you mean "exactly" as in "you finally got it" or in the sense of "yes, that was easy to miss"?
@<1523701083040387072:profile|UnevenDolphin73> : Thanks, but it does not mention the File Storage of "ClearML Hosted Server".