Reputation
Badges 1
31 × Eureka!i see, thanks for the explanation, so basically it is initializing the weights to the weights which are saved instead of to the random ones, if i understand correctly? So in this case to prevent this behavior and start from random weights I should just change the name of output file with weights, right?
From my end I can confirm that if I use tqdm to print progress of training epochs (like for epoch in tqdm(range(num_epochs)): <training code>
) then I get each tqdm update printed as well as seen in attached image. @<1523702932069945344:profile|CheerfulGorilla72> maybe just reduce the number of updates to be less frequent - I think tqdm parameter miniters
allows you to do that.
Thank you John for confirmation. I am getting clearml.storage.helper.StorageError: Can't find azure configuration for azure://...
I did put the storage account name and key in the clearml.conf file as indicated in the documentation and the uri seems to be correct, do you have any idea what I might be missing?
It seems like today the issue is resolved - the same code takes half a minute.
hm, but why it happens if my new model has different hyperparameters and is separate experiment with separate name? how to avoid this behavior?
from clearml import Dataset
dataset = Dataset.get(
dataset_project,
dataset_name,
dataset_version)
It is just that - the Dataset is stored in Azure Storage Account.
I see, thanks for explaining
it has 24 parameters, but two of them are list of features so overall the yaml file has around 70 lines
just noticed that even though I put the correct credentials in place, I didn't notice that the whole segment is commented by default :face_palm: it works now, thanks 😁
It was just mentioning that I don't have access rights
Yes, but it is not a matter of version - no version of this package can be installed on Linux, this module is available only on Windows.
When I try to install all the packages from my requirements.txt manually on this Linux machine it seems fine, so I am not sure why pywin32 is added as requirement when executing the clearml task.
I was checking my internet speed yesterday and it seemed normal, anyway it seems fine now.
yes, i can see the hyperparams are reflecting correctly, but I mean it shouldn't start from previous model checkpoint if it is different run with different hyperparams
But I don't specify pywin in my requirements, so I can't edit its version and I think it is not just a matter of version, but of installing it at all - I've read that you shouldn't install this library on non-Windows machines at all
I came up with this minimal example - it is a bit different but the behavior is also not as expected I think:
import numpy as np
import plotly.express as px
from clearml import Task
task = Task.init(project_name='MyProject', task_name='task1',
task_type=Task.TaskTypes.training)
logger = task.get_logger()
y_pred = np.random.rand(100)
y_test = np.random.rand(100)
fig = px.line({'y_pred': y_pred, 'y_test': y_test})
logger.report_plotly(title=f'Forecast', series=f'Forecas...
Hi, I am just having YAML file where I keep all the configurations and hyperparams and then for every training the first thing I do is I call a function which initializes the ClearML task and connects the config and returns it so that the rest of the code can use it.
def initialize_task(task_type=Task.TaskTypes.training):
config = yaml.safe_load(open(r"params.yaml"))
Task.ignore_requirements("pywin32")
task = Task.init(project_name='X', task_name=config['TASK_NAME'],
...
and previously I was able to retrieve it, I think the trouble started after I finalized dataset.
I also checked the account_name and account_key from clearml.conf file but it seems fine.
Request payload:
{"tasks":["425335514d444646b1077cb8b738ccf7","3c0db63779e940cc80895af728aac5ab"],"model_events":false,"no_scroll":true,"iters":1}
ok, that makes sense, thanks a lot for the clarifications