Thanks! (Maybe could be added to the docs ?) ๐
For me it is definitely reproducible ๐ But the codebase is quite large, I cannot share. The gist is the following:
import matplotlib.pyplot as plt
import numpy as np
from clearml import Task
from tqdm import tqdm
task = Task.init("Debug memory leak", "reproduce")
def plot_data():
fig, ax = plt.subplots(1, 1)
t = np.arange(0., 5., 0.2)
ax.plot(t, t, 'r--', t, t**2, 'bs', t, t**3, 'g^')
return fig
for i in tqdm(range(1000), total=1000):
fig = plot_data()
...
SuccessfulKoala55 Here is the trains-elastic error
` # Set the python version to use when creating the virtual environment and launching the experiment
# Example values: "/usr/bin/python3" or "/usr/local/bin/python3.6"
# The default is the python executing the clearml_agent
python_binary: ""
# ignore any requested python version (Default: False, if a Task was using a
# specific python version and the system supports multiple python the agent will use the requested python version)
# ignore_requested_python_version: ...
Nice, the preview param will do ๐ btw, I love the new docs layout!
I understand, but then why the docker mode is an option of the CLI if we always have to use it so that it works?
I am doing:try: score = get_score_for_task(subtask) except: score = pd.NA finally: df_scores = df_scores.append(dict(task=subtask.id, score=score, ignore_index=True) task.upload_artifact("metric_summary", df_scores)
trains==0.16.4
I am not using hydra, I am reading the conf with:config_dict = read_yaml(conf_yaml_path) config = OmegaConf.create(task.connect_configuration(config_dict))
So the migration from one server to another + adding new accounts with password worked, thanks for your help!
no, at least not in clearml-server version 1.1.1-135 โข 1.1.1 โข 2.14
AgitatedDove14 I see that the default sample_frequency_per_sec=2. , but in the UI, I see that there isnโt such resolution (ie. it logs every ~120 iterations, corresponding to ~30 secs.) What is the difference with report_frequency_sec=30. ?
To help you debugging this: in the /dashboard endpoint, all projects were still there, but empty (no experiment inside). No experiments archived as well.
So it seems like it doesn't copy /root/clearml.conf and it doesn't pass the environment variables (CLEARML_API_HOST, CLEARML_API_ACCESS_KEY, CLEARML_API_SECRET_KEY)
Ho wow! is it possible to not specify a remote task? (If i am working with Task.set_offline(True))
Looks like its a hurray then ๐ ๐ ๐พ
So it can be that when restarting the docker-compose, it used another volume, hence the loss of data
nvm, bug might be from my side. I will open an issue if I find any easy reproducible example
So either I specify in the clearml-agent agent.python_binary: python3.8 as you suggested, or I enforce the task locally to run with python3.8 using task.data.script.binary
These images are actually stored there and I can access them via the url shared above (the one written in the pop up message saying that these files could not be deleted)
CostlyOstrich36 , this also happens with clearml-agent 1.1.1 on a aws instanceโฆ
Alright SuccessfulKoala55 I was able to make it work by downgrading clearml-agent to 0.17.2
Ok so it seems that the single quote is the reason, using double quotes works
Thanks SuccessfulKoala55 ๐
Thereโs a reason for the ES index max size
Does ClearML enforce a max index size? what typically happens when that limit is reached?