report_scalar pernits to manually report a scalar series. This is the dedicated function. There could be other ways to report a scalar, for example through tensorboard - in this case you would have to report to tensorboard, and clearML will automatically report the values
Hello DepravedSheep68 ,
In order to store your info into the S3 bucket you will need two things :
specify the uri where you want to store your data when you initialize the task (search for the parameter output_uri in the Task.init function https://clear.ml/docs/latest/docs/references/sdk/task#taskinit ) specify your s3 credentials into the clear.conf file (what you did)
hey TenderCoyote78
Here is an example of how to dump the plots to jpeg files
` from clearml.backend_api.session.client import APIClient
from clearml import Task
import plotly.io as plio
task = Task.get_task(task_id='xxxxxx')
client = APIClient()
t = client.events.get_task_plots(task=task.id)
for i, plot in enumerate(t.plots):
fig = plio.from_json(plot['plot_str'])
plio.write_image(fig=fig, file=f'./my_plot_{i}.jpeg') `
Concerning how to use ParameterSet :
I first declare the setmy_param_set = ParameterSet([ {'General/batch_size': 32, 'General/epochs': 30}, {'General/batch_size': 64, 'General/epochs': 20}, {'General/batch_size': 128, 'General/epochs': 10} ])
This is a very basic example, it is also possible to use more complex things into the set (see https://clear.ml/docs/latest/docs/references/sdk/hpo_parameters_parameterset/ for UniformParameter Range usage in ParameterSet).
Then i do ...
what bother me is that it worked until yesterday, and you didnt changed your code. So the only thing i can think of is a bug introduced with the new sdk version, that was released yesterday. I am inverstigating with the sdk team, i will keep you updated asap ! 🙂
Hi UnevenDolphin73
I have reproduced the error :
Here is the behavior of that line, according to the version : StorageManager. download_folder( s3://mybucket/my_sub_dir/files , local_dir='./')
1.3.2 download the my_sub_dir content directly in ./
1.4.x download the my_sub_dir content in ./my_sub_dir/ (so the dotenv module cant find the file)
please keep in touch if you still have some issues, or if it helps you to solve the issue
hi FiercePenguin76
Can you also send your clearml packages versions ?
I would like to sum your issue up , so that you could check i got it right
you have a task that has a model, that you use to make some inference on a dataset you clone the task, and would like to make inferences on the dataset, but with another modelthe problem is that you have a cloned task with the first model....
How have you registered the second model ? Also can you share your logs ?
ok so here is the example.
the idea is to use the API, and finally reproduce what the WebUI does.
` from clearml.backend_api.session.client import APIClient
from clearml import Task
task = Task.get_task(task_id=xxxx)
#or Task.get_task(project_name=xxx, task_name=xxx)
client = APIClient()
my_data = client.tasks.get_by_id(task.id).to_dict()
with open('./my_data.csv', 'w') as f:
for key in my_data.keys():
f.write("%s, %s\n" % (key, my_data[key]) ) `
Interesting. We are opening a discussion to weight the pros and cons of those different approaches - i ll of course keep you updated>
Could you please open a github issue abot that topic ? 🙏
http://github.com/allegroai/clearml/issues
hi GentleSwallow91
Concerning the warning message, there is an entry in the FAQ. Here is the link :
https://clear.ml/docs/latest/docs/faq/#resource_monitoring
We are working on reproducing your issue
what do you mean ? the average time that the tasks are waiting before being executed by an agent ? that is to say the average difference between enqueue time and beginning time ?
Do you think that you could send us a bit of code in order to better understand how to reproduce the bug ? In particular about how you use dotenv...
So far, something like that is working normally. with both clearml 1.3.2 & 1.4.0
`
task = Task.init(project_name=project_name, task_name=task_name)
img_path = os.path.normpath("**/Images")
img_path = os.path.join(img_path, "*.png")
print("==> Uploading to Azure")
remote_url = "azure://****.blob.core.windows.net/*****/"
StorageManager.uplo...
Hi UnevenDolphin73
Let me resume, so that i ll be sure that i got it 🙂
I have a minio server somewhere like some_ip on port 9000 , that contains a clearml bucket
If I do StorageManager.download_folder(remote_url='
s3://some_ip:9000/clearml ', local_folder='./', overwrite=True)
Then i ll have a clearml bucket directory created in ./ (local_folder), that will contain the bucket files
hi WickedElephant66 how are u ?
could you try again providing " s3 ://localhost:9000/bucket-name" ?
btw can you screenshot your clearml-agent list and UI please ?
i am not sure i get you here.
when pip installing clearml-agent, it doesnt fire any agent. the procedure is that after having installed the package, if there isnt any config file, you do clearml-agent init
and you enter the credentials, which are stored in clearml.conf. If there is a conf file, you simply edit it and manually enter the credentials. so i dont understand what you mean by "remove it"
Hi MotionlessCoral18
You need to run some scripts when migrating, to update your old experiments. I am going to try to find you soem examples
i have found some threads that deal with your issue, and propose interesting solutions. Can you have a look at this ?
you can also specify a package, with or without specifying its version
https://clear.ml/docs/latest/docs/references/sdk/task#taskadd_requirements
Hi,
It would be great if you could also send your clearml package version 🙂
hey OutrageousSheep60
what about the process ? there must be one clearml-agent process that runs somwhere, and that is why it can continue reporting to the server
When the pipeline or any step is executed, a task is created, and it name will be taken from the decorator parameters. Additionally, for a step, the name parameter is optional : if not provided, the function name will be used instead.
It seems to me that your script fails creating the pipeline controller task because it fails pulling the name parameter. which is weird ... Weird because in the last error line, we can see that name !
Hi SteepDeer88
I wrote this script to try to reproduce the error. I am passing there +50 parameters and so far everything works fine. Could you please give me some more details about your issue, so that we could reproduce it ?
from clearml import Task
import argparse
'''
COMMAND LINE:
python -m my_script --project_name my_project --task_name my_task --execute_remotely true --remote_queue default --param_1 parameter...
Hi WittyOwl57 ,
The function is :
task.get_configuration_object_as_dict ( name="name" )
with task being your Task object.
You could find a bunch of pretty similar functions in the docs. Have a look at here : https://clear.ml/docs/latest/docs/references/sdk/task#get_configuration_object_as_dict
btw here is the content of the imported file:
import
torch
from
torchvision
import
datasets, transforms
import
os
MY_GLOBAL_VAR = 32
def my_dataloder
():
return
torch.utils.data.DataLoader(
datasets.MNIST(os.path.join('./', 'data'), train=True, download=True,
transform=transforms.Compose([
transforms.ToTensor()
` ...
I see some points that you should fix
in the train step, you return 2 items but you have only one in its decorator: add mock do you really need to init a task in the pipeline controller ? you will automatically get one when executing the pipeline