Hi RoughTiger69 ! Can you try adding the files using a python script such that we could get an exception traceback, something like this:
` from clearml import Dataset
or just use the ID of the dataset you previously created instead of creating a new one
parent_dataset = Dataset.create(dataset_name="xxxx", dataset_project="yyyyy", output_uri=" ")
parent_dataset.add_files("folder1")
parent_dataset.upload()
parent_dataset.finalize()
child_dataset = Dataset.create(dataset_name="xxxx", dat...
@<1643060801088524288:profile|HarebrainedOstrich43> you are right. we actually attempt to copy the default arguments as well. What happens is that we aggregate these arguments in the kwargs dict, then we dump str(kwargs) in the script of the pipeline step. Problem is, str(dict) actually calls __ repr_ _ on each key/value of the dict, so you end up with repr(MyEnum.FALSE) in your code, which is <MyEnum.FALSE: 'FALSE'> . One way to work around this is to add somet...
Hi @<1633638724258500608:profile|BitingDeer35> ! You could attach the configuration using set_configuration_object None in a pre_execute_callback . The argument is set here: None
Basically, you would have something like:
def pre_callback(pipeline, node, params):
node.job.task.set_configuration_object(config)...
Yes, you need to call the function every time. The remote run might have some parameters populated which you can use, but the pipeline function needs to be called if you actually want to run the pipeline.
@<1643060801088524288:profile|HarebrainedOstrich43> we released 1.14.1 as an official version
Haha, should not be too complicated to add one. We will consider it. Thanks for reporting the issue
UnevenDolphin73 looking at the code again, I think it is actually correct. it's a bit hackish, but we do use deferred_init as an int internally. Why do you need to close the task exactly? Do you have a script that would highlight the behaviour change between <1.8.1 and >=1.8.1 ?
Hi! Can you please provide us with code that would help us reproduce this issue? Is it just downloading from gcp?
Hi @<1523705721235968000:profile|GrittyStarfish67> ! This looks like a boto3 error. You could try lowering sdk.aws.s3.boto3.max_multipart_concurrency in clearml.conf and setting max_workers=1 when calling Dataset.get_local_copy
@<1523703472304689152:profile|UpsetTurkey67> It would be great if you could write a script we could use to reproduce
Hi @<1618418423996354560:profile|JealousMole49> ! To disable previews, you need to set all of the values below to 0 in clearml.conf :
dataset.preview.media.max_file_size
dataset.preview.tabular.table_count
dataset.preview.tabular.row_count
dataset.preview.media.image_count
dataset.preview.media.video_count
dataset.preview.media.audio_count
dataset.preview.media.html_count
dataset.preview.media.json_count
Also, I believe you could go through each dataset and remove the `Datase...
Hi RoundMole15 ! Are you able to see a model logged when you run this simple example?
` from clearml import Task
import torch.nn.functional as F
import torch.nn as nn
import torch
class TheModelClass(nn.Module):
def init(self):
super(TheModelClass, self).init()
self.conv1 = nn.Conv2d(3, 6, 5)
self.pool = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(6, 16, 5)
self.fc1 = nn.Linear(16 * 5 * 5, 120)
self.fc2 = nn.Linear(120, 84)
s...
Regarding number 2. , that is indeed a bug and we will try to fix it as soon as possible
DangerousDragonfly8 Yes this is correct, we mixed-up the places we call these functions
Hi @<1545216070686609408:profile|EnthusiasticCow4> ! Can't you just get the values of the hyperparameters and the losses, then plot them with something like mathplotlib then just report the plot to ClearML?
@<1626028578648887296:profile|FreshFly37> I see that create_dataset doesn't have a repo set. Can you try setting it manually via the repo repo_branch repo_commit arguments in the add_function_step method?
Hi @<1570583237065969664:profile|AdorableCrocodile14> ! get_local_copy will always copy/download external files to a folder. To get the external files, there is property on the dataset called link_entries which returns a list of LinkEntry objects, which contain a link attribute, and each such link should point to a extrenal file (in this case, your local paths prefixed with file:// )
MotionlessCoral18 If you provide the model as a hyperparam, then I believe you should query its value by calling https://clear.ml/docs/latest/docs/references/sdk/task/#get_parameters or https://clear.ml/docs/latest/docs/references/sdk/task/#get_parameter
Hi @<1633638724258500608:profile|BitingDeer35> ! Looks like the SDK doesn't currently allow to create steps/controllers with a designated cwd. You will need to call the set_script function on your step's tasks and on the controller for now.
For the controller: If you are using the PipelineDecorator, you can do something like: PipelineDecorator._singleton._task.set_script(working_dir="something") , before you are running the pipeline function. In the case of regular `PipelineControll...
Hi ShortElephant92 ! Random images, audio files, tables (trimmed to a few rows) are sent as Debug Samples for preview. By default, they are sent to our servers. Check this function if you wish to log the samples to another destination https://clear.ml/docs/latest/docs/references/sdk/logger/#set_default_upload_destination .
You could also add these entries in you clearml.conf to not send any samples for preview:
` sdk.dataset.preview.tabular.table_count: 0
sdk.dataset.preview.media.i...
Hi!
It is possible to use the same queue for the controller and the steps, but there needs to be at least 2 agents that pull tasks from that queue. Otherwise, if there is only 1 agent, then that agent will be busy running the controller and it won't be able to fetch the steps.
Regarding missing local packages: the step is ran in a temporary directory that is different than the directory the script is originally in. To solve this, you could add all the modules/files you are interested in in a...
PanickyMoth78 You might also want to set some lower values for sdk.google.storage.pool_connections/pool_maxsize in your clearml.conf . Newer clearml version set max_workers to 1 by default, and the number of connections should be tweaked using these values. If it doesn't help, please let us know
Can you see your task if you run this minimal example UnevenDolphin73 ?
` from clearml import Task, Dataset
task = Task.init(task_name="name_unique", project_name="project")
d = Dataset.create(dataset_name=task.name, dataset_project=task.get_project_name(), use_current_task=True)
d.upload()
d.finalize() `
@<1545216070686609408:profile|EnthusiasticCow4>
This:
parent = self.clearml_dataset = Dataset.get(
dataset_name="[LTV] Dataset",
dataset_project="[LTV] Lifetime Value Model",
)
# generate the local dataset
dataset = Dataset.create(
dataset_name=f"[LTV] Dataset",
parent_datasets=[parent],
dataset_project="[LTV] Lifetime Value Model",
)
should l...
Hi @<1724235687256920064:profile|LonelyFly9> ! I assume in this case we fail to retrieve the dataset? Can you provide an example when this happens?
Hi JitteryCoyote63 ! Your clearml-agent is likely ran with python3.9. Can try setting this entry https://github.com/allegroai/clearml-agent/blob/ebb955187dea384f574a52d059c02e16a49aeead/docs/clearml.conf#L48 in your clearml.conf to python3.8 , or the full path to python3.8 if that doesn't work
With that said, can I run another thing by you related to this. What do you think about a PR that adds the functionality I originally assumed schedule_function was for? By this I mean: adding a new parameter (this wouldn't change anything about schedule_function or how .add_task() currently behaves) that also takes a function but the function expects to get a task_id when called. This function is run at runtime (when the task scheduler would normally execute the scheduled task) and use ...
Could you try adding region under credentials as well?
Something like:
dataset = Dataset.create(dataset_name=dataset_name, dataset_porject=dataset_project, parent_datasets=[dataset.id])