Hi @<1523705721235968000:profile|GrittyStarfish67> ! This looks like a boto3 error. You could try lowering sdk.aws.s3.boto3.max_multipart_concurrency
in clearml.conf
and setting max_workers=1
when calling Dataset.get_local_copy
That's unfortunate. Looks like this is indeed a problem 😕 We will look into it and get back to you.
Hi RoundMole15 ! Are you able to see a model logged when you run this simple example?
` from clearml import Task
import torch.nn.functional as F
import torch.nn as nn
import torch
class TheModelClass(nn.Module):
def init(self):
super(TheModelClass, self).init()
self.conv1 = nn.Conv2d(3, 6, 5)
self.pool = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(6, 16, 5)
self.fc1 = nn.Linear(16 * 5 * 5, 120)
self.fc2 = nn.Linear(120, 84)
s...
Hi @<1523703107031142400:profile|FlatOctopus65> ! python3.9 introduced a breaking change for codebases that parse code containing slices. You can read more about it here: None . Notable:
* The code that produces a Python code from AST will need to handle indexing with tuples specially (see Tools/parser/unparse.py) because d[(a, b)] is valid syntax (although parenthesis are redundant), but d[(a, b:c)] is not.
What you could do is downgrade to...
Hi LittleShrimp86 ! Looks like something is broken. We are looking into it
Anyhow, there is a serialization_function argument you could use in upload_artifact. I could imagine that we don’t properly serialize your artifacts. You could use the argument to pass a callback that would eficiently serialize the artifact. Notice that getting the artifact back requires a deserialization function
Hi @<1576381444509405184:profile|ManiacalLizard2> ! Can you please share a code snippet that I could run to investigate the issue?
Don't call PipelineController
functions after start
has finished. Use a post_execute_callback
instead
` from clearml import PipelineController
def some_step():
return
def upload_model_to_controller(controller, node):
print("Start uploading the model")
if name == "main":
pipe = PipelineController(name="Yolo Pipeline Controller", project="yolo_pipelines", version="1.0.0")
pipe.add_function_step(
name="some_step",
function=some_st...
Hi @<1628202899001577472:profile|SkinnyKitten28> ! What code do you see that is being captured?
Please add it to github! No other info is needed, we know what the issue is
Can you see your task if you run this minimal example UnevenDolphin73 ?
` from clearml import Task, Dataset
task = Task.init(task_name="name_unique", project_name="project")
d = Dataset.create(dataset_name=task.name, dataset_project=task.get_project_name(), use_current_task=True)
d.upload()
d.finalize() `
Can you please provide a minimal example that may make this happen?
Hi @<1691620883078057984:profile|ConfusedSeaanemone5> ! Those are the only 3 charts that the HPO constructs and reports. You could construct other charts/plots yourself and report them when a job completes using the job_completed_callback
parameter.
Hi @<1668427950573228032:profile|TeenyShells80> , the parent_datasets
should be a list of dataset IDs or clearml.Dataset objects, not dataset names. Maybe that is the issue
Hi @<1571308003204796416:profile|HollowPeacock58> ! The changes should be reflected. Do you have a small example that could help us reproduce the issue?
I left another comment today. It’s about something raising an exception when creating a set from the file entries
We would appreciate a PR! Just open a GH issue, the the PR and we will review it
Hi @<1523707653782507520:profile|MelancholyElk85> ! I left you a comment on the PR
Yes, so even if you use a docker image with 3.8, the agent doesn't really know that you have 3.8 installed. If it is ran with 3.9, it will assume that is the desired version you want to use. So you need to change it in the config.
Not really sure why default_python
is ignored (we will need to look into this), but python_binary
should work...
UnevenDolphin73 looking at the code again, I think it is actually correct. it's a bit hackish, but we do use deferred_init
as an int internally. Why do you need to close the task exactly? Do you have a script that would highlight the behaviour change between <1.8.1
and >=1.8.1
?
@<1523707653782507520:profile|MelancholyElk85> my bad, I forgot to press on "Submit Review" :face_palm:
@<1526734383564722176:profile|BoredBat47> Yeah. This is an example:
s3 {
key: "mykey"
secret: "mysecret"
region: "us-east-1"
credentials: [
{
bucket: "
"
key: "mykey"
secret: "mysecret"
region: "us-east-1"
},
]
}
# some other config
default_output_uri: "
"
ShinyPuppy47 Try this: use task = Task.init(...)
(no create
) then call task.set_base_docker
Btw, to specify a custom package, add the path to that package to your requirements.txt
(the path can also be a github link for example).
Hi @<1523703652059975680:profile|ThickKitten19> ! Could you try increasing the max_iteration_per_job
and check if that helps? Also, any chance that you are fixing the number of epochs to 10, either through a hyper_parameter e.g. DiscreteParameterRange("General/epochs", values=[10]),
or it is simply fixed to 10 when you are calling something like model.fit(epochs=10)
?
Hi @<1603198163143888896:profile|LonelyKangaroo55> ! Each pipeline component runs in a task. So you first need the IDEs of each component you try to query. The you can use Task.get_task
None to get the task object, the you can use Task,get_status
to get the status None .
To get the ids, you can use something like [None](https://clear.ml/docs/...
can you try setting the repo
when calling add_function_step
?
in the meantime, we should have fixed this. I will ping you when 1.9.1 is out to try it out!