You could try this in the meantime if you don't mind temporary workarounds:dataset.add_external_files(source_url="
", wildcard=["file1.csv"], recursive=False)
Hi @<1523703652059975680:profile|ThickKitten19> ! Could you try increasing the max_iteration_per_job
and check if that helps? Also, any chance that you are fixing the number of epochs to 10, either through a hyper_parameter e.g. DiscreteParameterRange("General/epochs", values=[10]),
or it is simply fixed to 10 when you are calling something like model.fit(epochs=10)
?
Hi BoredBat47 ! What jsonschema
version are you using?
Hi @<1657918706052763648:profile|SillyRobin38> ! If it is compatible with http/rest, you could try setting api.files_server
to the endpoint or sdk.storage.default_output_uri
in clearml.conf
(depending on your use-case).
Hi RoughTiger69 ! Can you try adding the files using a python script such that we could get an exception traceback, something like this:
` from clearml import Dataset
or just use the ID of the dataset you previously created instead of creating a new one
parent_dataset = Dataset.create(dataset_name="xxxx", dataset_project="yyyyy", output_uri=" ")
parent_dataset.add_files("folder1")
parent_dataset.upload()
parent_dataset.finalize()
child_dataset = Dataset.create(dataset_name="xxxx", dat...
Hi @<1590514584836378624:profile|AmiableSeaturtle81> ! What function are you using to upload the data?
pruning old ancestors sounds like the right move for now.
Hi @<1654294828365647872:profile|GorgeousShrimp11> ! add_tags
is an instance method, so you will need the controller instance to call it. To get the controller instance, you can do PipelineDecorator.get_current_pipeline()
then call add_tags
on the returned value. So: PipelineDecorator.get_current_pipeline().add_tags(tags=["tag1", "tag2"])
Hi UnevenDolphin73 ! We were able to reproduce the issue. We'll ping you once we have a fix as well π
Hi @<1533257278776414208:profile|SuperiorCockroach75> Try setting packages
in your pipline component to your requirements.txt
or simply add the list of packages (with the specific versions). None
@<1654294828365647872:profile|GorgeousShrimp11> Any change your queue is actually named megan-testing
and not megan_testing
?
Hi @<1545216070686609408:profile|EnthusiasticCow4> ! Can't you just get the values of the hyperparameters and the losses, then plot them with something like mathplotlib
then just report the plot to ClearML?
Maybe you want to use some other functions then the ones I quoted, so feel free to read the docs, you should be able to do this
Your object is likely holding some file descriptor or something like that. The pipeline steps are all running in separate processes (they can even run on different machines while running remotely). You need to make sure that the objects that you are returning are thus pickleable and can be passed between these processes. You can try to see that the logger you are passing around is indeed pickalable by calling pickle.dump(s)
on it an then loading it in another run.
The best practice would ...
I am honestly not sure if it will work, but we do have a http driver that could query your endpoint. None
It's worth to give it a try
Hi @<1569496075083976704:profile|SweetShells3> ! Can you reply with some example code on how you tried to use pl.Trainer
with launch_multi_node
?
I left another comment today. Itβs about something raising an exception when creating a set from the file entries
@<1590514584836378624:profile|AmiableSeaturtle81> if you wish for you debug samples to be uploaded to s3 you have 2 options: you either use this function: None
or you can change the api.files_server
entry to your s3 bucket in clearml.conf
. This way you wouldn't need to call set_default_upload_destination
every time you run a new script.
Also, in clearml.conf
, you can change `sdk.deve...
You should alter the name (or else the model will be overwritten)
are you running this locally or are you enqueueing the task (controller)?
@<1566596968673710080:profile|QuaintRobin7> not for now. Could you please open a GH issue about it? Maybe we can fit this in a future patch.
can you share the logs of the controller?
Hi FreshParrot56 ! This is currently not supported π
Hi @<1603198134261911552:profile|ColossalReindeer77> ! The usual workflow is that you modify the fields in your remoter run in either the Hyperparameters section or the configuration section, but not usually both (as in Hydra's case). When using CLI tools, people mostly modify the Hyperparameters section so we chose to set the allow_omegaconf_edit to False by default for parity.
HandsomeGiraffe70 your conf file should look something like this:
` {
# ClearML - default SDK configuration
storage {
cache {
# Defaults to system temp folder / cache
default_base_dir: "~/.clearml/cache"
# default_cache_manager_size: 100
}
direct_access: [
# Objects matching are considered to be available for direct access, i.e. they will not be downloaded
# or cached, and any download request will ...
FlutteringWorm14 we do batch the reported scalars. The flow is like this: the task object will create a Reporter
object which will spawn a daemon in another child process that batches multiple report events. The batching is done after a certain time in the child process, or the parent process can force the batching after a certain number of report events are queued.
You could try this hack to achieve what you want:
` from clearml import Task
from clearml.backend_interface.metrics.repor...
Hi @<1668427950573228032:profile|TeenyShells80> , the parent_datasets
should be a list of dataset IDs or clearml.Dataset objects, not dataset names. Maybe that is the issue
Hi @<1523701240951738368:profile|RoundMosquito25> ! Try using this function None